Informetrics, Webometrics and Web Use metrics - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Informetrics, Webometrics and Web Use metrics

Description:

Term introduced by Pritchard in 1969. Pritchard's explanation: 'the application of mathematical and statistical ... A Person Finder. Conclusion ... – PowerPoint PPT presentation

Number of Views:479
Avg rating:3.0/5.0
Slides: 25
Provided by: ruil1
Category:

less

Transcript and Presenter's Notes

Title: Informetrics, Webometrics and Web Use metrics


1
Informetrics, Webometrics and Web Use metrics
Huimin Lu 10/21/2004
2
Outline
History
Article 1 Bibliometrics WWW
Article 2 Bibliometrics of the WWW
Article 3 Authoritative Sources
Article 4 ParaSite
Conclusion
3
History
Term introduced by Pritchard in 1969.
Pritchards explanation the application of
mathematical and statistical methods to books and
other media of communication.
4
A1 Bibliometrics and the World Wide Web By Don
Turnbull

Bibliometrics Bibliometric laws Apply
bibliometric to WWW Metrics design
5
A1 Bibliometrics
Classic citation analysis Refined classic
bibliometrics- Standard formula for impact n
journal citations / n citable articles
published- Basic formula for immediacy index of
influence n citations received by article
during the year / total number of citable
articles published
Bibliometric Coupling - Measure the number of
references two papers have in common to test for
similarity
Cocitation Analysis - Measure the relations
between cited documents
Common Errors - multiple authors lost,
self-citation, similar author names, human error,
etc.
6
A1 Bibliometric Laws
  • Bradfords Law of Scattering
  • - clustering method Ran (n from 0 alt1), sum
    R/(1-a)
  • Lotkas Law
  • - inverse square
  • Zipfs Law
  • - familiar words with high frequency (nth word
    k/n times)

7
A1 Applying Bibliometric to Web
  • Web surveys
  • - Georgia Tech Graphics, Visualization, and
    Usability Web Surveys
  • Web servers
  • Add programming logic
  • - Inaccurate data gathered skip standard
    procedures, miss state information between usage
    hits, server hits themselves dont represent true
    usage.

8
A1 Metrics Design
Configure Web server to gather comprehensive
metrics
Manage log files - Enhence reliability regular
backup, store log file analysis results and logs,
begin new logs timely, post results and log
information for comparasion. - Log analysis
tools Analog, WWWStat, GetStats, Perl Scripts. -
Standardization Extended Log File Format by WWW
Consortium Standards Committee
Downies attempt analysis user-based, request,
byte-based
Optimal Web content setup External
bibliometric gathering
9
A2 Bibliometrics of the World Wide Web An
Exploratory Analysis of the Intellectual
Structure of Cyberspace By Ray R. Larson
Analysis of 30G Web pages collected by Inktomi
Web Crawler Cocitation analysis using DEC
AltaVista search engine
10
A2 Growth and Usage of Web
WWW
11
A2 Cocitation Analysis of Web
Attempt Map the intellectual structure of
Web Question Can cocitation techniques be
applied to charting the contents of cyberspace?
12
A2 Methods
Selection of core set of items for
study Retrieval of cocitation frequency
information Compilation of the raw cocitation
frequency matrix Correlation analysis to convert
the raw frequencies into correlation
coefficients Multivariate analysis of the
correlation matrix Interpretation of the
resulting map and validation
13
A2 Results
14
A3 Authoritative Sources in a Hyperlinked
Environment By Jon M. Kleinberg
A new method for automatically extracting certain
types of information about a hypermedia
environment from its link structure.
15
A3 Goal
  • Types of query search and problem
  • - Specific queries scarcity problem
  • - Broad-topic queries abundance problem
  • - Similar-page queries
  • Synthesize the unreliable information contained
    in the presence of individual links to provide a
    set of authoritative pages relevant to an initial
    query.

16
A3 Common Approaches
Only S - Define S to be the top k pages indexed
by AltaVista - Rank pages according to their
in-degree S -gt T - Define same root set S -
Grow S to a larger base set T - Rank pages by
their in-degree
17
A3 Their Approach
Extract small core sets of community of hubs and
authorities from T Authoritative pages - A novel
type of quality measure of the document in
hypermedia by algorithmic means. - Large
in-degree considerable overlap in sets of pages
that point to them Hub Pages - have links to
multiple relevant authoritative pages
18
A3 Algorithm and Output
Method Iteratively propagates authority weight
and hub weight across links of the web graph,
converging simultaneously to steady states for
both types of weights Output a pair of sets (X,
Y) (X a small set of authorities, Y a small set
of hubs) referred by authors as community of hubs
and authorities Claim authoritative pages can
be identified as belonging to dense bipartite
communities in the link graph of the WWW via
their algorithm.
19
A4 ParaSite Mining Structural Information on
the Web By Ellen Spertus
Varieties of link information on the Web How
the web differs from conventional hypertext How
the links can be exploited to build useful
applications
20
A4 Classical Hypertext vs. Web
Classical hypertext - links dont cross site
even document boundaries - documents limited to
a single topic - manual answers each question in
exactly one place or in none - Hardly change
Web - links can cross site and document
boundaries - multiple topics permitted in one
web page - an answer could appear any number of
times on the web - constantly changing
21
A4 Mining Links
Naïve Link Geometry - A useful technique for
finding pages on a given set of topics Hypertext
Links example - Categorized into upward,
downward, crosswise, and outward Directory
Links - Directory structure relation in pages in
the absence of hypertext links Structure within a
Page - Page can be considered a tree of nodes,
each with attached text and links embedded in the
text Other - Domain names, relationships between
concepts represented by words and phrases, paths
traveled through Web sites by visitors
22
A4 Application
Finding Moved Pages - Exploiting hyperlinks -
Exploiting directory links Finding Related
Pages - Collaborative filtering - When searching
for a related page with similar pages got,
ParaSite can find the page (A) that has maximum
links to the pages user got and return other
pages referneced by A. A Person Finder
23
Conclusion
World Wide Web information increase
exponentially and Internet architecture turns to
be more complicated. Applying bibliometrics to
the Web will help us control and manage web
information wisely.
24
Example of Hypertext Link
Back to hypertext link
Write a Comment
User Comments (0)
About PowerShow.com