The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank

Description:

PageRank: random surfer ... probability that the surfer transitions to page j ... The surfer tends to follow those which lead to pages whose content has been ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 10
Provided by: cseLe
Category:

less

Transcript and Presenter's Notes

Title: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank


1
The Intelligent SurferProbabilistic Combination
of Link and Content Information in PageRank
  • Matt Richardson
  • Pedro Domingos

2
Surfer with PageRank
  • All outlinks from a page with equal probability.
  • Can we do it more intelligently?
  • How about query?

3
PageRank random surfer
  • ß is a constant, N is the total number of nodes,
    Fi is the set of pages page i links to, Bj is the
    set of pages which links to page i.

4
Intelligent surfer Model
  • It probabilistically hops from page to page,
    depending on the
  • and the
  • the surfer is looking for.

content of the pages
query terms
5
Most Important Formula
  • Pq(i-gtj) the probability that the surfer
    transitions to page j given that he is on page i
    and is searching for query q.
  • Pq(j) specified where the surfer chooses to jump
    when not following links.

6
Continue formula
  • The surfer tends to follow those which lead to
    pages whose content has been deemed relevant to
    the query.
  • What is Rq(j) ?

7
Rq(j) relevance of page j to query q
  • Many choices
  • One number R, so it is PageRank
  • 1 if the term q appears on page j, 0 otherwise
  • Complex functions, e.g. TFIDF
  • What is used in their experiment?

The fraction of words equal to q in page j
8
Scalability
  • Factor S/N
  • S the space required to store all of the
    pageranks
  • N the number of web pages
  • Test data set is 165

9
Results
Write a Comment
User Comments (0)
About PowerShow.com