Accelerating the Pagerank Algorithm - PowerPoint PPT Presentation

About This Presentation
Title:

Accelerating the Pagerank Algorithm

Description:

stanford.edu/berkley.edu web. Stanford Reordered by descending outdegree ... BFS on Stanford/Berkley. The dangling node/BFS reordering. Solving the BFS/Dangling system ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 30
Provided by: mathe181
Category:

less

Transcript and Presenter's Notes

Title: Accelerating the Pagerank Algorithm


1
Accelerating the Pagerank Algorithm
  • M. Campbell
  • Missouri State University REU

2
The Information Retrieval Problem
3
Actually two
  • Given a finite set of Documents D and a query q
  • I. Which elements of D are relevant to q?
  • II. Of the relevant documents, which are most
    relevant?

4
Exploiting the Structure of the document set
  • Scholarly papers
  • Papers cite other papers.
  • Papers which are cited the most are likely to
    be very important in their field. Additionally,
    the papers cited by important papers gain in
    relative importance.

5
What about the internet?
  • The internet has a similar structure due to
    hyperlinking.
  • Pages which are very important get linked to by
    many pages, and pages linked to by important
    pages will likely be deemed to be more important
    than others.

6
Looking abstractly at the link structure of the
web
7
The Pagerank Equation

8
The Iterative Pagerank Equation
9
Determining the Pagerank Vector by the Power
Method

This is the power method, where we are computing
the eigenvector of H associated to the eigenvalue
of 1
10
Fixing the Link matrix to ensure the pagerank
vector exists
  • I. Dangling nodes


11
Fixing The link matrix
  • II. Reducibility (dangling webs)


U is a probabilistic (entries add to one)
personalization vector
12
The Google matrix

13
An alternate Method

14
The linear system

Letting
It has been shown that v rx for some scalar r
where x is the solution of the system
15
Options for solving the system
  • There are many options for solving this system. I
    focused on three.
  • I. Jacobi
  • II. Gauss-Seidel
  • III. Successive Over Relaxation(SOR)
  • But first we study reorderings of the matrix to
    make it nice for the solver

16
Stanford.edu web
17
stanford.edu/berkley.edu web
18
Stanford Reordered by descending outdegree
19
SB Reordered by descending outdegree
20
Stanford Reordered by descending indegree
21
SB Reordered by descending indegree
22
Reverse Cuthill Mckee
23
The Breadth first search
24
BFS reorder on Stanford web
25
BFS on Stanford/Berkley
26
The dangling node/BFS reordering
27
Solving the BFS/Dangling system


28
Comparative Results
Web Stanford Stanford Stanford Stanford/Berkley Stanford/Berkley Stanford/Berkley
Time(s) N.Iter. residual Time(s) N. Iter. residual
Power 10.32 132 8x10-12 28.9 134 7x10-12
Jacobi 5.8186 146 9x10-12 17.42 144 1x10-11
GS 11.289 68 5x10-11 29.441 70 3x10-11
SOR 10.7 64 6x10-11 29 68 4x10-11
29
Further studies
  • Preconditioning
  • Optimal implementation of
  • Gauss-Siedel/SOR Algorithm
  • III. Markov Chain Updating Problem with
  • Linear Solving
  • IV. Using Kendall-tau measure for
  • convergence criterion.
Write a Comment
User Comments (0)
About PowerShow.com