Pregel: A System for Large-Scale Graph Processing presentation

About This Presentation

Transcript and Presenter's Notes

Title: Pregel: A System for Large-Scale Graph Processing

1
Pregel A System for Large-Scale Graph Processing

Grzegorz Malewicz, Matthew H. Austern, Aart J. C.
Bik, James C. Dehnert, Ilan Horn, Naty Leiser,
and Grzegorz Czajkwoski
Google, Inc.
SIGMOD 10
15 Mar 2013
Dong Chang

2
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

3
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

4
Introduction (1/2)
5
Introduction (2/2)

Many practical computing problems concern large
graphs
MapReduce is ill-suited for graph processing
Many iterations are needed for parallel graph
processing
Materializations of intermediate results at every
MapReduce iteration harm performance

Large graph data
Graph algorithms
Web graph Transportation routes Citation
relationships Social networks
PageRank Shortest path Connected
components Clustering techniques
6
MapReduce Execution

Map invocations are distributed across multiple
machines by automatically partitioning the input
data into a set of M splits.
The input splits can be processed in parallel by
different machines
Reduce invocations are distributed by
partitioning the intermediate key space into R
pieces using a hash function hash(key) mod R
R and the partitioning function are specified by
the programmer.

7
MapReduce Execution
8
Data Flow

Input, final output are stored on a distributed
file system
Scheduler tries to schedule map tasks close
to physical storage location of input data
Intermediate results are stored on local file
system of map and reduce workers
Output can be input to another map reduce task

9
MapReduce Execution
10
MapReduce Parallel Execution
11
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

12
Computation Model (1/3)
13
Computation Model (2/3)

Think like a vertex
Inspired by Valiants Bulk Synchronous Parallel
model (1990)

Source http//en.wikipedia.org/wiki/Bulk_synchron
ous_parallel
14
Computation Model (3/3)

Superstep the vertices compute in parallel
Each vertex
Receives messages sent in the previous superstep
Executes the same user-defined function
Modifies its value or that of its outgoing edges
Sends messages to other vertices (to be received
in the next superstep)
Mutates the topology of the graph
Votes to halt if it has no further work to do
Termination condition
All vertices are simultaneously inactive
There are no messages in transit

15
An Example
16
Example SSSP Parallel BFS in Pregel
17
Example SSSP Parallel BFS in Pregel
?
10
?
?
?
?
?
?
5
?
18
Example SSSP Parallel BFS in Pregel
19
Example SSSP Parallel BFS in Pregel
11
14
8
12
7
20
Example SSSP Parallel BFS in Pregel
21
Example SSSP Parallel BFS in Pregel
9
13
14
15
22
Example SSSP Parallel BFS in Pregel
23
Example SSSP Parallel BFS in Pregel
13
24
Example SSSP Parallel BFS in Pregel
25
Differences from MapReduce

Graph algorithms can be written as a series of
chained MapReduce invocation
Pregel
Keeps vertices edges on the machine that
performs computation
Uses network transfers only for messages
MapReduce
Passes the entire state of the graph from one
stage to the next
Needs to coordinate the steps of a chained
MapReduce

26
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

27
C API

Writing a Pregel program
Subclassing the predefined Vertex class

Override this!
in msgs
out msg
28
Example Vertex Class for SSSP
29
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

30
MapReduce Coordination

Master data structures
Task status (idle, in-progress, completed)
Idle tasks get scheduled as workers become
available
When a map task completes, it sends the master
the location and sizes of its R intermediate
files, one for each reducer
Master pushes this info to reducers
Master pings workers periodically to detect
failures

31
Mapreduce Failures

Map worker failure
Map tasks completed or in-progress at worker are
reset to idle
Reduce workers are notified when task is
rescheduled on another worker
Reduce worker failure
Only in-progress tasks are reset to idle
Master failure
MapReduce task is aborted and client is notified

32
System Architecture

Pregel system also uses the master/worker model
Master
Maintains worker
Recovers faults of workers
Provides Web-UI monitoring tool of job progress
Worker
Processes its task
Communicates with the other workers
Persistent data is stored as files on a
distributed storage system (such as GFS or
BigTable)
Temporary data is stored on local disk

33
Execution of a Pregel Program

Many copies of the program begin executing on a
cluster of machines
The master assigns a partition of the input to
each worker
Each worker loads the vertices and marks them as
active
The master instructs each worker to perform a
superstep
Each worker loops through its active vertices
computes for each vertex
Messages are sent asynchronously, but are
delivered before the end of the superstep
This step is repeated as long as any vertices are
active, or any messages are in transit
After the computation halts, the master may
instruct each worker to save its portion of the
graph

34
Fault Tolerance

Checkpointing
The master periodically instructs the workers to
save the state of their partitions to persistent
storage
e.g., Vertex values, edge values, incoming
messages
Failure detection
Using regular ping messages
Recovery
The master reassigns graph partitions to the
currently available workers
The workers all reload their partition state from
most recent available checkpoint

35
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

36
Experiments

Environment
H/W A cluster of 300 multicore commodity PCs
Data binary trees, log-normal random graphs
(general graphs)
Naïve SSSP implementation
The weight of all edges 1
No checkpointing

37
Experiments

SSSP 1 billion vertex binary tree varying of
worker tasks

38
Experiments

SSSP binary trees varying graph sizes on 800
worker tasks

39
Experiments

SSSP Random graphs varying graph sizes on 800
worker tasks

40
Outline

Introduction
Computation Model
Writing a Pregel Program
System Implementation
Experiments
Conclusion Future Work

41
Conclusion Future Work

Pregel is a scalable and fault-tolerant platform
with an API that is sufficiently flexible to
express arbitrary graph algorithms
Future work
Relaxing the synchronicity of the model
Not to wait for slower workers at inter-superstep
barriers
Assigning vertices to machines to minimize
inter-machine communication
Caring dense graphs in which most vertices send
messages to most other vertices

42
Thank You!

Write a Comment

User Comments (0)

About PowerShow.com

Pregel: A System for Large-Scale Graph Processing PowerPoint PPT Presentation