Title: A HighPerformance Interactive Tool for Exploring Large Graphs
1A High-Performance Interactive Tool for Exploring
Large Graphs
John R. Gilbert University of California, Santa
Barbara Aydin Buluc Viral Shah (UCSB) Brad
McRae (NCEAS) Steve Reinhardt (Interactive
Supercomputing) with thanks to Alan Edelman (MIT
ISC) and Jeremy Kepner (MIT-LL)
Support DOE Office of Science, NSF, DARPA, SGI,
ISC
23D Spectral Coordinates
32D Histogram RMAT Graph
4Strongly Connected Components
5Social Network Analysis in Matlab 1993
Co-author graph from 1993 Householdersymposium
6Social Network Analysis in Matlab 1993
Sparse Adjacency Matrix
- Which author hasthe most collaborators?
- gtgtcount,author max(sum(A))
- count 32
- author 1
- gtgtname(author,)
- ans Golub
7Social Network Analysis in Matlab 1993
- Have Gene Golub and Cleve Moler ever been
coauthors? - gtgt A(Golub,Moler)
- ans 0
- No.
- But how many coauthors do they have in common?
- gtgt AA A2
- gtgt AA(Golub,Moler)
- ans 2
- And who are those common coauthors?
- gtgt name( find ( A(,Golub) . A(,Moler) ), )
- ans
- Wilkinson
- VanLoan
8 Outline
- Infrastructure Array-based sparse graph
computation - An application Computational ecology
- Some nuts and bolts Sparse matrix multiplication
9Combinatorial Scientific Computing
- Emerging large scale, high-performance
applications -
- Web search and information retrieval
- Knowledge discovery
- Computational biology
- Dynamical systems
- Machine learning
- Bioinformatics
- Sparse matrix methods
- Geometric modeling
- . . .
- How will combinatorial methods be used by
nonexperts?
10Analogy Matrix Division in Matlab
- x A \ b
- Works for either full or sparse A
- Is A square?
- no gt use QR to solve least squares problem
- Is A triangular or permuted triangular?
- yes gt sparse triangular solve
- Is A symmetric with positive diagonal elements?
- yes gt attempt Cholesky after symmetric minimum
degree - Otherwise
- gt use LU on A(, colamd(A))
11MatlabP
- A rand(4000p, 4000p)
- x randn(4000p, 1)
- y zeros(size(x))
- while norm(x-y) / norm(x) gt 1e-11
- y x
- x Ax
- x x / norm(x)
- end
12Star-P Architecture
Star-P
MATLAB
processor 0
client manager
processor 1
package manager
processor 2
dense/sparse
sort
processor 3
ScaLAPACK
FFTW
Ordinary Matlab variables
. . .
FPGA interface
MPI user code
UPC user code
processor n-1
server manager
Distributed matrices
matrix manager
13Distributed Sparse Array Structure
31
1
41
P0
53
59
26
3
2
P1
Each processor stores local vertices edges in
a compressed row structure. Has been scaled to
gt108 vertices, gt109 edges in interactive session.
P2
Pn
14The sparse( ) Constructor
- A sparse (I, J, V, nr, nc)
- Input ddense vectors I, J, V, dimensions nr,
nc - Output A(I(k), J(k)) V(k)
- Sum values with duplicate indices
- Sorts triples lt i, j, v gt by lt i, j gt
- Inverse I, J, V find(A)
15Sparse Array and Matrix Operations
- dsparse layout, same semantics as ordinary full
sparse - Matrix arithmetic , max, sum, etc.
- matrix matrix and matrix vector
- Matrix indexing and concatenation
- A (13, 4 5 2) B(, J) C
- Linear solvers x A \ b using SuperLU (MPI)
- Eigensolvers V, D eigs(A) using PARPACK
(MPI)
16 Large-Scale Graph Algorithms
- Graph theory, algorithms, and data structures are
ubiquitous in sparse matrix computation. - Time to turn the relationship around!
- Represent a graph as a sparse adjacency matrix.
- A sparse matrix language is a good start on
primitives for computing with graphs. - Leverage the mature techniques and tools of
high-performance numerical computation.
17Sparse Adjacency Matrix and Graph
?
AT
x
ATx
- Adjacency matrix sparse array w/ nonzeros for
graph edges - Storage-efficient implementation from sparse data
structures
18Breadth-First Search Sparse mat vec
?
AT
x
ATx
- Multiply by adjacency matrix ? step to neighbor
vertices - Work-efficient implementation from sparse data
structures
19Breadth-First Search Sparse mat vec
?
AT
x
ATx
- Multiply by adjacency matrix ? step to neighbor
vertices - Work-efficient implementation from sparse data
structures
20Breadth-First Search Sparse mat vec
?
AT
x
ATx
- Multiply by adjacency matrix ? step to neighbor
vertices - Work-efficient implementation from sparse data
structures
21SSCA2 Graph Analysis Benchmark(spec version
1)
Fine-grained, irregular data access Searching and
clustering
- Many tight clusters, loosely interconnected
- Input data is edge triples lt i, j, label(i,j) gt
- Vertices and edges permuted randomly
22Clustering by Breadth-First Search
- Grow local clusters from many seeds in parallel
- Breadth-first search by sparse matrix matrix
- Cluster vertices connected by many short paths
- Grow each seed to vertices
- reached by at least k
- paths of length 1 or 2
- C sparse(seeds, 1ns, 1, n, ns)
- C A C
- C C A C
- C C gt k
23Toolbox for Graph Analysis and Pattern
Discovery
- Layer 1 Graph Theoretic Tools
- Graph operations
- Global structure of graphs
- Graph partitioning and clustering
- Graph generators
- Visualization and graphics
- Scan and combining operations
- Utilities
24Typical Application Stack
Computational ecology, CFD, data exploration
Applications
CG, BiCGStab, etc. combinatorial
preconditioners (AMG, Vaidya)
Preconditioned Iterative Methods
Graph querying manipulation, connectivity,
spanning trees, geometric partitioning, nested
dissection, NNMF, . . .
Graph Analysis PD Toolbox
Arithmetic, matrix multiplication, indexing,
solvers (\, eigs)
Distributed Sparse Matrices
25Landscape Connnectivity Modeling
- Landscape type and features facilitate or impede
movement of members of a species - Different species have different criteria,
scales, etc. - Habitat quality, gene flow, population stability
- Corridor identification, conservation planning
26Pumas in Southern California
Habitat quality model
27Predicting Gene Flow with Resistive Networks
28Early Experience with Real Genetic Data
- Good results with wolverines, mahogany, pumas
- Matlab implementation
- Needed
- Finer resolution
- Larger landscapes
- Faster interaction
5km resolution(too coarse)
29Combinatorics in Circuitscape
- Initial grid models connections to 4 or 8
neighbors. - Partition landscape into connected components
with GAPDT - Graph contraction from GAPDT contracts habitats
into single nodes in resistive network. (Need
current flow between entire habitats.) - Data-parallel computation on large graphs - graph
construction, querying and manipulation. - Ideally, model landscape at 100m resolution (for
pumas). Tradeoff between resolution and time.
30Numerics in Circuitscape
- Resistance computations for pairs of habitats in
the landscape - Direct methods are too slow for largest problems
- Use iterative solvers via Star-P
- Hypre (PCGAMG)
- Experimenting with support graph preconditioners
31Parallel Circuitscape Results
- Pumas in southern California
- 12 million nodes
- Under 1 hour (16 processors)
- Original code took 3 days at coarser resolution
- Targeting much larger problems
- Yellowstone-to-Yukon corridor
Figures courtesy of Brad McRae, NCEAS
32Sparse Matrix times Sparse Matrix
- A primitive in many array-based graph algorithms
- Parallel breadth-first search
- Shortest paths
- Graph contraction
- Subgraph / submatrix indexing
- Etc.
- Graphs are often not mesh-like, i.e. geometric
locality and good separators. - Often do not want to optimize for one repeated
operation, as in matvec for iterative methods
33Sparse Matrix times Sparse Matrix
- Current work
- Parallel algorithms with 2D data layout
- Sequential hypersparse algorithms
- Matrices over semirings
34ParSpGEMM
B(K,J)
J
K
K
C(I,J)
I
A(I,K)
- Based on SUMMA
- Simple for non-square matrices, etc.
C(I,J) A(I,K)B(K,J)
35 How Sparse? HyperSparse !
nnz(j)
nnz(j) c
blocks
- Any local data structure that depends on local
submatrix dimension n (such as CSR or CSC)
is too wasteful.
36SparseDComp Data Structure
- Doubly compressed data structure
- Maintains both DCSC and DCSR
- C AB needs only A.DCSC and B.DCSR
- 4nnz values communicated for AB in the worst
case (though we usually get away with much less)
37Sequential Operation Counts
Required non- zero operations (flops)
- Matlab O(nnnz(B)f)
- SpGEMM O(nzc(A)nzr(B)flogk)
Number of columns of A containing at least one
non-zero
Break-even point
38Parallel Timings
time vs n/nnz, log-log plot
- 16-processor Opteron, hypertransport, 64
GB memory - R-MAT R-MAT
- n 220
- nnz 8, 4, 2, 1, .5 220
39Matrices over Semirings
- Matrix multiplication C AB (or
matrix/vector) -
- Ci,j Ai,1?B1,j Ai,2?B2,j Ai,n?Bn,j
- Replace scalar operations ? and by
- ? associative, distributes over ?, identity 1
- ? associative, commutative, identity 0
annihilates under ? - Then Ci,j Ai,1?B1,j ? Ai,2?B2,j ? ?
Ai,n?Bn,j - Examples (?,) (and,or) (,min) . . .
- Same data reference pattern and control flow
40Remarks
- Tools for combinatorial methods built on parallel
sparse matrix infrastructure - Easy-to-use interactive programming environment
- Rapid prototyping tool for algorithm development
- Interactive exploration and visualization of data
- Sparse matrix sparse matrix is a key primitive
- Matrices over semirings like (min,) as well as
(,)