Title: Star-P and the Knowledge Discovery Suite Steve Reinhardt, spr@InteractiveSupercomputing.com Viral Shah, vshah@InteractiveSupercomputing.com John Gilbert, gilbert@cs.ucsb.edu Stefan Karpinski, sgk@cs.ucsb.edu
1Star-P and the Knowledge Discovery SuiteSteve
Reinhardt, spr_at_InteractiveSupercomputing.comViral
Shah, vshah_at_InteractiveSupercomputing.comJohn
Gilbert, gilbert_at_cs.ucsb.eduStefan Karpinski,
sgk_at_cs.ucsb.edu
2Context for Knowledge Discovery
3Goal for Large-scale Knowledge Discovery via
Star-P/KDS
- Enable domain expert to explore big unstructured
data interactively - Domain Expert Scientist or analyst, not math or
graph expert - Explore human-guided characterization, from
simple statistics to complex clustering or
factoring, even when best algorithm not known - Big 20GB of data commonplace, gt1TB largest
- Unstructured Data
- E.g., arising from metabolic networks, climate
change, social interactions, and Internet traffic - Allow analyst to discern structure in the data
via exploration - KDT implements many key algorithms extensible
for other algorithms - Interactively common queries take O(10 seconds)
on 30-128P Altix - Depends on algorithms being general-purpose,
reusable, and usable by non-experts
4Star-P Basics
MATLAB is a registered trademark of The
MathWorks, Inc. ISC's products are not sponsored
or endorsed by The Mathworks, Inc. or by any
other trademark owner referred to in this
document.
5A Comprehensive (?) Pictureof Knowledge Discovery
SVD implemented by ISCHadoop
implemented by others
Input HDF5, OPeNDAP, Hadoop, live
data feeds
Visualization Renoir, In-Spire, ...
Dimensionalityreduction / factorization SVD,
eigen, NMF, PCA,
Clustering K-means, ...
Classification SVM, Bayesian, HMMs, ...
Etc.
Graph primitives(BFS, MIS, conncomp, ...)
Linear algebra(spmatvec, ...)
Solvers(MUMPS, SuperLU, ...)
Utility(sort, indexing)
Data structures(sparse matrices, ...)
Parallel constructs
6Knowledge Discovery Suite
- Simple data analysis operations at very large
scale - Sorting, set operations, statistical operations
- Graph operations on very large graphs
- Simple queries, breadth-first search, connected
components, independent sets - Visualization with desktop tools
- Distributed image generation for large graphs and
datasets - Clustering and decomposition
- K-means clustering, non-negative matrix
factorization, principal component analysis - Bayesian network modeling
- Expectation maximization (Baum-Welch), hidden
Markov models - What would you like to see?
7Ways to Work Together
- You use Star-P infrastructure for parallel
algorithm development/deployment - You develop serial algorithms, we develop
parallel versions - We jointly develop parallel algorithms
- We develop or integrate missing functionality you
need - We target joint client with integrated package
8A Brief Demo
9Sample Kernels, Algorithms, and Workflows
10Simplest KDS operationParallel Sorting
3 6 8 1 5 4 7 2 9
1 2 3 4 5 6 7 8 9
- Simple, widely used combinatorial primitive
- W, perm sort (V)
- Used in many sparse matrix and array algorithms
sparse(), indexing, concatenation, transpose,
reshape, repmat, etc. - Communication efficient
11Sorting performance
12A complex workflow (SSCA2, kernel 3)
- function subgraphs kernel3 (G, pathlen, starts)
- KERNEL3 SSCA2 Kernel 3 -- Graph Extraction
- starts starts(,2)
- nstarts length(starts)
- A grsparse (G)
- nv nverts (G)
- Use sparse matrix multiplication to do several
BFS searches at once. - s sparse (starts, 1nstarts, 1, nv, nstarts)
- for k1pathlen
- s A s Ideally reach should support
this. Not yet. - s (s 0)
- end
- for i 1nstarts
- x s(,i)
- vtxmap find(x)
- S.graph subgraph (G, vtxmap)
13A complex workflow (SSCA2, kernel 4)
- function leader kernel4f (G)
- KERNEL4F SSCA2 Kernel 4 -- Graph Clustering
- Find a Maximal Independent Set in G
- IS, misrounds mis (G)
- fprintf ('MIS rounds d. MIS nodes d\n',
misrounds, length(IS)) - Find neighbours of each node from the IS
- neighFromIS G sparse(IS, IS, 1, n, n)
- Pick one of the neighbouring IS nodes as a
leader - ign leader max (neighFromIS, , 2)
- Collect votes from neighbours
- I J find (G)
- S sparse (I, leader(J), 1, n, n)
- Pick the most popular leader among neighbours
and join that cluster
14Scaling Performance cSSCA2 on 128P
"scale vertices ( 2scale) edges ( 10 vertices) graph size (bytes) time (seconds)
22 4.194E06 4.194E07 7.550E08 122.51
24 1.678E07 1.678E08 3.020E09 402.31
26 6.711E07 6.711E08 1.208E10 1237.1
- Timings scale well for large graphs,
- 2x problem size ? 2x time
- 2x problem size 2x processors ? same time
15Distributed visualization
16App1 Computational Ecology
- Modeling dispersal of species within a habitat
(to maximize range) - Large geographic areas, linked with GIS data
- Blend of numerical and combinatorial algorithms
Brad McRae and Paul Beier, Circuit theory
predicts gene flow in plant and animal
populations, PNAS, Vol. 104, no. 50, December
11, 2007
17Results
- Solution time reduced from 3 days (desktop) to 5
minutes (14p) for typical problems - Aiming for much larger problems
Yellowstone-to-Yukon (Y2Y)
18App2 Factoring network flow behavior
Karpinski, Almeroth, Belding
19Algorithmic exploration
- Many NMF variants exist in the literature
- Not clear how useful on large data
- Not clear how to calibrate (i.e., number of
iterations to converge) - NMF algorithms combine linear algebra and
optimization methods - Basic and improved NMF factorization
algorithms implemented - euclidean (Lee Seung 2000)
- K-L divergence (Lee Seung 2000)
- semi-nonnegative (Ding et al. 2006)
- left/right-orthogonal (Ding et al. 2006)
- bi-orthogonal tri-factorization (Ding et al.
2006) - sparse euclidean (Hoyer et al. 2002)
- sparse divergence (Liu et al. 2003)
- non-smooth (Pascual-Montano et al. 2006)
20NMF traffic analysis results
- NMF identifies essential components of the
traffic - Analyst labels different types of external
behavior
21Future Directions
- What should KDS contain
- More algorithms ?
- Other classes of algorithms?
- Visualization ?
- Easy use of hardware accelerators (GPU, Cell) ?