The PSciCo Project: Parallel Scientific Computing in ML presentation

About This Presentation

Transcript and Presenter's Notes

Title: The PSciCo Project: Parallel Scientific Computing in ML

1
The PSciCo Project Parallel Scientific Computing
in ML

Robert Harper
Carnegie Mellon University
August, 2000

2
Acknowledgements

This talk represents collaboration with Guy
Blelloch (CS), Gary Miller (CS), and Noel
Walkington (Math).
Much of the work is done by our students Umut
Acar, Hal Burch, Franklin Chen, Perry Cheng,
David Garmire, Aleks Nanevski, Leaf Petersen,
Seth Porter, and Chris Stone.

3
Project Goal

Use high-level languages for scientific computing
to achieve ...
Smooth integration of production and prototyping.
Richer data structures and more complex
algorithms.
Free exchange of re-usable components.
Expressiveness with acceptable efficiency.

4
Project Thesis

We may achieve these goals by exploiting
Functional programming.
Persistent data structures.
Higher-order functions, especially staging
(currying) and parameterization.
Implicit coarse-grained parallelism.
Parallel operations on aggregates.
Modular programming.
Interfaces, parameterization, hierarchy.

5
How We Can Win

Use methods that are unthinkable in current
languages.
persistence, parallelism, higher-order functions,
modularity
Change the terms of the debate.
Efficiency matters, but its not everything.
Expressiveness, concision, robustness,
flexibility matter too.

6
How We Can Win

Eliminate the distinction between
Prototyping languages, for experimenting with new
ideas.
Encourage quick and dirty hacks.
Poor software engineering properties.
Production languages, for building systems for
real.
Sacrifice ease of use and maintainability for the
sake of efficiency.

7
The State of the Art

Prototyping languages
eg, Mathematica, MatLab
Convenient, especially interactively
Support numeric and symbolic computation
High-level, language-oriented
Packages / toolkits for various domains
Easy to build an application

8
The State of the Art

Production languages
eg, Fortran, C, C
Batch-oriented, not interactive
Low-level, machine-oriented
Poor support for symbolic computation
Large collection of libraries, packages
More effort to build a system

9
Starting Point ML NESL

ML
Excellent support for complex data structures and
symbolic computing
Excellent support for modularity
Poor support for numeric computation
Poor support for parallelism

10
Starting Point ML NESL

NESL
First-order functional language.
Excellent support for implicit parallelism.
scan vector model of data parallelism
No support for complex data stuctures or symbolic
computing.
No support for modularity.

11
Current Work

Shared libraries.
Adaptive precision arithmetic.
General purpose geometry and physics.
Triangulated surfaces (simplicial complexes).
Applications.
Convex hull.
Delaunay triangulation.
n-Body problem.
Terrain modelling.
Finite element method.

12
Current Work

Language technology.
TILT compiler for Standard ML.
typed intermediate languages for efficient data
representation
Parallel run-time system with space- and
time-efficient thread scheduling.
exploited by data parallel sequence primitives.
Parallel, concurrent, real-time copying GC.
Multiple mutators and collectors on an SMP

13
Shared Libraries

Geometry primitives
Points, vectors, figures.
Line-side test, in-circle test.
Arithmetic.
adaptive precision floating point
interval arithmetic
Simplicial complexes.
Persistent representation of surfaces.

14
Representing Geometry

Single dimension-independent interface signature
GEOMETRY ...
Several dimension-dependent implementations
structure Geo2D gt GEOMETRY ...
structure Geo3D gt GEOMETRY ...
Different types for different dimensions!
avoids run-time checking for conformance

15
Representing Geometry

signature GEOMETRY sigstructure Vec
VECstructure Point POINTstructure HalfSpace
HALF_SPACEstructure Box BOXsharing Point.Vec
Vecand HalfSpace.Point Point
end

16
Representing Geometry

signature VEC sigstructure Scalar
NUMBERtype tval t t -gt tval scale t
Scalar.t -gt tval dot t t -gt Scalar.tval
norm t -gt Scalar.t
end

17
Representing Geometry

signature POINT sigstructure Vec
VECstructure Number NUMBERtype tval origin
tval translate t Vec.t -gt tval - t t -gt
Vec.t
end

18
Representing Geometry

signature HALF_SPACE sigexception
Degeneratestructure Pt POINTstructure Vec
VECtype tval from_pts Pt.t Seq.t -gt tval
side_test t -gt Pt.t -gt Side.t ( 3-valued )
end

19
An Object-Oriented Approach

CGAL uses OOP techniques.
No common interface for any dimension.
One implementation for all dimensions.
Point translate (Point p, Vector v)
Requires run-time conformance checks.
Uses templates for numeric precision (float,
double, etc)

20
ML vs OO

In ML conformance is ensured by sharing
specifications.
Vectors and points must agree on common types.
Run-time checks are avoided.
ML supports coherent integration of
abstractions better than do C or Java.
Geometry library is a good example of this.

21
Arithmetic

Key questions in geometric algorithms
Side test is a point above or below a given line
or plane?
Circle test is a point within a circle or
sphere?
These functions convert a floating point number
to a boolean.
Check sign of a determinant.

22
Arithmetic

It is essential that these questions be answered
consistently and accurately.
Otherwise higher-level results are nonsense (eg,
convex hull not convex).
It is essential that these be implemented
efficiently.
Lie at the heart of just about any algorithm in
CG.

23
Arithmetic

Adaptive precision via error analysis (Priest,
Shewchuk).
Calculate using single precision.
Calculate fuzz factor by forward error
analysis.
If reliable, answer, otherwise re-do in double
precision, then arbitrary precision.
Key weakness must derive and implement the error
calculation not a general arithmetic package.

24
Arithmetic

Lazy interval arithmetic.
Use a general interval arithmetic package for all
arithmetic.
IA determines reliability of result (eg, for
positivity test, interval cannot span origin).
Carry a symbolic representation of the history
of the computation of the point.
Re-do using full precision if unreliable.

25
Arithmetic

Intervals work well for polynomials.
Enough to do CG tests, assuming virgin data.
Polynomials are not enough, in general!
eg, projections onto a sphere require square
roots
How much can we feasibly support?
algebraic numbers?
transcendentals?

26
Real Numbers

Why not represent all (computable) real numbers?
eg, Edalat and Potts representation as infinite
sequences of convergent intervals
Basic limitation equality and ordering of reals
is not decidable.
eg, cannot decide whether xgt0 in general
most algorithms assume a decidable equality test

27
Real Numbers

CG algorithms assume that geometric objects are
discrete.
eg, line-side and in-circle tests
Computationally, this is nonsense!
cannot decide in general if x lies on L or within
C
Edalat suggests to treat geometric objects
themselves as sequences of intervals.
successive refinement to interior and exterior
But we must re-work CG in this framework.

28
Representing Surfaces

Triangulated surfaces arise frequently.
Finite element method for solving boundary-value
problems.
Computation of convex hull of a set of points.
Terrain modeling from altitude data.
Putting clothes on the Toy Story characters.

29
Representing Surfaces

A triangulated surface is a simplicial complex.
An n-complex is a set of n-simplexes that fit
together well.
Intersection of two n-simplexes is an
(n-1)-simplex.
A 1-simplex is a point, a 2-simplex a line, a
3-simplex a triangle, etc.
Vertices are affinely independent when embedded
in n-space.

30
Representing Surfaces

signature SIMPLEX sig
structure Vertex VERTEXtype simplexval
compare simplex simplex -gt orderval dim
simplex -gt intval vertices simplex -gt
Vertex.vertex seqval simplex Vertex.vertex seq
-gt simplexval down simplex -gt Vertex.vertex
simplexval join Vertex.vertex simplex -gt
simplexval faces simplex -gt simplex seqval
flip simplex -gt simplex
end

31
Representing Surfaces

signature VERTEX sig
type vertexval compare vertex vertex -gt
ordertype statetype pointval new state
point -gt state vertexval loc vertex -gt point
end

32
Representing Surfaces

signature SIMPCOMP sig
structure Simplex SIMPLEXtype a complexval
dim intval empty a complexval isempty
a complex -gt boolval simplices a complex
-gt int -gt Simplex.simplex seqval data a
complex -gt Simplex.simplex -gt a option...
end

33
Representing Surfaces

signature SIMPCOMP sig ... val grep a
complex -gt int Simplex.simplex -gt
Simplex.simplex seq val add a complex -gt
Simplex.simplex a -gt a complex ...
end

34
Representing Surfaces

Signature of complexes is dimension-independent.
Allows dimension-independent code to manipulate
surfaces.
Implementations are dimension-dependent.
Different dimensions have different types.

35
Implementing Surfaces

Usual implementation is imperative.
Doubly-linked edge list.
Constant-time traversal operations.
Ephemeral, not easily parallelizable.
Our implementation is functional.
Simplex is a sequence of vertices.
Logarithmic-time traversal of complex.
Persistent, easily parallelizable.

36
Cost of Persistence?

Lower bound on 3D hull ?(n lg n).
Matching upper bounds are known.
eg, Quickhull algorithm.
But they rely on constant-time surface
operations!
eg, find all three neighbors of a triangle
Can we match this in the persistent case?

37
Cost of Persistence?

Naively, wed expect O(n lg2 n) to pay for
persistence.
Surface operations are O(lg n) instead of O(1).
Burch and Miller discovered an optimal randomized
algorithm for the persistent case!
Touches surface O(n) times, at a cost of O(n lg
n).
O(n lg n) floating point operations.
Plus it is persistent and parallelizable!

38
Delaunay Triangulation

Generate a triangulation of a region with
boundary (in 2-space).
Must include the boundary segments.
Configuration of triangles must be good in
several senses (well discuss one).
A triangulation is a 2-complex.
Use simplicial complex package to represent it.

39
Delaunay Triangulation
40
Delaunay Triangulation

Goal avoid skinny triangles.
Lead to ill-conditioned numeric problems.
Method improve a triangulation by adding new
vertexes.
Turn a skinny triangle into two fat ones.
Where to add vertices?
Circumcenter of triangle usually works.
Other choices are sometimes forced.

41
Delaunay Triangulation
42
Ruperts Algorithm

Adding a vertex may violate the boundary!
New vertex may introduce a new skinny triangle
somewhere else in the region.
If the new triangle is on the boundary, further
splitting may remove a boundary segment!

43
Ruperts Algorithm
44
Ruperts Algorithm
45
Ruperts Algorithm

Typical solution search whenever a vertex is
about to be added.
Lots of ad hoc code for a rare case.
Introduces overhead at each step.
A simpler solution back out when a boundary
violation occurs.
Can be arbitrarily far in the future.

46
Ruperts Algorithm in ML

Use dynamic exceptions and persistent surface
representation.
Associate an exception with each new vertex.
If the boundary is violated, raise the exception
of some new vertex, passing a better location
for it.
Handler restarts triangulation from there.

47
Ruperts Algorithm in ML

For the 2D case this is nice.
simplifies coding
easier to explain
For the 3D case this is a big win.
conditions much harder to state and enforce
persistence helps enormously
first implementation of 3D Ruppert nearly complete

48
N-Body Problem

Problem given n bodies, compute the potentials
between them.
Gravitational potential between stars.
Electrical potential between charged bodies.
Naive solution O(n2).
For each body, sum potential due to all other
bodies in the system

49
N-Body Problem

Approximate potentials by clustering distant
bodies.
eg, think of Andromeda as one body relative to
the Sun
Several choices
How to form clusters?
How far is distant?
How to approximate potentials?

50
Clustering

Build a tree by recursively dividing space into
regions.
Each node is a cluster.
Each leaf is a body.
Two division methods
Equal-sized regions of space (quaternary).
Split bounding box on longest dimension (binary).

51
Clustering Even Division
52
Clustering Split Bounding Box
53
Approximation

Treat each node as a pseudo-body.
Potential based on approximation method.
Valid approximation for distant objects.
Several approximation methods.
Monopole first-order term (center of gravity).
Multipole expansion higher-order terms.
Spherical distribution k points evenly
distributed on the surface of a sphere.

54
Force Calculation

For each body, traverse the tree.
If far enough away from the cluster, use the
approximation.
Otherwise, sum potentials due to each child.
Distance criterion s/d gt ?.
s size of cell, d distance away
? threshold factor

55
Force Calculation

There is a trade-off between the approximation
and the separation criterion.
Smaller ? implies more interactions to be
calculated.
More terms in the expansion makes approximation
harder to calculate, but improves accuracy.

56
A Modular Implementation

Use a functor parameterized on separation,
partition, and approximation method
functor NBody(structure Sep SEPARATION
structure Part PARTITION structure Approx
APPROX sharing Sep.Geom Part.Geom and
Part.Geom Approx.Geom)

57
A Modular Implementation

We can mix-and-match methods to compare speed,
accuracy, etc.
Facilitates prototyping and comparison of
methods.
A single framework for all tree-based methods.
Standard methods use one trajectory through the
space of options.

58
Opportunities For Parallelism

Tree-based methods are very amenable to
parallelization.
During construction, can construct cells in
parallel.
During evaluation, can compute body-cell
interactions in parallel.
Can calculate effect on all bodies simultaneously.

59
Summary

Scientific computing demands a programming
language with
A flexible module system,
Numeric and symbolic computation,
Dynamically-allocated data structures
... while providing
Competitive efficiency on stock hardware.
Super reliability and maintainability.

60
Summary

Merging ML and NESL is a good start ...
superior expressiveness, modularity
natural support for parallelism
... but theres much more to be done!
need a killer app (3D Ruppert?)
performance comparisons

The PSciCo Project: Parallel Scientific Computing in ML PowerPoint PPT Presentation