Title: Dv: A toolkit for building remote interactive visualization services
1Dv A toolkit for building remote interactive
visualization services
- David OHallaron
- School of Computer Science and
- Department of Electrical and Computer Engineering
- Carnegie Mellon University
- September, 1999
- joint work with Martin Aeschlimann, Peter Dinda,
- Julio Lopez, and Bruce Lowekamp
www.cs.cmu.edu/dv
2Internet service models
request
client
server
response
- Traditional lightweight service model
- Small to moderate amount of computation to
satisfy requests - e.g., serving web pages, stock quotes, online
trading, current search engines - Proposed heavyweight service model
- Massive amount of computation to satisfy requests
- e.g., remote viz, data mining, future search
engines - Approach provide heavyweight services on a
computational grid of hosts.
3Heavyweight grid service model
Best effort Internet
Local compute hosts (allocated once per
session by the service user)
Remote compute hosts (allocated once per service
by the service provider)
4Initial heavyweight service Remote
visualization of earthquake ground motion
Jacobo Bielak and Omar Ghattas (CMU CE) David
OHallaron (CMU CS and ECE) Jonathan Shewchuk
(UC-Berkeley CS) Steven Day (SDSU Geology and
SCEC)
www.cs.cmu.edu/quake
5Teora, Italy 1980
6San Fernando Valley
7San Fernando Valley (top view)
Hard rock
epicenter
x
Soft soil
8San Fernando Valley (side view)
Soft soil
Hard rock
9Initial node distribution
10Partitioned unstructured finite element mesh of
San Fernando
nodes
element
11Communication graph
Vertices processors Edges communications
12Quake solver code
NODEVECTOR3 disp3, M, C, M23 MATRIX3 K
/ matrix and vector assembly / FORELEM(i)
... / time integration loop / for
(iter 1 iter lt timesteps iter)
MV3PRODUCT(K, dispdispt, dispdisptplus)
dispdisptplus - IP.dt IP.dt
dispdisptplus 2.0 M dispdispt -
(M - IP.dt / 2.0 C)
dispdisptminus - ...) dispdisptplus
dispdisptplus / (M IP.dt / 2.0 C) i
disptminus disptminus dispt dispt
disptplus disptplus i
13Archimedes
www.cs.cmu.edu/quake
MVPRODUCT(A,x,w)
Problem
Finite element
DOTPRODUCT(x,w,xw)
Geometry (.poly)
algorithm (.arch)
r r/xw
.c
.node, .ele
a.out
.part
parallel system
.pack
141994 Northridge quake simulation
- 40 seconds of an aftershock from the Jan 17, 1994
Northridge quake in San Fernando Valley of
Southern California. - Model
- 50 x 50 x 10 km region of San Fernando Valley.
- unstructured mesh with 13,422,563 nodes,
76,778,630 linear tetrahedral elements, 1 Hz
frequency resolution, 20 meter spatial
resolution. - Simulation
- 0.0024s timestep
- 16,666 timesteps (45M x 45M SMVP each timestep).
- 15-20 GBytes of DRAM.
- 6.5 hours on 256 PEs of Cray T3D (150 MHz 21064
Alphas, 64 MB/PE). - Comp 16,679s (71) Comm 575s (2) I/O
5995s(25) - 80 trillion (1012) flops (sustained 3.5 GFLOPS).
- 800 GB/575s (burst rate of 1.4 GB/s).
15(No Transcript)
16Visualization of 1994 Northridge
aftershockShock wave propagation path
Generated by Greg Foss, Pittsburgh Supercomputing
Center
17Visualization of 1994 Northridge
aftershockBehavior of waves within basin
Generated by Greg Foss, Pittsburgh Supercomputing
Center
18Animations of 1994 Northridge aftershock and
1995 Kobe mainshock
- Data produced by the Quake group at Carnegie
Mellon University - Images rendered by Greg Foss, Pittsburgh
Supercomputing Center.
19Motivation for Dv
- Quake datasets are too large to store and
manipulate locally. - 40 GB - 6 TB depending on degree of downsampling
- Common problem now because of advances in
hardware, software, and simulation methodology. - Current visualization approach
- Make request to supercomputing center (PSC)
graphics department. - Receive an MPEGs, JPEG and/or videotapes in a
couple of weeks/months. - Desired visualization approach
- Provide a remote visualization service that will
allow us to visualize Quake datasets
interactively and in collaboration with
colleagues around the world. - Useful for qualitative debugging, demos, and
solid engineering results
20Challenges to providing heavyweight services on a
computational grid
- Local resources are limited
- We must find an easy way to grid-enable existing
packages. - Grid resources are heterogeneous
- Programs should be performance-portable (at load
time) in the presence of heterogeneous resources. - Grid resources are dynamic
- Programs should be performance-portable (at run
time) in the face of dynamic resources. - Bottom line applications that provide
heavyweight grid services must be resource-aware.
21Example Quake viz flowgraph
local display and input
FEM solver engine
materials database
resolution
contours
ROI
scene
remote database
interpolation or decimation
isosurface extraction
scene synthesis
reading
rendering
vtk (visualization toolkit) routines
Decreasing amount of data
22Approaches for providing remote viz services
- Do everything on the remote server
- Pros very simple to grid enable existing
packages - Cons high latency, eliminates possibility of
proxying and caching at local site, can overuse
remote site, not appropriate for smaller
datasets.
Moderate Bandwidth Link
1-10 Mb/s
very high-end remote server
local machine
23Approaches for providing remote viz services
- Do everything but the rendering on the remote
server - Pros fairly simply to grid-enable existing
packages, removes some load from the remote site. - Cons requires every local site to have good
rendering power.
Moderate bandwidth link
1-10 Mb/s
high-end remote server
machine with good rendering power
24Approaches for providing remote viz services
- Use a local proxy for the rendering
- Pros offloads work from the remote site, allows
local sites to contribute additional resources. - Cons local sites may not have sufficiently
powerful proxy resources, application is more
complex, requires high bandwidth between local
and remote sites.
Moderate bandwidth link
High bandwidth link
1-10 Mb/s
100 Mb/s
powerful local proxy server
high-end remote server
low-end local PC or PDA
25Approaches for providing remote viz services
- Do everything at the local site.
- Pros low latency, easy to grid-enable existing
packages. - Cons requires high-bandwidth link between sites,
requires powerful compute and graphics resources
at the local site.
Very High Bandwidth Link
low-end remote server
powerful local server
1 Gb/s
26Providing remote viz services
- Claim static application partitions are not
appropriate for heavyweight Internet services on
computational grids - Thus, the goal with Dv is to provide a
framework for automatically scheduling and
partitioning heavyweight services. - The Dv approach is based on the notion of an
active frame.
27Active frames
Active Frame Server
Input Active Frame
Output Active Frame
Active frame interpreter
Frame data
Frame data
Frame program
Frame program
Application libraries e.g, vtk
Host
28Overview of a Dv visualization service
User inputs/ Display
Remote datasets
Local Dv client
Request frames
Response frames
Dv Server
Dv Server
Dv Server
Dv Server
...
Response frames
Response frames
Response frames
Remote DV Active Frame Servers
Local DV Active Frame Servers
29Grid-enabling vtk with Dv
request frame request server, scheduler,
flowgraph, data reader
local Dv client
code
result
request server
local Dv server
...
...
reader
scheduler
response frames (to other Dv servers) appl.
data, scheduler, flowgraph,control
local machine (Dv client)
code
remote machine
30Scheduling Dv programs
- Scheduling at request frame creation time
- all response frames use same schedule
- can be performance portable at load time
- cannot be performance portable at run time
- Scheduling at response frame creation time
- performance portable at load time and partially
at run time. - Scheduling at response frame delivery time
- can be performance portable at both load and run
time. - per-frame scheduling overhead a potential
disadvantage.
31SchedulingSingle client resource aggregation
example
Visualization flowgraph
Scheduling example (resource aggregation)
Dv server
Frame 1
Dv server(source)
Dv server
Dv server (client)
Frame 2
Frame 3
Dv server
Remote site
Local site
32SchedulingSingle client adaptation example
Visualization flowgraph
High CPU or battery availability
Moderate bandwidth link
Low CPU or battery availability
Moderate bandwidth link
Remote site
Local site
33Current Dv Issues
- Collaboration with viz groups
- need to exploit existing and new viz techniques
- e.g., progressive viz
- Flowgraphs with multiple inputs/outputs
- Caching
- Static data such as meshes needs to be cached on
intermediate servers. - Scheduling interface
- Must support a wide range of scheduling
strategies, from completely static (once per
session) to completely dynamic (each time a frame
is sent by each frame server) - Network and host resource monitoring
- network queries and topology discovery
(Lowekamp, OHallaron, Gross, HPDC99) - host load prediction (Dinda and OHallaron,
HPDC99)
34Issues (cont)
- Managing the Dv grid
- Resource discovery
- where are the Dv servers?
- which of them are running?
- Resource allocation
- which Dv servers are available to use?
- Collective operations
- Broadcast?
- Global synchronization of servers
- Client model
- one generic client that runs in a browser
- config files that personalize client interface
for each new service (dataset)
35Related work
- PVM and MPI (MSU, ORNL, Argonne)
- Active messages (Berkeley, Illinois)
- Active networks (CMU, MIT, GA Tech, ...)
- Globus (Argonne and USC)
- Legion (UVA)
- Harness and Cumulvs (ORNL)
- AppLEs (UCSD)
- NWS (UTenn and UCSD)
- Remos (CMU)
- svPilot and Autopilot (UIUC)
36Conclusions
- Heavyweight services on computational grids are
emerging. - Static partitioning is not appropriate for
heavyweight grid services - Actives frames provides a uniform framework for
grid-enabling and partitioning heavyweight
services such as remote visualization. - Dv is a toolkit based on active frames that we
have used to grid-enable vtk. - Dv provides a flexible framework for
experimenting with grid scheduling techniques.