Title: Developing HPC Scientific and Engineering Applications: From the Laptop to the Grid
1Developing HPC Scientific and Engineering
Applications From the Laptop to the Grid
- Gabrielle Allen, Tom Goodale, Thomas Radke, Ed
Seidel - Max Planck Institute for Gravitational Physics,
Germany - John Shalf
- Lawrence Berkeley Laboratory, USA
These slides http//www.cactuscode.org/Tutorials.
html
2Outline for the Day
- Introduction (Ed Seidel, 30 min)
- Issues for HPC (John Shalf, 60 min)
- Cactus Code (Gabrielle Allen, 90 min)
- Demo Cactus, IO and Viz (John, 15 min)
- LUNCH
- Introduction to Grid Computing (Ed, 15 min)
- Grid Scenarios for Applications (Ed, 60 min)
- Demo Grid Tools (John, 15 min)
- Developing Grid Applications Today (Tom Goodale,
60 min) - Conclusions (Ed, 5 min)
3Introduction
4Outline
- Review of application domains requiring HPC
- Access and availability of computing resources
- Requirements from end users
- Requirements from application developers
- The future of HPC
5What Do We Want to Achieve?
- Overview of HPC Applications and Techniques
- Strategies for developing HPC applications to be
- Portable from Laptop to Grid
- Future proof
- Grid ready
- Introduce Frameworks for HPC Application
development - Introduce the Grid What is/isnt it? What will
be? - Grid Toolkits How to prepare/develop apps for
Grid, today tomorrow - What are we NOT doing?
- Application specific algorithms
- Parallel programming
- Optimizing Fortran, etc
6Who uses HPC?
- Scientists and Engineers
- Simulating Nature Black Hole Collisions,
Hurricanes, Ground water flow - Modeling processes space shuttle entering
atmosphere - Analyzing data lots of it!
- Financial Markets
- Modeling currencies
- Industry
- Airlines, insurance companies
- Transaction, data, etc
- All face similar problems
- Computational need not met
- Remote facilities
- Heterogeneous and changing systems
- Look now at three types
- High-Capacity, Throughput, Data Computing
7High Capacity Computing Want to Compute What
Happens in Nature!
8Computation Needs 3D Numerical Relativity
t0
Get physicists CS people together Find
Resource (TByte, TFlop crucial) Initial Data 4
coupled nonlin. elliptics Choose Gauge
(elliptic/hyperbolic) Evolution hyperbolic
evolution coupled with elliptic eqs. Find
Resource . Analysis Interpret, Find AH, etc
9Any Such Computation Requires Incredible Mix of
Varied Technologies and Expertise!
- Many Scientific/Engineering Components
- Physics, astrophysics, CFD, engineering,...
- Many Numerical Algorithm Components
- Finite difference methods? Finite volume? Finite
elements? - Elliptic equations multigrid, Krylov subspace,
preconditioners,... - Mesh Refinement?
- Many Different Computational Components
- Parallelism (HPF, MPI, PVM, ???)
- Architecture Efficiency (MPP, DSM, Vector, PC
Clusters, ???) - I/O Bottlenecks (generate gigabytes per
simulation, checkpointing) - Visualization of all that comes out!
- Scientist/eng. wants to focus on top, but all
required for results... - Such work cuts across many disciplines, areas of
CS - And now do it on a Grid??!!
10How to Achieve This?
- Any Such Computation Requires Incredible Mix of
Varied Technologies and Expertise! - Many Scientific/Engineering Components
- Physics, astrophysics, CFD, engineering,...
- Many Numerical Algorithm Components
- Finite difference methods? Finite elements?
- Elliptic equations multigrid, Krylov subspace,
preconditioners,... - Mesh Refinement?
- Many Different Computational Components
- Parallelism (HPF, MPI, PVM, ???)
- Architecture Efficiency (MPP, DSM, Vector, PC
Clusters, ???) - I/O Bottlenecks (generate gigabytes per
simulation, checkpointing) - Visualization of all that comes out!
- Scientist/eng. wants to focus on top, but all
required for results... - Such work cuts across many disciplines, areas of
CS - And now do it on a Grid??!!
11High Throughput Computing Task farming
- Running hundreds - millions of jobs as quickly
as possible - Collecting statistics, doing ensemble
calculations, surveying large parameter space,
etc - Typical Characteristics
- Many small, independent jobs must be managed!
- Usually not much data transfer
- Sometimes jobs can be moved from site to site
- Example Problems climatemodeling.com, NUG30
- Example Solutions Condor, SC02 demos, etc
- Later examples that combine capacity and
throughput
12Large Data Computing
- Data more and more the killer app for the Grid
- Data mining
- Looking for patterns in huge databases
distributed over the world - E.g. Genome analysis
- Data analysis
- Large astronomical observatories
- Particle physics experiments
- Huge amounts of data from different locations to
be correlated, studied - Data generation
- Resources Grow Huge simulations will each
generate TB-PB to be studied - Visualization
- How to visualize such large data, here, at a
distance, distributed - Soon Dynamic combinations of all types of
computing, data on grids - Our Goal is to give strategies for dealing with
all types of computing
13Grand Challenge CollaborationsGoing Large Scale
Needs Dwarf Capabilities
- NASA Neutron Star Grand Challenge
- 5 US Institutions
- Solve problem of colliding neutron stars (try)
- NSF Black Hole Grand Challenge
- 8 US Institutions, 5 years
- Solve problem of colliding BH (try)
- EU Network Astrophysics
- 10 EU Institutions, 3 years, 1.5M
- Continue these problems
- Entire Community becoming Grid enabled
- Examples of Future of Science Engineering
- Require Large Scale Simulations, beyond reach of
any machine - Require Large Geo-distributed Cross-Disciplinary
Collaborations - Require Grid Technologies, but not yet using
them! - Both Apps and Grids Dynamic
14Growth of Computing Resources (from Dongarra)
15Not just Growth, Proliferation
- Systems getting larger by 2-3-4x per year!
- Moores law (processor doubles each 18 months)
- Increasing parallelism add more and more
processors - More systems
- Many more organizations recognizing need for HPC
- Universities
- Labs
- Industry
- Business
- New kind of parallelism Grid
- Harness these machines, which themselves are
growing - Machines all different! Be prepared for next
thing
16Todays Computational Resources
- PDAs
- Laptops
- PCs
- SMPs
- Shared memory up to now
- Clusters
- Distributed memory, must use message passing or
task farming - Traditional supercomputers
- SMPs of up to 64 processors
- Clustering above this
- Vectors
- Clusters of large systems metacomputing
- The Grid
- Everyone uses PDAs - PCs
- Industry prefers traditional machines
- Academia clusters for price/perf
- We show how to minimize effort to go
- between systems, prepare for Grid
17The Same Application
Laptop
Super Computer
The Grid
Application
Application
Application
Middleware
Middleware
Middleware
No network!
Biggest machines!
18What is Difficult About HPC?
- Many different architectures and operating
systems - Things change very rapidly
- Must worry about many things at same time
- Single processor performance, caches, etc
- Different languages
- Different operating systems (but now, at least
everything is (nearly) unix!) - Parallelism
- I/O
- Visualization
- Batch systems
- Portability compilers, datatypes and associated
tools
19Requirements of End Users
- We have problems that need to be solved
- Want to work at conceptual level
- Build on top of other things that have been
solved for us - Use libraries, modules, etc.
- We dont want to waste time with
- Learning a new parallel layer
- Writing high performance I/O
- Learning a new batch system, etc
- We have collaborators distributed all over the
world - We want answers fast, on whatever machines are
available - Basically, want to write simple Fortran or C code
and have it work
20Requirements of Application Developers
- We must have access to latest technologies
- These should be available through simple
interfaces and APIs - They should be interchangeable with each other
when same functionality is available from
different packages - Code we develop must be as portable and as future
proof as possible - Run on all these architectures we have today
- Easily adapted to those of tomorrow
- If possible, top level user application code
should not change, only layers underneath - Well give strategies for doing this, on todays
machines, and on the Grid of tomorrow
21Where is This All Going?
- Dangerous to predict, but
- Resources will continue to grow for some time
- Machines will get larger at this rate TeraFlop
now, PetaFlop tomorrow - Collections of resources into Grids is happening
now, will be routine tomorrow - Very heterogeneous environments
- Data explosion will be exponential
- Mixture of real-time simulation and data analysis
will become routine - Bandwidth from point to point will allocatable on
demand! - Applications will become very sophisticated, able
to adapt to their changing needs, and to changing
environment (on time scales of minutes to years) - We are trying today to help you prepare for this!