Developing HPC Scientific and Engineering Applications: From the Laptop to the Grid - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Developing HPC Scientific and Engineering Applications: From the Laptop to the Grid

Description:

Heterogeneous and changing systems. Look now at three types: ... Very heterogeneous environments. Data explosion will be exponential ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 22

Provided by: edwar177

Category:

more less

Transcript and Presenter's Notes

Title: Developing HPC Scientific and Engineering Applications: From the Laptop to the Grid

1
Developing HPC Scientific and Engineering
Applications From the Laptop to the Grid

Gabrielle Allen, Tom Goodale, Thomas Radke, Ed
Seidel
Max Planck Institute for Gravitational Physics,
Germany
John Shalf
Lawrence Berkeley Laboratory, USA

These slides http//www.cactuscode.org/Tutorials.
html
2
Outline for the Day

Introduction (Ed Seidel, 30 min)
Issues for HPC (John Shalf, 60 min)
Cactus Code (Gabrielle Allen, 90 min)
Demo Cactus, IO and Viz (John, 15 min)
LUNCH
Introduction to Grid Computing (Ed, 15 min)
Grid Scenarios for Applications (Ed, 60 min)
Demo Grid Tools (John, 15 min)
Developing Grid Applications Today (Tom Goodale,
60 min)
Conclusions (Ed, 5 min)

3
Introduction
4
Outline

Review of application domains requiring HPC
Access and availability of computing resources
Requirements from end users
Requirements from application developers
The future of HPC

5
What Do We Want to Achieve?

Overview of HPC Applications and Techniques
Strategies for developing HPC applications to be
Portable from Laptop to Grid
Future proof
Grid ready
Introduce Frameworks for HPC Application
development
Introduce the Grid What is/isnt it? What will
be?
Grid Toolkits How to prepare/develop apps for
Grid, today tomorrow
What are we NOT doing?
Application specific algorithms
Parallel programming
Optimizing Fortran, etc

6
Who uses HPC?

Scientists and Engineers
Simulating Nature Black Hole Collisions,
Hurricanes, Ground water flow
Modeling processes space shuttle entering
atmosphere
Analyzing data lots of it!
Financial Markets
Modeling currencies
Industry
Airlines, insurance companies
Transaction, data, etc
All face similar problems
Computational need not met
Remote facilities
Heterogeneous and changing systems
Look now at three types
High-Capacity, Throughput, Data Computing

7
High Capacity Computing Want to Compute What
Happens in Nature!
8
Computation Needs 3D Numerical Relativity
t0
Get physicists CS people together Find
Resource (TByte, TFlop crucial) Initial Data 4
coupled nonlin. elliptics Choose Gauge
(elliptic/hyperbolic) Evolution hyperbolic
evolution coupled with elliptic eqs. Find
Resource . Analysis Interpret, Find AH, etc
9
Any Such Computation Requires Incredible Mix of
Varied Technologies and Expertise!

Many Scientific/Engineering Components
Physics, astrophysics, CFD, engineering,...
Many Numerical Algorithm Components
Finite difference methods? Finite volume? Finite
elements?
Elliptic equations multigrid, Krylov subspace,
preconditioners,...
Mesh Refinement?
Many Different Computational Components
Parallelism (HPF, MPI, PVM, ???)
Architecture Efficiency (MPP, DSM, Vector, PC
Clusters, ???)
I/O Bottlenecks (generate gigabytes per
simulation, checkpointing)
Visualization of all that comes out!
Scientist/eng. wants to focus on top, but all
required for results...
Such work cuts across many disciplines, areas of
CS
And now do it on a Grid??!!

10
How to Achieve This?

Any Such Computation Requires Incredible Mix of
Varied Technologies and Expertise!
Many Scientific/Engineering Components
Physics, astrophysics, CFD, engineering,...
Many Numerical Algorithm Components
Finite difference methods? Finite elements?
Elliptic equations multigrid, Krylov subspace,
preconditioners,...
Mesh Refinement?
Many Different Computational Components
Parallelism (HPF, MPI, PVM, ???)
Architecture Efficiency (MPP, DSM, Vector, PC
Clusters, ???)
I/O Bottlenecks (generate gigabytes per
simulation, checkpointing)
Visualization of all that comes out!
Scientist/eng. wants to focus on top, but all
required for results...
Such work cuts across many disciplines, areas of
CS
And now do it on a Grid??!!

11
High Throughput Computing Task farming

Running hundreds - millions of jobs as quickly
as possible
Collecting statistics, doing ensemble
calculations, surveying large parameter space,
etc
Typical Characteristics
Many small, independent jobs must be managed!
Usually not much data transfer
Sometimes jobs can be moved from site to site
Example Problems climatemodeling.com, NUG30
Example Solutions Condor, SC02 demos, etc
Later examples that combine capacity and
throughput

12
Large Data Computing

Data more and more the killer app for the Grid
Data mining
Looking for patterns in huge databases
distributed over the world
E.g. Genome analysis
Data analysis
Large astronomical observatories
Particle physics experiments
Huge amounts of data from different locations to
be correlated, studied
Data generation
Resources Grow Huge simulations will each
generate TB-PB to be studied
Visualization
How to visualize such large data, here, at a
distance, distributed
Soon Dynamic combinations of all types of
computing, data on grids
Our Goal is to give strategies for dealing with
all types of computing

13
Grand Challenge CollaborationsGoing Large Scale
Needs Dwarf Capabilities

NASA Neutron Star Grand Challenge
5 US Institutions
Solve problem of colliding neutron stars (try)

NSF Black Hole Grand Challenge
8 US Institutions, 5 years
Solve problem of colliding BH (try)

EU Network Astrophysics
10 EU Institutions, 3 years, 1.5M
Continue these problems
Entire Community becoming Grid enabled

Examples of Future of Science Engineering
Require Large Scale Simulations, beyond reach of
any machine
Require Large Geo-distributed Cross-Disciplinary
Collaborations
Require Grid Technologies, but not yet using
them!
Both Apps and Grids Dynamic

14
Growth of Computing Resources (from Dongarra)
15
Not just Growth, Proliferation

Systems getting larger by 2-3-4x per year!
Moores law (processor doubles each 18 months)
Increasing parallelism add more and more
processors
More systems
Many more organizations recognizing need for HPC
Universities
Labs
Industry
Business
New kind of parallelism Grid
Harness these machines, which themselves are
growing
Machines all different! Be prepared for next
thing

16
Todays Computational Resources

PDAs
Laptops
PCs
SMPs
Shared memory up to now
Clusters
Distributed memory, must use message passing or
task farming
Traditional supercomputers
SMPs of up to 64 processors
Clustering above this
Vectors
Clusters of large systems metacomputing
The Grid

Everyone uses PDAs - PCs
Industry prefers traditional machines
Academia clusters for price/perf
We show how to minimize effort to go
between systems, prepare for Grid

17
The Same Application
Laptop
Super Computer
The Grid
Application
Application
Application
Middleware
Middleware
Middleware
No network!
Biggest machines!
18
What is Difficult About HPC?

Many different architectures and operating
systems
Things change very rapidly
Must worry about many things at same time
Single processor performance, caches, etc
Different languages
Different operating systems (but now, at least
everything is (nearly) unix!)
Parallelism
I/O
Visualization
Batch systems
Portability compilers, datatypes and associated
tools

19
Requirements of End Users

We have problems that need to be solved
Want to work at conceptual level
Build on top of other things that have been
solved for us
Use libraries, modules, etc.
We dont want to waste time with
Learning a new parallel layer
Writing high performance I/O
Learning a new batch system, etc
We have collaborators distributed all over the
world
We want answers fast, on whatever machines are
available
Basically, want to write simple Fortran or C code
and have it work

20
Requirements of Application Developers

We must have access to latest technologies
These should be available through simple
interfaces and APIs
They should be interchangeable with each other
when same functionality is available from
different packages
Code we develop must be as portable and as future
proof as possible
Run on all these architectures we have today
Easily adapted to those of tomorrow
If possible, top level user application code
should not change, only layers underneath
Well give strategies for doing this, on todays
machines, and on the Grid of tomorrow

21
Where is This All Going?

Dangerous to predict, but
Resources will continue to grow for some time
Machines will get larger at this rate TeraFlop
now, PetaFlop tomorrow
Collections of resources into Grids is happening
now, will be routine tomorrow
Very heterogeneous environments
Data explosion will be exponential
Mixture of real-time simulation and data analysis
will become routine
Bandwidth from point to point will allocatable on
demand!
Applications will become very sophisticated, able
to adapt to their changing needs, and to changing
environment (on time scales of minutes to years)
We are trying today to help you prepare for this!