Loading...

PPT – The Role of Analysis in Modern High Performance Computing PowerPoint presentation | free to download - id: 71c8d9-NDAxZ

The Adobe Flash plugin is needed to view this content

The Role of Analysis in Modern High Performance

Computing

- J. L. Schwarzmeier, Cray Inc, July 6, 2007

Outline

- Where I am coming from
- Describe process of going from problem to be

solved to solution on computer and role of

analysis throughout - Conclusions

Where I am coming from

- As much as I have studied and enjoyed math, I am

not a researcher in analysis ? a personal view of

analysis and applied math - However, did see lots of use of analysis for

theoretical and numerical purposes in my own work

and work of others - 20 years since I left LANL for Cray
- Cray personnel generally do not work with

customers at the level of their fundamental

equations. But continue to see sophisticated uses

of analysis by customers - Sometimes we do recommend alternative numerical

schemes to improve performance - Will describe journey of identifying problem to

be solved ? to obtaining solution on a

computer, role of analysis, sprinkled with

examples I have encountered.

Role of analysis in the sciences

- Is the use of analysis is science/engineering

relegated to a bygone era before computers and

before all science had been discovered?

NO!! - Myth all science has already been discovered
- Reality no, laws of physics known, but focus

has changed to solving more realistic problems - wave function of the universe,

, where N particles in universe,

satisfies -

- approximately true but totally useless

equation. Since 1920s about the only quantum

mechanical problems that have been solved

exactly are hydrogen atom, single particle in

harmonic oscillator potential, or other similarly

idealized situations. Even on most powerful

supercomputers, quantum chemistry codes struggle

mightily to approximately solve N-atom

Schroedinger equation for N O(10) atoms. - Even on todays supercomputers, important

problems cannot be solved by brute force clever

algorithms and implementations are required and

knowing what approximations to make

Circle of dependency

- Scientific progress today often requires

multi-disciplinary collaboration - Team of scientists
- Mathematics, numerical methods analysis
- Computer specialists
- Do we have enough students entering

analysis/applied math? Will their training allow

them to bridge the gap from scientist/engineer to

computer programmer?

2. Analysis to understand properties, guide

solution

1. Scientists define problem

3. Numerics and computer implementation

Step 1 of the Journey role of scientists/engineer

s

- A big opportunity for analysis and computing

today is solving new, realistic, specific, often

multi-disciplinary problems. Specific model

equations are derived from general equations by

physical and mathematical intuition - Focus on narrowed range of problems of interest
- Of fundamental equations which terms are

important how are multi-disciplinary PDEs

coupled together? How to deal with greatly

disparate space/time scales? ? modified set of

equations - In climate studies current model couples

atmosphere, ocean, land, sea ice. Seek to improve

model by including 100 chemical species in

atmosphere full carbon cycle modeling with

interaction of vegetation, plant decay, release

of CO2 full fresh water hydrology with river

basin modeling and drainage into oceans higher

spatial resolution to model cloud physics and

effects of narrow land formations Florida on

oceanic currents better modeling of man-made

interactions such as fires, deforestation,

development, irrigation, pollution, etc. This is

a much expanded set of equations and unknowns

needs mathematical foundation

Another multi-disciplinary example

- Researchers at Rice University Use Cray

Supercomputer to Unlock Biomedical Mysteries and

Aid Future Diagnostics - Members of the Team for Advanced Flow

Simulation and Modeling at Rice are collaborating

with colleagues from other institutions to create

computational fluid dynamics models that mimic

how blood courses through the brains arteries

and interacts with an aneurysm on a vessel wall.

An aneurysm is a balloon-like protrusion of an

artery that could be fatal if it bursts. The team

uses the Cray system to simulate numerically how

the blood, artery and aneurysm interact with each

other. The data is then loaded into a program

from Computational Engineering International

called EnSight, which provides visualization and

analytical capabilities. - Accurate blood-flow simulations are

extremely complex because an artery wall isnt

rigid and blood pressure fluctuates with the

beating of the heart, says Tayfun Tezduyar,

professor of mechanical engineering at Rice. We

want to understand how much a cerebral artery

wall deforms, how blood flow is affected and what

stresses are created that could affect the

aneurysm. A precise understanding of this dynamic

will be of great benefit to brain surgeons when

they have to make a decision about whether or not

to operate A traditional Singular Value

Decomposition algorithm running on a conventional

computer does not preserve the symmetry of the

molecule, making it difficult to isolate and

study a proteins characteristics. The team

developed more accurate algorithms that they

parallelized to run quickly on the

supercomputer

Example of HPC for economic competitiveness

- CRAY SUPERCOMPUTERS PLAY KEY ROLE IN DESIGNING

BOEING 787 DREAMLINER 800,000 Simulation Hours

Helped Create Design For Highly Successful

Commercial Aircraft SEATTLE, WA, July 5, 2007 --

Global supercomputer leader Cray Inc. (Nasdaq GM

CRAY) reported today that 800,000 processor hours

of computing time on Cray supercomputers went

into the design of the highly successful Boeing

787 Dreamliner. Supercomputer-based modeling

and simulation is far more efficient,

cost-effective and practical than physical

proto-typing for testing large numbers of design

variables. While physical prototyping is still

important for final design validation, Boeing

engineers were able to build the 787 Dreamliner

after physically testing only 11 wing designs,

versus 77 wing designs for the earlier Boeing 767

aircraft. The Boeing 787 Dreamliner is 20

lighter and produces 20 fewer emissions than

similarly sized airplanes, while providing 10

better per-seat costs per mile, according to

Boeing.

Step 2 mathematical analysis

- Analysis gives range of analytical techniques of

posing or formally solving boundary value

problems - What are mathematical properties of equations

are equations parabolic, elliptic, hyperbolic?

what are appropriate BCs? are initial conditions

continuous, differentiable? should we allow for

weak solutions? what are continuity properties

of operators? - Can perturbation theory be used to find

first-order solutions? Are there

transformations of independent, dependent

variables that simplify problem? are there

analytical, limiting solutions? are there

symmetry properties to take advantage of? - What are possible methods of solution

eigenfunction expansions, Greens function,

method of characteristics, transform methods,

divergence theorem, Stokes theorem? are there

special functions that form an expansion set with

desired continuity, orthogonality, completeness

properties? - Is the solution derivable from a variational

principle? are there constraints in the problem

Lagrange multipliers? - Many of these techniques apply not only as

traditional mathematical solutions but also as

numerical solutions on computers, only now linear

spaces have finite dimensionality

Resolving coordinate-induced singularities

- For problems involving cylindrical or

spherical domains, introduce artificial

singularities at origin by cylindrical or

spherical coordinates ? must eliminate singular

solutions, made worse by increasing resolution.

For expansions

, determine minimum continuity conditions on

function that allows only analytic

solutions about the origin. Do this separately

for scalar quantities versus vector

components . This

can be done independent of any PDE. - For example, for simple scalar case, for

, - Derivatives of the latter functional form

eventually will diverge at the origin, unless we

have . For , we

have - . Thus

- Thus the analytic solution is of the form

- Furthermore, in finite element

implementations shape functions that intersect

the origin can be made explicit functions of .

Shape functions whose finite elements do not

reach the origin can be functions of .

Use of Fourier Expansion

- For theoretical purposes, decomposition of

solution in terms of Fourier wavelengths and

frequencies will always be important for

judging physical scales of interest - Specific applicability for a) periodic boundary

conditions, b) smooth, long wavelength solutions,

c) when basis functions needs to be small as

possible, and d) when analysis or computation

aided by orthogonal basis - Currently used in the following HPC applications
- Turbulence modeling
- Long range molecular forces in molecular dynamics

and materials science - Application that have geometries (cylindrical,

spherical) with periodic BCs - Some weather/climate codes for latitude-longitude
- But, Fourier expansion not for every problem
- Solutions with steep gradients exhibit Gibbs

phenomena, where Fourier solution has strong,

spurious oscillations around discontinuities - Complicated boundaries difficult with global

expansion functions - some problems have coordinate singularities in

Fourier expansions

DNS code and Fourier Transforms

- E.g. 3-D turbulence. Direct Numerical Simulation

(DNS) code seeks to understand distribution of

energy in density/velocity fluctuations versus

wavelengths for the incompressible Navier-Stokes

equations, for a specified source of turbulence.

Understanding turbulence needed for efficient

design of airplanes, vehicles, combustion

systems, etc. - Code written in Fortran 90 with MPI
- Time evolution Runge Kutta 2nd order
- Spatial derivative calculation pseudospectral

method - Typically, FFTs are done in all 3 dimensions.
- Parallel 3D FFT so-called transpose strategy, as

opposed to direct strategy. That is, make sure

all data in direction of 1D transform resides in

one processors memory. Parallelize over

orthogonal dimension(s). - Data decomposition N3 grid points over P

processors - Originally 1D (slab) decomposition divide one

side of the cube over P, assign N/P planes to

each processor. Limitation P lt N - Currently 2D (pencil) decomposition divide side

of the cube (N2) over P, assign N2/P pencils

(columns) to each processor. - DNS code needs PFLOPS sustained performance to

achieve turbulence scaling on grid in

40 hours

Issues of Fourier expansions for climate models

- Pick geometry, Boundary Conditions (BC),

coordinate system - In climate/weather studies, use of

latitude-longitude (and height) coordinates leads

to efficient Fourier expansions in longitude,

Legendre transforms in latitude. But there are

issues with this approach 1) artificial

singularities at poles ? must eliminate singular

solutions at poles ? do complicated Fourier

filtering ? but poor load balancing on

computers 2) 1D data decomposition in either

longitude or latitude dimensions leads to limited

scalability on computers ? cannot use more

processors to reduce wall time. Furthermore, when

switching between Fourier phase or Legendre phase

data must be re-distributed across processors via

global transposes, which require very high

bandwidth networks. For these reasons the

climate/weather community is moving away from

Fourier/spectral methods to finite element

approaches

More Detailed Models with High Resolution

Sea surface temperature (degreesC) on the last

day of year 4 from the 1/10 degree, 42 level POP

spinup simulation on Jaguar. The result of this

spinup run will be used as the ocean initial

condition for a fully coupled climate run.

(Mat Maltrud, LANL)

(courtesy M. Gunzberger)

New methods and scaling

- Scaling to 100,000 processors using cubed sphere
- Finite Volume with GFDL (Lin, Kerr, Putman)
- Spectral Elements with NCAR (Taylor, Nair)
- Cloud resolving icoshedral dynamical core being

developed by Randall at CSU under SciDAC2

Example of perturbation theory, Hamiltonian

dynamics, etc. in Magnetic Fusion

- Plasma physics is regime of high temperature,

ionized gases. At T gt 100M nuclear fusion can

occur. Energy of fusion reactions released as

high energy neutrons, photons, or alpha

particles, depending on type of reaction. Energy

can be captured in a reactor to make electricity

-- ultimate energy source - After equilibrium and gross stability are

ensured, need to reduce microturbulence- - induced thermal transport to walls so

applied heating can raise temperature

Microturbulence requires particle description

rather than continuous fluid model - Collisionless plasma described by Vlasov equation

for -- 6D phase space - Electromagnetic field given by Maxwells

equations, where plasma particles are sources

of charge density and current density. Highly

coupled, nonlinear system - Solving Vlasov equation equivalent to solving

equations of motion for millions of particles

subject to applied and self-consistent forces

Particle-In-Cell (PIC) - Ion temperature gradient instability drives

microturbulence, but Tokamak fusion - devices have a small parameter of

, is major

radius of torus. Perturbation analysis for small

is used to simplify Hamiltonian

Example of choosing proper coordinates

Gyrokinetic Toroidal Code (GTC)

- Tokamaks are main candidates for fusion research

today - Plasma contained in toroidal (donut) shaped

device. Long way around is toroidal direction,

short way around is poloidal, and minor radial

direction. - obvious independent spatial coordinates are

- Better to transform from to poloidal flux,

Efficiency of Global Field-aligned Mesh

- Transform from toroidal coordinates to canonical

magnetic coordinates -

- Use perturbation analysis to find canonical

coordinates with .

Following particle motion with magnetic

coordinates in Hamiltonian straightens out

and changes charge deposition step of PIC from 3D

to 2D process. This allows for coarse

grid and saves factor 100 in CPU time. This is

huge and easily justifies a team spending a year

doing analysis to get this right

R

Domain Decomposition

- Domain decomposition
- each MPI process holds a toroidal section
- each particle is assigned to a processor

according to its position - Initial memory allocation is done locally on each

processor to maximize efficiency - Communication between domains is done with MPI

calls (runs on most parallel computers)

Step 2 Example of taking advantage of symmetry

- Find eigenfunctions, eigenvalues of operator

in 1D geometry with periodic boundary

conditions at , but which also has

periodicity length

. The eigenvalue problem is - ,

. The eigenfunctions are of the form - ,

where ,

. - That is, rather than find eigenfunctions

over interval , find reduced

eigenfunctions over interval , for

each value of . This is an example of the

Floquet-Bloch theorem.

Step3 Analysis in computer solution

- Many new, critical problems of interest will

involve computer solution. Embrace this reality,

as analysis can greatly aid implementation of

computer solutions - Do contributions from new sciences enter as

source terms to old models or are new PDEs

introduced? What is coupling at interfaces what

continuity conditions, conservation laws, and

boundary conditions are appropriate? - What are asymptotic solutions, symmetric

solutions, or other first-order solutions? They

can be crucial in understanding properties of

general solution space. Special solutions often

can be used as sanity checks, to improve choice

of basis functions, reduce computation, or

improve convergence. - Applied math needed to steer among myriad of

possible numerical algorithms, based on

mathematical properties of final system of

equations methods for different types of PDEs

finite differencing versus finite elements

global eigenfunction expansions versus basis

functions with compact support direct versus

iterative solvers explicit versus implicit

iterative methods convergence and stability of

iterative methods what kind of iterative solver

multi-grid, conjugate gradient,

pre-conditioners? - How is problem distributed among processors? are

scalable algorithms chosen for inter-processor

communication? is code written to allow full

expression of parallelism to CPUs, including

vectorization?

How do university math departments view analysis?

Is there an applied math major?

- In my opinion, applied math majors should take
- Required 2 semesters calculus-based introductory

physics - 1 semester undergraduate mechanics
- 1 semester undergraduate atomic/quantum physics
- 1 semester undergraduate numerical

methods/analysis - 1 semester programming for scientific

applications, including Fortran/C/C/Matlab - 1 semester undergraduate electromagnetism
- 22 25 credits outside math courses
- Each of the physics courses has graduate

counterparts - Or, rather than physics could focus on

engineering fields, chemistry, biology, medicine,

etc - Math I found most useful calculus, vector

analysis, complex variables, linear vector

spaces, ODEs, PDEs, probability, calculus of

variations, linear algebra

Conclusions

- Analysis and applied math are crucial components

of training professionals needed help mankind to

understand the environment, achieve fusion

energy, enable drugs by design, develop new

materials, etc. These also keep the US on top

economically and in terms of national security - Todays problems are more targeted to solving

specific problems that are directly tied to

industrial, national, or international need - Many of todays problems are multi-disciplinary

and all require sound mathematical foundation and

understanding - Todays big problems are solved on

supercomputers, and use of these machines

requires solid understanding of algorithms and

computer system architecture