I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET

About This Presentation

Title:

I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET

Description:

Block-Structured Local Refinement (Berger and Oliger, 1984) ... Previous work: BoxLib (LBNL/CCSE), KeLP (Baden, et. al., UCSD), FIDIL (Hilfinger ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 12

Provided by: philc156

Learn more at: https://sdm.lbl.gov

Category:

more less

Transcript and Presenter's Notes

Title: I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET

1
I/O for Structured-Grid AMRPhil
ColellaLawrence Berkeley National
LaboratoryCoordinating PI, APDEC CET
2
Block-Structured Local Refinement (Berger and
Oliger, 1984)
Refined regions are organized into rectangular
patches. Refinement performed in time as well as
in space.
3
Stakeholders

SciDAC projects
Combustion, astrophysics (cf. John Bells talk).
MHD for tokomaks (R. Samtaney).
Wakefield accelerators (W. Mori, E. Esarey).
AMR visualization and analytics collaboration
(VACET).
AMR elliptic solver benchmarking / performance
collaboration (PERI, TOPS).
Other projects
ESL edge plasma project - 5D gridded data (LLNL,
LBNL).
Cosmology - AMR Fluids PIC (F. Miniati, ETH).
Systems biology - PDE in complex geometry (A.
Arkin, LBNL).
Larger structured-grid AMR community Norman
(UCSD), Abel (SLAC), Flash (Chicago), SAMRAI
(LLNL) We all talk to each other, have common
requirements.

4
Chombo a Software Framework for Block-Structured
AMRRequirement to support a wide variety of
applications that use block-structured AMR using
a common software framework.

Mixed-language model C for higher level data
structures, Fortran for regular single-grid
calculations.
Reusable components Component design based on
mapping of mathematical abstractions to classes.
Build on public domain standards MPI, HDF5, VTK.

Previous work BoxLib (LBNL/CCSE), KeLP (Baden,
et. al., UCSD), FIDIL (Hilfinger and Colella).
5
Layered Design

Layer 1. Data and operations on unions of boxes
- set calculus, rectangular array library (with
interface to Fortran), data on unions of
rectangles, with SPMD parallelism implemented by
distributing boxes over processors.
Layer 2. Tools for managing interactions between
different levels of refinement in an AMR
calculation - interpolation, averaging operators,
coarse-fine boundary conditions.
Layer 3. Solver libraries - AMR-multigrid
solvers, Berger-Oliger time-stepping.
Layer 4. Complete parallel applications.
Utility layer. Support, interoperability
libraries - API for HDF5 I/O, visualization
package implemented on top of VTK, C APIs.

6
Distributed Data on Unions of RectanglesProvides
a general mechanism for distributing data defined
on unions of rectangles onto processors, and
communication between processors.

Metadata of which all processors have a copy
BoxLayout is a collection of Boxes and processor
assignment.
template ltclass Tgt LevelDataltTgt and other
container classes hold data distributed over
multiple processors. For each k1 ... nGrids ,
an array of type T corresponding to the box Bk
is located on processor pk. Straightforward
APIs for copying, exchanging ghost cell data,
iterating over the arrays on your processor in a
SPMD manner.

7
Typical I/O requirements

Loads are balanced to fill available memory on
all processors.
Typical output data size corresponding to a
single time slice 10 - 100 of total memory
image.
Current problems scale to 100 - 1000 processors.
Combustion and astrophysics simulations write one
file / processor other applications use Chombo
API for HDF5.

8
HDF5 I/O

Disk File /
Group subdirectory
Attribute, dataset files. Attribute
small metadata that multiple processes in a SPMD
program can write out redundantly. Dataset large
data, each processor writes out only what it owns.

Chombo API for HDF5
Parallel neutral can change processor layout
when re-inputting output data.
Dataset creation is expensive create only one
dataset for each LevelData. The data for each
patch is written into offsets from the origin of
that dataset.

9
Performance Analysis (Shan and Shalf, 2006)

Observed performance of HDF5 applications in
Chombo no (weak) scaling. More detailed
measurements indicate two causes misalignment
with disk block boundaries, lack of aggregation.

10
Future Requirements