Requirements for an end-to-end solution the Center for Plasma Edge Simulation (FSP) - PowerPoint PPT Presentation

About This Presentation
Title:

Requirements for an end-to-end solution the Center for Plasma Edge Simulation (FSP)

Description:

Requirements for an endtoend solution the Center for Plasma Edge Simulation FSP – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 20
Provided by: scottk47
Learn more at: https://sdm.lbl.gov
Category:

less

Transcript and Presenter's Notes

Title: Requirements for an end-to-end solution the Center for Plasma Edge Simulation (FSP)


1
Requirements for an end-to-end solution the
Center for Plasma Edge Simulation (FSP)
  • SDM AHM
  • October 5, 2005
  • Scott A. Klasky
  • ORNL

2
Perhaps not just the CPES FSP
  • Can we form the CAFÉ solution?
  • Combustion, Astrophysics, Fusion End-to-end
    framework
  • Combustion SciDAC
  • Astrophysics TSI SciDAC.
  • Fusion SciDACS (CPES, SWIM, GPS, CEMM)
  • SNS Follow closely, and try to exchange
    technology.

3
Center for Plasma Edge Simulations (A Fusion
Simulation Project SciDAC)
How can a particular plasma edge condition
dramatically improve the confinement of fusion
plasma, as observed in the experiments? The
physics of the transitional edge plasma that
connects the hot core (of order
100-million-degree-C, or tens of keV) with the
material walls is the subject of this research
question. 5-year goal Predict the edge pedestal
behavior for the ITER and existing devices. This
must be answered for the success of ITER
We are developing a testable pedestal simulation
framework which incorporates the relevant
spectrum of physics processes (e.g, transport,
kinetic and magnetohydrodynamic stability and
turbulence, flows, and atomic physics in
realistic geometry) that span the range of plasma
parameters relevant to ITER.
Pedestal
Use Kepler for end-to-end solution with autonomic
high performance NXM data transfers for code
coupling, code monitoring, saving results.
M3D simulation depicting edge localized modes
(ELMs)
Input Files
Data Interpolation
MHD Linear Stability monitor
Job submission
XGC-ET Simulation on leadership-class computer
True
STABLE?
False
Noise Monitor
-ET
Distributed Storage
Distributed Storage
M3D Simulation
Portal
Data Interpolation
a
XGC-ET Compute SOL
a
Out-of-core isosurface
  • Codes used in this project
  • XGC-ET
  • A fully kinetic PIC code which will solve
    turbulence, neoclassical, and neutral dynamics
    self-consistently.
  • High velocity space resolution and arbitrary
    shaped wall are necessary to solve this research
    problem.
  • Will acquire the gyrokinetic machinery from the
    GTC code, part of the GPS SciDAC.
  • Will include Degas-2 for more accurate neutral
    atomic physics around the boundary.
  • M3D-edge
  • An edge modified version of M3D MHD/2-fluid code,
    part of the CEMM SciDAC.
  • For nonlinear MHD ELM crashes.
  • Linear solvers
  • Simple preconditioners for diagonally dominant
    systems
  • Multigrid for scalable elliptic solves.
  • perfect weak scaling
  • investigation of tree code methods (e.g. fast
    multipole) for direct calculation of
    electrostatic forces (i.e., PIC w/o cells)

a
a
4
Code Coupling Forming a computational pipeline
  • 2 computers (or more)
  • 1 computer runs in batch.
  • Other system(s) is for interactive parallel use.
  • Security will be by-passed if we can have all
    computers at ORNL.

Cray XT3 XGC on 1,024P
Move 10MB lt1 second
Move 10MB lt1 second
I. cluster Mhd-L on 4P
I. cluster M3D on 32P
30GB/minute
I. cluster Noise monitor 80P
5
Interfaces must be designed to couple codes.
  • What variables are to be moved/what units?
  • What is the data decomposition on the sending
    side? On the receiving side?
  • Intercomm (Sussman) seems very interesting (PVM)
  • Development of algorithms and techniques for
    effectively solving key problems in software
    support for coupled simulations.
  • Concentrate on three main issues
  • Comprehensive support for determining at runtime
    what data is to be moved between simulations
  • Flexibly and efficiently determining when the
    data should be moved
  • Effectively deploying coupled simulation codes in
    a Grid computing environment.
  • A major goal is to minimize the changes that must
    be made to each individual simulation code.
  • Accomplished by having an individual simulation
    model only specify what data will be made
    available for a potential data transfer and not
    specify when an actual data transfer will take
    place.
  • Decisions about when data transfers will take
    place will be made through a separate
    coordination specification, that generally will
    be provided by the person building the complete
    coupled simulation.

6
Look at Mbs, not total data sizes
  • Hawkes (SciDAC 2005)
  • INCITE calculation
  • 2000 Seaborg processors, 2.5 million hours total
  • 5tb data, 9.3Mbs.
  • Blondin (SciDAC 2005)
  • 4 TB, 30 hours 310Mbs
  • CPES code coupling 1.3Mbs, data saving (3D)
    300 - 30(0)GB/10 minutes
  • Future is difficult to predict for data
    generation rates.
  • Codes add more physics, which slow down the code,
    algorithms speed up the code, new variables are
    generated, computers speed up,
  • This is also true for analysis of the Data.
  • Do we need all of the data at all of the
    timesteps before we can analyze?
  • Can we do analysis and data movement together?
  • Analysis/Visualization systems might have to be
    changed.

7
What happens when the Mbs gets too large?
  • Must understand the features in the data.
  • Use AMR-like scheme to save the data.
  • Does the data change dramatically everywhere?
  • Is the data smooth in some regions?
  • Can save 100x in compression techniques, but must
    be able to use data.
  • New viz/analysis tools?
  • Could just stitch up the grid, and use old tools.
  • Useful for Level of Detail Visualization (more
    detail in regions which change).
  • Use in combination with smart data caching/
    data compression (see below)

8
End-to-end/workflow requirements.
  • Easy to Install
  • Good examples (MPI, Netcdf, HDF5, LN, bbcp)
  • Easy to Use
  • Ensight-Gold
  • Must have value-added over simple approaches.
  • Value added discussed in the following slides.
  • Must be robust/fault tolerant.
  • The workflow can not crash our simulations/nodes!

9
Need a data model
  • Allows the CS community to design modules which
    can understand the data.
  • Allow for netcdf, hdf5.
  • Develop interfaces to extract portion of the
    data from the files/memory.
  • Must come from the application areas teaming up
    with the CS community.
  • HDF5/Netcdf is not a data model.
  • Can we use the data model in SciRun/AVSExpress/Ens
    ight as a start?
  • Meshes (uniform, rectlinear, structured,
    unstructured).
  • Hierarchy in meshes (AMR).
  • Cell Centered, Vertex Centered, Edge Centered
    data.
  • Multiple variables on a mesh.
  • Can we use simple APIs in the codes which can
    write the data out?

10
Monitoring.
  • We want to watch portions of the data from the
    simulation, as the simulation progresses.
  • Want the ability to play back from t0 to the
    current frame. I.e. snapshot movies.
  • Want this information presented so that we can
    collaborate during/after the simulation.
  • Highlights part of the data, to discuss with
    other users.
  • Draw on the figures.
  • Mostly 1D plots, some 2d (surface/contour) plots,
    some 3D plots.
  • Example (http//w3.pppl.gov/transp/ElVis/121472A03
    _D3D.html)

11
Portal to launch workflow/monitor jobs
  • Use the portal as a front-end to the workflow.
  • Would like to see the workflow/ but not monitor
    it.
  • Perhaps it will allow us to choose different
    workflows which were created?
  • Would like to launch workflow, and have
    automatic job submission for known
    clusters/HPC.
  • Submit to all, kill all when one starts running

12
Users want to write their own analysis
  • Requires that they can do this in F90, C/C,
    Python.
  • Need wizards to allow users to describe their
    input/output.
  • Similar to AVS/Express, SciRun, OpenDX.
  • Common scenario
  • Users want the main data field (field_in), they
    want a string (temperature), they want a
    condition (gt), they want an output field. They
    also want this to run on their cluster with M
    processors. They also want to change the inputs
    at any given time.

13
Efficient Data Movement
  • One same node
  • Use memory reference.
  • On same cluster
  • Use MPI communication.
  • On different clusters (NXM communication)
  • 2 approaches memory-memory vs. files.
  • File approach is not always useable.
  • Will break the solution for code-coupling
    approaches since I/O can become the bottleneck.
    (open/close/read/write).
  • Working with Parashar/Kohl to look into the NXM
    problem.
  • Do we make this part of Kepler?

14
Distributed Data Storage - 1
  • Users do NOT want to know where their data is
    stored.
  • Users want the FASTEST possible method to get to
    their data.
  • Users seldom look at all of their data at once.
  • Usually, we look at a handful of variables at a
    time, with only a few time slices at a time.
    (DONT need 4 TB in a second).
  • Users require that solution works on their laptop
    when traveling! (must cache results from
    local-disk).
  • Users do NOT want to change their
    mode-of-operational during travel.

15
Distributed-data storage -2
  • LN is a good example of an almost useable
    system.
  • Needs to directly understand HDF5/netcdf.
  • Needs to be able to cache information on local
    disks, and modify the eXnodes.
  • Needs to be able to work with HPSS.
  • But this is NOT enough!

16
Smart data cache
  • Users typically access their data in similar
    patterns.
  • Look at timestep 1 for variables A,B, look at
    ts2 for A,B, ..
  • If we know what the user wants, when he/she wants
    it, then we can use smart technologies.
  • In a collaboration, the data access gets more
    complicated.
  • Neural Networks to the rescue!

17
Need data mining technology integrated into the
solution
  • We must understand the features of the data.
  • Requires a working relationship with app.
    Scientists and computer scientists.
  • Want to detect features on-the-fly (from the
    current, and previous timesteps).
  • Could feature born analysis be done by the end of
    the simulation?
  • Pre-compute everything possible by the end of the
    simulation. DO NOT REQUIRE the end user to wait
    for anything that we know we want.

18
Security
  • Users do NOT want to deal with this.
  • But of course, they have to.
  • Will DOE require single sign-ins.
  • Can trusted sites talk to other trusted sites
    via ports being opened from A-B?
  • Will this be the death for workflow automation?
  • Can automate data movement, if we must sign on
    each time with unique passwords.

19
Conclusions.
  • We need Kepler in order for the CPES project to
    be successful.
  • We need efficient NXM data moved, and monitored.
  • We need to be able to provide feedback to the
    simulation(s).
  • Codes must be coupled, and we need an efficient
    mechanism to couple the data.
  • What do we do with single-logins?
  • ORNL tells me that we can have ports open from
    one site to another without violating the
    security model. What about other sites?
  • Are we prepared for new architectures?
  • Cray XT3 has only 1 small pipe out to the world.
Write a Comment
User Comments (0)
About PowerShow.com