Preconditioning%20Implicit%20Methods%20for%20Coupled%20Physics%20Problems%20%20David%20Keyes%20Center%20for%20Computational%20Science%20Old%20Dominion%20University%20 - PowerPoint PPT Presentation

About This Presentation
Title:

Preconditioning%20Implicit%20Methods%20for%20Coupled%20Physics%20Problems%20%20David%20Keyes%20Center%20for%20Computational%20Science%20Old%20Dominion%20University%20

Description:

Princeton Plasma Physics Lab. Fusion Minisymposium, SIAM CSE03. CMRS profile from SciDAC website ... reconnection in astrophysical plasmas, in smaller scale ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Preconditioning%20Implicit%20Methods%20for%20Coupled%20Physics%20Problems%20%20David%20Keyes%20Center%20for%20Computational%20Science%20Old%20Dominion%20University%20


1
Preconditioning Implicit Methods for Coupled
Physics ProblemsDavid KeyesCenter for
Computational ScienceOld Dominion
UniversityInstitute for Scientific Computing
ResearchLawrence Livermore National Laboratory
2
Plan of presentation
  • Background/progress report
  • Fusion applications in the TOPS SciDAC portfolio
  • Center for Extended Magnetohydrodynamic Modeling
    (CEMM)
  • Center for Magnetic Reconnection Studies (CMRS)
  • Parallel nonlinearly implicit solvers in TOPS
  • PETScs domain-decomposed JFNK framework
  • Hypres multilevel preconditioners
  • Whats wrong with this picture?
  • Curse of the fast waves
  • Curse of the splitting errors
  • Curse of the brute force algebraic solvers
  • Current and future needs/plans
  • Physics-based preconditioning
  • Coupling multi-physics applications

3
Acknowledgments
  • For mentoring, application codes, and some
    slides, our primary SciDAC fusion collaborators
    to date
  • Steve Jardin and Jin Chen (CEMM)
  • Amitava Bhattacharjee and Kai Germaschewski
    (CMRS)
  • For solver codes, new development, and some
    slides, our TOPS partners
  • Barry Smith and the PETSc team (ANL)
  • Rob Falgout and the Hypre team (LLNL)
  • For putting it all together
  • Florin Dobrian, TOPS project post-doc (ODU)
  • For future algorithmic philosophy
  • Xiao-Chuan Cai (CU Boulder) and Dana Knoll (LANL)
  • For support and computational resources
  • DOE SciDAC program and earlier computational math
    investments
  • NERSC and ORNL computing centers

4
Two fusion energy applications
  • Center for Magnetic Reconnection Studies (CMRS,
    based at Iowa)
  • Idealized 2D Cartesian geom., periodic BCs,
    simple FD discretization
  • Mix of primitive variables and streamfunctions
  • Sequential nonlinearly coupled explicit
    evolution, w/2 Poisson inversions on each
    timestep
  • Want from TOPS
  • Now scalable Jacobian-free Newton-Krylov
    nonlinearly implicit solver for higher resolution
    in 3D (and for AMR)
  • Later physics-based preconditioning for
    nonlinearly implicit solver
  • Center for Extended MHD Modeling (CEMM, based at
    PPPL)
  • Realistic toroidal geom., unstructured mesh,
    hybrid FE/FD discretization
  • Fields expanded in scalar potentials, and
    streamfunctions
  • Operator-split, linearized, w/11 potential solves
    in each poloidal cross-plane/step (90 exe. time)
  • Parallelized w/PETSc (Tang et al., SIAM PP01,
    Chen et al., SIAM AN02, Jardin et al., SIAM
    CSE03)
  • Want from TOPS
  • Now scalable linear implicit solver for much
    higher resolution (and for AMR)
  • Later fully nonlinearly implicit solvers and
    coupling to other codes

5
CEMM profile from SciDAC website
Center for Extended Magnetohydrodynamic
Modeling This research project will develop
computer codes that will enable a realistic
assessment of the mechanisms leading to
disruptive and other stability limits in the
present and next generation of fusion devices.
With an improvement in the efficiency of codes
and with the extension of the leading 3D
nonlinear magneto-fluid models of hot, magnetized
fusion plasmas, this research will pioneer new
plasma simulations of unprecedented realism and
resolution. These simulations will provide new
insights into low frequency, long-wavelength
non-linear dynamics in hot magnetized plasmas,
some of the most critical and complex phenomena
in plasma and fusion science. The underlying
models will be validated through cross-code and
experimental comparisons. PI Steve
JardinPrinceton Plasma Physics Lab
6
CMRS profile from SciDAC website
Magnetic Reconnection Applications to Sawtooth
Oscillations, Error Field Induced Islands and the
Dynamo Effect The research goals of this project
include producing a unique high performance code
and using this code to study magnetic
reconnection in astrophysical plasmas, in smaller
scale laboratory experiments, and in fusion
devices. The modular code that will be developed
will be a fully three-dimensional, compressible
Hall MHD code with options to run in slab,
cylindrical and toroidal geometry and flexible
enough to allow change in algorithms as needed.
The code will use adaptive grid refinement, will
run on massively parallel computers, and will be
portable and scalable. The research goals include
studies that will provide increased understanding
of sawtooth oscillations in tokamaks, magnetotail
substorms, error-fields in tokamaks, reverse
field pinch dynamos, astrophysical dynamos, and
laboratory reconnection experiments. PI Amitava
BhattacharjeeUniversity of Iowa
7
Summary of progress on CEMM
  • TOPS running full-up M3D code on PPPLs preferred
    NERSC Seaborg platform
  • Using up to 75 processors per poloidal
    cross-plane in weak scaling fixed memory per
    processor (parallel concurrency can be 10-100
    times this in toroidal direction), PETSc handles
    all aspects of the distributed data structures
  • Have integrated Hypre as preconditioner to PETSc
    and created new Hypre coarsening rules to
    accommodate a peculiarity of M3Ds Dirichlet
    boundary conditions
  • 1-level additive Schwarz preconditioner has
    excellent per-iteration implementation
    efficiency, but nonscalable convergence
  • Algebraic Multigrid preconditioner has perfect
    convergence scalability, but (presently) high
    set-up time and poor set-up scalability, and
    mildly degrading per-iteration implementation
    efficiency
  • PLAN separately tune each of the 11 different
    Poisson solves per timestep
  • PLAN work on improving Hypre solves and
    amortizing Hypre set-ups
  • PLAN (with APDEC team) incorporate AMR

8
Hypres AMG in CEMM app
  • M3Ds custom discretization and parallel data
    structures dictated by geometry and physics
    periodic toroidally, unstructured poloidally, fit
    to flux surfaces, partitioned into 3n2 chunks
    poloidally times arbitrary toroidal
    decomposition, some mesh anisotropy
  • Mesh mapped and reordered using PETScs index
    sets, solved by GMRES, preconditioned with
    Hypres parallel algebraic MG solver of
    Ruge-Stueben type

figures c/o S. Jardin, CEMM
9
Hypres AMG in CEMM app
  • Iteration count results below are averaged over
    19 different PETSc SLESSolve calls in
    initialization and first timestep loop for this
    operator-split unsteady code, abcissa is number
    of procs in scaled problem problem size ranges
    from 12K to 303K unknowns (approx 4K per
    processor)

iterations
size or procs
10
Hypres AMG in CEMM app, cont.
  • Scaled speedup timing results below are summed
    over 19 different PETSc SLESSolve calls in
    initialization and first timestep loop for this
    operator-split unsteady code
  • Much of AMG cost is coarse-grid formation
    (preprocessing) in production, these coarse
    hierarchies will be saved for reuse (same linear
    systems are called in each timestep loop), making
    AMG less expensive

time
size or procs
11
Hypres AMG in CEMM app, cont.
12
Hypres AMG in CEMM app, cont.
13
Summary of progress on CMRS
  • CMRS team has provided TOPS with discretization
    of model 2D multicomponent MHD evolution code in
    PETScs FormFunctionLocal format using DMMG and
    automatic differentiation for Jacobian objects
  • TOPS has implemented fully nonlinearly implicit
    GMRES-MG-ILU parallel solver with custom
    deflation of nullspace in CMRSs doubly periodic
    formulation
  • CMRS and TOPS reproduce the same dynamics on the
    same grids with the same time-stepping, up to a
    finite-time singularity due to collapse of
    current sheet (that falls below presently uniform
    mesh resolution)
  • TOPS code, being implicit, can choose timesteps
    an order of magnitude larger, with potential for
    higher ratio in more physically realistic
    parameter regimes, but is still slower in
    wall-clock time
  • PLAN tune PETSc solver by profiling, blocking,
    reuse, etc.
  • PLAN go to higher-order in time
  • PLAN identify the numerical complexity benefits
    from implicitness (in suppressing fast
    timescales) and quantify (explicit versus
    implicit)
  • PLAN (with APDEC team) incorporate AMR

14
2D Hall MHD sawtooth instability
(Porcelli et al., 1993, 1999)
Model equations
Equilibrium
figures c/o A. Bhattacharjee, CMRS
15
PETScs DMMG in CMRS app
  • Mesh and time refinement studies of CMRS Hall
    magnetic reconnection model problem (4 mesh
    sizes, dt0.1 (nondimensional, near CFL limit for
    fastest wave) on left, dt0.8 on right)
  • Measure of functional inverse to thickness of
    current sheet versus time, for 0lttlt200
    (nondimensional), where singularity occurs around
    t215

16
PETScs DMMG in CMRS app, cont.
  • Implicit timestep increase studies of CMRS Hall
    magnetic reconnection model problem, on finest
    (192?192) mesh of previous slide, in absolute
    magnitude, rather than semi-log

17
Scope for TOPS
  • Design and implementation of solvers
  • Time integrators
  • Nonlinear solvers
  • Constrained optimizers
  • Linear solvers
  • Eigensolvers
  • Software integration
  • Performance optimization

(w/ sens. anal.)
(w/ sens. anal.)
18
Jacobian-Free Newton-Krylov Method
  • In the Jacobian-Free Newton-Krylov (JFNK) method
    for F(u)0 , a Krylov method solves the linear
    Newton correction equation, requiring
    Jacobian-vector products
  • These are approximated by the Fréchet derivatives
  • so that the actual Jacobian elements are
    never explicitly needed, where ? is chosen
    with a fine balance between approximation and
    floating point rounding error (or by automatic
    differentiation)
  • Build preconditioner using convenient
    approximations without losing Newton convergence
    in the outer loop

19
Background of PETSc Library
  • Developed at ANL to support research,
    prototyping, and production parallel solutions of
    operator equations in message-passing
    environments
  • Distributed data structures as fundamental
    objects - index sets, vectors/gridfunctions, and
    matrices/arrays
  • Iterative linear and nonlinear solvers,
    combinable modularly and recursively, and
    extensibly
  • Portable, and callable from C, C, Fortran
  • Uniform high-level API, with multi-layered entry
  • Aggressively optimized copies minimized,
    communication aggregated and overlapped, caches
    and registers reused, memory chunks preallocated,
    inspector-executor model for repetitive tasks
    (e.g., gather/scatter)

See http//www.mcs.anl.gov/petsc
20
User Code/PETSc Interactions
Main Routine
Timestepping Solvers (TS)
Nonlinear Solvers (SNES)
Linear Solvers (SLES)
PETSc
PC
KSP
Application Initialization
Function Evaluation
Jacobian Evaluation
Post- Processing
User code
PETSc code
21
User Code/PETSc Interactions
Main Routine
Timestepping Solvers (TS)
Nonlinear Solvers (SNES)
Linear Solvers (SLES)
PETSc
PC
KSP
Application Initialization
Function Evaluation
Jacobian Evaluation
Post- Processing
User code
PETSc code
AD code
22
Background of Hypre Library
  • Developed at LLNL to support research,
    prototyping, and production parallel solutions of
    linear equations in message-passing environments
  • Object-oriented design similar to PETSc
  • Concentrates on linear problems only
  • Richer in preconditioners than PETSc, with focus
    on algebraic multigrid, including many research
    prototypes not yet in the public release
  • Includes other preconditioners, including sparse
    approximate inverse (Parasails) and parallel ILU
    (Euclid)

See http//www.llnl.gov/CASC/hypre/
23
Algebraic multigrid background
  • AMG assumes only information about the underlying
    matrix structure
  • AMG automatically forms coarse operators, based
    on problem data, using heuristics
  • There are many AMG algorithms, and the area is
    under active theoretical and experimental
    development

algebraically smooth error
Recur, as in geometric MG
24
Geometric multigrid a special case
Finest Grid
figure c/o hypre team, LLNL
25
Sample of Hypres scaled efficiency
26
Whats wrong with this picture?
  • Curse of the fast waves
  • MHD phenomena include waves at many different
    timescales, some of which (e.g., Whistler,
    Alfvén) are not dynamically relevant to some
    simulations but suppress the timestep of explicit
    methods due to instability
  • Curse of the splitting errors
  • Traditional operator-split approaches to
    advancing evolutionary PDEs incur
    first-order-in-time splitting errors that
    suppress the timestep of semi-implicit methods
    due to inaccuracy
  • Curse of the brute force algebraic solvers
  • Fully implicit methods to overcome timestep
    limitations demand powerful solvers, whose cost
    per iteration for large timesteps may take back
    any savings they provide by allowing larger
    timesteps for a given accuracy

27
What do we need?
  • To overcome the algebraic complexity of implicit
    solvers
  • Knowledge of the physics already present in
    traditional operator-split approaches
  • To begin to couple existing large-scale nonlinear
    physics codes
  • An algebraic framework like Schwarz methods for
    linear problems, which shows how to use
    conveniently obtained solutions to subproblems of
    the new fully coupled problem (e.g., the solver
    procedures in the original component codes) to
    arrive at the solution of the full problem,
    faster and more reliably than by nonlinear Picard
    iteration (successive substitutions) alone
  • TOPS is addressing these near-downstream issues
  • Physics-based preconditioning (work w/ D. Knolls
    LANL group)
  • Nonlinear Schwarz (work w/ X.-C. Cais Boulder
    group (part of TOPS))
  • Nonlinear solver interface (work w/ SciDAC CCA
    project)

28
Philosophy of Jacobian-free NK
  • To evaluate the linear residual, we use the true
    F(u) , giving a true Newton step and asymptotic
    quadratic Newton convergence
  • To precondition the linear residual, we do
    anything convenient that uses understanding of
    the dominant physics/mathematics in the system
    and respects the limitations of the parallel
    computer architecture and the cost of various
    operations
  • combinations of operator-split Jacobians (for
    reasons of physics or reasons of numerics)
  • Jacobian of related discretization (for fast
    solves)
  • Jacobian of lower-order discretization (for more
    stability, less storage)
  • Jacobian with lagged values for expensive terms
    (for less computation per degree of freedom)
  • Jacobian stored in lower precision (for less
    memory traffic per preconditioning step)
  • Jacobian blocks decomposed for parallelism

29
Philosophy of Jacobian-free NK, cont.
  • These motivations are not new most large-scale
    application codes also take short cuts on the
    approximate Jacobian operator to be inverted
    showing physical intuition
  • The problem with many codes is that they do not
    anywhere have an accurate global Jacobian
    operator they use only the weak Jacobian
  • This leads to a weakly nonlinearly converging
    defect correction method
  • Defect correction
  • in contrast to preconditioned Newton

30
Physics-based preconditioningexample of 1D
Shallow Water Equations
  • Continuity
  • Momentum
  • Gravity wave speed
  • Typically , but stability
    restrictions require timesteps based on the CFL
    criterion for the fastest wave, for an explicit
    method
  • One can solve fully implicitly, or one can filter
    out the gravity wave by solving semi-implicitly

31
Physics-based preconditioning, cont.
  • Continuity ()
  • Momentum ()
  • Solving () for and
    substituting into (),
  • where

32
Physics-based preconditioning, cont.
  • After the parabolic equation is spatially
    discretized and solved for , then
    can be found from
  • One scalar parabolic solve and one scalar
    explicit update replace an implicit hyperbolic
    system
  • Similar tricks are employed in aerodynamics
    (sound waves), MHD (Alfvén waves), etc.
  • Temporal truncation error remains in (), due to
    lagged advection, but in delta form it makes a
    great preconditioner
  • Mousseau et al. for gravity waves (generalization
    of this example)
  • Pernice et al. for acoustic waves (SIMPLE scheme)
  • Chacon et al. for Whistler waves in MHD (very
    intricate see JCP)

33
Physics-based Preconditioning
  • When the shallow-water wave splitting is used as
    a solver, it leaves a first-order in time
    splitting error
  • In the Jacobian-free Newton-Krylov framework,
    this solver, which maps a residual into a
    correction, can be regarded as a preconditioner
  • The true Jacobian is never formed yet the
    time-implicit nonlinear residual at each time
    step can be made as small as needed for nonlinear
    consistency in long time integrations

34
Physics-based preconditioning
  • In Newton iteration, one seeks to obtain a
    correction (delta) to solution, by inverting
    the Jacobian matrix on (the negative of) the
    nonlinear residual
  • A typical operator-split code also derives a
    delta to the solution, by some implicitly
    defined means, through a series of implicit and
    explicit substeps
  • This implicitly defined mapping from residual to
    delta is a natural preconditioner
  • TOPS Software must (and will) accommodate this!

35
1D Shallow water preconditioning
  • Define continuity residual for each timestep
  • Define momentum residual for each timestep
  • Continuity delta-form ()
  • Momentum delta form ()

36
1D Shallow water preconditioning, cont.
  • Solving () for and
    substituting into (),
  • After this parabolic equation is solved for ?? ,
    we have
  • This completes the application of the
    preconditioner to one Newton-Krylov iteration at
    one timestep
  • Of course, the parabolic solve need not be done
    exactly one sweep of multigrid can be used

37
Operator-split preconditioning
  • Subcomponents of a PDE operator often have
    special structure that can be exploited if they
    are treated separately
  • Algebraically, this is just a generalization of
    Schwarz, by term instead of by subdomain
  • Suppose and a
    preconditioner is to be constructed, where
    and are each easy to
    invert
  • Form a preconditioned vector from as follows
  • Equivalent to replacing with
  • First-order splitting error, yet often used as a
    solver!

38
Operator-split preconditioning, cont.
  • Suppose S is convection-diffusion and R is
    reaction, among a collection of fields stored as
    gridfunctions
  • On a small regular 2D grid with a five-point
    stencil
  • R is trivially invertible in block diagonal form
  • S is invertible with one multilevel solve per
    field



J
S
R
39
Operator-split preconditioning, cont.
  • Preconditioners assembled from just the strong
    elements of the Jacobian, alternating the source
    term and the diffusion term operators, are
    competitive in convergence rates with block-ILU
    on the Jacobian
  • particularly, since the decoupled scalar
    diffusion systems are amenable to simple
    multigrid treatment not as trivial for the
    coupled system
  • The decoupled preconditioners store many fewer
    elements and significantly reduce memory
    bandwidth requirements and are expected to be
    much faster per iteration when carefully
    implemented
  • See alternative block factorization by Bank et
    al. incorporated into SciDAC TSI solver by
    DAzevedo

40
Nonlinear Schwarz preconditioning
  • Nonlinear Schwarz has Newton both inside and
    outside and is fundamentally Jacobian-free
  • It replaces with a new nonlinear
    system possessing the same root,
  • Define a correction to the
    partition (e.g., subdomain) of the solution
    vector by solving the following local nonlinear
    system
  • where is nonzero only in
    the components of the partition
  • Then sum the corrections

41
Nonlinear Schwarz, cont.
  • It is simple to prove that if the Jacobian of
    F(u) is nonsingular in a neighborhood of the
    desired root then and
    have the same unique root
  • To lead to a Jacobian-free Newton-Krylov
    algorithm we need to be able to evaluate for any
  • the residual
  • the Jacobian-vector product
  • Remarkably, (Cai-Keyes, 2000) it can be shown
    that
  • where and
  • All required actions are available in terms of
    !

42
Example of nonlinear Schwarz
43
Coupling using Nonlinear Schwarz
  • Consider systems F1(u1 ,u2)0 and F2(u1 ,u2)0
  • Couple together as F(u)0
  • Apply nonlinear Schwarz with nonoverlapping
    partitions corresponding to original systems
  • Then
  • where
  • (gives insight into coupling
    direction/scaling)

44
Abstract Gantt Chart for TOPS
Each color module represents an algorithmic
research idea on its way to becoming part of a
supported community software tool. At any moment
(vertical time slice), TOPS has work underway at
multiple levels. While some codes are in
applications already, they are being improved in
functionality and performance as part of the TOPS
research agenda.
Dissemination
Applications Integration
Hardened Codes
e.g.,PETSc
Research Implementations
e.g.,TOPSLib
Algorithmic Development
e.g., ASPIN
time
45
Summary
  • Important codes in fusion MHD are bottlenecked
    without powerful parallel implicit solvers
  • These same codes often contain valuable
    approximate solvers being employed as the only
    solver, thereby limiting timestep effectiveness
  • Coupling such codes together will only impose
    further limits on the timestep it is likely to
    be smaller than the smallest timestep permitted
    by any component code, to accommodate higher
    levels of coupling
  • Nonlinearly implicit solvers for each component
    can leverage existing solvers as physics-based
    preconditioners, if software allows,
    strengthening the components even prior to their
    integration
  • Nonlinear versions of Schwarz contain promise for
    preconditioning coupled codes
  • Doing physics is more fun than doing driven
    cavities ?

46
For more information ...
Come to poster session tonight, see Knoll Keyes
(2002) JCP preprint at http//www.math.odu.edu/k
eyes and see
http//www.tops-scidac.org
Write a Comment
User Comments (0)
About PowerShow.com