Challenges in Performance Evaluation and Improvement of Scientific Codes - PowerPoint PPT Presentation

About This Presentation
Title:

Challenges in Performance Evaluation and Improvement of Scientific Codes

Description:

... Tracer (memory), PAPI, HPCToolkit, Sigma (memory), DPOMP (OpenMP), mpiP, gprof, ... Convolver (memory), DIMEMAS(network), SvPablo (scalability), Paradyn, Sigma ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 21
Provided by: nor7153
Learn more at: https://www.mcs.anl.gov
Category:

less

Transcript and Presenter's Notes

Title: Challenges in Performance Evaluation and Improvement of Scientific Codes


1
Challenges in Performance Evaluation and
Improvement of Scientific Codes
  • Boyana Norris
  • Argonne National Laboratory
  • http//www.mcs.anl.gov/norris
  • Ivana Veljkovic
  • Pennsylvania State University

2
Outline
  • Performance evaluation challenges
  • Component-based approach
  • Motivating example adaptive linear system
    solution
  • A component infrastructure for performance
    monitoring and adaptation of applications
  • Summary and future work

3
Acknowledgments
  • Ivana Veljkovic, Padma Raghavan (Penn State)
  • Sanjukta Bhowmick (ANL/Columbia)
  • Lois Curfman McInnes (ANL)
  • TAU developers (U. Oregon)
  • PERC members
  • Sponsor DOE and NSF

4
Challenges in performance evaluation
  • Many tools for performance data gathering and
    analysis
  • PAPI, TAU, SvPablo, Kojak,
  • Various interfaces, levels of automation, and
    approaches to information presentation
  • Users point of view
  • What do the different tools do? Which is most
    appropriate for a given application?
  • (How) can multiple tools be used in concert?
  • I have tons of performance data, now what?
  • What automatic tuning tools are available, what
    exactly do they do?
  • How hard is it to install/learn/use tool X?
  • Is instrumented code portable? Whats the
    overhead of instrumentation? How does code
    evolution affect the performance analysis process?

5
Incomplete list of tools
  • Source instrumentation TAU/PDT, KOJAK
    (MPI/OpenMP), SvPablo, Performance Assertions,
  • Binary instrumentation HPCToolkit, Paradyn,
    DyninstAPI,
  • Performance monitoring MetaSim Tracer (memory),
    PAPI, HPCToolkit, Sigma (memory), DPOMP
    (OpenMP), mpiP, gprof, psrun,
  • Modeling/analysis/prediction MetaSim Convolver
    (memory), DIMEMAS(network), SvPablo
    (scalability), Paradyn, Sigma,
  • Source/binary optimization Automated Empirical
    Optimization of Software (ATLAS), OSKI, ROSE
  • Runtime adaptation ActiveHarmony, SALSA

6
Incomplete list of tools
  • Source instrumentation TAU/PDT, KOJAK
    (MPI/OpenMP), SvPablo, Performance Assertions,
  • Binary instrumentation HPCToolkit, Paradyn,
    DyninstAPI,
  • Performance monitoring MetaSim Tracer (memory),
    PAPI, HPCToolkit, Sigma (memory), DPOMP
    (OpenMP), mpiP, gprof, psrun,
  • Modeling/analysis/prediction MetaSim Convolver
    (memory), DIMEMAS(network), SvPablo
    (scalability), Paradyn, Sigma,
  • Source/binary optimization Automated Empirical
    Optimization of Software (ATLAS), OSKI, ROSE
  • Runtime adaptation ActiveHarmony, SALSA

7
Challenges (where is the complexity?)
  • More effective use ? integration
  • Tool developers perspective
  • Overhead of initially implementing one-to-one
    interoperabilty
  • Managing dependencies on other tools
  • Maintaining interoperabilty as different tools
    evolve
  • Individual Scientist Perspective
  • Learning curve for performance tools ? less time
    to focus on own research (modeling, physics,
    mathematics)
  • Potentially significant time investment needed to
    find out whether/how using someone elses tool
    would improve performance ? tend to do own
    hand-coded optimizations (time-consuming,
    non-reusable)
  • Lack of tools that automate (at least partially)
    algorithm discovery, assembly, configuration, and
    enable runtime adaptivity

8
What can be done
  • How to manage complexity? Provide
  • Performance tools that are truly interoperable
  • Uniform easy access to tools
  • Component implementations of software, esp.
    supporting numerical codes, such as linear
    algebra algorithms
  • New algorithms (e.g., interactive/dynamic
    techniques, algorithm composition)
  • Implementation approach components, both for
    tools and the application software

9
What is being done
  • No integrated environment for performance
    monitoring, analysis, and optimization
  • Most past efforts
  • One-to-one tool interoperability
  • More recently
  • OSPAT (initial meeting at SC04), focus on common
    data representation and interfaces
  • Tool-independent performance databases PerfDMF
  • Eclipse parallel tools project (LANL)

10
OSPAT
  • The following areas were recommended for OSPAT to
    investigate
  • A common instrumentation API for source level,
    compiler level, library level, binary
    instrumentation
  • A common probe interface for routine entry and
    exit events
  • A common profile database schema
  • An API to walk the callstack and examine the heap
    memory
  • A common API for thread creation and fork
    interface
  • Visualization components for drawing histograms
    and hierarchical displays typically used by
    performance tools

11
Components
  • Working definition a component is a piece of
    software that can be composed with other
    components within a framework composition can be
    either static (at link time) or dynamic (at run
    time)
  • plug-and-play model for building applications
  • For more info C. Szyperski, Component Software
    Beyond Object-Oriented Programming, ACM Press,
    New York, 1998
  • Components enable
  • Tool interoperability
  • Automation of performance instrumentation/monitori
    ng
  • Application adaptivity (automated or user-guided)

12
Example component infrastructure for multimethod
linear solvers
  • Goal provide a framework for
  • Performance monitoring of numerical components
  • Dynamic adaptativity, based on
  • Off-line analyses of past performance information
  • Online analysis of current execution performance
    information
  • Motivating application examples
  • Driven cavity flow Coffey et al, 2003,
    nonlinear PDE solution
  • FUN3D incompressible and compressible Euler
    equations
  • Prior work in multimethod linear solvers
  • McInnes et al, 03, Bhowmick et al,03 and 05,
    Norris at al. 05.

13
Example driven cavity flow
  • Linear solver GMRES(30), vary only fill level of
    ILU preconditioner
  • Adaptive heuristic based on
  • Previous linear solution convergence rate,
    nonlinear solution convergence rate, rate of
    increase of linear solution iterations
  • 96x96 mesh, Grashof 105, lid velocity 100
  • Intel P4 Xeon, dual 2.2 GHz, 4GB RAM

14
Example Compressible PETSc-FUN3D
  • Finite volume discretization, variable order Roe
    scheme on a tetrahedral, vertex-centered mesh
  • Initial discretization first-order scheme
    switch to second-order after shock position has
    settled down
  • Large sparse linear system solution takes
    approximately 72 of overall solution time

Original FUN3D developer W.K. Anderson et al.,
NASA Langley Image Dinesh Kaushik
15
PETSc-FUN3d, cont.
  • A3 Nonsequence-based adaptive strategy based on
    polynomial interpolation Bhowmick et al., 05
  • A3 vs base method time 1 slowdown - 32
    improvement
  • Hand-tuned adaptive vs base method time 7 - 42
    improvement

16
Component architecture
Off-line analysis
PerfDMF
Runtime DB
extract
extract
insert
Metadata extractor
Checkpoint
TAU
query
extract
checkpoint
Monitor
adapt request
start, stop, trigger
Experiment
adapt algorithm, parameters
17
Future work
  • Integration of ongoing efforts in
  • Performance tools common interfaces and data
    represenation (leverage OSPAT, PerfDMF, TAU
    performance interfaces, and similar efforts)
  • Numerical components emerging common interfaces
    (e.g., TOPS solver interfaces) increase choice of
    solution method ? automated composition and
    adaptation strategies
  • Long term
  • Is a more organized (but not too restrictive)
    environment for scientific software lifecycle
    development possible/desirable?

18
Typical application development cycle
Configure, make,
Compilation, Linking
Ext. dependencies, Version control
Debugging
Implementation
Testing
Performance evaluation
Deployment
Design
Performance tools
Production Execution
Job management, Results
19
Future work
  • Beyond components
  • Work flow
  • Reproducible results associate all necessary
    information for reproducing particular
    application instance
  • Ontology of tools and tools to guide selection
    and use

20
Summary
  • No shortage of performance evaluation, analysis,
    and optimization technology (and new capabilities
    are continuously added)
  • Little shared infrastructure, limiting the
    utility of performance technology in scientific
    computing
  • Components, both in performance tools, and
    numerical software can be used to manage
    complexity and enable better performance through
    dynamic adaptation or multimethod solvers
  • A life-cycle environment may be the best
    long-term solution
  • Some relevant sites
  • http//www.mcs.anl.gov/norris
  • http//perc.nersc.gov (performance tools)
  • http//cca-forum.org (component specification)
Write a Comment
User Comments (0)
About PowerShow.com