Performance and Memory Evaluation using the TAU Performance System Sameer Shende, Allen D. Malony, Alan Morris University of Oregon {sameer, malony, amorris}@cs.uoregon.edu Holger Brunst, Wolfgang Nagel T.U. Dresden {holger.brunst, - PowerPoint PPT Presentation

About This Presentation
Title:

Performance and Memory Evaluation using the TAU Performance System Sameer Shende, Allen D. Malony, Alan Morris University of Oregon {sameer, malony, amorris}@cs.uoregon.edu Holger Brunst, Wolfgang Nagel T.U. Dresden {holger.brunst,

Description:

Classes and templates. Statement-level blocks. Support for user-defined events ... f95parse *.f omerged.pdb I/usr/local/mydir R free. Instrument the program: ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 62
Provided by: allend7
Category:

less

Transcript and Presenter's Notes

Title: Performance and Memory Evaluation using the TAU Performance System Sameer Shende, Allen D. Malony, Alan Morris University of Oregon {sameer, malony, amorris}@cs.uoregon.edu Holger Brunst, Wolfgang Nagel T.U. Dresden {holger.brunst,


1
Performance and Memory Evaluation using the TAU
Performance SystemSameer Shende, Allen D.
Malony, Alan MorrisUniversity of Oregonsameer,
malony, amorris_at_cs.uoregon.edu Holger Brunst,
Wolfgang NagelT.U. Dresdenholger.brunst,
wolfgang.nagel_at_.tu-dresden.de MS14
Application Performance Analysis and Optimization
on BlueGene/LSIAM Parallel Processing Conference
Wed. Feb 22, 2006. Franciscan Room 5-525pmz
2
Outline of Talk
  • Overview of TAU
  • Instrumentation
  • Measurement
  • Analysis ParaProf and Vampir/VNG
  • Future work and concluding remarks

3
TAU Performance System
  • Tuning and Analysis Utilities (13 year project
    effort)
  • Open Source Performance system for HPC systems
  • Integrated, scalable, flexible, and parallel
  • Targets a general complex system computation
    model
  • Entities nodes / contexts / threads
  • Multi-level system / software / parallelism
  • Measurement and analysis abstraction
  • Integrated toolkit for performance problem
    solving
  • Instrumentation, measurement, analysis, and
    visualization
  • Portable performance profiling and tracing
    facility
  • Performance data management and data mining
  • http//www.cs.uoregon.edu/research/tau

4
Definitions Profiling
  • Profiling
  • Recording of summary information during execution
  • inclusive, exclusive time, calls, hardware
    statistics,
  • Reflects performance behavior of program entities
  • functions, loops, basic blocks
  • user-defined semantic entities
  • Very good for low-cost performance assessment
  • Helps to expose performance bottlenecks and
    hotspots
  • Implemented through
  • sampling periodic OS interrupts or hardware
    counter traps
  • instrumentation direct insertion of measurement
    code

5
Definitions Tracing
  • Tracing
  • Recording of information about significant points
    (events) during program execution
  • entering/exiting code region (function, loop,
    block, )
  • thread/process interactions (e.g., send/receive
    message)
  • Save information in event record
  • timestamp
  • CPU identifier, thread identifier
  • Event type and event-specific information
  • Event trace is a time-sequenced stream of event
    records
  • Can be used to reconstruct dynamic program
    behavior
  • Typically requires code instrumentation

6
TAU Parallel Performance System Goals
  • Multi-level performance instrumentation
  • Multi-language automatic source instrumentation
  • Flexible and configurable performance measurement
  • Widely-ported parallel performance profiling
    system
  • Computer system architectures and operating
    systems
  • Different programming languages and compilers
  • Support for multiple parallel programming
    paradigms
  • Multi-threading, message passing, mixed-mode,
    hybrid
  • Support for performance mapping
  • Support for object-oriented and generic
    programming
  • Integration in complex software, systems,
    applications

7
TAU Performance System Architecture
event selection
8
TAU Performance System Architecture
9
Program Database Toolkit (PDT)
Application / Library
C / C parser
Fortran parser F77/90/95
Program documentation
PDBhtml
Application component glue
IL
IL
SILOON
C / C IL analyzer
Fortran IL analyzer
C / F90/95 interoperability
CHASM
Program Database Files
Automatic source instrumentation
TAU_instr
DUCTAPE
10
TAU Instrumentation Approach
  • Support for standard program events
  • Routines
  • Classes and templates
  • Statement-level blocks
  • Support for user-defined events
  • Begin/End events (user-defined timers)
  • Atomic events (e.g., size of memory
    allocated/freed)
  • Selection of event statistics
  • Support definition of semantic entities for
    mapping
  • Support for event groups
  • Instrumentation optimization (eliminate
    instrumentation in lightweight routines)

11
TAU Instrumentation
  • Flexible instrumentation mechanisms at multiple
    levels
  • Source code
  • manual (TAU API, TAU Component API)
  • automatic
  • C, C, F77/90/95 (Program Database Toolkit
    (PDT))
  • OpenMP (directive rewriting (Opari), POMP spec)
  • Object code
  • pre-instrumented libraries (e.g., MPI using PMPI)
  • statically-linked and dynamically-linked
  • Executable code
  • dynamic instrumentation (pre-execution)
    (DynInstAPI)
  • virtual machine instrumentation (e.g., Java using
    JVMPI)
  • Proxy Components

12
Using TAU A tutorial
  • Configuration
  • Instrumentation
  • Manual
  • MPI Wrapper interposition library
  • PDT- Source rewriting for C,C, F77/90/95
  • OpenMP Directive rewriting
  • Component based instrumentation Proxy
    components
  • Binary Instrumentation
  • DyninstAPI Runtime Instrumentation/Rewriting
    binary
  • Java Runtime instrumentation
  • Python Runtime instrumentation
  • Measurement
  • Performance Analysis

13
Building Bridges to Other Tools TAU
14
TAU Performance System Interfaces
  • PDT U. Oregon, LANL, FZJ for instrumentation of
    C, C99, F95 source code
  • PAPI UTK PCLFZJ for accessing hardware
    performance counters data
  • DyninstAPI U. Maryland, U. Wisconsin for
    runtime instrumentation
  • KOJAK FZJ, UTK
  • Epilog trace generation library
  • CUBE callgraph visualizer
  • Opari OpenMP directive rewriting tool
  • Vampir/Intel Trace Analyzer Pallas/Intel
  • VTF3 trace generation library for Vampir TU
    Dresden (available from TAU website)
  • Paraver trace visualizer CEPBA
  • Jumpshot-4 trace visualizer MPICH, ANL
  • JVMPI from JDK for Java program instrumentation
    Sun
  • Paraprof profile browser/PerfDMF database
    supports
  • TAU format
  • Gprof GNU
  • HPM Toolkit IBM
  • MpiP ORNL, LLNL
  • Dynaprof UTK
  • PSRun NCSA

15
PAPI UTK
  • Performance Application Programming Interface
  • The purpose of the PAPI project is to design,
    standardize and implement a portable and
    efficient API to access the hardware performance
    monitor counters found on most modern
    microprocessors.
  • Parallel Tools Consortium project
  • University of Tennessee, Knoxville
  • http//icl.cs.utk.edu/papi

16
TAU Measurement System Configuration
  • configure OPTIONS
  • -cltCCgt, -ccltccgt Specify C and C
    compilers
  • -pthread, -sproc Use pthread or SGI sproc
    threads
  • -openmp Use OpenMP threads
  • -jdkltdirgt Specify Java instrumentation (JDK)
  • -opariltdirgt Specify location of Opari OpenMP
    tool
  • -papiltdirgt Specify location of PAPI
  • -pdtltdirgt Specify location of PDT
  • -dyninstltdirgt Specify location of DynInst
    Package
  • -mpiinc/libltdirgt Specify MPI library
    instrumentation
  • -shmeminc/libltdirgt Specify PSHMEM library
    instrumentation
  • -pythoninc/libltdirgt Specify Python
    instrumentation
  • -epilogltdirgt Specify location of EPILOG
  • -slog2ltdirgt Specify location of SLOG2/Jumpshot
  • -otfltdirgt Specify location of Open Trace Format
  • -vtfltdirgt Specify location of VTF3 trace package
  • -archltarchitecturegt Specify architecture
    explicitly (bgl,ibm64,ibm64linux)

17
TAU Measurement System Configuration
  • configure OPTIONS
  • -TRACE Generate binary TAU traces
  • -PROFILE (default) Generate profiles (summary)
  • -PROFILECALLPATH Generate call path profiles
  • -PROFILEPHASE Generate phase based profiles
  • -PROFILEMEMORY Track heap memory for each routine
  • -PROFILEHEADROOM Track memory headroom to grow
  • -MULTIPLECOUNTERS Use hardware counters time
  • -COMPENSATE Compensate timer overhead
  • -CPUTIME Use usertimesystem time
  • -PAPIWALLCLOCK Use PAPIs wallclock time
  • -PAPIVIRTUAL Use PAPIs process virtual time
  • -SGITIMERS Use fast IRIX timers
  • -LINUXTIMERS Use fast x86 Linux timers

18
Using TAU on IBM BG/L
  • Configure PDT
  • configure XLC exec-prefixbgl make clean
    install
  • Use XLC compiler
  • Configure TAU for front-end
  • configure make clean install
  • Add lttaudirgt/ppc64/bin/ to your path
  • Configure TAU for back-end
  • configure -archbgl mpi pdtltdirgt
    -pdt_cxlC
  • Use IBMs Blue Gene/L blrts_xlC compilers for
    building the library and xlC for building
    tau_instrumentor -pdt_cxlC. It executes on
    the front-end.
  • Libraries are built in lttaudirgt/bgl/lib/
    directory
  • Each configuration creates a unique
    ltarchgt/lib/Makefile.tau-ltoptionsgt stub makefile
    that corresponds to the configuration options
    specified. e.g.,
  • /usr/local/tau/tau-2.15.2/bgl/lib/Makefile.tau-mpi
    -pdt

19
TAU_SETUP A GUI for Installing TAU
tau-2.xgt./tau_setup
20
Configuration Parameters in Stub Makefiles
  • Each TAU Stub Makefile resides in lttaugtltarchgt/lib
    directory
  • Variables
  • TAU_CXX Specify the C compiler used by TAU
  • TAU_CC, TAU_F90 Specify the C, F90 compilers
  • TAU_DEFS Defines used by TAU. Add to CFLAGS
  • TAU_LDFLAGS Linker options. Add to LDFLAGS
  • TAU_INCLUDE Header files include path. Add to
    CFLAGS
  • TAU_LIBS Statically linked TAU library. Add to
    LIBS
  • TAU_SHLIBS Dynamically linked TAU library
  • TAU_MPI_LIBS TAUs MPI wrapper library for C/C
  • TAU_MPI_FLIBS TAUs MPI wrapper library for F90
  • TAU_FORTRANLIBS Must be linked in with C linker
    for F90
  • TAU_CXXLIBS Must be linked in with F90 linker
  • TAU_INCLUDE_MEMORY Use TAUs malloc/free wrapper
    lib
  • TAU_DISABLE TAUs dummy F90 stub library
  • TAU_COMPILER Instrument using tau_compiler.sh
    script
  • Note Not including TAU_DEFS in CFLAGS disables
    instrumentation in C/C programs (TAU_DISABLE
    for f90).

21
Using TAU
  • Install TAU
  • configure make clean install
  • Typically modify application makefile
  • Change the name of compiler to tau_cxx.sh,
    tau_f90.sh
  • Set environment variables
  • Name of the stub makefile TAU_MAKEFILE
  • Options passed to tau_compiler.sh TAU_OPTIONS
  • Execute application
  • mpirun np ltprocsgt a.out
  • Analyze performance data
  • paraprof, vampir, paraver, jumpshot

22
Manual Instrumentation C Example
include ltTAU.hgt int main(int argc, char
argv) TAU_PROFILE(int main(int, char ),
 , TAU_DEFAULT) TAU_PROFILE_INIT(argc,
argv) TAU_PROFILE_SET_NODE(0) / for
sequential programs / foo() return
0 int foo(void) TAU_PROFILE(int
foo(void), , TAU_DEFAULT) // measures entire
foo() TAU_PROFILE_TIMER(t, foo() for loop,
2345 file.cpp, TAU_USER)
TAU_PROFILE_START(t) for(int i 0 i lt N
i) work(i) TAU_PROFILE_STOP(t)
// other statements in foo
23
Manual Instrumentation F90 Example
cc34567 Cubes program comment line
PROGRAM SUM_OF_CUBES integer profiler(2)
save profiler INTEGER H, T, U
call TAU_PROFILE_INIT() call
TAU_PROFILE_TIMER(profiler, 'PROGRAM
SUM_OF_CUBES') call TAU_PROFILE_START(prof
iler) call TAU_PROFILE_SET_NODE(0)
! This program prints all 3-digit numbers that
! equal the sum of the cubes of their digits.
DO H 1, 9 DO T 0, 9 DO
U 0, 9 IF (100H 10T U H3
T3 U3) THEN PRINT "(3I1)", H,
T, U ENDIF END DO END
DO END DO call TAU_PROFILE_STOP(profil
er) END PROGRAM SUM_OF_CUBES
24
TAUs MPI Wrapper Interposition Library
  • Uses standard MPI Profiling Interface
  • Provides name shifted interface
  • MPI_Send PMPI_Send
  • Weak bindings
  • Interpose TAUs MPI wrapper library between MPI
    and TAU
  • -lmpi replaced by lTauMpi lpmpi lmpi
  • No change to the source code! Just re-link the
    application to generate performance data
  • setenv TAU_MAKEFILE ltdirgt/ltarchgt/lib/Makefile.tau-
    mpi-options
  • Use tau_cxx.sh, tau_f90.sh and tau_cc.sh as
    compilers

25
Using Program Database Toolkit (PDT)
  • Parse the Program to create foo.pdb
  • cxxparse foo.cpp I/usr/local/mydir DMYFLAGS
  • or
  • cparse foo.c I/usr/local/mydir DMYFLAGS
  • or
  • f95parse foo.f90 I/usr/local/mydir
  • f95parse .f omerged.pdb I/usr/local/mydir
    R free
  • Instrument the program
  • tau_instrumentor foo.pdb foo.f90 o
    foo.inst.f90 f select.tau
  • Compile the instrumented program ifort
    foo.inst.f90 c I/usr/local/mpi/include o foo.o

26
Using TAU
Step 1 Configure and install TAU configure
-pdtltdirgt -pdt_cxlC -archbgl mpi make
clean make install Builds lttaudirgt/ltarchgt/lib/Mak
efile.tau-ltoptionsgt set path(path
lttaudirgt/ppc64/bin) Step 2 Choose target stub
Makefile setenv TAU_MAKEFILE /usr/local/tau-2.1
5.2/bgl/lib/Makefile.tau-mpi-pdt setenv
TAU_OPTIONS -optVerbose -optKeepFiles (see
tau_compiler.sh for all options) Step 3 Use
tau_f90.sh, tau_cxx.sh and tau_cc.sh as the F90,
C or C compilers respectively. tau_f90.sh -c
app.f90 tau_f90.sh app.o -o app -lm -lblas Or
use these in the application Makefile.
27
Tau_cxx,cc,f90.sh Improves Integration in
Makefiles
set TAU_MAKEFILE and TAU_OPTIONS env vars CXX
tau_cxx.sh F90 tau_f90.sh CFLAGS LIBS
-lm OBJS f1.o f2.o f3.o fn.o app
(OBJS) (CXX) (LDFLAGS) (OBJS) -o _at_
(LIBS) .cpp.o (CC) (CFLAGS) -c lt
28
Using Stub Makefile and TAU_COMPILER
include /usr/common/acts/TAU/tau-2.15.2/bgl/lib/
Makefile.tau-mpi-pdt-trace MYOPTIONS
-optVerbose optKeepFiles F90 (TAU_COMPILER)
(MYOPTIONS) mpxlf90 OBJS f1.o f2.o f3.o
LIBS -Lappdir lapplib1 lapplib2 app
(OBJS) (F90) (OBJS) o app
(LIBS) .f90.o (F90) c lt
29
TAU_COMPILER Options
  • Optional parameters for (TAU_COMPILER)
    tau_compiler.sh help
  • -optVerbose Turn on verbose debugging messages
  • -optPdtDir"" PDT architecture directory.
    Typically (PDTDIR)/(PDTARCHDIR)
  • -optPdtF95Opts"" Options for Fortran parser in
    PDT (f95parse)
  • -optPdtCOpts"" Options for C parser in PDT
    (cparse). Typically (TAU_MPI_INCLUDE)
    (TAU_INCLUDE) (TAU_DEFS)
  • -optPdtCxxOpts"" Options for C parser in PDT
    (cxxparse). Typically (TAU_MPI_INCLUDE)
    (TAU_INCLUDE) (TAU_DEFS)
  • -optPdtF90Parser"" Specify a different
    Fortran parser. For e.g., f90parse instead of
    f95parse
  • -optPdtUser"" Optional arguments for
    parsing source code
  • -optPDBFile"" Specify merged PDB file.
    Skips parsing phase.
  • -optTauInstr"" Specify location of
    tau_instrumentor. Typically (TAUROOT)/(CON
    FIG_ARCH)/bin/tau_instrumentor
  • -optTauSelectFile"" Specify selective
    instrumentation file for tau_instrumentor
  • -optTau"" Specify options for
    tau_instrumentor
  • -optCompile"" Options passed to the
    compiler. Typically (TAU_MPI_INCLUDE)
    (TAU_INCLUDE) (TAU_DEFS)
  • -optLinking"" Options passed to the
    linker. Typically (TAU_MPI_FLIBS)
    (TAU_LIBS) (TAU_CXXLIBS)
  • -optNoMpi Removes -lmpi libraries
    during linking (default)
  • -optKeepFiles Does not remove
    intermediate .pdb and .inst. files
  • e.g.,
  • setenv TAU_OPTIONS -optTauSelectFileselect.tau
    optVerbose -optPdtCOpts-I/home -DFOO
  • tau_cxx.sh matrix.cpp -o matrix -lm

30
Optimization of Program Instrumentation
  • Need to eliminate instrumentation in frequently
    executing lightweight routines
  • Throttling of events at runtime
  • setenv TAU_THROTTLE 1
  • Disables instrumentation in routines that execute
    over 100000 times (TAU_THROTTLE_NUMCALLS) and
    take less than 10 microseconds of inclusive time
    per call (TAU_THROTTLE_PERCALL)
  • Selective instrumentation file to filter events
  • tau_instrumentor options f ltfilegt
  • Compensation of local instrumentation overhead
  • configure -COMPENSATE

31
TAU_REDUCE
  • Reads profile files and rules
  • Creates selective instrumentation file
  • Specifies which routines should be excluded from
    instrumentation

rules
tau_reduce
Selective instrumentation file
profile
32
Memory Profiling in TAU
  • Configuration option PROFILEMEMORY
  • Records global heap memory utilization for each
    function
  • Takes one sample at beginning of each function
    and associates the sample with function name
  • Configuration option -PROFILEHEADROOM
  • Records headroom (amount of free memory to grow)
    for each function
  • Takes one sample at beginning of each function
    and associates it with the callstack
    TAU_CALLPATH_DEPTH env variable
  • Useful for debugging memory usage on IBM BG/L.
  • Independent of instrumentation/measurement
    options selected
  • No need to insert macros/calls in the source code
  • User defined atomic events appear in
    profiles/traces

33
Memory Profiling in TAU
Flash2 code profile (-PROFILEMEMORY) on IBM
BlueGene/L MPI rank 0
34
Memory Profiling in TAU
  • Instrumentation based observation of global heap
    memory (not per function)
  • call TAU_TRACK_MEMORY()
  • call TAU_TRACK_MEMORY_HEADROOM()
  • Triggers one sample every 10 secs
  • call TAU_TRACK_MEMORY_HERE()
  • call TAU_TRACK_MEMORY_HEADROOM_HERE()
  • Triggers sample at a specific location in source
    code
  • call TAU_SET_INTERRUPT_INTERVAL(seconds)
  • To set inter-interrupt interval for sampling
  • call TAU_DISABLE_TRACKING_MEMORY()
  • call TAU_DISABLE_TRACKING_MEMORY_HEADROOM()
  • To turn off recording memory utilization
  • call TAU_ENABLE_TRACKING_MEMORY()
  • call TAU_ENABLE_TRACKING_MEMORY_HEADROOM()
  • To re-enable tracking memory utilization

35
ParaProf Full Profile (Miranda)
8K processors!
36
ParaProf Flat Profile (Miranda)
37
ParaProf Callpath Profile (Flash)
38
Gprof Style Callpath View in Paraprof
39
TAUs ParaProf Profile Browser Static Timers
40
ParaProf 3D Full Profile (Miranda)
16k processors
41
ParaProf Bar Plot (Zoom in/out /-)
42
ParaProf 3D Scatterplot (Miranda)
  • Each pointis a threadof execution
  • A total offour metricsshown inrelation
  • ParaVis 3Dprofilevisualizationlibrary
  • JOGL

43
Vampir, VNG, and OTF
  • Commercial trace based tools developed at ZiH,
    T.U. Dresden
  • Wolfgang Nagel, Holger Brunst and others
  • Vampir Trace Visualizer (aka Intel Trace
    Analyzer v4.0)
  • Sequential program
  • Vampir Next Generation (VNG)
  • Client (vng) runs on a desktop, server (vngd) on
    a cluster
  • Parallel trace analysis
  • Orders of magnitude bigger traces (more memory)
  • State of the art in parallel trace visualization
  • Open Trace Format (OTF)
  • Hierarchical trace format, efficient streams
    based parallel access with VNGD
  • Replacement for proprietary formats such as STF
  • Tracing library available on IBM BG/L platform
  • Development of OTF supported by LLNL
  • http//www.vampir-ng.de and http//www.par
    atools.com/otf

44
Vampir Next Generation (VNG) Architecture
45
VNG Parallel Analysis Server
46
TAU Tracing Enhancements
  • Configure TAU with -TRACE vtfltdirgt otfltdirgt
    options
  • configure TRACE vtfltdirgt
  • configure TRACE otfltdirgt
  • Generates tau_merge, tau2vtf, tau2otf tools in
    lttaugt/ltarchgt/bin directory
  • tau_f90.sh app.f90 o app
  • Instrument and execute application mpirun -np
    4 app
  • Merge and convert trace files to VTF3/OTF format
  • tau_treemerge.pl tau2vtf tau.trc tau.edf
    app.vpt.gz vampir foo.vpt.gz
  • OR
  • tau2otf tau.trc tau.edf app.otf n
    ltnumstreamsgt
  • vampir app.otf
  • OR use VNG to analyze OTF/VTF trace files

47
Environment Variables
  • Configure TAU with -TRACE otfltdirgt option
  • configure TRACE otfltdirgt -archbgl-MULTIPLEC
    OUNTERS papiltdirgt -mpi pdtdir pdt_cxlC
  • Set environment variables
  • setenv TRACEDIR /p/gm1/ltlogingt/traces
  • setenv COUNTER1 GET_TIME_OF_DAY (reqd)
  • setenv COUNTER2 PAPI_FP_INS
  • setenv COUNTER3 PAPI_TOT_CYC
  • Execute application
  • srun N8 n16 p pdebug ./a.out args
  • tau_treemerge.pl and tau2otf/tau2vtf

48
Using Vampir Next Generation (VNG v1.4)
49
VNG Timeline Display
50
VNG Calltree Display
51
VNG Timeline Zoomed In
52
VNG Grouping of Interprocess Communications
53
VNG Process Timeline with PAPI Counters
54
OTF/VNG Support for Counters
55
VNG Communication Matrix Display
56
VNG Message Profile
57
VNG Process Activity Chart
58
VNG Preferences
59
TAU Performance System Status
  • Computing platforms (selected)
  • IBM SP/pSeries/BGL, SGI Altix/Origin, Cray
    T3E/SV-1/X1/XT3, HP (Compaq) SC (Tru64), Sun,
    Linux clusters (IA-32/64, Alpha, PPC, PA-RISC,
    Power, Opteron), Apple (G4/5, OS X), Hitachi
    SR8000, NEC SX-5/6, Windows
  • Programming languages
  • C, C, Fortran 77/90/95, HPF, Java, Python
  • Thread libraries (selected)
  • pthreads, OpenMP, SGI sproc, Java,Windows,
    Charm
  • Compilers (selected)
  • Intel, PGI, GNU, Fujitsu, Sun, PathScale, SGI,
    Cray, IBM, HP, NEC, Absoft, Lahey, Nagware

60
Concluding Discussion
  • Performance tools must be used effectively
  • More intelligent performance systems for
    productive use
  • Evolve to application-specific performance
    technology
  • Deal with scale by full range performance
    exploration
  • Autonomic and integrated tools
  • Knowledge-based and knowledge-driven process
  • Performance observation methods do not
    necessarily need to change in a fundamental sense
  • More automatically controlled and efficiently use
  • Develop next-generation tools and deliver to
    community
  • Open source with support by ParaTools, Inc.
  • http//www.cs.uoregon.edu/research/tau

61
Support Acknowledgements
  • Department of Energy (DOE)
  • Office of Science contracts
  • University of Utah ASC Level 1 sub-contract
  • LLNL ASC/NNSA Level 3 contract
  • LLNL ParaTools/GWT contract
  • Argonne National Laboratory
  • Pete Beckman
  • T.U. Dresden, GWT
  • Dr. Wolfgang Nagel and Holger Brunst
  • Research Centre Juelich
  • Dr. Bernd Mohr
  • Los Alamos National Laboratory contracts
Write a Comment
User Comments (0)
About PowerShow.com