Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon - PowerPoint PPT Presentation

Loading...

PPT – Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon PowerPoint presentation | free to download - id: 70fba5-ZTYyO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon

Description:

Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 69
Provided by: Alle1157
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Performance Technology for Complex Parallel Systems Sameer Shende University of Oregon


1
Performance Technology for Complex Parallel
Systems Sameer Shende University of Oregon
2
General Problems
  • How do we create robust and ubiquitous
    performance technology for the analysis and
    tuning of parallel and distributed software and
    systems in the presence of (evolving) complexity
    challenges?
  • How do we apply performance technology
    effectively for the variety and diversity of
    performance problems that arise in the context of
    complex parallel and distributed computer systems.

3
Computation Model for Performance Technology
  • How to address dual performance technology goals?
  • Robust capabilities widely available
    methodologies
  • Contend with problems of system diversity
  • Flexible tool composition/configuration/integratio
    n
  • Approaches
  • Restrict computation types / performance problems
  • limited performance technology coverage
  • Base technology on abstract computation model
  • general architecture and software execution
    features
  • map features/methods to existing complex system
    types
  • develop capabilities that can adapt and be
    optimized

4
General Complex System Computation Model
  • Node physically distinct shared memory machine
  • Message passing node interconnection network
  • Context distinct virtual memory space within
    node
  • Thread execution threads (user/system) in context

Interconnection Network
Inter-node message communication


Node
Node
Node
node memory
memory
memory
SMP
physical view
VM space

model view

Context
Threads
5
Definitions Profiling
  • Profiling
  • Recording of summary information during execution
  • inclusive, exclusive time, calls, hardware
    statistics,
  • Reflects performance behavior of program entities
  • functions, loops, basic blocks
  • user-defined semantic entities
  • Very good for low-cost performance assessment
  • Helps to expose performance bottlenecks and
    hotspots
  • Implemented through
  • sampling periodic OS interrupts or hardware
    counter traps
  • instrumentation direct insertion of measurement
    code

6
Definitions Tracing
  • Tracing
  • Recording of information about significant points
    (events) during program execution
  • entering/exiting code region (function, loop,
    block, )
  • thread/process interactions (e.g., send/receive
    message)
  • Save information in event record
  • timestamp
  • CPU identifier, thread identifier
  • Event type and event-specific information
  • Event trace is a time-sequenced stream of event
    records
  • Can be used to reconstruct dynamic program
    behavior
  • Typically requires code instrumentation

7
Event Tracing Instrumentation, Monitor, Trace
Event definition
CPU A
timestamp
MONITOR
CPU B
8
Event Tracing Timeline Visualization
main
master
slave
B
9
TAU Performance System Framework
  • Tuning and Analysis Utilities
  • Performance system framework for scalable
    parallel and distributed high-performance
    computing
  • Targets a general complex system computation
    model
  • nodes / contexts / threads
  • Multi-level system / software / parallelism
  • Measurement and analysis abstraction
  • Integrated toolkit for performance
    instrumentation, measurement, analysis, and
    visualization
  • Portable performance profiling/tracing facility
  • Open software approach

10
TAU Performance System Architecture
11
Levels of Code Transformation
  • As program information flows through stages of
    compilation/linking/execution, different
    information is accessible at different stages
  • Each level poses different constraints and
    opportunities for extracting information
  • At what level should performance instrumentation
    be done?

12
TAU Instrumentation
  • Flexible instrumentation mechanisms at multiple
    levels
  • Source code
  • manual
  • automatic using Program Database Toolkit (PDT),
    OPARI
  • Object code
  • pre-instrumented libraries (e.g., MPI using PMPI)
  • statically linked
  • dynamically linked (e.g., Virtual machine
    instrumentation)
  • fast breakpoints (compiler generated)
  • Executable code
  • dynamic instrumentation (pre-execution) using
    DynInstAPI

13
TAU Instrumentation (continued)
  • Targets common measurement interface (TAU API)
  • Object-based design and implementation
  • Macro-based, using constructor/destructor
    techniques
  • Program units function, classes, templates,
    blocks
  • Uniquely identify functions and templates
  • name and type signature (name registration)
  • static object creates performance entry
  • dynamic object receives static object pointer
  • runtime type identification for template
    instantiations
  • C and Fortran instrumentation variants
  • Instrumentation and measurement optimization

14
Multi-Level Instrumentation
  • Uses multiple instrumentation interfaces
  • Shares information cooperation between
    interfaces
  • Taps information at multiple levels
  • Provides selective instrumentation at each level
  • Targets a common performance model
  • Presents a unified view of execution

15
Program Database Toolkit (PDT)
  • Program code analysis framework for developing
    source-based tools
  • High-level interface to source code information
  • Integrated toolkit for source code parsing,
    database creation, and database query
  • commercial grade front end parsers
  • portable IL analyzer, database format, and access
    API
  • open software approach for tool development
  • Target and integrate multiple source languages
  • Use in TAU to build automated performance
    instrumentation tools

16
PDT Architecture and Tools
C/C
Fortran 77/90
17
PDT Components
  • Language front end
  • Edison Design Group (EDG) C, C, Java
  • Mutek Solutions Ltd. F77, F90
  • creates an intermediate-language (IL) tree
  • IL Analyzer
  • processes the intermediate language (IL) tree
  • creates program database (PDB) formatted file
  • DUCTAPE (Bernd Mohr, ZAM, Germany)
  • C program Database Utilities and Conversion
    Tools APplication Environment
  • processes and merges PDB files
  • C library to access the PDB for PDT applications

18
TAU Measurement
  • Performance information
  • High-resolution timer library (real-time /
    virtual clocks)
  • General software counter library (user-defined
    events)
  • Hardware performance counters
  • PCL (Performance Counter Library) (ZAM, Germany)
  • PAPI (Performance API) (UTK, Ptools Consortium)
  • consistent, portable API
  • Organization
  • Node, context, thread levels
  • Profile groups for collective events (runtime
    selective)
  • Performance data mapping between software levels

19
TAU Measurement (continued)
  • Parallel profiling
  • Function-level, block-level, statement-level
  • Supports user-defined events
  • TAU parallel profile database
  • Function callstack
  • Hardware counts values (in replace of time)
  • Tracing
  • All profile-level events
  • Inter-process communication events
  • Timestamp synchronization
  • User-configurable measurement library (user
    controlled)

20
TAU Measurement System Configuration
  • configure OPTIONS
  • -cltCCgt, -ccltccgt Specify C and C
    compilers
  • -pthread, -sproc Use pthread or SGI sproc
    threads
  • -openmp Use OpenMP threads
  • -jdkltdirgt Specify location of Java Dev. Kit
  • -opariltdirgt Specify location of Opari OpenMP
    tool
  • -pcl, -papiltdirgt Specify location of PCL or
    PAPI
  • -pdtltdirgt Specify location of PDT
  • -dyninstltdirgt Specify location of DynInst
    Package
  • -mpiincltdgt, mpilibltdgt Specify MPI library
    instrumentation
  • -TRACE Generate TAU event traces
  • -PROFILE Generate TAU profiles
  • -CPUTIME Use usertimesystem time
  • -PAPIWALLCLOCK Use PAPI to access wallclock time
  • -PAPIVIRTUAL Use PAPI for virtual (user) time

21
TAU Measurement Configuration Examples
  • ./configure -cxlC -ccxlc pdt/usr/packages/pd
    toolkit-2.1 -pthread
  • Use TAU with IBMs xlC compiler, PDT and the
    pthread library
  • Enable TAU profiling (default)
  • ./configure -TRACE PROFILE
  • Enable both TAU profiling and tracing
  • ./configure -cguidec -ccguidec
    -papi/usr/local/packages/papi openmp
    -mpiinc/usr/packages/mpich/include
    -mpilib/usr/packages/mpich/lib
  • Use OpenMPMPI using KAI's Guide compiler suite
    and use PAPI for accessing hardware performance
    counters for measurements
  • Typically configure multiple measurement libraries

22
TAU Measurement API
  • Initialization and runtime configuration
  • TAU_PROFILE_INIT(argc, argv) TAU_PROFILE_SET_NODE
    (myNode) TAU_PROFILE_SET_CONTEXT(myContext) TAU_
    PROFILE_EXIT(message) TAU_REGISTER_THREAD()
  • Function and class methods
  • TAU_PROFILE(name, type, group)
  • Template
  • TAU_TYPE_STRING(variable, type) TAU_PROFILE(name,
    type, group) CT(variable)
  • User-defined timing
  • TAU_PROFILE_TIMER(timer, name, type,
    group) TAU_PROFILE_START(timer) TAU_PROFILE_STOP
    (timer)

23
Compiling TAU Makefiles
  • Include TAU Makefile in the users Makefile.
  • Variables
  • TAU_CXX Specify the C compiler
  • TAU_CC Specify the C compiler used by TAU
  • TAU_DEFS Defines used by TAU. Add to CFLAGS
  • TAU_LDFLAGS Linker options. Add to LDFLAGS
  • TAU_INCLUDE Header files include path. Add to
    CFLAGS
  • TAU_LIBS Statically linked TAU library. Add to
    LIBS
  • TAU_SHLIBS Dynamically linked TAU library
  • TAU_MPI_LIBS TAUs MPI wrapper library for C/C
  • TAU_MPI_FLIBS TAUs MPI wrapper library for F90
  • TAU_FORTRANLIBS Must be linked in with C linker
    for F90.
  • Note Not including TAU_DEFS in CFLAGS disables
    instrumentation in C/C programs.

24
Including TAU Makefile - Example
include /usr/tau/sgi64/lib/Makefile.tau-pthread-kc
c CXX (TAU_CXX) CC (TAU_CC) CFLAGS
(TAU_DEFS) LIBS (TAU_LIBS) OBJS ... TARGET
a.out TARGET (OBJS) (CXX) (LDFLAGS)
(OBJS) -o _at_ (LIBS) .cpp.o (CC) (CFLAGS)
-c lt -o _at_
25
TAU Makefile for PDT
include /usr/tau/include/Makefile CXX
(TAU_CXX) CC (TAU_CC) PDTPARSE
(PDTDIR)/(CONFIG_ARCH)/bin/cxxparse TAUINSTR
(TAUROOT)/(CONFIG_ARCH)/bin/tau_instrumentor CFL
AGS (TAU_DEFS) LIBS (TAU_LIBS) OBJS
... TARGET a.out TARGET (OBJS) (CXX)
(LDFLAGS) (OBJS) -o _at_ (LIBS) .cpp.o (PDTP
ARSE) lt (TAUINSTR) .pdb lt -o
.inst.cpp (CC) (CFLAGS) -c .inst.cpp -o
_at_
26
Setup Running Applications
setenv PROFILEDIR /home/data/experiments/profile
/01 setenv TRACEDIR /home/data/experiments/trace
/01 set path(path lttaudirgt/ltarchgt/bin)
setenv LD_LIBRARY_PATH LD_LIBRARY_PATH\lttaudirgt/
ltarchgt/lib For PAPI/PCL setenv PAPI_EVENT
PAPI_FP_INS setenv PCL_EVENT PCL_FP_INSTR For
Java (without instrumentation) java
application With instrumentation java -XrunTAU
application java -XrunTAUexcludesun/io,java
application For DyninstAPI a.out tau_run
a.out tau_run -XrunTAUsh-papi a.out
27
TAU Analysis
  • Profile analysis
  • pprof
  • parallel profiler with text-based display
  • racy
  • graphical interface to pprof (Tcl/Tk)
  • jracy
  • Java implementation of Racy
  • Trace analysis and visualization
  • Trace merging and clock adjustment (if necessary)
  • Trace format conversion (ALOG, SDDF, Vampir)
  • Vampir (Pallas) trace visualization

28
Pprof Command
  • pprof -c-b-m-t-e-i -r -s -n num -f
    file -l nodes
  • -c Sort according to number of calls
  • -b Sort according to number of subroutines called
  • -m Sort according to msecs (exclusive time total)
  • -t Sort according to total msecs (inclusive time
    total)
  • -e Sort according to exclusive time per call
  • -i Sort according to inclusive time per call
  • -v Sort according to standard deviation
    (exclusive usec)
  • -r Reverse sorting order
  • -s Print only summary profile information
  • -n num Print only first number of functions
  • -f file Specify full path and filename without
    node ids
  • -l List all functions and exit

29
Pprof Output (NAS Parallel Benchmark LU)
  • Intel Quad PIII Xeon, RedHat, PGI F90
  • F90 MPICH
  • Profile for Node Context Thread
  • Application events and MPI events

30
jRacy (NAS Parallel Benchmark LU)
Routine profile across all nodes
Global profiles
n node c context t thread
Individual profile
31
Vampir Trace Visualization Tool
  • Visualization and Analysis of MPI Programs
  • Originally developed by Forschungszentrum Jülich
  • Current development by Technical University
    Dresden
  • Distributed by PALLAS, Germany
  • http//www.pallas.de/pages/vampir.htm

32
Vampir (NAS Parallel Benchmark LU)
Callgraph display
Timeline display
Parallelism display
Communications display
33
Case Study Hybrid Computation (OpenMPI MPI)
  • Portable hybrid parallel programming
  • OpenMP for shared memory parallel programming
  • Fork-join model
  • Loop level parallelism
  • MPI for cross-box message-based parallelism
  • OpenMP performance measurement
  • Interface to OpenMP runtime system (RTS events)
  • Compiler support and integration
  • 2D Stommel model of ocean circulation
  • Jacobi iteration, 5-point stencil
  • Timothy Kaiser (San Diego Supercomputing Center)

34
OpenMP Instrumentation
  • OPARI FZJ, Germany
  • OpenMP Pragma And Region Instrumentor (OPARI)
  • Source-to-Source translator to insert POMP calls
    around OpenMP constructs and API functions
  • POMP
  • OpenMP Directive Instrumentation
  • OpenMP Runtime Library Routine Instrumentation
  • Performance Monitoring Library Control
  • User Code Instrumentation
  • Context Descriptors
  • Conditional Compilation
  • Conditional / Selective Transformations

35
Example !OMP PARALLEL DO Instrumentation
!OMP PARALLEL DO clauses... do
loop !OMP END PARALLEL DO
!OMP PARALLEL other-clauses... !OMP DO
schedule-clauses, ordered-clauses,
lastprivate-clauses do loop !OMP END
DO !OMP END PARALLEL DO
NOWAIT !OMP
BARRIER
call pomp_parallel_fork(d) call
pomp_parallel_begin(d)
call pomp_parallel_end(d) call
pomp_parallel_join(d)
call pomp_do_enter(d)
call pomp_do_exit(d)
call
pomp_barrier_enter(d) call pomp_barrier_exit(d)

36
Tracing Hybrid Executions TAU and Vampir
37
Profiling Hybrid Executions
38
Case Study Utah ASCI/ASAP Level 1 Center
  • C-SAFE was established to build a problem-solving
    environment (PSE) for the numerical simulation of
    accidental fires and explosions
  • Fundamental chemistry and engineering physics
    models
  • Coupled with non-linear solvers, optimization,
    computational steering, visualization, and
    experimental data verification
  • Very large-scale simulations
  • Computer science problems
  • Coupling of multiple simulation codes
  • Software engineering across diverse expert teams
  • Achieving high performance on large-scale systems

39
Example C-SAFE Simulation Problems
?
Heptane fire simulation
Typical C-SAFE simulation with a billion degrees
of freedom and non-linear time dynamics
Material stress simulation
40
Uintah High-Level Component View
41
Uintah Parallel Component Architecture
42
Uintah Computational Framework
  • Execution model based on software (macro)
    dataflow
  • Exposes parallelism and hides data transport
    latency
  • Computations expressed a directed acyclic graphs
    of tasks
  • consumes input and produces output (input to
    future task)
  • input/outputs specified for each patch in a
    structured grid
  • Abstraction of global single-assignment memory
  • DataWarehouse
  • Directory mapping names to values (array
    structured)
  • Write value once then communicate to awaiting
    tasks
  • Task graph gets mapped to processing resources
  • Communications schedule approximates global
    optimal

43
Uintah Task Graph (Material Point Method)
  • Diagram of named tasks (ovals) and data (edges)
  • Imminent computation
  • Dataflow-constrained
  • MPM
  • Newtonian material point motion time step
  • Solid values defined at material point
    (particle)
  • Dashed values defined at vertex (grid)
  • Prime () values updated during time step

44
Uintah PSE
  • UCF automatically sets up
  • Domain decomposition
  • Inter-processor communication with
    aggregation/reduction
  • Parallel I/O
  • Checkpoint and restart
  • Performance measurement and analysis (stay tuned)
  • Software engineering
  • Coding standards
  • CVS (Commits Y3 - 26.6 files/day, Y4 - 29.9
    files/day)
  • Correctness regression testing with bugzilla bug
    tracking
  • Nightly build (parallel compiles)
  • 170,000 lines of code (Fortran and C tasks
    supported)

45
Performance Technology Integration
  • Uintah present challenges to performance
    integration
  • Software diversity and structure
  • UCF middleware, simulation code modules
  • component-based hierarchy
  • Portability objectives
  • cross-language and cross-platform
  • multi-parallelism thread, message passing, mixed
  • Scalability objectives
  • High-level programming and execution abstractions
  • Requires flexible and robust performance
    technology
  • Requires support for performance mapping

46
Performance Analysis Objectives for Uintah
  • Micro tuning
  • Optimization of simulation code (task) kernels
    for maximum serial performance
  • Scalability tuning
  • Identification of parallel execution bottlenecks
  • overheads scheduler, data warehouse,
    communication
  • load imbalance
  • Adjustment of task graph decomposition and
    scheduling
  • Performance tracking
  • Understand performance impacts of code
    modifications
  • Throughout course of software development
  • C-SAFE application and UCF software

47
Uintah Performance Engineering Approach
  • Contemporary performance methodology focuses on
    control flow (function) level measurement and
    analysis
  • C-SAFE application involves coupled-models with
    task-based parallelism and dataflow control
    constraints
  • Performance engineering on algorithmic (task)
    basis
  • Observe performance based on algorithm (task)
    semantics
  • Analyze task performance characteristics in
    relation to other simulation tasks and UCF
    components
  • scientific component developers can concentrate
    on performance improvement at algorithmic level
  • UCF developers can concentrate on bottlenecks not
    directly associated with simulation module code

48
Task Execution in Uintah Parallel Scheduler
  • Profile methods and functions in scheduler and in
    MPI library

Task execution time dominates (what task?)
Task execution time distribution
MPI communication overheads (where?)
  • Need to map performance data!

49
Semantics-Based Performance Mapping
  • Associate performance measurements with
    high-level semantic abstractions
  • Need mapping support in the performance
    measurement system to assign data correctly

50
Hypothetical Mapping Example
  • Particles distributed on surfaces of a cube

Particle PMAX / Array of particles / int
GenerateParticles() / distribute particles
over all faces of the cube / for (int face0,
last0 face lt 6 face) / particles on
this face / int particles_on_this_face
num(face) for (int ilast i lt
particles_on_this_face i) / particle
properties are a function of face / Pi
... f(face) ... last
particles_on_this_face
51
Hypothetical Mapping Example (continued)
int ProcessParticle(Particle p) / perform
some computation on p / int main()
GenerateParticles() / create a list of
particles / for (int i 0 i lt N i) /
iterates over the list / ProcessParticle(Pi)
  • How much time is spent processing face i
    particles?
  • What is the distribution of performance among
    faces?
  • How is this determined if execution is parallel?

52
Semantic Entities/Attributes/Associations (SEAA)
  • New dynamic mapping scheme
  • Entities defined at any level of abstraction
  • Attribute entity with semantic information
  • Entity-to-entity associations
  • Two association types (implemented in TAU API)
  • Embedded extends data structure of associated
    object to store performance measurement entity
  • External creates an external look-up table
    using address of object as the key to locate
    performance measurement entity

53
No Performance Mapping versus Mapping
  • Typical performance tools report performance with
    respect to routines
  • Does not provide support for mapping
  • Performance tools with SEAA mapping can observe
    performance with respect to scientists
    programming and problem abstractions

TAU (w/ mapping)
TAU (no mapping)
54
Uintah Task Performance Mapping
  • Uintah partitions individual particles across
    processing elements (processes or threads)
  • Simulation tasks in task graph work on particles
  • Tasks have domain-specific character in the
    computation
  • interpolate particles to grid in Material Point
    Method
  • Task instances generated for each partitioned
    particle set
  • Execution scheduled with respect to task
    dependencies
  • How to attributed execution time among different
    tasks
  • Assign semantic name (task type) to a task
    instance
  • SerialMPMinterpolateParticleToGrid
  • Map TAU timer object to (abstract) task (semantic
    entity)
  • Look up timer object using task type (semantic
    attribute)
  • Further partition along different domain-specific
    axes

55
Using External Associations
  • Two level mappings
  • Level 1 lttask name, timergt
  • Level 2 lttask name, patch, timergt
  • Embedded association vs External
    association

Hash Table
Data (object)
Performance Data
56
Task Performance Mapping Instrumentation
  • void MPISchedulerexecute(const ProcessorGroup
    pc,
  • DataWarehouseP old_dw, DataWarehouseP
    dw )
  • ...
  • TAU_MAPPING_CREATE(
  • task-gtgetName(), "MPISchedulerexecute()",
    (TauGroup_t)(void)task-gtgetName(),
    task-gtgetName(), 0)
  • ...
  • TAU_MAPPING_OBJECT(tautimer)
  • TAU_MAPPING_LINK(tautimer,(TauGroup_t)(void)task
    -gtgetName())
  • // EXTERNAL ASSOCIATION
  • ...
  • TAU_MAPPING_PROFILE_TIMER(doitprofiler,
    tautimer, 0)
  • TAU_MAPPING_PROFILE_START(doitprofiler,0)
  • task-gtdoit(pc)
  • TAU_MAPPING_PROFILE_STOP(0)
  • ...

57
Task Performance Mapping (Profile)
Mapped task performance across processes
Performance mapping for different tasks
58
Task Performance Mapping (Trace)
Work packet computation events colored by task
type
Distinct phases of computation can be identifed
based on task
59
Task Performance Mapping (Trace - Zoom)
Startup communication imbalance
60
Task Performance Mapping (Trace - Parallelism)
Communication / load imbalance
61
Comparing Uintah Traces for Scalability Analysis
62
Scaling Performance Optimizations
Last year initial correct scheduler
Reduce communication by 10 x
Reduce task graph overhead by 20 x
ASCI Nirvana SGI Origin 2000 Los Alamos National
Laboratory
63
Scalability to 2000 Processors (Fall 2001)
ASCI Nirvana SGI Origin 2000 Los Alamos National
Laboratory
64
TAU Performance System Status
  • Computing platforms
  • IBM SP, SGI Origin, Intel Teraflop, Cray T3E,
    Compaq SC, HP, Sun, Apple, Windows, IA-32, IA-64
    (Linux),
  • Programming languages
  • C, C, Fortran 77/90, HPF, Java
  • Communication libraries
  • MPI, PVM, Nexus, Tulip, ACLMPL, MPIJava
  • Thread libraries
  • pthread, Java,Windows, SGI sproc, Tulip, SMARTS,
    OpenMP
  • Compilers
  • KAI, PGI, GNU, Fujitsu, HP, Sun, Microsoft, SGI,
    Cray, IBM, Compaq

65
PDT Status
  • Program Database Toolkit (Version 2.1, web
    download)
  • EDG C front end (Version 2.45.2)
  • Mutek Fortran 90 front end (Version 2.4.1)
  • C and Fortran 90 IL Analyzer
  • DUCTAPE library
  • Standard C system header files (KCC Version
    4.0f)
  • PDT-constructed tools
  • TAU instrumentor (C/C/F90)
  • Program analysis support for SILOON and CHASM
  • Platforms
  • SGI, IBM, Compaq, SUN, HP, Linux (IA32/IA64),
    Apple, Windows, Cray T3E

66
Evolution of the TAU Performance System
  • Customization of TAU for specific needs
  • TAUs existing strength lies in its robust
    support for performance instrumentation and
    measurement
  • TAU will evolve to support new performance
    capabilities
  • Online performance data access via
    application-level API
  • Dynamic performance measurement control
  • Generalize performance mapping
  • Runtime performance analysis and visualization

67
Information
  • TAU (http//www.acl.lanl.gov/tau)
  • PDT (http//www.acl.lanl.gov/pdtoolkit)

68
Support Acknowledgement
  • TAU and PDT support
  • Department of Energy (DOE)
  • DOE 2000 ACTS contract
  • DOE MICS contract
  • DOE ASCI Level 3 (LANL, LLNL)
  • U. of Utah DOE ASCI Level 1 subcontract
  • DARPA
  • NSF National Young Investigator (NYI) award
About PowerShow.com