CS380 C lecture 20 - PowerPoint PPT Presentation

About This Presentation
Title:

CS380 C lecture 20

Description:

JIT compilation, GC, dynamic checks, etc. Methodology has not adapted ... with 'best' application performance in order to keep from hiding mutator overheads in ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 33
Provided by: CSCF
Category:

less

Transcript and Presenter's Notes

Title: CS380 C lecture 20


1
CS380 C lecture 20
  • Last time
  • Linear scan register allocation
  • Classic compilation techniques
  • On to a modern context
  • Today
  • Jenn Sartor
  • Experimental evaluation for managed languages
    with JIT compilation and garbage collection

2
Wake Up and Smell the Coffee Performance
Analysis Methodologies for the 21st Century
  • Kathryn S McKinley
  • Department of Computer Sciences
  • University of Texas at Austin

3
Shocking News!
  • In 2000, Java overtook C and C as the most
    popular programming language
  • TIOBE 2000--2008

4
Systems Researchin Industry and Academia
  • ISCA 2006
  • 20 papers use C and/or C
  • 5 papers are orthogonal to the programming
    language
  • 2 papers use specialized programming languages
  • 2 papers use Java and C from SPEC
  • 1 paper uses only Java from SPEC

5
What is Experimental Computer Science?
6
What is Experimental Computer Science?
  • An idea
  • An implementation in some system
  • An evaluation

7
The success of most systems innovation hinges on
evaluation methodologies.
  1. Benchmarks reflect current and ideally, future
    reality
  2. Experimental design is appropriate
  3. Statistical data analysis

8
The success of most systems innovation hinges on
experimental methodologies.
?
  1. Benchmarks reflect current and ideally, future
    reality DaCapo Benchmarks 2006
  2. Experimental design is appropriate.
  3. Statistical Data Analysis Georges et al. 2006

?
9
Experimental Design
  • Were not in Kansas anymore!
  • JIT compilation, GC, dynamic checks, etc
  • Methodology has not adapted
  • Needs to be updated and institutionalized

this sophistication provides a significant
challenge to understanding complete system
performance, not found in traditional languages
such as C or C Hauswirth et al OOPSLA 04
10
Experimental Design
  • Comprehensive comparison
  • 3 state-of-the-art JVMs
  • Best of 5 executions
  • 19 benchmarks
  • Platform 2GHz Pentium-M, 1GB RAM, linux 2.6.15

11
Experimental Design
12
Experimental Design
13
Experimental Design
14
Experimental Design
First Iteration
Second Iteration
Third Iteration
15
Experimental Design
  • Another Experiment
  • Compare two garbage collectors
  • Semispace Full Heap Garbage Collector
  • Marksweep Full Heap Garbage Collector

16
Experimental Design
  • Another Experiment
  • Compare two garbage collectors
  • Semispace Full Heap Garbage Collector
  • Marksweep Full Heap Garbage Collector
  • Experimental Design
  • Same JVM, same compiler settings
  • Second iteration for both
  • Best of 5 executions
  • One benchmark - SPEC 209_db
  • Platform 2GHz Pentium-M, 1GB RAM, linux 2.6.15

17
Marksweep vs Semispace
18
Marksweep vs Semispace
19
Marksweep vs Semispace
20
Experimental Design
21
Experimental DesignBest Practices
  • Measuring JVM innovations
  • Measuring JIT innovations
  • Measuring GC innovations
  • Measuring Architecture innovations

22
JVM InnovationBest Practices
  • Examples
  • Thread scheduling
  • Performance monitoring
  • Workload triggers differences
  • real workloads perhaps microbenchmarks
  • e.g., force frequency of thread switching
  • Measure report multiple iterations
  • start up
  • steady state (aka server mode)
  • never configure the VM to use completely
    unoptimized code!
  • Use a modest or multiple heap sizes computed as a
    function of maximum live size of the application
  • Use report multiple architectures

23
Best Practices
24
JIT Innovation Best Practices
  • Example new compiler optimization
  • Code quality Does it improve the application
    code?
  • Compile time How much compile time does it add?
  • Total time compiler and application time
    together
  • Problem adaptive compilation responds to
    compilation load
  • Question How do we tease all these effects apart?

25
JIT Innovation Best Practices
  • Teasing apart compile time and code quality
  • requires multiple experiments
  • Total time Mix methodology
  • Run adaptive system as intended
  • Result mixture of optimized and unoptimized code
  • First second iterations (that include compile
    time)
  • Set and/or report the heap size as a function of
    maximum live size of the application
  • Report average and show statistical error
  • Code quality
  • OK Run iterations until performance stabilizes
    on best, or
  • Better Run several iterations of the benchmark,
    turn off the compiler, and measure a run
    guaranteed to have no compilation
  • Best Replay mix compilation
  • Compile time
  • Requires the compiler to be deterministic
  • Replay mix compilation

26
Replay Compilation
  • Force the JIT to produce a deterministic result
  • Make a compilation profiler replayer
  • Profiler
  • Profile first or later iterations with adaptive
    JIT, pick best or average
  • Record profiling information used in compilation
    decisions, e.g., dynamic profiles of edges,
    paths, /or dynamic call graph
  • Record compilation decisions, e.g., compile
    method bar at level two, inline method foo into
    bar
  • Mix of optimized and unoptimized, or all
    optimized/unoptimized
  • Replayer
  • Reads in profile
  • As the system loads each class, apply profile /-
    innovation
  • Result
  • controlled experiments with deterministic
    compiler behavior
  • reduces statistical variance in measurements
  • Still not a perfect methodology for inlining

27
GC Innovation Best Practices
  • Requires more than one experiment...
  • Use report a range of fixed heap sizes
  • Explore the space time tradeoff
  • Measure heap size with respect to the maximum
    live size of the application
  • VMs should report total memory not just
    application memory
  • Different GC algorithms vary in the meta-data
    they require
  • JIT and VM use memory...
  • Measure time with a constant workload
  • Do not measure through put
  • Best run two experiments
  • mix with adaptive methodology what users are
    likely to see in practice
  • replay hold the compiler activity constant
  • Choose a profile with best application
    performance in order to keep from hiding mutator
    overheads in bad code.

28
Architecture Innovation Best Practices
  • Requires more than one experiment...
  • Use more than one VM
  • Set a modest heap size and/or report heap size as
    a function of maximum live size
  • Use a mixture of optimized and uncompiled code
  • Simulator needs the same code in many cases to
    perform comparisons
  • Best for microarchitecture only changes
  • Multiple traces from live system with adaptive
    methodology
  • start up and steady state with compiler turned
    off
  • what users are likely to see in practice
  • Wont work if architecture change requires
    recompilation, e.g., new sampling mechanism
  • Use replay to make the code as similar as possible

29
benchmarks
There are lies, damn lies, and
  • statistics
  • Disraeli

30
Conclusions
  • Methodology includes
  • Benchmarks
  • Experimental design
  • Statistical analysis OOPSLA 2007
  • Poor Methodology
  • can focus or misdirect innovation and energy
  • We have a unique opportunity
  • Transactional memory, multicore performance,
    dynamic languages
  • What we can do
  • Enlist VM builders to include replay
  • Fund and broaden participation in benchmarking
  • Research and industrial partnerships
  • Funding through NSF, ACM, SPEC, industry or ??
  • Participate in building community workloads

31
CS380 C
  • More on Java Benchmarking
  • www.dacapobench.org
  • Alias analysis
  • Read A. Diwan, K. S. McKinley, and J. E. B.
    Moss, Using Types to Analyze and Optimize
    Object-Oriented Programs, ACM Transactions on
    Programming Languages and Systems, 23(1) 30-72,
    January 2001.

32
Suggested ReadingsPerformance Evaluation of JVMs
  • How Java Programs Interact with Virtual Machines
    at the Microarchitectural Level, Lieven Eeckhout,
    Andy Georges and Koen De Bosschere, The 18th
    Annual ACM SIGPLAN Conference on Object-Oriented
    Programming, Systems, Languages and Applications
    (OOPSLA'03), Oct. 2003
  • Method-Level Phase Behavior in Java Workloads,
    Andy Georges, Dries Buytaert, Lieven Eeckhout and
    Koen De Bosschere, The 19th Annual ACM SIGPLAN
    Conference on Object-Oriented Programming,
    Systems, Languages and Applications (OOPSLA'04),
    Oct. 2004
  • Myths and Realities The Performance Impact of
    Garbage Collection, S. M. Blackburn, P. Cheng,
    and K. S. McKinley, ACM SIGMETRICS Conference on
    Measurement Modeling Computer Systems, pp.
    25--36, New York, NY, June 2004.
  • The DaCapo Benchmarks Java Benchmarking
    Development and Analysis, S. M. Blackburn, et.
    al., The ACM SIGPLAN Conference on Object
    Oriented Programming Systems, Languages and
    Applications (OOPSLA), Portland, OR, pp.
    191--208, October 2006.
  • Statistically Rigorous Java Performance
    Evaluation, A. Georges, D. Buytaert, and L.
    Eeckhout, The ACM SIGPLAN Conference on Object
    Oriented Programming Systems, Languages and
    Applications (OOPSLA), Montreal, Canada, Oct
    2007. To appear.
Write a Comment
User Comments (0)
About PowerShow.com