Title: JAVA WORKLOAD CHARACTERIZATION USING BYTECODES Carl Herder and Jozo Dujmovic Department of Computer
1JAVA WORKLOAD CHARACTERIZATION USING
BYTECODESCarl Herder and Jozo
DujmovicDepartment of Computer Science San
Francisco State University
2 CONTENTS
- White-box metric based on resource utilization
- White-box metric based on bytecodes
- Classification of workload difference models
- Workload analysis using all bytecodes
- Workload analysis using selected bytecodes
- Workload analysis using bytecode groups
- General conclusions about approaches to WC
3Quantitative Characterization of Computer
Workloads
- White-box approach to workload characterization
- Identify a set of relevant internal hardware
and/or software components - Workload attributes describe the use of these
components - Define difference metric
- Compare workloads using cluster analysis
techniques
4A hierarchy of Java workload characterization
models
- Physical models
- System resource utilizations
- Machine instructions
- Virtual models
- JVM bytecodes
- Java source instructions
- Functional models
- Functions and/or libraries
- Application areas
5With a hierarchy of characterization vectors, we
can
- Understand reasons for differences between
workloads - Detect similarities and redundancies among a set
of workloads - Evaluate and compare of benchmark suites
- Design benchmark suites with a controlled level
of redundancy between component benchmarks - Improve the reliability and relevance of computer
performance indicators measured using benchmark
suites - Provide a theoretical framework for decomposition
and aggregation of Java workloads.
6Java Bytecode Characterization Vectors
- All countable bytecodes
- Selected bytecodes
- Grouped bytecodes
7Questions raised by this
- Which level is the most appropriate for workload
characterization? - Are there workload relationships that are
invariant across approaches?
8White-box Difference Metric Based on Resource
Utilization
Let U1,,UN represent utilization measurements of
N system resources Difference of workloads A
and B For non-overlapped execution
9Other white-box metrics based on utilizations are
also possible
10White Box model based on bytecodes
11White Box model based on bytecodes
Let N be the number of defined bytecodes (max
256). Let be the
frequencies of executed bytecodes for a workload.
Let be their
average execution times . Total execution time
(non-overlapped) Total bytecodes executed by
the workload Average bytecode execution time
12White Box model based on bytecodes, cont.
Relative frequency (probability) of executing
bytecodes i Relative execution
time Utilization of bytecode i is fraction of
time spent executing it.
13Problems With ?
- Dependent upon
- Virtual machine implementation
- Execution time argument values
- Just-in-time compilation
- Underlying machine
- Underlying architecture
- Resource contention
- Can be interpreted as relative bytecode weight
for WC
14Machine Independent Model
Let ?i1, and This is the probability
vector difference.
15Cluster Analysis Using All Bytecodes
Compress --------------------------------------
------------------------------------.Javac
-----------.
DB ------------------------------------------
----------------------------.Jess
---' Javac
Javac -------------------------------------------
-----.Javac
----------.
---------------------'
Jess ------------------------
------------------------'
Javac-.
Mpegaudio ----------------------------
--------------------------------------------------
--------'
Jack --------------------------------
--------------------------------------------------
--------------- Javac
Mtrt --.Mtrt
-------------------------------------------------
-----------------------------------------------'
Raytrace --'
SIMILARITY
100 94.78 89.56 84.34
79.11 73.89 68.67 63.45 58.23
53.01 47.79 -------------------
---------------------------------------------
-------------------------- 0
5.22 10.44 15.66 20.89 26.11
31.33 36.55 41.77 46.99 52.21
DIFFERENCE SPEC Java benchmarks All
bytecode frequency metric, using probability
vector difference.
16Selected Bytecodes
- More than half the bytecodes support stack based
architecture. - They say nothing about the functional nature of
the workload. - May introduce differences of questionable
relevance. - Disappear with JIT compilation.
- Execution time measurement is problematic.
17Cluster Analysis Using Selected Bytecodes
Compress ---------------------------------------
-------------------------------------.Compress
-----.
Mpegaudio ---------------------------------------
-------------------------------------' Javac
----------.
DB -------------------------------------------
------------------------------.Jess
--------'
Javac -------------------------------------------
---------.Jess
Javac-----.
--------------------'
Jess -----------------------------
-----------------------'
Javac Jack -------------------------------
--------------------------------------------------
------------'
Mtrt -.Raytrace
-------------------------------------------------
------------------------------------------------'
Raytrace -'
SIMILARITY
100 95.15 90.30 85.45
80.60 75.75 70.90 66.05 61.20
56.35 51.49 -------------------
---------------------------------------------
-------------------------- 0
4.85 9.70 14.55 19.40 24.25
29.10 33.95 38.80 43.65 48.51
DIFFERENCE SPEC Java Benchmarks
Selected Bytecode Frequency Metric
18Grouped Bytecodes
- Bytecode granularity may introduce artificial
differences.ifle ifgt ?double ! float?pop2
poppop? - Views workloads at higher abstraction level.
- Facilitates workload decomposition/composition.
19Grouped Bytecodes
- Inter-workload differences are provably smaller
after aggregation. - Exposes the operations responsible for workload
differences.
20Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- Create a decomposition tree of Java data types.
For example - integers
- int, short, byte, char, long
- real numbers
- float, double
- language structures
- reference, array, class
- internals
- stack, program counter
21Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- 2. Create a decomposition tree of Java operation
types. For example - data processing
- arithmetic, bit-level operations, type
conversions - control flow
- branches, jumps
- memory access
- variables, arrays, objects, stack
- memory allocate
- class, array
22Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- 3. Create a bytecode distribution table.
Example -
23Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- 4. For each workload, replace the bytecode
mnemonic with the measured bytecode frequency. - 5. Create an initial matrix of differences
between analyzed workloads based on all bytecode
frequencies. Cluster the workloads and create an
initial dendrogram. - 6. Highest granularity grouping sum across each
row and column of the distribution table. Use
sums to create a new matrix of differences, and
generate a dendrogram.
24Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- 7. Develop a strategy for merging operation
rows and data type columns. Add all execution
counts in each aggregated row or column. - At each level of granularity, compare the new and
the previous dendrogram and determine the effect
of the aggregation. - 9. Continue the stepwise aggregation process and
create a sequence of dendrograms, increasing the
level of abstraction and decreasing the
differences between workloads.
25Ten Steps of Workload Characterization by
Stepwise Aggregation of Bytecodes
- 10. Use the generated sequence of dendrograms to
determine and explain similarities and
differences between component workloads. Use the
results of this analysis to solve workload
characterization problems (e.g. elimination of
redundant workloads, design and validation of
benchmark suites, etc.).
26Experimental Application to JVM98
27Experimental Application to JVM98
28Experimental Application to JVM98
29Experimental Application to JVM98
- compress ---------------------------------------
-----------------------.javac
-
------------------.
- DB -------------------------------------
------------------.javac
-
------'
javac - javac ----------------------------.javac
---------. -
--------------------------'
- jess ----------------------------'
-
javac-------. - mpegaudio -------------------------------------
--------------------------------------------'
-
- jack -------------------------------------
--------------------------------------------------
---- javac -
- mtrt -.mtrt
- -----------------------------------
--------------------------------------------------
------------' - raytrace -'
-
SIMILARITY - 100 96.89 93.77 90.66
87.54 84.43 81.31 78.20 75.08
71.97 68.85 - ----------------------------------
---------------------------------------------
----------- - 0 3.11 6.23 9.34
12.46 15.57 18.69 21.80 24.92
28.03 31.15
Grouping Level 1 Grouping Level 2
30Experimental Application to JVM98
- compress -----------------------.DB
-
------------------.DB
- DB -----------------------'
---------------.
-
- mpegaudio -------------------------------------
-----' jack
-
----------------------------
------------. - javac -------------------.javac
-
----------------------.jack
- jess -------------------'
---------------'
-
javac - jack -------------------------------------
-----'
-
- mtrt -.mtrt
- -----------------------------------
--------------------------------------------------
------------' - raytrace -'
-
SIMILARITY - 100 97.15 94.31 91.46
88.62 85.77 82.93 80.08 77.24
74.39 71.55 - ----------------------------------
---------------------------------------------
----------- - 0 2.85 5.69 8.54
11.38 14.23 17.07 19.92 22.76
25.61 28.45
Grouping Level 3 Grouping Level 4
31What We Have Learned About Workload Relationships?
- Mtrt an raytrace almost identical across the
cluster analysis approaches. - Javac and jess form the next most similar
workload pair under all approaches. - Jack and Mtrt/raytrace add significant coverage
only when the analysis is at higher levels of
granularity. - Mpegaudio add significant coverage only when
distinctions among JVM internal structure
operations are relevant. - Javac is the most central workload in all
analyses. - best representative of the entire benchmark suite
- primary candidate for elimination
32Advantages of an hierarchical approach to Java
workload characterization
- Workload characterization at different levels of
granularity explains the origin of similarities
and differences between workloads. - A range of metric variations is available, from
which the most suitable can be chosen for solving
a specific workload characterization problem. - Essential workload properties may be identified
by examining relationships between workloads that
are invariant at different granularities.
33Future Work
- Design of workloads with desired properties
through composition of workloads with known
properties - Creation of Universal Benchmark Suite
- Performance prediction