A Framework for Assessing the RealTime Performance of Generic Code

About This Presentation

Title:

A Framework for Assessing the RealTime Performance of Generic Code

Description:

A Framework for Assessing the Real-Time Performance of Generic Code ... 30,000 calls to allocate. One copy constructor and one allocate for each push_back. ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 47

Provided by: marka47

Category:

more less

Transcript and Presenter's Notes

Title: A Framework for Assessing the RealTime Performance of Generic Code

1
A Framework for Assessing the Real-Time
Performance of Generic Code
Mark A. Rybka Washington University in Saint Louis
Advisors Dr. Ron K. Cytron, Dr. Christopher D.
Gill Department of Computer Science and
Engineering Washington University
This work supported in part by DARPA under
contracts F33615-00-C-1697 and F33615-03-C-4111
The author thanks The Boeing Company for
supporting his graduate studies
2
Context
What was the research about?
Set out to understand the performance
characteristics of the C Standard Template
Library (STL)
LISTltTYPE, ALLOCgt
DEQUEltTYPE, ALLOCgt
VECTORltTYPE, ALLOCgt
3
Context, continued
What was the research about?
Developed a framework for profiling the
performance of three different interfaces of the
STL
4
Context, continued
What was the research about?

Used the framework to examine the performance of
push_back and pop_back operations of the STL
sequence containers
Noticed patterns in the three different
instances of the framework developed
Used patterns to describe a conceptual
architecture for assessing the performance of
generic code

5
Context, continued
Issues and Challenges Addressed
Black box versus White Box Testing

STL and any generic code defines an interface
that is a black box without visibility of an
implementation
Some aspects of the implementation may need to
be addressed to perform testing accurately
Black box testing Goal is to identify
problems, not to prove absence of problems

6
Context, continued
Issues and Challenges Addressed

Determining the interface-to-subinterface
interactions
Determining percentages of time spent in
subinterfaces during operations

CONTAINERltTYPE, ALLOCltTYPEgt gt
?
?
TYPE
ALLOClt?gt
7
Context, continued
Issues and Challenges Addressed

Reducing interference from the test platform or
framework
Correlation of spikes in the data to software
versus system

8
Real-time Systems
A real-time system is one in which correctness of
the system depends not only on the LOGICAL
RESULTS but also on THE TIME AT WHICH the results
are provided

Scheduling analysis requires reliable estimates
of the running time of the programs tasks
If bounds are overestimated, CPU resources can
be wasted, or scheduling would be deemed
infeasible
If bounds are underestimated, deadlines may be
missed, causing system failure

9
Real-time Systems, continued
Items that must be addressed to ensure execution
time is bounded

Understanding of the runtime system
OS process priorities
Memory system and potential paging
Understanding of software subcomponents
(middleware)
Must be able to predict the time bounds of a
component
Must be made ready for real-time, which implies
a mechanism to profile performance readiness
Non-predictability is contagious

10
Type Independent Performance Framework
Requirements
11
Experimental Procedure
Description

One test run consists of 30,000 push_backs on
the container, one after the other
Time for each operation is output
Interface-to-subinterface interactions output
Type of container changed between tests
Allocator changes
Container changes
Contained Type changes

12
Experimental Data, continued
Handling System Jumps
The STL specification says constant time inserts
on a list. Why are there jumps in the data, is
it system noise or the list implementation?
13
Experimental Data, continued
Handling System Jumps
FILTERED OUT
FILTERED OUT

Software test does the same thing every time it
is run
Multiple runs can be performed and jumps can be
compared
Jumps that do not occur between test runs are
not attributed to the software
10 test runs were performed for each test
Spikes that did not occur in every test were
filtered out

14
Experimental Data, continued
Handling System Jumps
Unfiltered
True Behavior of List
The STL specification says constant time inserts
on a list. The list implementation has jumps in
it, systems spikes have been filtered out.
15
Experimental Data, continued
List using an ACE cached allocator
Using a different allocator made the list
performance tightly bounded. The allocator has
been isolated as the cause of the jumps (after
system noise filtering).
16
Experimental Data, continued
Another example of system noise filtering
Unfiltered
True Behavior of List
Filtering on this run definitely highlights the
need for the system noise filtering. Note the
first push_back operation jump remains through
filter.
17
Experimental Data, continued
Comparing the performance of the sequence
containers
A comparison of list, vector, and deque with
SimpleClass for push_back.
18
Experimental Data, continued
List

In 30,000 push_back calls
30,000 T copy constructors
30,000 calls to allocate
One copy constructor and one allocate for each
push_back. Spikes due to allocator, as shown.

19
Experimental Data, continued
Deque

In 30,000 push_back calls
30,937 T copy constructors
937 T destructors
945 calls to allocate
8 calls to deallocate
Does not do an allocate on every push_back. Some
spikes due to allocator, as shown.

20
Experimental Data, continued
Deque
21
Experimental Data, continued
Vector

In 30,000 push_back calls
62,767 T copy constructors
32,767 T destructors
16 calls to allocate
15 calls to deallocate
Large spikes occur with 1 allocate, 1 deallocate,
and a bunch of copy constructors and destructors.
ACEallocator does not make tightly bounded.

22
Experimental Data, continued
Comparing the performance of the sequence
containers
A comparison of list, vector, and deque for
push_back.
30,000 push_back operations invoke.
23
Experimental Data, continued
Comparing the performance of the sequence
containers
A comparison of list, vector, and deque for
push_back.
Deque using default allocator wins with average
and worst! W/A ratio similar to list with
default allocator!

Deque does not need to allocate on every
push_back like list, lowering its average
Deque does not need to mass copy and destroy
like vector, lowering its worst

24
Experimental Data, continued
Comparing the performance of the sequence
containers
A comparison of list, deque and vector with more
complicated TYPE, map, for push_back.
25
Experimental Data, continued
Comparing the performance of the sequence
containers

In 30,000 push_back calls
30,000 T copy constructors
30,000 calls to allocate
One copy constructor and one allocate for each
push_back. Spikes not all due to allocator,
ACEallocator does not fully help! Is TYPE now
the problem?

26
Experimental Data, continued
Comparing the performance of the sequence
containers
Copy constructor for map uses the default
allocator causing jumps! Map using ACEallocator
makes list behavior tightly bounded again.
27
Experimental Data, continued
Deque

In 30,000 push_back calls
30,714 T copy constructors
714 T destructors
721 calls to allocate
7 calls to deallocate
Moving to ACEallocator eliminated the spikes
causes by deque accessing allocator as before.
How is Map copy constructor affecting deque?

28
Experimental Data, continued
Deque
Map using ACEallocator does not have the spikes
associated with its copy constructor. Graph
looks more like behavior of deque with
SimpleClass again.
29
Experimental Data, continued
Vector
Changing allocator for both vector and map did
not change overall container behavior
significantly.
30
Experimental Data, continued
Comparing the performance of the sequence
containers
A comparison of list and deque for push_back.
Deque with default allocator beats list on
average and worst! W/A ratio similar to list with
ACE allocator!

Deque performs better than list with map as
type. Map copy constructor spike occurs with an
allocate call in list case, not necessarily an
allocate call made in deque case
Deque with default allocator performs better
than with ACE allocator! Remember, allocator
given to deque is not the cause of the worst
spike. Map copy constructor spike not as large
due to deque use of default allocator

31
Type Independent Performance Framework Design
Patterns
Problem/Context Need a module to represent a
single operation on an interface.
32
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module to represent the
usage-pattern of a type for a test run, including
operations that change object state.
33
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module to represent an
already executed test run, correlating the Test
Signature and the results.
34
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module to execute the
tests and measure the time. Must take a Test
Signature and return a Test Record.
35
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module to query time.
36
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module that can
statistically reduce the data found in Test
Records.
37
Type Independent Performance Framework Design
Patterns, continued
Problem/Context Need a module that can store
Test Records for future queries to save time
associated with rerunning tests.
38
Performance Framework Realized on STL

Type independent performance framework design
patterns realized on three different interfaces,
each related to the C Standard Template Library
(STL)
Sequence Containers - Exercises and queries the
performance of the push_back and pop_back
operations of provided type
Allocators - Exercises and queries the
performance of the allocate and deallocate
operations of the provided type
Container types - Exercises any type that may
be used in a container (default constructor,
destructor, copy constructor, assignment operator
of any type)

39
Performance Framework Realized on STL
Container Tester
How does a container interface with the
subcomponents? This became a unique
responsibility of the Container Tester. Used
signature generation technique.
40
Performance Framework Realized on STL
Allocator Tester
Which allocator is actually used in the container
operations? This became a unique responsibility
of the Allocator Tester. Used rebind allocator
interception technique.
41
Conclusions
Black box versus White Box Testing

Profiling behavior of a black box library can
only find problems not prove absence of problems
Finding problems is useful for diagnosis and
solution
Profiling behavior can build confidence

42
Conclusions, continued
Issues and Challenges Addressed
White box testing versus black box testing

Some items needed white box examination. In
particular
Allocator rebind type
No assurance that only one type is used for all
allocator requests by a container
Nothing in interface says that rebind even
occurs
First run-time interception point for allocator
Explaining software jumps in data needs some
implementation knowledge
Examination for specializations, in case of
interface-to-subinterface interactions

43
Conclusions, continued
Issues and Challenges Addressed

Determining the interface-to-subinterface
interactions
Used signature generation techniques
Determining percentages of time spent in
subinterfaces during operations
Given generated subinterface test signature, can
execute to see how much time sub-operations take
Can provide percentage of time in sub-operations
on a per test basis

44
Conclusions, continued
Issues and Challenges Addressed

Reducing interference from the framework
Run test of known performance through framework
and should see expected behavior
Get confidence that parts of framework executed
during that test did not interfere
Correlation of spikes in the data to software
versus system
Deterministically running the software tests
means spikes in software should occur in same
place between tests requires some white box
information
Spikes that do not occur in all tests are
considered system noise and removed

45
Conclusions, continued
C/STL Workarounds

Container Rebinding taking a container
parameterized on one type and rebinding to
associated types
Useful for creating a signature generating
container in a generic way
Generic programming rule provide rebind on all
parameterized types?
Compile time access to rebound allocator types
Some way to know more about how a container
interfaces with the allocator
Type-to-string conversions
Some way to get a string from a type, built-in
to the language

46
Conclusions, continued
Future Work

Extend framework template to other interfaces
Generative containers
May use existing Test Signature concept
Container choice is left to system based on
provided usage-pattern
Code is self-documenting
Compile-time Test Records, stored in C

Write a Comment

User Comments (0)

About PowerShow.com

A Framework for Assessing the RealTime Performance of Generic Code - PowerPoint PPT Presentation

A Framework for Assessing the RealTime Performance of Generic Code

A Framework for Assessing the Real-Time Performance of Generic Code ... 30,000 calls to allocate. One copy constructor and one allocate for each push_back. ... – PowerPoint PPT presentation