Sampling profiler for Rotor as part of optimizing compilation system - PowerPoint PPT Presentation

About This Presentation
Title:

Sampling profiler for Rotor as part of optimizing compilation system

Description:

Sampling profiler for Rotor. as part of optimizing. compilation system. Sofia Chilingarova, St-Petersburg, Russia. Prof. Vladimir O. Safonov. St-Petersburg, Russia ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 15
Provided by: sofiachil
Category:

less

Transcript and Presenter's Notes

Title: Sampling profiler for Rotor as part of optimizing compilation system


1
Sampling profiler for Rotor as part of
optimizing compilation system
Sofia Chilingarova, St-Petersburg, Russia
Prof. Vladimir O. Safonov St-Petersburg, Russia
2
Agenda
  • Problem Statement
  • Rotor Sampling Profiler Implementation
  • Results

3
Problem Statement
  • Rotor does not implement optimizations in
    JIT-compiler
  • To implement optimizations runtime profiling is
    needed
  • Sampling based profiler the best option, rather
    full information by low cost

4
Typical Optimizing Dynamic Compilation Subsystem
Architecture
IL (bytecode, CIL)
Base Compiler/ Interpreter
Executable Code
Multilevel Optimizing Compiler
Data
Profiling
Compilation Queue
Profiler
Data
Controller
Methods list
Profiling plan
5
Rotor Sampling Profiler Implementation
  • Goals
  • Profiling Subsystem Architecture
  • Data Storage Structure
  • Self-Tuning
  • Integration with Rotor

6
Goals
  • To estimate individual method calls frequency
  • To construct a Call Graph
  • To achieve a reasonably low cost
  • small total overhead of profiling
  • avoid suspending user threads for a long time
  • To make good use of existing Rotor facilities

7
Profiling Subsystem Architecture
SSCLI Threads
Profiler marks managed threads
buffer
Profiler
Marking- Thread
local queue
raw samples data
Manager-Thread
Global
queue
Data Storage
8
Data Storage Structure previous approaches
a bunch of samples
DCG (Dynamic Call Graph)
PCCT (Partial Call Context Tree)
9
Data Storage Structure - our approach
10
Self-Tuning
  • When taking a sample if visited frame is
    encountered, stack lookup is completed
  • The sample is marked with a visited mark
  • When processing samples if marked sample
    contains only 1 frame data (a topmost frame), a
    special repetitions counter is incremented
  • Profiling interval is tuned based on
    repetitions counter value when a fixed number
    of samples is processed

11
Integration with Rotor
  • Threads are stopped at safe points to get
    profile
  • Just as they are stopped for GC or debugging
  • Inherent SSCLI Stack Walk mechanism is used to
    collect managed stack samples
  • Internal SSCLI VM hash tables and synchronization
    locks are used to store and maintain profile data

12
Results testing environment
  • Tests from Rotor test suit have been used
    sscli\tests\bcl\threadsafety
  • Many threads execute the same code
  • Measures used
  • statistical correlation of total individual
    method calls counters
  • Arnold Ryders Tree Overlap Percentage Measure
  • Self-tuning turned off for simplicity of
    measurement
  • But the best results were obtained with the same
    interval, which had been set automatically
    (100ms)
  • Average value from 10 subsequent runs is counted

13
Results
Test Correlation Overlap
co8545int32 0.99 0.97
co8546int16 0.99 0.92
co8547sbyte 0.99 0.94
co8548intptr 0.99 0.98
co8549uint16 0.99 0.95
co8550uint32 0.99 0.95
co8502multiplereaders 0.96 0.85
Co8503singlewritermultiple readers 0.96 0.80
14
Questions
  • Author Sofia Chilingarova, e-mail
    sofie-chil_at_hotmail.ru
Write a Comment
User Comments (0)
About PowerShow.com