Effective Fine-Grain Synchronization For Automatically Parallelized Programs Using Optimistic Synchronization Primitives - PowerPoint PPT Presentation

About This Presentation
Title:

Effective Fine-Grain Synchronization For Automatically Parallelized Programs Using Optimistic Synchronization Primitives

Description:

Effective Fine-Grain Synchronization For Automatically Parallelized Programs ... Optimistic Synchronization In Modern Processors. Load Linked (LL) - Used To ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 39
Provided by: martin49
Category:

less

Transcript and Presenter's Notes

Title: Effective Fine-Grain Synchronization For Automatically Parallelized Programs Using Optimistic Synchronization Primitives


1
  • Effective Fine-Grain Synchronization For
    Automatically Parallelized Programs Using
    Optimistic Synchronization Primitives
  • Martin Rinard
  • University of California, Santa Barbara

2
  • Problem
  • Efficiently Implementing Atomic Operations On
    Objects
  • Key Issue
  • Mutual Exclusion Locks
  • Versus
  • Optimistic Synchronization Primitives
  • Context
  • Parallelizing Compiler For Irregular Object-Based
    Programs
  • Linked Data Structures
  • Commutativity Analysis

3
Talk Outline
  • Histogram Example
  • Advantages and Limitations of Optimistic
    Synchronization
  • Synchronization Selection Algorithm
  • Experimental Results

4
Histogram Example
  • class histogram
  • private int countsN
  • public void update(int i)
  • countsi
  • parallel for (i 0 i lt iterations i)
  • int c f(i)
  • h-gtupdate(c)

3
7
4
1
2
0
5
8
5
Cloud Of Parallel Histogram Updates
Histogram
iteration 0
3
iteration 8
7
4
iteration 2
iteration 1
1
iteration 7
2
iteration 3
0
iteration 6
iteration 4
5
8
iteration 5
Updates Must Execute Atomically
6
One Lock Per Object
  • class histogram
  • private int countsN
  • lock mutex
  • public void update(int i)
  • mutex.acquire()
  • countsi
  • mutex.release()

Problem False Exclusion
7
One Lock Per Item
  • class histogram
  • private int countsN
  • lock mutexN
  • public void update(int i)
  • mutexi.acquire()
  • countsi
  • mutexi.release()

Problem Memory Consumption
8
Optimistic Synchronization
Load Old Value
Compute New Value Into Local Storage
Commit Point
No Write Between Load and Commit
Write Between Load and Commit
Commit Fails Retry Update
Commit Succeeds Write New Value
9
Parallel Updates With Optimistic Synchronization
Load Old Value
3
7
4
Compute New Value Into Local Storage
1
2
0
5
8
Commit Succeeds Write New Value
10
Optimistic Synchronization In Modern Processors
  • Load Linked (LL) - Used To Load Old Value
  • Store Conditional (SC) - Used To Commit New Value
  • Atomic Increment Using Optimistic Synchronization
    Primitives
  • retry LL 2,0(4) Load Old Value
  • addiu 3,2,1 Compute New Value Into
    Local Storage
  • SC 3,0(4) Attempt To Store New Value
  • beq 3,0,retry Retry If Failure

11
Optimistically Synchronized Histogram
  • class histogram
  • private int countsN
  • public void update(int i)
  • do
  • new_count LL(countsi)
  • new_count
  • while (!SC(new_count, countsi))

12
Aspects of Optimistic Synchronization
  • Advantages
  • Slightly More Efficient Than Locked Updates
  • No Memory Overhead
  • No Data Cache Overhead
  • Potentially Fewer Memory Consistency Requirements
  • Advantages In Other Contexts
  • No Deadlock, No Priority Inversions, No Lock
    Convoys
  • Limitations
  • Existing Primitives Support Only Single Word
    Updates
  • Each Update Must Be Synchronized Individually
  • Lack of Fairness

13
Synchronization In Automatically Parallelized
Programs
Serial Program
Assumption Operations Execute Atomically
CommutativityAnalysis
Unsynchronized Parallel Program
Requirement Correctly Synchronize Atomic
Operations
Synchronization Selection
Goal Choose An Efficient Synchronization
Mechanism for Each Operation
Synchronized Parallel Program
14
Atomicity Issues In Generated Code
Serial Program
Assumption Operations Execute Atomically
CommutativityAnalysis
Unsynchronized Parallel Program
Goal Choose An Efficient Synchronization
Mechanism For Each Operation
Synchronization Selection
Requirement Correctly Synchronize Atomic
Operations
Synchronized Parallel Program
15
  • Use Optimistic Synchronization
  • Whenever Possible

16
Model Of Computation
  • Objects With Instance Variables
  • class histogram
  • private int countsN
  • Operations Update Objects By Modifying Instance
    Variables
  • void histogramupdate(int i)
  • countsi

4
2
5
h-gtupdate(1)
4
4
2
3
5
5
17
Commutativity Analysis
  • Compiler Computes Extent Of Computation
  • Representation of All Operations in Computation
  • In Example histogramupdate
  • Do All Pairs Of Operations Commute?
  • No - Generate Serial Code
  • Yes - Automatically Generate Parallel Code
  • In Example
  • h-gtupdate(i) and h-gtupdate(j) commute for all i,
    j

18
Synchronization Requirements
  • Traditional Parallelizing Compilers
  • Parallelize Loops With Independent Iterations
  • Barrier Synchronization
  • Commutativity Analysis
  • Parallel Operations May Update Same Object
  • For Generated Code To Execute Correctly,
  • Operations Must Execute Atomically
  • Code Generation Algorithm Must Insert
    Synchronization

19
Default Synchronization Algorithm
  • class histogram
  • private int countsN
  • lock mutex One Lock Per Object
  • public void update(int i)
  • mutex.acquire()
  • countsi
  • mutex.release()

Operations Acquire and Release Lock
20
Synchronization Constraints
  • Operation
  • countsi countsi1 aaaaaaaaaaaaaaaaaaaaaa
    aaaaaaaaaaaaaaaaaa
  • temp countsi
  • countsi countsj
  • countsj temp

Synchronization Constraint Can Use Optimistic
Synchronization - Read/Compute/Write Update To A
Single Instance Variable Must Use Lock
Synchronization - Updates Involve Multiple
Interdependent Instance Variables
21
Synchronization Selection Constraints
  • Can Use Optimistic Synchronization Only For
    Single Word Updates That
  • All Updates To Same Instance Variable Must Use
    Same Synchronization Mechanism

Read An Instance Variable
Compute A New Value That Depends On No Other
Updated Instance Variable
Write New Value Back Into Instance Variable
22
Synchronization Selection Algorithm
  • Operates At Granularity Of Instance Variables
  • Compiler Scans All Updates To Each Instance
    Variable
  • If A Class Has A Lock Synchronized Variable,
  • Class is Marked Lock Synchronized

If All Updates Can Use Optimistic
Synchronization, Instance Variable Is Marked
Optimistically Synchronized
If At Least One Update Must Use Lock
Synchronization, Instance Variable Is Marked Lock
Synchronized
23
Synchronization Selection In Example
class histogram private int
countsN public void update(int i)
countsi
Optimistically Synchronized Instance Variable
histogram NOT Marked As Lock Synchronized Class
24
Code Generation Algorithm
  • All Lock Synchronized Classes Augmented With
    Locks
  • Operations That Update Lock Synchronized
    Variables Acquire and Release the Lock in the
    Object
  • Operations That Update Optimistically
    Synchronized Variables Use Optimistic
    Synchronization Primitives

25
Optimistically Synchronized Histogram
  • class histogram
  • private int countsN
  • public void update(int i)
  • do
  • new_count LL(countsi)
  • new_count
  • while (!SC(new_count, countsi))

26
Experimental Results
27
Methodology
  • Implemented Parallelizing Compiler
  • Implemented Synchronization Selection Algorithm
  • Parallelized Three Complete Scientific
    Applications
  • Barnes-Hut, String, Water
  • Produced Four Versions
  • Optimistic (All Updates Optimistically
    Synchronized)
  • Item Lock (Produced By Hand)
  • Object Lock
  • Coarse Lock
  • Used Inline Intrinsic Locks With Exponential
    Backoff
  • Measured Performance On SGI Challenge XL

28
Time For One Update
Time for One Cached Update On Challenge XL
Time for One Uncached Update On Challenge XL
29
Synchronization Frequency
Optimistic, Item Lock
Barnes-Hut
Object Lock
661
Coarse Lock
Optimistic, Item Lock
String
Object Lock
Optimistic, Item Lock
Water
Object Lock
25
Coarse Lock
0
5
10
15
Microseconds Per Synchronization
30
Memory Consumption For Barnes-Hut

50
40
30
Memory Consumption (MBytes)
20
10
0
Optimistic
Item Lock
Object Lock
Coarse Lock
Total Memory Used To Store Objects
31
Memory Consumption For String

5
4
3
Memory Consumption (MBytes)
2
1
0
Optimistic
Item Lock
Object Lock
Total Memory Used To Store Objects
32
Memory Consumption For Water


1.5
1
Memory Consumption (MBytes)
0.5
0
Optimistic
Item Lock
Object Lock
Coarse Lock
Total Memory Used To Store Objects
33
Speedups For Barnes-Hut
Optimistic
Item Lock
Object Lock
Coarse Lock
34
Speedups For String



24
24
24
16
16
16
Speedup
8
8
8
0
0
0
0
8
16
24
0
8
16
24
0
8
16
24
Processors
Processors
Processors
Optimistic
Item Lock
Object Lock
35
Speedups For Water




24
24
24
24
16
16
16
16
Speedup
8
8
8
8
0
0
0
0
0
8
16
24
0
8
16
24
0
8
16
24
0
8
16
24
Processors
Processors
Processors
Processors
Optimistic
Item Lock
Object Lock
Coarse Lock
36
Acknowledgements
  • Pedro Diniz
  • Parallelizing Compiler
  • Silicon Graphics
  • Challenge XL Multiprocessor
  • Rohit Chandra, T.K. Lakshman, Robert Kennedy,
    Alex Poulos
  • Technical Assistance With SGI Hardware and
    Software

37
Bottom Line
  • Optimistic Synchronization Offers
  • No Memory Overhead
  • No Data Cache Overhead
  • Reasonably Small Execution Time Overhead
  • Good Performance On All Applications
  • Good Choice For Parallelizing Compiler
  • Minimal Impact On Parallel Program
  • Simple, Robust, Works Well In Range Of Situations
  • Major Drawback
  • Current Primitives Support Only Single Word
    Updates
  • Use Optimistic Synchronization Whenever Applicable

38
Future
  • The Efficient Implementation Of Atomic Operations
    On Objects Will Become A Crucial Issue For
    Mainstream Software
  • Small-Scale Shared-Memory Multiprocessors
  • Multithreaded Applications and Libraries
  • Popularity of Object-Oriented Programming
  • Specific Example Java Standard Library
  • Optimistic Synchronization Primitives Will Play
    An Important Role
Write a Comment
User Comments (0)
About PowerShow.com