Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christ - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christ

Description:

'Visible readers' is a bitmap of the visible readers. ... remove the thread from readers bitmap of all // visibly opened objects ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 30
Provided by: yoss
Category:

less

Transcript and Presenter's Notes

Title: Lowering the Overhead of Software Transactional Memory Virendra J. Marathe, Michael F. Spear, Christ


1
Lowering the Overheadof Software Transactional
MemoryVirendra J. Marathe, Michael F. Spear,
Christopher Heriot, Athul Acharya, David
Eisenstat, William N. Scherer III, Michael L.
Scott
  • Featuring
  • RSTM low overhead STM library for C
  • Presenting Yosef Etigin

2
What is this paper about?
  • Design and implementation of RSTM.
  • RSTM is meant to be a fast STM library for C
    multi-threaded programs.
  • RSTM main features
  • Cache-optimized metadata organization.
  • No memory allocations during runtime, except for
    cloning objects.
  • Use a contention manager to tune performance.
  • Allow different strategies eager/lazy acquire,
    visible/invisible readers.

3
Where RSTM fits in?
  • Requires atomic load/store and CAS in hardware.
  • Provides C Smart Pointers API that can be
    used to safely access shared data within
    transactions.

4
Overview
  • RSTM Theory
  • Transaction Semantics
  • Readers
  • Writers
  • RSTM Design
  • Descriptor
  • Data Object
  • Shared Object Handle
  • RSTM Implementation
  • Resolving the data object
  • Open for read-only
  • Acquire
  • Open for read-write
  • Commit
  • Abort
  • Performance results
  • Conclusion

5
Transaction Semantics
  • Data is considered in object granularity.
  • Objects are shadowed, rather than changed in
    place.
  • Inside a transaction, objects may be opened for
    read-only or for read-write.
  • Objects that are opened for read-write are
    cloned, and those for read-only are not.
  • Commit tries to set the clone as the current
    object.
  • Abort tries to set the original as the current
    object.
  • Transactions may abort each other, but they
    consult the Contention Manager (CM) before doing
    so.

6
Readers
  • A thread that opens an object for reading may
    become a visible or invisible reader.
  • visible visible to writers.
  • Reader must have a consistent view of its opened
    objects.
  • consistent no writer has made a change that
    the reader sees only in some of its opened
    objects.
  • Inconsistency might cause hardware exceptions and
    infinite loops, thus
  • Invisible reader, on every open, must validate
    all previously opened objects (O(n2) cost).
  • Visible reader must be explicitly aborted by a
    writer that acquired it.

7
Writers
  • Opening an object for writing involves
    acquiring it.
  • Acquiring is getting exclusive access to the
    object.
  • Writers conflict with other writers and with
    visible readers.
  • Visible readers can co-exist with each other.
  • Acquiring can be done in eager or lazy fashion
  • Eager acquire an object as soon as its opened.
  • Lazy acquire it prior to committing the
    transaction.
  • Eager acquire aborts doomed transactions
    immediately, but causes more conflicts.
  • Lazy acquire enables readers to run together with
    a writer that is not committing yet.
  • Has the same consistency issue as with invisible
    reads.

8
Contention Management
  • CM is a Thread-local object
  • Notified of transaction events
  • Decides what to do on a conflict
  • Abort a transaction or spin-wait
  • Which transaction to abort, if any
  • For instance Polka CM
  • Prefers writers over readers

9
Overview
  • RSTM Theory
  • Transaction Semantics
  • Readers
  • Writers
  • RSTM Design
  • Descriptor
  • Data Object
  • Shared Object Handle
  • RSTM Implementation
  • Resolving the data object
  • Open for read-only
  • Acquire
  • Open for read-write
  • Commit
  • Abort
  • Performance results
  • Conclusion

10
RSTM Design
Thread 3
Thread 1
Thread 2
11
Descriptor
  • Each thread has a static descriptor that is used
    for all transactions of this thread.
  • Dont support nested transactions
  • Descriptor has
  • Status ACTIVE / COMMITTED / ABORTED
  • Lists of opened objects
  • Visible, invisible reads.
  • Eager, lazy writes.

12
Data Object
  • Shared objects hold, in addition to data fields,
    owner and next fields.
  • Owner is the descriptor of the current writer
    thread, if any.
  • Next is the original object, if this is a
    writer-made clone.

13
Shared Object Handle (1)
  • Encapsulates a reference to a shared object.
  • Global variables are handles rather than
    pointers.
  • Direct pointers are obtained within a
    transaction, via open.
  • Holds
  • header word - identifies the current version of
    the object.
  • visible readers word bitmap of the visible
    readers.

14
Shared Object Handle (2)
  • The header is a single word that holds a pointer
    and a dirty bit.
  • Take advantage of address alignment
  • The pointer holds some data object pObj.
  • The dirty bit tells whether pObj is a clean
    object, or a writer-made clone.
  • Saves one dereference in the common case of
    non-conflicting access.

15
Shared Object Handle (3)
  • Visible readers is a bitmap of the visible
    readers.
  • Bit i of the mask is set if thread i is a visible
    reader of the object.
  • Allows getting all readers or adding a reader in
    a single atomic operation.
  • Limits the number of visible readers
  • All others will be invisible

16
Overview
  • RSTM Theory
  • Transaction Semantics
  • Readers
  • Writers
  • RSTM Design
  • Descriptor
  • Data Object
  • Shared Object Handle
  • RSTM Implementation
  • Resolving the data object
  • Open for read-only
  • Acquire
  • Open for read-write
  • Commit
  • Abort
  • Performance results
  • Conclusion

17
RSTM Implementation
  • This section will provide pseudo-code for the
    most important STM operations
  • Open object for read-only
  • Open object for read-write
  • Commit
  • Abort
  • We present pseudo-code for methods of Descriptor
    class, which is the object that implements RSTM
    functionality.

18
Resolving the Data Object
  • // This function returns the up-to-date data
    object, associated with
  • // a handle. If the object has an active owner,
    call CM.
  • Object Descriptorresolve(Handle shared)
  • long snapshot shared-gtheader
  • Object ptr snapshot 1 //
    mask out LSB
  • if (snapshot 1) // dirty
  • switch (ptr-gtowner-gtm_status)
  • case ACTIVE
  • m_cm.handleConflict(this, ptr-gtowner)
  • return NULL
  • case COMMITTED
  • return ptr
  • case ABORTED
  • return ptr-gtnext
  • else // clean
  • return ptr

19
Open for Read-Only
  • // Open an object for read-only
  • Object DescriptoropenRO(Handle shared)
  • long headerSnapshot shared-gtheader
  • // find the data object
  • Object ptr
  • do
  • ptr resolve(shared)
  • while (!ptr)
  • if (m_isVisible)
  • m_visibleReads.add(shared)
  • // install this tx as a visible reader of the
    object
  • while (! CAS(shared-gtreaders, shared-gtreaders,
  • shared-gtreaders (1 ltlt m_id))
    )
  • // make sure no writer acquired this object
    before he could see the CAS above
  • if (headerSnapshot ! shared-gtheader)
  • abort()
  • else

20
Open for Read-Write
  • // Open an object for read-write
  • Object DescriptoropenRW(Handle shared)
  • // find the data object
  • Object ptr
  • do
  • ptr resolve(shared)
  • while (!ptr)
  • // make a writeable clone
  • Object clone ptr-gtclone()
  • clone-gtowner this
  • clone-gtnext ptr
  • // eager acquires now. lazy acquires later.
  • if (m_isEager)
  • acquire(shared, clone)
  • m_eagerWrites.add(shared, clone)
  • else
  • m_lazyWrites.add(shared, clone)

21
Acquire
  • // acquire the object
  • void Descriptoracquire(Handle shared, Object
    clone)
  • // replace the header with a dirty reference to
    the clone
  • if (!CAS( shared-gtheader, shared-gtheader,
    (long)clone 1))
  • abort()
  • // abort all visible readers
  • for (i 0 i lt sizeof(shared-gtreaders) 8
    i)
  • if (shared-gtreaders (1 ltlt i))
  • allDescriptorsi-gtabort()
  • // record this object for cleanup
  • m_acquiredObjects.add(ltshared, clonegt)

22
Commit
  • // commit a transaction
  • void DescriptoronCommit()
  • validate()
  • // acquire now lazily opened-for-rw objects
  • acquireLazyWrites()
  • // if this CAS succeeds our clones (if any)
    become the active objects
  • CAS( m_status, ACTIVE, COMMITTED )
  • if (m_status COMMITTED)
  • // replace a dirty reference to our clone
  • // with a clean reference to our clone
  • for (ltshared, clonegt in m_acquiredObjects)
  • CAS( shared-gtheader, clone 1, clone )
  • for (Shared shared in m_visibleReads)
  • while (!CAS( shared-gtreaders,
    shared-gtreaders,
  • shared-gtreaders (1 ltlt
    m_id)) )

Linearization Point
23
Abort
  • // called when Aborted exception is caught
  • void DescriptoronAbort()
  • // after this CAS, our clones (if any) are
    discarded
  • CAS( m_status, ACTIVE, ABORTED )
  • // cleanup the written objects
  • // replace a dirty reference to our clone
  • // with a clean reference to the original object
  • for (ltshared, clonegt in m_acquiredObjects)
  • CAS( shared-gtheader, clone 1, clone-gtnext )
  • // remove the thread from readers bitmap of all
  • // visibly opened objects
  • for (Shared shared in m_visibleReads)
  • while (!CAS( shared-gtreaders, shared-gtreaders,
  • shared-gtreaders (1 ltlt m_id)) )

24
Overview
  • RSTM Theory
  • Transaction Semantics
  • Readers
  • Writers
  • RSTM Design
  • Descriptor
  • Data Object
  • Shared Object Handle
  • RSTM Implementation
  • Resolving the data object
  • Open for read-only
  • Acquire
  • Open for read-write
  • Commit
  • Abort
  • Performance results
  • Conclusion

25
Performance Results (1)
  • Compare ASTM and RSTM (previous work showed that
    ASTM outperforms DSTM and OSTM).
  • Platform 16-processor SunFire 6800 at 1.2GHz.
  • Use several benchmarks with different
    configurations visible/invisible readers,
    eager/lazy writers.
  • Each benchmark was run for 10 seconds with 1 to
    28 threads.
  • Contention manager Polka.
  • Count successful transactions.

26
Performance Results (2)
  • RSTM with invisible readers is 2 times better
    than ASTM.
  • Visible readers are expensive because each access
    reads the root node and causes cache
    invalidation.
  • The only difference between C ASTM and RSTM is
    metadata organization.

27
Performance Results (3)
  • In LinkedList, FGL performs bad if threads gt
    CPUs due to preemption.
  • In LinkedList, ASTM outperforms RSTM since each
    writer invalidates objects for many readers.
  • HashTable allows great concurrency, so RSTM works
    well (3 times faster than ASTM).

28
Performance Results (4)
  • In RandomGraph and LFUCache, all STMs perform
    worse than CGL, because these data structures do
    not allow much concurrency.
  • Nevertheless, RSTM beats ASTM.

29
Conclusion
  • RSTM has a novel metadata organization which
    reduces overhead, due to
  • One level of indirection instead of the common
    two.
  • Using static instead of dynamic data structures.
  • RSTM provides a variety of policies for conflict
    detection, so can be customized for a given
    workload.
  • Compared to ASTM, RSTM gives better performance
    due to better metadata organization.
Write a Comment
User Comments (0)
About PowerShow.com