Lecture 12: TM, Consistency Models - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 12: TM, Consistency Models

Description:

Title: PowerPoint Presentation Author: Rajeev Balasubramonian Last modified by: RB Created Date: 9/20/2002 6:19:18 PM Document presentation format – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 28
Provided by: RajeevBalas168
Learn more at: https://my.eng.utah.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture 12: TM, Consistency Models


1
Lecture 12 TM, Consistency Models
  • Topics TM pathologies, sequential
    consistency, hw and
  • hw/sw optimizations

2
Paper on TM Pathologies (ISCA08)
  • LL lazy versioning, lazy conflict detection,
    committing
  • transaction wins conflicts
  • EL lazy versioning, eager conflict detection,
    requester
  • succeeds and others abort
  • EE eager versioning, eager conflict detection,
    requester
  • stalls

3
Pathology 1 Friendly Fire
  • VM any
  • CD eager
  • CR requester wins
  • Two conflicting transactions that
  • keep aborting each other
  • Can do exponential back-off to
  • handle livelock
  • Fixable by doing requester stalls? Also fixable
    by doing
  • requester wins only if the requester is older

4
Pathology 2 Starving Writer
  • VM any
  • CD eager
  • CR requester stalls
  • A writer has to wait for the reader
  • to finish but if more readers keep
  • showing up, the writer is starved
  • (note that the directory allows new
  • readers to proceed by just adding
  • them to the list of sharers)
  • Fixable by forcing the directory to override
    requester-stalls
  • on a starvation alarm

5
Pathology 3 Serialized Commit
  • VM lazy
  • CD lazy
  • CR any
  • If theres a single commit token,
  • transaction commit is serialized
  • There are ways to alleviate this problem
    (discussed
  • in the last class)

6
Pathology 4 Futile Stall
  • VM any
  • CD eager
  • CR requester stalls
  • A transaction is stalling on another
  • transaction that ultimately aborts and
  • takes a while to reinstate old values
  • -- no good workaround

7
Pathology 5 Starving Elder
  • VM lazy
  • CD lazy
  • CR committer wins
  • Small successful transactions can
  • keep aborting a large transaction
  • The large transaction can eventually
  • grab the token and not release it
  • until after it commits

8
Pathology 6 Restart Convoy
  • VM lazy
  • CD lazy
  • CR committer wins
  • A number of similar (conflicting)
  • transactions execute together one
  • wins, the others all abort shortly,
  • these transactions all return and
  • repeat the process
  • Use exponential back-off

9
Pathology 7 Dueling Upgrades
  • VM eager
  • CD eager
  • CR requester stalls
  • If two transactions both read the
  • same object and then both decide to
  • write it, a deadlock is created
  • Exacerbated by the Futile Stall pathology
  • Solution?

10
Four Extensions
  • Predictor predict if the read will soon be
    followed by a
  • write and acquire write permissions
    aggressively
  • Hybrid if a transaction believes it is a
    Starving Writer, it
  • can force other readers to abort for
    everything else, use
  • requester stalls
  • Timestamp In the EL case, requester wins only
    if it is the
  • older transaction (handles Friendly Fire
    pathology)
  • Backoff in the LL case, aborting transactions
    invoke
  • exponential back-off to prevent convoy
    formation

11
Coherence Vs. Consistency
  • Recall that coherence guarantees (i) that a
    write will
  • eventually be seen by other processors, and
    (ii) write
  • serialization (all processors see writes to the
    same location
  • in the same order)
  • The consistency model defines the ordering of
    writes and
  • reads to different memory locations the
    hardware
  • guarantees a certain consistency model and the
  • programmer attempts to write correct programs
    with
  • those assumptions

12
Example Programs
Initially, A B 0 P1
P2 A 1 B
1 if (B 0) if (A 0)
critical section critical
section Initially, A B 0 P1
P2 P3 A 1
if (A 1) B 1
if (B 1)
register A
P1 P2 Data 2000
while (Head 0) Head 1
Data
13
Sequential Consistency
P1 P2 Instr-a
Instr-A Instr-b Instr-B
Instr-c Instr-C Instr-d
Instr-D
  • We assume
  • Within a program, program order is preserved
  • Each instruction executes atomically
  • Instructions from different threads can be
    interleaved arbitrarily
  • Valid executions
  • abAcBCDdeE or ABCDEFabGc or
    abcAdBe or
  • aAbBcCdDeE or ..

14
Sequential Consistency
  • Programmers assume SC makes it much easier to
  • reason about program behavior
  • Hardware innovations can disrupt the SC model
  • For example, if we assume write buffers, or
    out-of-order
  • execution, or if we drop ACKS in the coherence
    protocol,
  • the previous programs yield unexpected outputs

15
Consistency Example - I
  • Consider a multiprocessor with bus-based
    snooping cache
  • coherence and a write buffer between CPU and
    cache

Initially A B 0 P1
P2 A ? 1 B ? 1
if (B 0) if (A 0)
Crit.Section Crit.Section
The programmer expected the above code to
implement a lock because of write buffering,
both processors can enter the critical section
The consistency model lets the programmer know
what assumptions they can make about the
hardwares reordering capabilities
16
Consistency Example - 2
P1 P2
Data 2000 while (Head
0) Head 1 Data
Sequential consistency requires program order
-- the write to Data has to complete before the
write to Head can begin -- the read of Head has
to complete before the read of Data can begin
17
Consistency Example - 3
Initially, A B 0 P1 P2
P3 A 1 if
(A 1) B 1
if (B 1)

register A
Sequential consistency can be had if a process
makes sure that everyone has seen an update
before that value is read else, write
atomicity is violated
18
Sequential Consistency
  • A multiprocessor is sequentially consistent if
    the result
  • of the execution is achieveable by maintaining
    program
  • order within a processor and interleaving
    accesses by
  • different processors in an arbitrary fashion
  • The multiprocessors in the previous examples are
    not
  • sequentially consistent
  • Can implement sequential consistency by
    requiring the
  • following program order, write serialization,
    everyone has
  • seen an update before a value is read very
    intuitive for
  • the programmer, but extremely slow

19
HW Performance Optimizations
  • Program order is a major constraint the
    following try to
  • get around this constraint without violating
    seq. consistency
  • if a write has been stalled, prefetch the block
    in
  • exclusive state to reduce traffic when the
    write happens
  • allow out-of-order reads with the facility to
    rollback
  • if the ROB detects a violation (detected by
    re-executing
  • the read later)

20
Relaxed Consistency Models (HW/SW)
  • We want an intuitive programming model (such as
  • sequential consistency) and we want high
    performance
  • We care about data races and re-ordering
    constraints for
  • some parts of the program and not for others
    hence,
  • we will relax some of the constraints for
    sequential
  • consistency for most of the program, but
    enforce them
  • for specific portions of the code
  • Fence instructions are special instructions that
    require
  • all previous memory accesses to complete before
  • proceeding (sequential consistency)

21
Fences
P1
P2
Region of code Region
of code with no races
with no races
Fence
Fence Acquire_lock
Acquire_lock Fence
Fence
Racy code
Racy code
Fence
Fence Release_lock
Release_lock Fence
Fence
22
Potential Relaxations
  • Program Order (all refer to different memory
    locations)
  • Write to Read program order
  • Write to Write program order
  • Read to Read and Read to Write program orders
  • Write Atomicity (refers to same memory
    location)
  • Read others write early
  • Write Atomicity and Program Order
  • Read own write early

23
Relaxations
Relaxation W ? R Order W ? W Order R ?RW Order Rd others Wr early Rd own Wr early
IBM 370 X
TSO X X
PC X X X
SC X
  • IBM 370 a read can complete before an earlier
    write to a different address, but a
  • read cannot return the value of a write
    unless all processors have seen the write
  • SPARC V8 Total Store Ordering (TSO) a read can
    complete before an earlier
  • write to a different address, but a read
    cannot return the value of a write by another
  • processor unless all processors have seen the
    write (it returns the value of own
  • write before others see it)
  • Processor Consistency (PC) a read can complete
    before an earlier write (by any
  • processor to any memory location) has been
    made visible to all

24
Performance Comparison
  • Taken from Gharachorloo, Gupta, Hennessy,
    ASPLOS91
  • Studies three benchmark programs and three
    different
  • architectures
  • MP3D 3-D particle simulator
  • LU LU-decomposition for dense matrices
  • PTHOR logic simulator
  • LFC aggressive lockup-free caches, write
    buffer with
  • bypassing
  • RDBYP only write buffer with bypassing
  • BASIC no write buffer, no lockup-free caches

25
Performance Comparison
26
Summary
  • Sequential Consistency restricts performance
    (even more
  • when memory and network latencies increase
    relative to
  • processor speeds)
  • Relaxed memory models relax different
    combinations of
  • the five constraints for SC
  • Most commercial systems are not sequentially
    consistent
  • and rely on the programmer to insert
    appropriate fence
  • instructions to provide the illusion of SC

27
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com