Shared Memory Consistency Protocol Verification against Weak Memory Models: Refinement via Model-checking - PowerPoint PPT Presentation

Loading...

PPT – Shared Memory Consistency Protocol Verification against Weak Memory Models: Refinement via Model-checking PowerPoint presentation | free to download - id: 731cc7-YzcxN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Shared Memory Consistency Protocol Verification against Weak Memory Models: Refinement via Model-checking

Description:

Shared Memory Consistency Protocol Verification against Weak Memory Models: Refinement via Model-checking Prosenjit Chatterjee, Hemanthkumar Sivaraj, – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 43
Provided by: Nand79
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Shared Memory Consistency Protocol Verification against Weak Memory Models: Refinement via Model-checking


1
Shared Memory Consistency Protocol Verification
against Weak Memory Models Refinement via
Model-checking
  • Prosenjit Chatterjee,
  • Hemanthkumar Sivaraj,
  • Ganesh Gopalakrishnan
  • School of Computing, University of Utah
  • http//www.cs.utah.edu/formal_verification/

pchatterjee_at_nvidia.com
hemanth, ganesh _at_ cs.utah.edu
Supported by NSF awards CCR 9987516 and
0081406, and equipment gift from Intel Corpn.
2
Shared memory multiprocessors
Desktop machines
Servers and Supercomputers

dir
dir
3
How is the programmers view classically
specified?
Logical View
sequential consistency
Processors
(Coherence means per location SC)
Memory
One disallowed scenario
st(b,2) ld(a,0)
st(a,1) ld(b,0)
Initial memory contents 0
cpu
cpu
Peterson? No!
mem
4
Growing CPU / Memory performance gap necessitates
weakenings
Bypassing (read back own store before others)
Aggressive load/store reorderings
Strong orderings only at acquires/releases
.
cpu
cpu
cpu
mem

dir
dir
all that and more!
5
Overall Features of Weak Memory Models
  • Support ordinary as well as special loads
    and stores
  • Support fences and synchronization primitives
  • Orderings may even depend on dynamic context
  • gt Provide a much larger range of load-values

Therefore
  • Writing a formal specification is highly
    non-trivial
  • Writing a spec that supports verification is
    even trickier

6
A variety of highly intricate weak memory models
exist
PowerPC model
Alpha model
Sparc TSO / PSO / RMO
Itanium RC_tso
Java mem model
7
One almost wishes to go back to SC
sequential consistency IS good
  • Simplifies programming
  • Some hardware tricks to hide latencies

It does not seem a realistic goal for now
  • Range of such tricks limited
  • Complexity for end-users is containable

8
The Verification Problem
Given
  • a formal specification of a weak consistency
    model (SPEC)
  • a finite-state model of the shared memory system
    (IMP)

Verify that
  • the executions of IMP executions
    allowed by SPEC

Our work enables this checking to be achieved
using finite-state reachability
9
Related Work
For SC
  • Qadeer CAV99, SRC TR 176
  • Condon, Hu, et.al. SPAA01
  • Nalumasu et.al. CAV98
  • Dist. Computing Special Issue 99
  • MPV Workshop Post FMCAD00

For Weak Models
  • Qadeer MPV workshop
  • Condon et.al. HPCA99
  • Ghughal and Gopalakrishnan FMPPTA00

10
Our Emphasis
  • Simple and intuitive SPECs
  • Support a wide range of memory models
  • Support automated (finite-state) verification
  • Avoid backtracking search over SPECs executions
  • Avoid bloating state-space beyond that of IMP

11
Verification Criterion Illustrated
Show that
P1 st(a,1) ld(a)
P2 ld(a)
st(a,1) ld(a,1) ld(a, 0)
IMP
Implies
Same execution
Same program
Spec
12
Idea Employ a model-checker to establish
refinement
Executable SPEC
load
store
load
load values agree
load values agree

store
load
load
IMP
  • Must do a non-backtracking search over SPECs
    executions
  • SPEC must be deterministic with respect to
    recorded events

therefore
13
What events do we record? Not just Loads and
Stores!
SPEC Carbon-copy of Imp
st(a,1) ld(a,1) ld(a, 1)
P1 st(a,1) ld(a)
P2 ld(a)
eh?
phew!
st(a,1) ld(a,1) ld(a, 0)
Imp
P1
P2
st(a,1) ld(a,1) ld(a, 1)
st
st
ld
ld
  • st(a,1) drained to M
  • ld(a,1) , ld(a,1) read from M

LB
LB
SB
SB
st(a,1) ld(a,1) ld(a, 0)
M
  • st(a,1) in SB ld(a,1) from SB
  • ld(a,0) from M

14
Use Visibility Order style SPECs
  • -- Already growing in use (Itanium spec,
    Neiger, Condon, )
  • -- Helps export internal events to determinize
    SPECs executions
  • -- Defines Read Values to depend on most
    recent write

Choices revealed..
ld(a,0) st_G(a,1)
SPEC Carbon-copy of Imp
st_L(a,1) ld(a,1) st_G(a,1) ld(a,1)
P1 st(a,1) ld(a)
P2 ld(a)
Imp
st_L(a,1) ld(a,1) ld(a, 0) st_G(a,1)
15
Example of Visibility Order Spec (Condon, HPCA99)
In non-Visibility Order
In Visibility Order style
  • program order
  • a total order of LD, ST_L, ST_G
  • is in TSO if
  • (Memory order constraints)
  • conditions on split stores
  • Read value rule
  • Value of LD, X
  • -- most recent ST_L, when ST_G is after X
  • (local
    bypassing)
  • -- most recent ST_G, otherwise
  • (local bypassing not exercised)
  • program order
  • memory order
  • is in TSO if
  • (Memory order constraints)
  • X Y /\ isLD(X) /\ isST(Y) gt X Y
  • X MB Y gt X Y
  • Read value rule
  • Value of LD, X
  • Value of closest store Y
  • before or after X in
  • (local bypassing detail is messy)

16
Our Contributions
  • Visibility order SPECs for a wide range of mem
    models
  • Built executable SPEC generator prototype
  • (runnable over web)
  • Verification of refinement using Parallel Murphi
  • (ported to MPI at Utah)
  • Verification without bloating IMPs state-space
  • and without backtracking on SPECs executions
  • Two snoopy-bus protocols modeled after Alpha and
    Itanium
  • Two snoopy protocols where temporal order !
    visibility order
  • One directory-based protocol (Avalanche
    multiprocessor)

17
Details of our solution Addressing the large
abstraction distance
IMP
SPEC

dir
dir
18
Approach Exploit Bug-classification

Inside Directories, Interconnects,
Inside CPU chips
(More design groups have control over this)
(Fewer design groups have control over this)
So develop Intermediate Abstraction
that Retains External Partition
19
The Intermediate Abstraction
Intermediate Abstraction
SPEC
IMP
Retain internal partition
THIS PAPER
FUTURE WORK

Visibility order Read-value rule
dir
dir
Simplify external partition
20
External Partition Replacement Depends on SPEC
Memory Model
Strong
Weak
Weakest
Hybrid
PC PowerPC PRAM
Slow Memory Cache C
Causal C
Itanium Weak C Entry C
Release C
TSO
S C IBM370
PSO
RMO
Alpha
( C means Consistency )
21
Abstraction Method for External Partition
Memory Model Splitting of store instructions External Partition
Strong store unsplit single port memory
Weak store single port memory
Weakest store Memory re-order buffer per processor
Hybrid store Memory re-order buffer per processor
local global
local global global
local global global
22
Creating the Intermediate Abstraction
CPU1
CPU2
Pipe
Pipe
st
st
ld
ld
RB
RB
SB
SB
Snoopy-bus or Directory-based Memory Subsystem
23
Overall approach
Generate Executable Spec Run it, and gain
understanding
Define Spec
Start
Phase 1
Final Spec
Annotate Imp with events
Design Imp
Phase 2
Annotated Imp
Failure
Verify Impabs
Success
Phase 3
Derive Impabs
Verify against Impabs
Final Imp
24
Verification
Intermediate Abstraction
IMP
25
Runs on 16 CPU Parallel Murphi ported to MPI at
Utah Each CPU _at_ 850 MHz, 256 Mb per node (LAN
communication)
Alpha model w/o Barriers and LL/SC
Itanium w/o weak ld/st Semaphores (RC_tso)
Protocol States (M) Trans (M) Time (h) States (M) Trans (M) Time (h)
Split Trans Bus 64 470 0.95 111 985 1.75
-- with Scheurich Opt 251 1794 3.4 325 2769 4.8
Multiple Interleaved Bus 255 1820 3.6 773 2686 11
-- with Scheurich 278 1946 3.9 927 3402 12
26
  • Features of Examples
  • Examples with Scheurichs optimization
  • -- Logical order ! Temporal order
  • Directory Protocols
  • -- a Migratory directory protocol using PV and
    SPIN
  • found no errors (parallel search not tried)
  • Other directory protocols as well as
  • Itanium (hybrid) memory model soon to be tried

27
Bugs likely to be caught
  • Not just coherence
  • SC violations
  • Write atomicity violations
  • Hybrid memory ordering violations
  • Bugs in internal partition
  • will be caught when
  • intermediate abstraction compared against SPEC

28
How to scale up?
  • Improve parallel model-checker
  • Approximate search (e.g., parallel random-walk)
  • Bounded model-checking (enumerative or SAT)
  • Exploit data independence
  • Try many examples, and refine methodology

29
Conclusions
  • Efficient use of reachability analysis to verify
  • IMP against weak memory model SPEC
  • Applicable to a whole range of weak models
  • Selection of Intermediate Abstraction is
    systematic
  • Annotating Intermediate Abstractions is not hard
  • State explosion problem is not worsened
  • An easy-to-use verification technique that
  • multiprocessor designers can use readily.

30
Extra Slides
31
Visibility Order explained using SC
  • SC executions have a single visibility order, V
  • Stores present in V consistent with prog. order
    (single store order)
  • Loads present in V consistent with prog. order
  • Each load to address A returns value D
  • that the most recent store in V to A wrote

NON-SC
SC
P1
P2
P1
P2
st(b,2) ld(a,0)
st(a,1) ld(b,0)
ld(q,2) ld(p,1)
st(p,1) st(q,2)
st(a,1) ld(b,0) st(b,2) ld(a,0)
st(p,1) ld(q,2) st(q,2) ld(p,1)
whoops!
OK!
32
Writing visibility order specs for weak memory
models
Can use single or multiple visibility orders MPV
workshop slides, see http//www.cs.utah.edu/mpv
Multiple VO needed for some weak mem models.
Single visibility order for TSO
st_L(a,1) ld(a,1) ld(a,0) st_G(a,1) ld(a,1)
P1
P2
Visibility order of P1
..of P3
xxx ld(a,0) ld(a,1)
st(a,1) ld(a,1)
st(p,1) ld(p,1) st(p,2) ld(p,2)
st(p,2) ld(p,2) st(p,1) ld(p,1)
Split stores into Local and Global Single
Global-store Order
Stores kept unsplit
33
  • Always use single Visibility Order
  • Makes specification more intuitive
  • Can annotate Implementation model with
  • coherency events to obtain generated VO
  • Can compare against reliable Spec that
  • encompasses all legal VO using reachability
    analysis

Our main idea
Single visibility order for Itanium, obtained
by splitting every Store into N copies
st_1(p,1) ld(p,1) st_1(p,2) ld(p,2) st_2(p,2)
ld(p,2) st_2(p,1) ld(p,1)
P1
P2
P3
st(p,1) st(p,2)
ld(p,2) ld(p,1)
ld(p,1) ld(p,2)
34
Related Work on Verifying Against Weak Memory
Models
  • Ghughal et.al. FMPPTA00
  • -- Extension of Colliers work to weak memory
    models
  • -- Finite-state abstraction of ARCHTESTs to
    detect
  • ordering violations
  • Condon, Hill, Plakal, Sorin et.al HPCA99
  • -- Idea based on Lamport Clocks
  • -- Define Wisconsin TSO ordering for
    execution events
  • -- Assign Lamport Clock values to coherency
    events
  • -- Manual proof that Lamport Ordering (which
    traces
  • causalities, and hence read values) implies
    Wisconsin TSO
  • -- Defines single visibility order idea, but
    shows it only for
  • subsets of TSO and Alpha

Main inspiration for our work
35
What are the observable effects on programs?
ld(a,2) st(b,1)
ld(b,1) st(a,2)
lost atomicity
.
cpu
cpu
mem
st(b,2) ld(a,0)
ld.acq(q,2) ld(p,1)
st(a,1) ld(b,0)
st(p,1) st.rel(q,2)
only certain guarantees on executions
.
cpu
cpu
cpu
cpu
mem
36
The Verification Problem
  • Shared Memory Implementations are very complex
  • Spec (shared memory consistency models) also
    highly non-trivial
  • gt Verification engineers face a double-whammy
  • Mini Roadmap
  • Identifying the sources of memory model
    related bugs
  • Related work on verifying against weak memory
    models
  • How to verify against a broad taxonomy of mem
    models

37
Proc
Proc
st
st
ld
ld
RB
RB
SB
SB
Single Port Memory
38
The Verification Problem
  • Shared Memory Implementations are very complex
  • Spec (shared memory consistency models) also
    highly non-trivial
  • gt Verification engineers face a double-whammy
  • Mini Roadmap
  • Identifying the sources of memory model
    related bugs
  • Related work on verifying against weak memory
    models
  • How to verify against a broad taxonomy of mem
    models

39
Where are Ordering Relaxations Made?

Inside Directories, Interconnects,
Inside CPU chips
(More design groups have control over this)
(Fewer design groups have control over this)
Techniques that focus on the external partition
can still be quite useful
40
Methodology
Intermediate Abstraction
IMP
  • Annotate Imp protocol with events of visibility
    order
  • -- designer reflects his/her understanding of
    mem model and Imp
  • Replace external partition specific to target
    memory model
  • Annotate intermediate abstraction thus obtained
  • Run reachability, matching every visibility event
    of Imp by
  • one produced by Intermediate Abstraction

41
Taxonomy of memory models, and external
partitions for them (can use these in combination
for hybrid models)
Strong
Weak
Weakest
Hybrid
Instructions of many varieties Fences, Acq / Rel
Write Atomicity No local bypassing
Write Atomicity Local bypassing
No Write Atomicity Coherence
Pictures of ext partitions as well as brief
explanation (pictorial) of how event-splitting is
done
42
One allowed scenario under a weak memory model
(e.g. Sparc TSO)
st(b,2) ld(a,0)
st(a,1) ld(b,0)
cpu
cpu
mem
About PowerShow.com