Formal Analysis and Code Generation Support for MPI - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Formal Analysis and Code Generation Support for MPI

Description:

... (traditional and 'in situ') (overview GG) ... In situ Model Checking. Verifying One-Sided MPI Constructs ... In-Situ Model Checker for MCCS MPI Programs ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 75
Provided by: DAM4
Category:

less

Transcript and Presenter's Notes

Title: Formal Analysis and Code Generation Support for MPI


1
Formal Analysis andCode Generation Supportfor
MPI
  • Ganesh Gopalakrishnan
  • Mike Kirby
  • School of Computing and
  • Scientific Computing and Imaging (SCI) Institute
  • University of Utah
  • Salt Lake City, UT, USA

2
We are the Gauss Group, School of
Computing, University of Utah, Salt Lake City, UT
  • Faculty
  • Ganesh Gopalakrishnan (main area FV)
  • Mike Kirby (main area HPC)
  • Students
  • Robert Palmer (PhD)
  • Yu Yang (PhD)
  • Salman Pervez (PhD)
  • Sonjong Hwang (BS/MS)
  • Geoffrey Sawaya (BS)
  • Summer Visitor
  • Igor Melatti
  • Funding Acknowledgement
  • Microsoft HPC Institute
  • Formal Analysis and Code Generation Support
    for MPI
  • (Also supported on a complementary project
    thru NSF grant
  • CSRSMA -- Toward Reliable and Efficient
    Message Passing Software Through Formal Analysis)

3
What is Formal Analysis, and Why do It?
  • The Engineering Approach
  • Model, Analyze, Debug, and Improve software /
    systems
  • Success Stories
  • Windows Device Driver Development Kit
  • Initial ideas from the area of
    predicate-abstraction
  • for C programs (Ball, Rajamani,)
  • Later optimized with powerful analysis engines
  • (Ball, Cook, )
  • Automated to analyze source trees and run
  • checks
  • 4 years from possibility to reality
  • Similar story in cache coherence / MP design
    (SRC
  • project at Utah)
  • Our vision make this happen for HPC software!
  • Having found four focal areas, we are going
    after them!
  • Many differences in terms of FV problem
    formulation / solution

4
Focal Areas
  • Formal Modeling of MPI
  • Model Checking MPI programs
  • Verifying Advanced / New MPI Features e.g.,
    One-Sided Communication
  • Parallel Model Checking

5
Organization of the Rest of the Talk
  • Modeling of the MPI Library (overview GG)
  • Model Checking (traditional and in situ)
  • (overview GG)
  • Verifying One-Sided MPI Constructs (overview
    GG)
  • Parallel/Distributed Model Checking (overview
    GG)
  • Verifying Byte-Range Locking (details MK)
  • Parallel/Distributed Model Checking (details
    MK)
  • Future Plans (MK)

6
Reasons for our thrusts
Modeling of the MPI Library In situ Model
Checking Verifying One-Sided MPI Constructs
Parallel Model Checking
MPI Widely used HPC library with COMPLEX and
EVOLVING semantics (1-sided, threading)
Large MPI programs are MPI Calls Hanging off a
Program Scaffolding. Hence Finite State
Machine model extraction Model Checking is
ineffective in many cases
Some of the new MPI Extensions are Poorly
Understood
Parallelism can benefit even the verification
process !!
7
Brief Overview of Specific Results
8
1. MPI Library Modeling
  • Informal Spec Documents do not help answer what
    ifs
  • We need specs using which we can calculate
    outcomes
  • A cal (Lamport, MSR) Formal Spec for MPI
    (Palmer)
  • Semantics cal ? TLA ? Mathematical Logic /
    Sets
  • Executability cal ? TLC model checker ?
    Scenario Oracle
  • Analyzability cal ? Boolean Propositions ?
    answering more
  • open-ended scenarios with all possible
    answers for instance
  • Coverage Covered many aspects of MPI-2
  • Presentation Formal Spec Scenario Oracle will
    be hyper-
  • linked into MPI HTML
    Formal Spec document

9
2. Model Checking
  • Traditional Extract / Analyze control
    skeletons
  • Now developing a finite model checker tailored
  • for MPI (M language and model checker)
  • Integration into VisualStudio Environment (in
  • progress) written in C
  • In-Situ Model Checking
  • Instrument downscaled MPI programs to record
  • states intelligently schedule state
    exploration
  • Socket based instrumentation (prototype runs)
  • PMPI based instrumentation (collab with Argonne)

10
3. Verifying One-Sided MPI Constructs
  • MPIs One-Sided Primitives are Poorly Understood
  • bugs in simple algorithms, proposed fixes,
  • The Semantics are Non-trivial (Writes / Reads
    into
  • Windows are unordered) like weak mem
    models
  • Pervez Analyzed and Found Bug in Byte-Range
  • Locking Protocol Published in EuroPVM 2005 (by
  • Thakur and Latham)
  • Proposed Two Fixes (joint work with Thakur,
    Gropp)
  • Details in Mikes talk

11
4. Parallel Model Checking
  • Finite State Verification Models of even
  • downscaled MPI programs are VERY large
  • Parallel and Distributed Exploration of these
    FSMs
  • using Clusters Running MPI is attractive!
  • A Model Checker Eddy has been developed
  • Two Threads per CPU
  • One Thread for State Generation
  • Another Thread for State send / receive
  • States coalesced into Lines Before Shipping
  • Win32 Threads / MCC Porting Done
  • Gives Linear Speedups (CPU and memory
  • across clusters utilized)

12
Details Verification of a Byte-Range Locking
Algorithm
13
Byte-Range lockingusing MPI one-sided
communication
  • One process makes its memory space available for
    communication
  • Global state is stored in this memory space
  • Each process is associated with a flag, start and
    end values stored in array A
  • Pis flag value is in A3 i for all i

14
Lock Acquire
lock_acquire (strat, end) 1 val0 1 /
flag / val1 start val2 end 2
while(1) 3 lock_win 4 place val in
win 5 get values of other processes from
win 6 unlock_win 7 for all i, if (Pi
conflicts with my range) 8 conflict
1 9 if(conflict) 10 val0 0 11
lock_win 12 place val in
win 13 unlock_win 14
MPI_Recv(ANY_SOURCE) 15 16
else 17 / lock is acquire / 18
break 19
15
Lock Release
lock_release (strat, end) val0 0 /
flag / val1 -1 val2 -1 lock_win
place val in win get values of other
processes from win unlock_win for all i,
if (Pi conflicts with my range)
MPI_Send(Pi)
16
Error Trace
17
Error Discussion
  • Problem too many Sends, not enough Recvs
  • Not really a problem
  • Messages are 0 bytes only
  • Sends could be made non-blocking
  • Maybe a problem
  • Even 0 byte messages cause unknown memory leaks,
    not desirable
  • More importantly, if there are unused Sends in
    the system, processes that were supposed to be
    blocked may wake up by consuming these. This ties
    up processor resources and hurts performance!

18
Some Details Parallel Model Checking The
Eddy-Murphi Model Checker
19
Parallel Model Checking
  • Each computation node owns a portion of the
    state space
  • Each node locally stores and analyzes its own
    states
  • Newly generated states which do not belong to the
    current node are sent to the owner node
  • Standard distributed algorithm may be chosen for
    termination

20
Eddy Algorithm
  • For each node, two threads are used
  • Worker thread analyzes, generates and partitions
    states
  • If there are no states to be visited, it sleeps
  • Communication thread repeatedly sends/receives
    states to/from the other nodes
  • It also handles termination
  • Communication between the two threads
  • Via shared memory
  • Via mutex signals primitives

21
Worker Thread
Communication Thread
Consumption Queue
Receive and process inbound Messages Initiate
Isends Check completion of Isends
Take State Off Consumption Queue Expand
State (get new set of states) Make decision
about Set of states
Hash
Communication Queue
22
The Communication Queue
  • There is one communication queue for each node
  • Each communication queue has N lines and M states
    per line
  • States additions are made (by the worker thread)
    only on one active line
  • The other lines may be already full or empty

23
Eddy-Murphi Performance
24
Eddy-Murphi Performance
25
Summary
  • Complex systems can benefit from formal methods
  • Rich interdisciplinary ground for FM and HPC to
    interact
  • Win-Win scenarios exist

26
Future Plans
  • Publish Formal and Analyzable MPI-2 Spec
  • Develop Formal Property-based Tests for MPI
    Libraries such as MCCS
  • Build Model Checker for MPI Programs that treats
    MPI calls as Primitives (more efficient
    partial-order reductions, thanks to MPI
    Semantics)
  • Large-scale Eddy-Murphi runs on MCCS
  • In-Situ Model Checker for MCCS MPI Programs
  • Release of Tools

27
Software and Publications
  • Eddy Software Available
  • Formal Spec of MPI-2
  • reliminary stable version would be hyperlinked
    into HTML MPI-2 spec
  • Two workshop papers submitted (Yu Palmer) to
    Verify06 and RTVA06
  • EuroPVM06 paper (Pervez, Ganesh, Mike,
  • Gropp, Thakur) accepted

28
Workshop on Thread Verification (TV06, sponsored
by Microsoft) 2-day program, 13 papers, 5
invited talks Seattle, WA, Aug 21-22 See
http//www.cs.utah.edu/tv06
29
Extra Slides
30
Our Approach Toward Code Generation
Support for MPI
  • Describe Executable MPI-2 (MPI) Semantic Model
  • Develop MPI Formal Semantics
  • Develop Facility to Calculate Scenario Outcomes
  • from Formal Semantics
  • Develop Test Suites for MPI Libraries Based on
  • Formal Semantics
  • Detect Errors in User MPI Code
  • Deadlocks
  • Resource (e.g. Buffer) Usage Violations
  • Support Code Optimizations
  • Develop Facility to Show Correctness of
    Designer
  • Introduced Code Transformations

31
The Communication Queue
  • Summing up, this is the evolution of a line
    status

WTBA
Active
WTBS
CBS
32
Eddy-Murphi Performance
  • Tuning of the communication queuing mechanism
  • High number of states per line is required
  • Much better sending many states at a time
  • Not too few number of lines
  • Or the worker will not be able to submit new
    states

33
Eddy-Murphi Performances
  • Comparison with previous versions of parallel
    Murphi
  • When ported to MPI, old versions of parallel
    Murphi perform worse than serial Murphi
  • Comparison with serial Murphi almost linear
    speedup is expected

34
A Formal Model of MPI Process Interaction
  • Robert Palmer Mike Kirby Ganesh Gopalakrishnan

35
Goals
  • Create a human readable and understandable formal
    document to supplement the written standard
  • CAL
  • ZF set theory first order logic weak temporal
    logic (TLA)
  • Sequencing scope processes (Pseudo-code
    flavored)
  • Model behaviors of MPI to preserve correctness
    properties
  • Deadlock
  • Local assertion violations
  • Capture communication behavior implied by the MPI
    standard
  • Standard/Asynchronous mode point to point
    operations
  • Collective operations
  • Set the formality sufficiently high that
    automated reasoning assistance can be applied
  • Theorem proving and Model checking
  • Useful for discovering difficult to reproduce
    errors in reactive systems
  • Cover an interesting (but still limited) subset
    of the standard
  • Point to point and collective communications that
    transmit data
  • Wait on topology, communicator manipulation etc.
  • Abstract away most of the data passing (i.e.,
    focus on the semantics of communication)
  • Assume static process system

36
Communicator
  • Defines the communication universe for processes
    in MPI
  • Group (Set of processes and a ranking function)
  • Context (Where messages are posted)
  • Virtual Topology
  • Attributes
  • Shared memory based model
  • Communicator objects are globally visible to
    processes in the computation
  • Accessed by an integer handle as in MPICH

( The set of communicator objects in the
computation. ) variables comm i \in
(0..(MAX_COMM - 1)) -gt p2p
-gt data -gt 0, src -gt 0,
dest -gt 0,
type -gt "MPI_INT",
tag -gt 1,
state -gt
"vacant",
col -gt
root -gt 0, type -gt
"MPI_INT", state -gt
"vacant", processes_in -gt ,
participants -gt ,
group -gt pr pr \in (0..(N-1)),
groupsize -gt N, ranking -gt
p \in (0..(N-1)) -gt 0,
inverseranking -gt p \in (0..(N-1)) -gt 0,
lastrank -gt 0
37
Point to Point Basis
  • Communication happens with respect to the context
    of a communicator
  • Algorithm for posting a point to point send
    (receive) to the context
  • When the state of the context is vacant the send
    (receive) can be posted directly to the context.
    The sender (receiver) then waits for the state to
    change to transmitting indicating that the
    matching receive has been posted. The sender
    (receiver) then changes the state of the context
    back to vacant and exits.
  • Or when the state of the context is initially
    receive (send) and the message matches the
    message already posted to the context, the sender
    (receiver) changes the state of the context to
    transmitting and exits.

post_send1 either when commc.p2p.state
"vacant"
commc.p2p m post_send2 when
commc.p2p.state "transmitting"
commc.p2p.state "vacant"
or when commc.p2p.state "recv"
/\ commc.p2p.data m.data
/\ commc.p2p.src m.src
/\ commc.p2p.dest m.dest /\
commc.p2p.tag m.tag /\
commc.p2p.type m.type
commc.p2p.state "transmitting"
end either
38
MPI Specified P2P Operations
  • Asynchronous mode send start
  • Enqueue communication requests into a process
    local FIFO queue
  • Consume buffer or check for a match as
    appropriate
  • Asynchronous mode send complete
  • Post messages from the process local FIFO to the
    context of the communicator until the requested
    message completes
  • Release resources as appropriate
  • Standard mode communications
  • Couple corresponding asynchronous start and
    complete

39
Collective Operations
  • Three flavors of collective operations (recall
    data is abstracted away)
  • Require all processes to wait until some
    particular process has entered the communication
  • MPI_Bcast, MPI_Scatter
  • Require at least one process to wait until all
    processes have entered the communication
  • MPI_Gather, MPI_Reduce
  • Require all processes to wait until all processes
    have entered the communication
  • MPI_Barrier, MPI_All_reduce
  • The standard does not require but allows (1) and
    (2) above

40
Type 1 Collective
  • Three states and three transitions are used to
    implement this protocol
  • All processes entering the communication add
    themselves to the participant set. These
    processes then wait until the root process is in
    the participant set. All processes are allowed
    to exit once root has entered the communication.
    The last process to exit resets the collective
    state and participant set.

col_one1 when (commc.col.state "vacant"
\/ commc.col.state
"collective in) /\ self \notin
commc.col.participants commc commc
EXCEPT !.col.state "collective in",

!.col.participants _at_ \cup self,


!.col.processes_in _at_ \cup self col_one2
when root \in commc.col.participants
if (commc.col.processes_in \ self
/\ commc.col.participants
commc.group) then
commc commc EXCEPT !.col.state
"vacant",
!.col.participants , !.col.processes_in
else
commc commc EXCEPT !.col.processes_in _at_
\ self end if
41
Type 2 Collective
  • Three states and four transitions are used to
    implement this protocol
  • If the process entering the communication is not
    root it adds itself to the set of participants
    and exits. If the process is root, it also adds
    itself to the set of participants. The root
    process then waits until the set of participants
    is equal to the group for the communicator. The
    root process is then allowed to exit.

col_two1 either when commc.col.state
"vacant"
commc.col.state "col_two in"
commc.col.participants self
or when commc.col.state
"col_two in"
commc.col.participants commc.col.participan
ts \cup self end either
col_two2 if commc.rankingself root
then when commc.col.participants
commc.group commc.col.state
"vacant" commc.col.participants
end if
42
Type 3 Collective
  • Algorithm requries three states and five
    transitions
  • The first process to post this type of message
    changes the state from vacant to col3 in. Other
    processes in the communicator are collected until
    all processes have joined the communication The
    guard at col32 then becomes enabled, changing the
    state to col3 out. All processes then exit the
    communication. The last process to exit returns
    the state of the collective context to vacant.

col31 either when commc.col.state "vacant"
( First process in the barrier. )
commc commc EXCEPT !.col.state
col3 in", !.col.participants _at_ \cup self
or when commc.col.state col3
in" /\ self \notin commc.col.participants
commc commc EXCEPT
!.col.participants _at_ \cup self
end either col32 either when commc.col.partici
pants commc.group ( First out. )
commc commc EXCEPT
!.col.participants _at_ \ self, !.col.state
col3 out" or when
commc.col.state col3 out" /\
commc.col.participants / self ( Middle
out. ) commc commc EXCEPT
!.col.participants _at_ \ self or
when commc.col.state col3 out /\
commc.col.participants self ( Last out.
) commc commc
EXCEPT !.col.state "vacant", !.col.participants
end either
43
Motivation
  • Software model checking in the presence of
    library calls is hard
  • Model extraction and verification may not scale
  •  it may also miss bugs due to modeling
    assumptions
  • In-situ model checking can help
  • run the instrumented program directly with an
    external schedule to control the programs
    execution
  • need to avoid redundant interleavings
  • need to retain enough central scheduling control

44
Avoid Redundant Interleaving with DPOR1
  • Static Partial Order Reduction relies on static
    analysis
  • to yield approximate information about run-time
    behavior
  • pointers gt coarse information gt limited POR gt
    state explosion Partial-order reduction
  • explores subset of the state space, without
    sacrificing soundness
  • Dynamic Partial Order Reduction
  • while model checker explores the programs state
    space,
  • it sees exactly which threads access which
    locations
  • use to simultaneously reduce the state space
    while model-checking

45
An Example of DPOR
46
In-situ model checking concurrent programs
  • 1. Instrument the program to add
    request/permit routines before the
    synchronization routines/access to shared objects
    (this phage can be automated)
  • 2. Compile and run the program and have the
    scheduler record the request trace
  • 3. The schedule generates other possible
    interleavings
  • 4. Run the program again according to those
    interleavings

47
Current Status
  • Implementation underway
  • Efficiency will be measured  - naive schedules
    vs DPOR generated  - rules for filtering
    infeasible executions -    proving them complete

48
References
  • 1 Cormac Flanagan and Patrice Godefroid,
    Dynamic Partial-Order Reduction for Model
    Checking Software, POPL05.

49
Independent transitions1
  • B and R are independent transitions if
  • they commute B R R B
  • neither enables nor disables the other
  • Example x 3 and y 1 are independent

50
Byte-Range lockingusing MPI one-sided
communication
  • One process makes its memory space available for
    communication
  • Global state is stored in this memory space
  • Each process is associated with a flag, start and
    end values stored in array A
  • Pis flag value is in A3 i for all i

51
Lock Acquire
lock_acquire (strat, end) 1 val0 1 /
flag / val1 start val2 end 2
while(1) 3 lock_win 4 place val in
win 5 get values of other processes from
win 6 unlock_win 7 for all i, if (Pi
conflicts with my range) 8 conflict
1 9 if(conflict) 10 val0 0 11
lock_win 12 place val in
win 13 unlock_win 14
MPI_Recv(ANY_SOURCE) 15 16
else 17 / lock is acquire / 18
break 19
52
Lock Release
lock_release (strat, end) val0 0 /
flag / val1 -1 val2 -1 lock_win
place val in win get values of other
processes from win unlock_win for all i,
if (Pi conflicts with my range)
MPI_Send(Pi)
53
Error Trace
54
Solution - Picking
  • Main Idea The process about to be blocked (Pi)
    picks who will wake it up (Pj) and indicates so
    by writing to shared memory in lines 11 and 13
  • It is possible that Pj leaves before seeing this
    information. If this is a case the next
    conflicting (Pk) process must be picked.
  • If no such Pk exists, the lock must be retried.
  • Similarly, if picking Pj causes deadlock, Pk must
    be picked instead.

55
Avoiding Deadlock
  • Once processes declare their intentions globally,
    deadlock can be avoided.
  • For there to be deadlock, a dependency cycle must
    exist.
  • The last process to complete this cycle will know
    and must not do so.

56
Lock Acquire - Picking
  • lock_acquire (strat, end)
  • 1 val0 1 / flag / val1 start
    val2 end val3 -1 / pick /
  • 2 while(1)
  • 3 lock_win
  • 4 place val in win
  • 5 get values of other processes from win
  • 6 unlock_win
  • 7 for all i, if (Pi conflicts with my
    range)
  • 8 conflict 1 remember Pi
  • 9 if(conflict with Pi)
  • 10 val0 0 val3 Pi
  • 11 lock_win
  • 12 place val in win
  • 13 unlock_win
  • 14 if(Pi has left deadlock_possible)
    i goto 9
  • 15 else MPI_Recv(ANY_SOURCE)
  • 16
  • 17 else

57
Orthogonal compositionof primitives
Can perhaps model MPI this way Bsndinit Rsndinit
Rcvinit Probe Test Wait on the LHS,
and nonlocal a problem (need conceptual
bits) Send -gt S,R,B -gt Ssend, Rsend, Bsend -gt
I -gt ISSENd, IRsend, IBsend Issend IRsend
are global other I are not
58
Expensive Resources are Involved!
10k/week on Blue Gene (180 GFLOPS)at IBMs Deep
Computing Lab
136,800 GFLOPS Max
59
HPC Software Development isInherently Quite
Complex!
  • Understand What is Simulated (the
    Physics, Biology, etc).
  • Develop a Mathematical Model
  • Generate Numerical Discretization of
    Model
  • Solve Numerical Problem
  • Experiment with Serial Solution (gain
    understanding)
  • Develop Parallel Algorithm
  • Check Consistency With Physics etc.
    (energy conservation)
  • Improve Load Balancing
  • Fight Grubby Realities of Libraries,
    Program Complexity,

60
General Reasons for picking our Four Thrusts
  • Natural conclusion of our own
  • study of key issues in the area
  • Close to no previous research in the area
  • Complementary Strengths of PIs
  • Mike (HPC) and Ganesh (FV)
  • Opportunity and luck
  • NSF funding, MS funding
  • Past HPC and FV research at Utah
  • Collaboration with Argonne
  • Enthusiastic Students !!

61
Specific Reasons for our thrusts
Modeling of the MPI Library In situ Model
Checking Verifying One-Sided MPI Constructs
Parallel Model Checking
MPI Widely used HPC library with COMPLEX and
EVOLVING semantics
Large MPI programs are MPI Calls Hanging off a
Program Scaffolding. Hence Finite State
Machine model extraction Model Checking is
ineffective in many cases
Some of the new MPI Extensions are
Extremely Poorly Understood
Parallelism can benefit even the verification
process !!
62
Some Nasty Realities
  • MPI Programming
  • Code Optimized to take advantage of
  • relaxed send / receive / probe orderings may be
    buggy
  • Too many MPI Functions (over 200 in MPI-2)
  • ? Misunderstandings, Buggy MPI Libraries
  • Threads and MPI often used together
  • Thread Programming Bugs, Unexpected Interactions
  • MPI Libraries Vary in the Allowed Range of
    Semantics
  • Code that takes advantage of one library doesnt
    port

63
What is Model Checking basedFormal Verification ?
  • Analog of wind-tunnel testing of airplane
    wings
  • for programs and hardware designs
  • Build Scaled-down Model, retaining essential
    features
  • Exhaustively Verify the Model
  • Experience shows that Exhaustive
    Verification of
  • downscaled Model often Superior to ad hoc
    testing of
  • full system descriptions which have a HUGE
    state space
  • Key Advances in Recent Times
  • Very Large Models can be Verified
  • Richer Assertions can be Verified
  • Yet, Little (or no) work in FV for HPC

64
Yet, verification (bug-hunting)is only a small
part of the complex picture!
  • Provide Formal Models that unambiguously specify
    MPI Library
  • Function Semantics
  • Make these Formal Models Runnable
  • Use Formal Specification to Assist MPI Program
    Transformations
  • Help Debug Large MPI Programs and / or MPI
    Library Implementations
  • Study Specific tricky MPI Constructs and
    Programs (e.g. Locking
  • protocols implemented using MPIs new 1-sided
    Constructs)
  • Speed-up Model Checking by using Multiple
    Threads and MPI
  • Processes

65
Past Work on FV for HPC
  • Avrunin and Siegel have published the following
    results
  • Hand-modeled MPI Library Functions in Promela
  • Identified some issues (bugs?) with the help
    of SPIN used as
  • a model checker
  • Proved that .. under certain conditions ..
    showing absence
  • of deadlocks in an MPI program using synchronous
  • communications guarantees absence of deadlocks
    even
  • after the communications are switched back to
    being async.
  • Pioneering work
  • but none recently
  • lots of areas not covered

66
Example 1 Modeling of the MPI Library
67
Variety of bugs that are common in parallel
scientific programs
  • Deadlock
  • Communication Race Conditions
  • Misunderstanding the semantics of MPI procedures
  • Resource related assumptions
  • Incorrectly matched send/receives

68
State of the art in Debugging
  • TotalView
  • Parallel debugger trace visualization
  • Parallel DBX
  • gdb
  • MPICHECK
  • Does some deadlock checking
  • Uses trace analysis

69
Related work
  • Verification of wildcard free models Siegel,
    Avrunin, 2005
  • Deadlock free with length zero buffers gt
    deadlock free with length gt zero buffers
  • SPIN models of MPI programs Avrunin, Seigel,
    Seigel, 2005 and Seigel, Mironova, Avrunin,
    Clarke, 2005
  • Compare serial and parallel versions of numerical
    computations for numerical equivalence

70
The Big Picture
proctype MPI_Send(chan out, int c)
out!c proctype MPI_Bsend(chan out, int c)
out!c proctype MPI_Isend(chan out, int c)
out!c typedef MPI_Status int MPI_SOURCE
int MPI_TAG int MPI_ERROR
MPI LibraryModel
int y active proctype T1() int x x 1
if x 0 x 2 fi y
x active proctype T2() int x x 2
if y x 1 y 0 fi assert( y
0 )
Compiler

ProgramModel
Model Generator

EnvironmentModel
Error Simulator
Refinement
Abstractor
Modeling Language
MC Server
Result Analyzer
MC Client
MC Client
MC Client
MC Client
MC Client
MC Client

OK
MC Client
MC Client
MC Client
71
Goal
  • Verification / Transformation of MPI programs
  • that is nice that you may be able to show my
    program does not deadlock but can you make it
    faster?
  • Verification of safety properties
  • Automatic optimization through verifiably safe
    transformation (Send with ISend/Wait, etc.)

72
Example 2 In Situ Model Checking
73
Motivation
  • Software model checking in the presence of
    library calls is hard
  • Model extraction and verification may not scale
  •  It may also miss bugs due to modeling
    assumptions
  • In-situ model checking can help
  • run the instrumented program directly with an
    external schedule to control the programs
    execution
  • need to avoid redundant interleavings
  • need to retain enough central scheduling control

74
In-situ model checking concurrent programs
  • 1. Instrument the program to add
    request/permit routines before the
    synchronization routines/access to shared objects
    (this phase can be automated)
  • 2. Compile and run the program and have the
    scheduler record the request trace
  • 3. The schedule generates other possible
    interleavings
  • 4. Run the program again according to those
    interleavings
Write a Comment
User Comments (0)
About PowerShow.com