Title: Scaling%20Formal%20Methods%20Toward%20Hierarchical%20Protocols%20in%20Shared%20Memory%20Processors
1Scaling Formal Methods Toward Hierarchical
Protocols in Shared Memory Processors
GRC CADTS Review, Berkeley, March 18, 2008
Presenters Ganesh Gopalakrishnan and Xiaofang
Chen School of Computing , University of Utah,
Salt Lake City, UT 84112 ganesh,
xiachen_at_cs.utah.edu http//www.cs.utah.edu/fo
rmal_verification
Supported by SRC Contract TJ-1318 (Intel
Customization)
2Multicores are the future!Their caches are
visibly central
gt 80 of chips shipped will be multi-core
(photo courtesy of Intel Corporation.)
3Hierarchical Cache Coherence Protocols will play
a major role in multi-core processors
Chip-level protocols
Intra-cluster protocols
mem
mem
dir
dir
Inter-cluster protocols
- State Space grows multiplicatively across the
hierarchy! - Verification will become harder
4Protocol design happens in the thick of things
(many interfaces, constraints of performance,
power, testability).
From High-throughput coherence control and
hardware messaging in Everest, by Nanda et.al.,
IBM J.RD 45(2), 2001.
5Future Coherence Protocols
- Cache coherence protocols that are tuned for the
contexts in which they are operating can
significantly increase performance and reduce
power consumption Liqun Cheng - Producer-consumer sharing pattern-aware protocol
Cheng et.al, HPCA07 - 21 speedup and 15 reduction in network traffic
- Interconnect-aware coherence protocols Cheng
et.al., ISCA06 - Heterogeneous Interconnect
- Improve performance AND reduce power
- 11 speedup and 22 wire power savings
- Bottom-line Protocols are going to get more
complex!
6Main Result 1 Hierarchical
Developed way to reduce verification complexity
of hierarchical (CMP) protocols using A/G
Intra-cluster
Home Cluster
Remote Cluster 1
Remote Cluster 2
L1 Cache
L1 Cache
L1 Cache
L1 Cache
L1 Cache
L1 Cache
L2 CacheLocal Dir
L2 CacheLocal Dir
L2 CacheLocal Dir
RAC
RAC
RAC
Global Dir
Inter-cluster
Main Mem
7Main Result 2 Refinement
Developed way to Verify a Proposed Refinement
of ONE unit into its low level (RTL)
implementation
8Main Result 2 Refinement
Developed way to Verify a Proposed Refinement
of ONE unit into its low level (RTL)
implementation
Murphi
9Main Result 2 Refinement
Developed way to Verify a Proposed Refinement
of ONE unit into its low level (RTL)
implementation
Murphi
10Main Result 2 Refinement
Developed way to Verify a Proposed Refinement
of ONE unit into its low level (RTL)
implementation
Murphi
HMurphi
11Differences in Modeling Specs vs. Impls
Multiple steps in low-level
One step in high-level
home
remote
home
remote
router
buf
an atomic guarded/command
12Our Refinement Check
Guard for Spec transition must hold
Spec transition
Spec(I)
Spec(I)
Observable vars changed by either must match
Multi-step Impl transaction
I
I
I is a reachable Impl state
13Workflow of Our Refinement Check
Murphi Spec model
Property check
Muv
Product model in Hardware Murphi
Product model in VHDL
Hardware Murphi Impl model
Check implementation meets specification
14Anticipated Future Result
Developed way to Verify a Proposed Refinement
of the ENTIRE hierarchy
15Anticipated Future Result
Deal with pipelining
Pipelined Interaction
Sequential Interaction
16Anticipated Future Result
Develop ways to tease apart protocols that are
blended in e.g. for power-down or post-si
observability enhancement
More protocols .. do they interfere?
17Basics
- PI Ganesh
Gopalakrishnan - Industrial Liaisons Ching Tsun Chou (Intel),
Steven M. Geman (IBM), John W. OLeary (Intel),
Jayanta Bhadra (Freescale), Alper Sen
(Freescale), Aseem Maheshwari (TI) - Primary Student Xiaofang Chen
- Graduation Date Writing PhD Dissertation in
the market - Other Students Yu Yang (PhD), Guodong Li (PhD),
Michael DeLisi (BS/MS) - Anticipated Results
- Hierarchical Methodology for Hierarchical
(Cache Coherence) Protocol Verification, with
Emphasis on Complexity Reduction (was in original
SRC proposal) - Refinement Methodology for Expressing and
Verifying Refinement of Higher Level Protocol
Descriptions (not in original SRC
proposal)
18Basics
- Deliverables (Papers, Software, Xiaofangs
Dissertation) - Hierarchical
- Methodology for Applying A/G Reasoning for
Complexity Reduction - Verified Protocol Benchmarks Inclusive,
Non-Inclusive, Snoopy (Large Benchmarks) - Automatic Abstraction Tool in support of A/G
Reasoning - Refinement
- Muv Language Design (for expressing Designs)
- Refinement Checking Theory and Methodology
- Complete Muv tool implementation
19Whats Going On
- Accomplishments during the past year
- Hierarchical
- Finishing Non-inclusive Hierarchical Protocol
Verif - Developing and Verifying a Hier. Protocol with a
Snoopy First Level
20Insert Table of Hier Snoopy Here
21Whats Going On
- Accomplishments during the past year (contd.)
- Refinement
- HMurphi was fleshed out in great detail
- Most of Muv was implemented (a large portion
during IBM T.J. Watson Internship) joint work
with Steven German and Geert Janssen
22Whats Going On
- Future directions
- Hierarchical Refinement
- Develop ways to verify hierarchies of HMurphi
modules interacting - Pipelining
- Teasing out protocols supporting non-functional
aspects - Power-down protocols
- Protocols to enhance Post-si Observability
- Architectural Characterization
- How do we describe the ISA of future multi-core
machines? - How do we make sure that this ISA has no hidden
inconsistencies
23Whats Going On
- Technology Transfer Industrial Interactions
- With Liaisons
- Publications
- FMCAD 06, 07, HLDVT 07, TECHCON 07 (best session
paper award), Journal paper (under prep),
Dissertation (under prep) - Request to IBM for Open-sourcing Muv has been
placed
24Overview of Hierarchical
- Given a protocol to verify, create a
- verification model that models a small
- number of clusters acting on a single
- cache line
Verification Model
Home Remote Global directory
Inv P
252. Exploit Symmetries
- Model home and the two remotes
- (one remote, in case of symmetry)
Verification Model
Inv P
264. Initial abstraction will be extreme
slowly back-off from this extreme
- P1 fails
- Diagnose failure
- Bug
- report to user
- False Alarm
- Diagnose where guard
- is overly weak
- Add Strengthening Guard
- Introduce Lemma to ensure
- Soundness of Strengthening
Inv P2
Inv P1
Inv P3
27Overview of Theory Involved
?
283. Create Abstract Models (three models in
this example)
Inv P1
Inv P2
Inv P
Inv P3
29Step 1 of Refinement
Inv P2
Inv P1
Inv P2
Inv P1
Inv P3
Inv P3
30Step 2 of Refinement
Inv P2
Inv P1
Inv P2
Inv P1
Inv P3
Inv P3
Inv P2
Inv P1
Inv P3
31Final Step of Refinement
Inv P2
Inv P1
Inv P2
Inv P1
Inv P3
Inv P3
Inv P2
Inv P1
Inv P2
Inv P1
Inv P3
Inv P3
32Detailed Presentation of RefinementNote Three
examples have been presented in full detail at
http//www.cs.utah.edu/formal_verification/muv
33Here, arrange the rest of the slides the new
ones you are making as you feel best. Most of the
remaining slides are quite good, so your work
need not include any clean-up but just delete
those already covered
34Project Summary Year 2
- Verification of hierarchical cache coherence
protocols - Non-inclusive multicore benchmark
- Compositional approach one level a time
- Can reduce gt95 explicit state space
- Refinement check protocol RTL Impls vs. Specs
- Refinement theory and methodology
- Compositional approach theory
- Publications
- FMCAD 2007, HLDVT 2007
- TECHCON 2007 (best session paper award)
35Yearly Summary 2007 - 2008
- Refinement check protocol RTL Impls vs. Specs
- A comprehensive tool path
- Can find bugs on RTL protocols with realistic
features - A simple pipelined stack example
- Verification of hierarchical cache coherence
protocols - A snoop multicore protocol benchmark
36A Simple Snoop Multicore Protocol
- Motivation
- Snoop protocols commonly used in 1st level of
caches - Have applied our approach on directory protocols
- How about snoop protocols?
37Applying Our Approach
38Refinement Check Spec vs. Impl
Abstraction level
Specification
Cycle accurate RTL level
Model size
39Differences in Execution Specs vs. Impls
Interleaving in HL
Concurrency in LL
40Our Approach of Refinement Check
- Modeling
- Specification Murphi
- Implementation Hardware Murphi
- Use transactions in Impl to relate to Spec
- Verification
- Muv Hardware Murphi ? synthesizable VHDL
- Tool IBM SixthSense and RuleBase
41What Are Transactions?
- Group a multi-step execution in implementations
Spec
Impl
42Outline
- Project background
- Extensions to the tool path
- Experimental results
- Future work
43Tool Path
- Initial efforts from IBM
- By German and Janssen
- Hardware Murphi language
- Muv Hardware Murphi ? Synthesizable VHDL
- Our extensions -- enable refinement check
- Language extensions
- Muv extensions
44Basic Features of Hardware Murphi vs Murphi
signal s1, s2 s1 lt chooserule rules
end firstrule rules end transaction
rule-1 rule-2 end
45Language Extensions to Hardware Murphi (I)
--include spec.m
- Joint variables correspondence
correspondence u10..7 v11..8
u1 v2 end
46Language Extensions to Hardware Murphi (II)
transactionset p1T1 p2T2 do transaction
end
ruleid guard gt action ruleset p1T1 p2T2
do ruleid end
47Language Extensions to Hardware Murphi (III)
ltlt id.guard() gtgt ltlt id.action() gtgt ltlt
idv1v2.guard() gtgt
- Fine-grained assignments for write-write
conflicts
vari lt data
48How to Annotate an Impl Model with Spec?
impl.m
transaction rule-1 g1 ? a1
rule-2 g2 ? a2 end
spec.m
ruleid g ? a
49How to Annotate an Impl Model with Spec?
impl.m
--include spec.m correspondence u1 v1
end transaction rule-1 g1 ? a1
ltlt id.guard() gtgt
ltlt id.action() gtgt rule-2 g2 ?
a2 end
spec.m
ruleid g ? a
50The Framework of Muv
Constant propagation ruleid, ltltid.guard()gtgt rules
et, transactionset
pre-processor
parser
Hardware Murphi model
AST
AST
refinement check analysis
translator
AST
VHDL model
51Our Extensions to Muv
- Language extension support
- Refinement check assertions generation
- Ensure exclusive write to a variable
- Serializability for Spec rules
- Enableness for Spec rules
- Joint variables equivalence when inactive
- Mostly done with static analysis
52Refinement Extensions to Muv (I)
v d
for i s1..s2 do assert (update_bitsi
false) end v d for i s1..s2 do
update_bitsi true end
53Refinement Extensions to Muv (II)
- Serializability for specification rules
t1
t2
t3
t1
t2
S1
S0
S0
S2
S1
S1
t3
- Obtain read and write sets of variables of each
rule - Analyze read-write dependency
- Check for cycles
54Check for Dependency Cycles
t1
t3
t2
t1
t2
S1
S0
S0
S2
S1
S1
t3
t1
r(v1) w(v3)
t3 write v2 t3 read v1
t2 write v1 t2 read v2
55Refinement Extensions to Muv (III)
- Enableness of specification rules
ruleid guard ? action
bool function id_guard() void procedure
id_action()
ltlt id.guard() gtgt ltlt id.action() gtgt
assert id_guard() id_action()
56Refinement Extensions to Muv (IV)
- Joint variables equivalence when inactive
- For each joint variable v
- When all transactions that write to v are
inactive - v must be equivalent in Impl and Spec
Transaction T1 Transaction T2
Assert inactive(T1) inactive(T2) gt
v v
57Outline
- Project background
- Extensions to the tool path
- Experimental results
- Future work
58A Driving Protocol Benchmark
Dir
Cache
Mem
Local
Buf
Buf
Home
Remote
Buf
Router
Dir
Cache
Mem
Local
Buf
Home
Buf
Remote
Buf
S. German and G. Janssen, IBM Research Tech
Report 2006
59More Detail of the Cache Example
- Hardware Murphi model
- 2500 LOC
- 15 transactionsets
- Generated VHDL
- 1800 assertions, of which 1600 are write-write
conflicts check assertions - Took 16min with SixthSense for all assertions
- Took 13min w/o write-write conflicts check
60Bugs Found with Refinement Check
- Benchmark satisfies cache coherence already
- Bugs still found
- Bug 1 router unit loses messages
- Bug 2 home unit replies twice for one request
- Bug 3 cache unit gets updated twice from 1 reply
- Refinement check is an automatic way of
constructing such checks
61Model Checking Approaches
- Monolithic
- Straightforward property check
- Compositional
- Divide and conquer
62Compositional Refinement Check
- Reduce the verification complexity
- Basic Techniques
- Abstraction
- Removing details to make verification easier
- Assume guarantee
- A simple form of induction which introduces
assumptions and justifies them
63Experimental Results
- Configurations
- 2 nodes, 2 addresses, SixthSense
Verification Time
1-day
Monolithic approach
Compositional approach
30 min
Datapath
1-bit
10-bit
64A Simple 2-Stage Pipelined Stack
- Push push data increase counter
- Pop decrease counter pop data
pipelined pushes
pipelined pops
overlapped pop push
65Outline
- Project background
- Extensions to the tool path
- Experimental results
- Future work
66Future Work
- Muv-like refinement check for interaction modules
- RTL modules interaction via communication
protocols - Interfaces involving buffers and pipelining
- Refinement of initial RTL protocols
- Power-down issues
- Post-silicon validation support
- Runtime verification support
- Safe augmentation of verified protocols
- Cheap re-verification