A New Reachability Algorithm for Symmetric Multi-processor Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

A New Reachability Algorithm for Symmetric Multi-processor Architecture

Description:

A New Reachability Algorithm for Symmetric Multi-processor Architecture D. Sahoo, Stanford J. Jain, Fujitsu S. Iyer, UT-Austin D. Dill, Stanford Formal Equivalence and – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 26
Provided by: Debashi8
Learn more at: http://genedesk.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: A New Reachability Algorithm for Symmetric Multi-processor Architecture


1
A New Reachability Algorithm for Symmetric
Multi-processor Architecture
D. Sahoo, Stanford J. Jain, Fujitsu S. Iyer,
UT-Austin D. Dill, Stanford
Formal Equivalence and  Assertion-based
Verification Workshop 2005
2
Outline
  • Standard Reachability Analysis
  • Multithreaded Reachability
  • Multithreaded Reachability in SMP machines
  • Engineering Issues
  • Results
  • Conclusion and Future Work

3
Related Work
  • Parallel Reachability Analysis
  • Stern and Dill CAV, 97
  • Stornetta and Brewer DAC, 96
  • Yang, Hallaron 97
  • Heyman, Geist, Grumberg, Schuster CAV, 00
  • Garavel, Mateescu, Smarandache SPIN, 01
  • Pixley, Havlicek 03

4
Reachability using BDD
Burch et al. 91
Partitioned Transition Relation
Initial State
I


R1
Image computation
Trn
Tri
Tr1
R2
Least Fixed Point
Ri
5
Partitioned Reachability using POBDD
POBDD - Jain 92 Reachability - Narayan et
al. 97
I
Initial States I
6
Partitioned Reachability using POBDD
POBDD - Jain 92 Reachability - Narayan et
al. 97
I
Initial States I
Local Fixed Point 1
Local Fixed Point 2
7
Partitioned Reachability using POBDD
POBDD - Jain 92 Reachability - Narayan et
al. 97
I
Initial States I
Similarly repeat for other partitions
8
Partitioned Reachability using POBDD
POBDD - Jain 92 Reachability - Narayan et
al. 97
I
Improvements Iyer et al. 03 Sahoo et al.
04
9
Motivation for Multi-threaded Approach
  • Scheduling Problem
  • Increasing availability of powerful SMP machines
  • Multi-threading is a way of achieving real
    parallelism in SMP machines

10
Multi-threaded Reachability DAC 05
Naïve parallelization
Time
  • Advantage
  • Parallel speedup
  • Catch a bug faster than the sequential version
  • Problems
  • Not much parallelism

11
Multi-threaded Reachability DAC 05
Early Communication
Time
  • Advantage
  • Parallel speedup
  • Finishes the reachability analysis faster
  • Catches bug faster than the naive version
  • Problems
  • Parallelism could be better

12
Multi-threaded Reachability DAC 05
Early Communication and Partial Communication
Time
  • Advantage
  • Parallel speedup
  • Finishes the reachability analysis faster
  • Catches bug faster than the previous versions

13
Reachability in SMP Architecture
Time
  • We find the bugs faster !
  • Improved parallelism
  • Better parallel speedup

14
Engineering Issues
  • Thread-safe BDD library
  • Deterministic behavior
  • Smart thread scheduling

15
Sources of Non-determinism
  • Extensive memory based optimizations
  • Pointer comparisons
  • Hashing based on memory address
  • Solutions
  • Deterministic Hashing
  • Deterministic comparisons

Thread 1
Thread 2
p malloc ()
p malloc ()
key hash(p)
if (p gt p1)
16
Sources of Non-determinism
  • Thread synchronization
  • Solutions
  • Synchronization based on deterministic count
  • Number of ITE operations
  • Number of Sift operations

Thread 1
Thread 2
Image n
Image n1
17
Smart Thread Scheduling
Thread
  • Each processor has its own cache
  • Thread is assigned to a processor
  • The cache fills up with the threads memory
    usage.
  • The same thread assigned to a different processor
    after sometime
  • A large number of unnecessary cache miss when the
    thread use its previously used memory locations
  • Solutions
  • Bind thread to a processor
  • Leads to suboptimal throughput
  • If the number of threads exceeds the number of
    processors

CPU2
CPU1
Cache2
Cache1 0x07ffd0
Lookup 0x07ffd0
Cachemiss
18
BDD Performance CUDD Vs New
Ckts BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order) BDD Statistics after Reachability Analysis (Static Order)
Ckts P/F img nodes CUDD CUDD CUDD CUDD New New New New
Ckts P/F img nodes Mem (MB) Cache hits Cache collision Time Mem Cache hits Cache collision Time
bpb F 10 1.8M 50M 41.0 90.4 18.6 61M 41.0 88.2 26.3
eight P 47 79K 6.1M 42.9 26.2 0.8 7.5M 42.9 26.2 1.5
fru32 F 2 8K 9.2M 34.0 28.4 7.9 10.9M 34.0 28.9 8.9
idu32 F 1 36K 6.6M 28.8 5.0 4.2 7.8M 28.7 7.7 4.5
usbphy P 1 90K 6.4M 37.7 16.6 0.7 7.8M 37.7 17.1 0.7
19
BDD Performance CUDD Vs New
20
Performance Non-deterministic Vs Deterministic
Ckts Verification Time in Sec Verification Time in Sec
Ckts Non-deterministic Deterministic
c1 T/O 227
c2 962 917
c3 809 62
c4 903 161
d1 13 13
d2 24 30
d3 84 100
d4 30 38
d5 13 37
21
Performance Cache or Parallelism
Ckts Verification Time in Sec Verification Time in Sec Verification Time in Sec
Ckts Uniprocessor Sequential In 8-way SMP Parallel In 8-way SMP
c1 1570 286 227
d1 125 13 13
d2 180 39 30
d3 295 130 100
d4 176 60 38
22
Results on Industrial Circuits
Ckt Vis Seq POBDD Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches
Ckt Vis Seq POBDD Parallel 8 CPUs Naïve Parallel 8 CPUs Early Comm Early Comm Partial Comm Early Comm Partial Comm
Ckt Vis Seq POBDD Parallel 8 CPUs Naïve Parallel 8 CPUs Early Comm 1 CPU 8 CPUs
c1 371 T/O T/O T/O 286 227
c2 3346 1789 1564 93 917 917
c3 2540 T/O T/O T/O 228 62
c4 2236 2084 1174 161 509 161
d1 6 T/O T/O 13 13 13
d2 10 11 13 45 39 30
d3 15 21 23 100 130 100
d4 11 T/O T/O 39 60 38
d5 12 16 15 34 37 37
23
Results on public benchmarks
Ckt Vis Seq POBDD Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches Parallel Multi-threaded Approaches
Ckt Vis Seq POBDD Parallel 8 CPUs Naïve Parallel 8 CPUs Early Comm Early Comm Partial Comm Early Comm Partial Comm
Ckt Vis Seq POBDD Parallel 8 CPUs Naïve Parallel 8 CPUs Early Comm 1 CPU 8 CPUs
spprod 891 61 53 93 510 440
am2910 T/O 281 122 204 386 356
palu 273 4 9 8 9 9
S1269b-1 3635 T/O T/O 59 72 60
S1269b-5 2287 T/O T/O 55 67 55
blackjck T/O 1213 470 340 98 70
24
Results Gantt charts
Real execution traces from our multi-threaded
reachability program
25
Conclusion and Future Work
  • Parallelize the Reachability
  • Multi-threaded Reachability
  • Better results
  • Deterministic behavior
  • Future Work
  • Improve the parallelism further
  • Study cache behavior
Write a Comment
User Comments (0)
About PowerShow.com