Title: Functional Test Generation using Property Decompositions for Validation of Pipelined Processors
1Functional Test Generation using Property
Decompositions for Validation of Pipelined
Processors
- Heon-Mo Koo, Prabhat Mishra
- Dept. of Computer and Information Science and
Engineering - University of Florida, USA
2Outline
- Introduction
- Related Work
- Test Generation using Model Checking
- Functional Test Generation
- Design and Property Decompositions
- A Case Study
- Conclusion
3Motivation
- Exponential growth of design complexity
- Deeply pipelined complex microarchitecture
- Logic bugs increase 3 - 4x per generation
- Up to 70 of design time and resources are spent
during functional validation - Functional validation is a major challenge in
microprocessor design
4Related Work
- Existing validation approaches
- Simulation-based techniques
- Formal methods
- Simulation is the most widely used form for
microprocessor validation - Uses random, pseudorandom, directed tests
- Random/pseudo-random test generation
- e.g., Genesys-Pro, Adir et al., DAC95, Shen et
al., DAC99 - Directed Test Generation
- Ur et al., Campenhout et al., , DAC99, Iwashita
et al., ICCAD94 - Test Generation using Model Checking
- Mishra et al., DATE 2004
5Processor Validation using Test Programs
Test Generator
Pipelined Processor
TestGen
MOV R1, 011 MOV R2, 010 ADD R3, R1, R2 R3 101
Test Program
R3 101 ?
Check Result
Verifies the functionality of the processor using
assembly programs
6Test Gen. using Model Checking
- Processor model
- Desired behaviors
- Expressed in temporal logic properties
- Test generation Algorithm
- Apply negated version of the property
- MC generates a counterexample
-
Decode never stalled
Processor Model
Model Checker
An Example Generate test to stall a Decode unit
Cycle Opcode Dest Src1 Src2 1
NOP 2 ADD R3 R1 R2
3 SUB R4 R3 R2
7Test Gen. using Model Checking
Mishra and Dutt, DATE 2004
Architecture Specification
ADL Specification
Simulator Generation
Problem Test generation is very costly or not
possible in many scenarios --
in the presence of complex
processors and/or complex properties.
SMV
Not Enough Properties
Counterexamples
Approach Design and property decompositions to
reduce the verification
complexity - Reduction of TG time memory
requirement - Enables test
generation in complex scenarios
Simulator
ADL Architecture Description Language
Test Programs
8Decomposition Challenges
- Partitioning is not possible in many cases
- Design/property partitions are related
- A partition may generate incorrect result
- Merging of local counterexamples
- P, D ? C
- p1, D ? C1, and p2, D ? C2
- C ? merge (C1, C2)
- Interfaces are not handled properly
- local properties ? global property
9Our Test Generation Methodology
Architecture Specification (ADL Specification)
Decomposition
Decomposition
1 Design Decomposition
2 Property Decomposition
Test Generation
Test Cases
3 Test generation
10Decomposition of Processor Model
- Module-level partition
- Functional unit
- Path-level partition
- Group modules on the same path
- Clustering
- Based on property
11Decomposition of Properties
- LTL property consists of
- Temporal operators
- G p Globally p, p holds in every state (Always)
- F p Future p, p will hold in a future state
(Eventually) - X p Next p, p holds in the next state
- p U q p Until q, p will hold until q holds
- p, q propositional logic
- Boolean connectives ?, ?, ?, ?
- Negation of properties
- ?G(p) F(?p), ?F(p) G(?p)
- ?X(p) X(?p)
12Decomposition of Properties
- Decomposable properties
- G(p?q) ? G(p) ? G(q)
- F(p?q) ? F(p) ? F(q)
- X(p?q) ? X(p) ? X(q)
- X(p?q) ? X(p) ? X(q)
- Not decomposable properties
- F(p?q) ? F(p) ? F(q)
- G(p?q) ? G(p) ? G(q)
- Introducing the time step, decomposable
- F((clkt) ? (p ? q)) ? F((clk t) ? p) ? F((clk
t) ? q) - G((clkt)? (p?q)) ? G((clkt) ? p)? G((clk
t) ? q)
13Decomposition of Properties
- Decomposable properties
- G(p?q) ? G(p) ? G(q)
- F(p?q) ? F(p) ? F(q)
- X(p?q) ? X(p) ? X(q)
- X(p?q) ? X(p) ? X(q)
- Not decomposable properties
- F(p?q) ? F(p) ? F(q)
- G(p?q) ? G(p) ? G(q)
- Introducing the time step, decomposable
- F((clkt) ? (p ? q)) ? F((clk t) ? p) ? F((clk
t) ? q) - G((clkt)? (p?q)) ? G((clkt) ? p)? G((clk
t) ? q)
14Test Generation An Example
- Example of Multiple Exception
- Original property
- P F( (clk7) (MEM.exception 1)
- (IALU.exception 1)
- (DIV.exception 1))
- Negated property
- P' G( (clk7) (MEM.exception 1)
- (IALU.exception
1) - (DIV.exception
1)) - Decompose into three sub-properties
- P1 G((clk7) (MEM.exception 1))
- P2 G((clk7) (IALU.exception 1))
- P3 G((clk7) (DIV.exception 1))
15Test Generation An Example
- Clk 7
- P1 G((clk7) (MEM.exception1))
- P2 G((clk7) (IALU.exception1))
- P3 G((clk7) (DIV.exception 1))
Counterexamples CP1 load operation with memory
address 0 CP2 add operation with value 2 for
both source operands (2-bit register) CP3
divide operation with 2nd src value 0 Generate
parent node property CP1 ? IALU (P1) CP2 and CP3
? Decode (P23)
16Test Generation An Example
- Clk 7
- P1 G((clk7) (MEM.exception1))
- P2 G((clk7) (IALU.exception1))
- P3 G((clk7) (DIV.exception 1))
Counterexamples CP1 load operation with memory
address 0 CP2 add operation with value 2 for
both source operands (2-bit register) CP3
divide operation with 2nd src value 0 Generate
parent node property CP1 ? IALU (P1) CP2 and CP3
? Decode (P23)
May cause conflict
17Test Generation An Example
- Clk 6
- P1 G((clk6) (aluOp.opcodeLD)
- (aluOp.src1Val0))
- P23 G((clk6) (decOp0.opcADD)
- (decOp0.src1Val2)
- (decOp0.src2Val2)
- (decOp3.opcDIV)
- (decOp3.src2Val0))
- Counterexamples
- CP1 load operation with memory addr 0
- CP23 add operation with value 2 for both
source operands divide operation with second
source operand value 0 - Generate parent node property
- CP1 ? Decode (p1)
- CP23 ? Fetch (p23)
PC
Memory
Fetch
Decode
Reg File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
MUL7
FADD4
MEM
WriteBack
18Test Generation An Example
- Clk 5
- P1 G((clk5) (decOp0.opcLD)
- (decOp0. src1Val0))
- P23G((clk5) (fetOp0.opcADD)
- (fetOp0.src1Val2)
- (fetOp0.src2Val2)
- (fetOp3.opcDIV)
- (fetOp3.src2Val0))
- Counterexamples
- CP1 load with memory address 0
- Generate parent node property
- CP1 ? Fetch
- CP23 ? Primary Input
PC
Memory
Fetch
Decode
Reg File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
MUL7
FADD4
MEM
WriteBack
19Test Generation An Example
- Clk 4
- P1 G((clk4) (fetOp0.opcLD)
- (fetOp0. src1Val0))
- Generate parent node property
- CP1 ? Primary Input
PC
Memory
Fetch
Decode
Reg File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
MUL7
FADD4
MEM
WriteBack
20Test Generation An Example
PC
Memory
Fetch
Reg File
Decode
FADD1
MUL1
DIV
IALU
21Comparison of Test Gen. Methods
- Naïve original properties are applied to whole
design - Existing module properties are applied to module
level design - Our approach decompositions of design and
properties - Our approach improves test generation time and
memory requirement by an order-of-magnitude
NA Not Applicable
22Conclusions
- Functional validation is a major bottleneck in
microprocessor design - Simulation using random and directed tests
- Model checking as test generation engine
- Capacity restriction poses practical challenges
- Our approach
- Design and property decompositions
- Decompose a temporal logic property
- Apply them to appropriate design partitions
- Merge the intermediate partial counterexamples
- Reduces test generation time and memory
requirement - Future work
- Optimal property and design partitions
- Apply using bounded model checking
23 24Decomposition Scenarios
D P Comments 0 0 Infeasible for large
designs 0 1 Merging counterexamples is not
always possible 1 0 Only possible for module
level properties 1 1 Merging intermediate
partial counterexamples
D Design, P Property 0 No decomposition, 1
Decomposed/partitioned
25Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode P7 C4 P8) (WB RF P5 C3
P6) ) (TYPE BI (MEM MEMORY P4 C2 P3) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
26Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode P7 C4 P8) (WB RF P5 C3
P6) ) (TYPE BI (MEM MEMORY P4 C2 P3) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
27Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode P7 C4 P8) (WB RF P5 C3
P6) ) (TYPE BI (MEM MEMORY P4 C2 P3) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
28Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode P7 C4 P8) (WB RF P5 C3
P6) ) (TYPE BI (MEM MEMORY P4 C2 P3) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
29Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode P7 C4 P8) (WB RF P5 C3
P6) ) (TYPE BI (MEM MEMORY P4 C2 P3) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
30Specification of the DLX Processor
PC
Memory
Fetch
Structure
( ARCHITECTURE_SECTION ..........
(FetchUnit Fetch (CAPACITY 4) (TIMING (all
1)) (OPCODES all) (LATCHES (OTHER
PCLatch)(OUT DLatch)) ) ( PIPELINE_SECTION
(PIPELINE Fetch Decode Execute MEM WB) (Execute
(ALTERNATE ALU MUL FADD DIV)) (FADD (PIPELINE
FADD1 .. FADD3 FADD4)) (DTPATHS (TYPE
UNI (RF Decode) (WB RF) ) (TYPE
BI (MEM MEMORY) ) )
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
FADD3
FADD4
MUL7
MEM
WriteBack
31Specification of the DLX Processor
Structure
PC
Memory
Fetch
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
Behavior
(OPCODE ADD (OPERANDS (SRC1 rf) (SRC2 imm)
(DEST rf)) (BEHAVIOR DEST SRC1 SRC2)
(FORMAT ) )
FADD3
FADD4
MUL7
MEM
WriteBack
32Specification of the DLX Processor
Structure
PC
Memory
Fetch
Decode
Register File
DIV
FADD1
IALU
MUL1
FADD2
MUL2
Behavior
Mapping
(OPCODE ADD (OPERANDS (SRC1 rf) (SRC2 imm)
(DEST rf)) (BEHAVIOR DEST SRC1 SRC2)
(FORMAT ) )
FADD3
FADD4
MUL7
MEM
WriteBack
33Functional Fault Model
Fetch
MEM
1. Node Fault A node does not execute
correctly - active - stalled -
exception - flushed 2. Edge Fault An
edge does not transfer inst./data correctly
- active - stalled - flushed 3.
Pipeline Fault Incorrect execution due to
multiple faults - two simultaneous node/edge
faults
Decode
ALU
AddrCalc
RF
LdSt
WB
34Functional Verification of SOC Designs
2000
1000B
2007
200
10B
2001
Engineer Years
Simulation Vectors
100M
20
1995
100M
1M
10M
Logic Gates
Source Synopsys
71 of SOC re-spins are due to logic bugs
Source G. Spirakis, keynote address at DATE 2004
35Applying Properties at Module-level
- Initialize registers Ain and Bin with values 2
and 3 at cycle 9
- Apply to DIV unit
- assert G ((cycle8) ? X((Ain 2) (Bin 3)))
- input assignments divInst.src1 2, divInst.src2
3 - Apply to Decode unit
- assert G((cycle7) ? X((divInst.src1 2)
(divInst.src2 3))) - input assignments oper DIV R3 R1 R2 RF12,
RF23 - Apply to Fetch unit
- assert G((cycle6) ? X((oper.opcode DIV)
-
(oper.src1 1) (oper.src2 2))) - input assignments PC5, Memory5 DIV R3 R1 R2
36Test Generation Block Diagram