Title: Modeling%20Data%20in%20Formal%20Verification%20Bits,%20Bit%20Vectors,%20or%20Words
1Modeling Data in Formal VerificationBits, Bit
Vectors, or Words
Randal E. Bryant Carnegie Mellon University
http//www.cs.cmu.edu/bryant
2Overview
- Issue
- How should data be modeled in logical analysis?
- Verification, synthesis, test generation,
security analysis, - Approaches
- Bits Every bit is represented individually
- Words View each word as arbitrary value
- E.g., unbounded integers
- Historic program verification work
- Bit Vectors Finite precision words
- Captures true semantics of hardware and software
3Bit-Level Modeling
Control Logic
- Represent Every Bit of State Individually
- Behavior expressed as Boolean next-state over
current state - Historic method for most CAD, testing, and
verification tools - E.g., model checkers, test generators
4Bit-Level Modeling in Practice
- Strengths
- Allows precise modeling of system
- Well developed technology
- BDDs SAT for Boolean reasoning
- Limitations
- Every state bit introduces two Boolean variables
- Current state next state
- Overly detailed modeling of system functions
- Dont want to capture full details of FPU
- Making It Work
- Use extensive abstraction to reduce bit count
- Hard to abstract functionality
5Word-Level Abstraction 1 Bits ? Integers
x0
x1
x2
xn-1
- View Data as Symbolic Words
- Arbitrary integers
- No assumptions about size or encoding
- Classic model for reasoning about software
- Can store in memories registers
6Abstracting Data Bits
Control Logic
7Word-Level Abstraction 2 Uninterpreted
Functions
f
- For any Block that Transforms or Evaluates Data
- Replace with generic, unspecified function
- Only assumed property is functional consistency
- a x ? b y ? f (a, b) f (x, y)
8Abstracting Functions
Control Logic
Data Path
Com. Log. 1
Com. Log. 1
- For Any Block that Transforms Data
- Replace by uninterpreted function
- Ignore detailed functionality
- Conservative approximation of actual system
9Word-Level Modeling History
- Historic
- Used by theorem provers
- More Recently
- Burch Dill, CAV 94
- Verify that pipelined processor has same behavior
as unpipelined reference model - Use word-level abstractions of data paths and
memories - Use decision procedure to determine equivalence
- Bryant, Lahiri, Seshia, CAV 02
- UCLID verifier
- Tool for describing verifying systems at word
level
10Pipeline Verification Example
Pipelined Processor
Reference Model
11Abstracted Pipeline Verification
Pipelined Processor
Reference Model
12Experience with Word-Level Modeling
- Powerful Abstraction Tool
- Allows focus on control of large-scale system
- Can model systems with very large memories
- Hard to Generate Abstract Model
- Hand-generated how to validate?
- Automatic abstraction limited success
- Andraus Sakallah, DAC 2004
- Realistic Features Break Abstraction
- E.g., Set ALU function to A0 to pass operand to
output - Desire
- Should be able to mix detailed bit-level
representation with abstracted word-level
representation
13Bit Vectors Motivating Example 1
int abs(int x) int mask xgtgt31 return (x
mask) mask 1
int test_abs(int x) return (x lt 0) ? -x x
- Do these functions produce identical results?
- Strategy
- Represent and reason about bit-level program
behavior - Specific to machine word size, integer
representations, and operations
14Motivating Example 3
bitW popSpec(bitW x) int cnt 0 for
(int i0 iltW i) if (xi) cnt
return cnt
bitW popSketch(bitW x) loop (??) x
(x??) ((xgtgt??)??) return x
- Is there a way to expand the program sketch to
make it match the spec?
- Answer
- W16
- Solar-Lezama, et al., ASPLOS 06
x (x0x5555) ((xgtgt1)0x5555) x
(x0x3333) ((xgtgt2)0x3333) x (x0x0077)
((xgtgt8)0x0077) x (x0x000f)
((xgtgt4)0x000f)
15Motivating Example 4
Sequential Reference Model
Pipelined Microprocessor
- Is pipelined microprocessor identical to
sequential reference model? - Strategy
- Represent machine instructions, data, and state
as bit vectors - Compatible with hardware description language
representation - Verifier finds abstractions automatically
16Bit Vector Formulas
- Fixed width data words
- Arithmetic operations
- Add/subtract/multiply/divide,
- Twos complement, unsigned,
- Bit-wise logical operations
- Bitwise and/or/xor, shift/extract, concatenate
- Predicates
- , lt
- Task
- Is formula satisfiable?
- E.g., a gt 0 aa lt 0
50000 50000 -1794967296 (on 32-bit machine)
17Decision Procedures
- Core technology for formal reasoning
- Boolean SAT
- Pure Boolean formula
- SAT Modulo Theories (SMT)
- Support additional logic fragments
- Example theories
- Linear arithmetic over reals or integers
- Functions with equality
- Bit vectors
- Combinations of theories
18Recent Progress in SAT Solving
19BV Decision ProceduresSome History
- B.C. (Before Chaff)
- String operations (concatenate, field extraction)
- Linear arithmetic with bounds checking
- Modular arithmetic
- Limitations
- Cannot handle full range of bit-vector operations
20BV Decision ProceduresUsing SAT
- SAT-Based Bit Blasting
- Generate Boolean circuit based on bit-level
behavior of operations - Convert to Conjunctive Normal Form (CNF) and
check with best available SAT checker - Handles arbitrary operations
- Effective in Many Applications
- CBMC Clarke, Kroening, Lerda, TACAS 04
- Microsoft Cogent SLAM Cook, Kroening,
Sharygina, CAV 05 - CVC-Lite Dill, Barrett, Ganesh, Yices deMoura,
et al
21Bit-Vector Challenge
- Is there a better way than bit blasting?
- Requirements
- Provide same functionality as with bit blasting
- Find abstractions based on word-level structure
- Improve on performance of bit blasting
- Observation
- Must have bit blasting at core
- Only approach that covers full functionality
- Want to exploit special cases
- Formula satisfied by small values
- Simple algebraic properties imply
unsatisfiability - Small unsatisfiable core
- Solvable by modular arithmetic
22Some Recent Ideas
- Iterative Approximation
- UCLID Bryant, Kroening, Ouaknine, Seshia,
Strichman, Brady, TACAS 07 - Use bit blasting as core technique
- Apply to simplified versions of formula
- Successive approximations until solve or show
unsatisfiable - Using Modular Arithmetic
- STP Ganesh Dill, CAV 07
- Algebraic techniques to solve special case forms
- Layered Approach
- MathSat Bruttomesso, Cimatti, Franzen, Griggio,
Hanna, Nadel, Palti, Sebastiani, CAV 07 - Use successively more detailed solvers
23Iterative Approach Background Approximating
Formula
?
Original Formula
- Example Approximation Techniques
- Underapproximating
- Restrict word-level variables to smaller ranges
of values - Overapproximating
- Replace subformula with Boolean variable
24Starting Iterations
?
?1-
- Initial Underapproximation
- (Greatly) restrict ranges of word-level variables
- Intuition Satisfiable formula often has
small-domain solution
25First Half of Iteration
?
?1-
- SAT Result for ?1-
- Satisfiable
- Then have found solution for ?
- Unsatisfiable
- Use UNSAT proof to generate overapproximation ?1
- (Described later)
26Second Half of Iteration
?1
?
?1-
- SAT Result for ?1
- Unsatisfiable
- Then have shown ? unsatisfiable
- Satisfiable
- Solution indicates variable ranges that must be
expanded - Generate refined underapproximation
27Iterative Behavior
?2
?1
- Underapproximations
- Successively more precise abstractions of ?
- Allow wider variable ranges
- Overapproximations
- No predictable relation
- UNSAT proof not unique
? ? ?
?k
?
?k-
? ? ?
?2-
?1-
28Overall Effect
- Soundness
- Only terminate with solution on
underapproximation - Only terminate as UNSAT on overapproximation
- Completeness
- Successive underapproximations approach ?
- Finite variable ranges guarantee termination
- In worst case, get ?k- ? ?
29Generating Overapproximation
- Given
- Underapproximation ?1-
- Bit-blasted translation of ?1- into Boolean
formula - Proof that Boolean formula unsatisfiable
- Generate
- Overapproximation ?1
- If ?1 satisfiable, must lead to refined
underapproximation - Generate ?2- such that
- ?1- ? ?2- ? ?
30Bit-Vector Formula Structure
- DAG representation to allow shared subformulas
?
31Structure of Underapproximation
?-
- Linear complexity translation to CNF
- Each word-level variable encoded as set of
Boolean variables - Additional Boolean variables represent subformula
values
32UNSAT Proof
- Subset of clauses that is unsatisfiable
- Clause variables define portion of DAG
- Subgraph that cannot be satisfied with given
range constraints
x y
x 2 z ? 1
a
Æ
w 0xFFFF x
Ç
x 26 v
33Extracting Circuit from UNSAT Proof
- Subgraph that cannot be satisfied with given
range constraints - Even when replace rest of graph with
unconstrained variables
x y
x 2 z ? 1
UNSAT
a
Æ
b1
b2
34Generated Overapproximation
- Remove range constraints on word-level variables
- Creates overapproximation
- Ignores correlations between values of subformulas
x y
x 2 z ? 1
a
?1
Æ
b1
b2
35Refinement Property
- Claim
- ?1 has no solutions that satisfy ?1-s range
constraints - Because ?1 contains portion of ?1- that was
shown to be unsatisfiable under range constraints
x y
x 2 z ? 1
UNSAT
a
Æ
?1
b1
b2
36Refinement Property (Cont.)
- Consequence
- Solving ?1 will expand range of some variables
- Leading to more exact underapproximation ?2-
x y
x 2 z ? 1
a
?1
Æ
b1
b2
37Effect of Iteration
?1
UNSAT proof generate overapproximation
?
?1-
- Each Complete Iteration
- Expands ranges of some word-level variables
- Creates refined underapproximation
38Approximation Methods
- So Far
- Range constraints
- Underapproximate by constraining values of
word-level variables - Subformula elimination
- Overapproximate by assuming subformula value
arbitrary - General Requirements
- Systematic under- and over-approximations
- Way to connect from one to another
- Goal Devise Additional Approximation Strategies
39Function Approximation Example
x x x
0 1 else
y 0 0 0 0
y 1 0 1 x
y else 0 y
- Motivation
- Multiplication (and division) are difficult cases
for SAT - Prohibit Via Additional Range Constraints
- Gives underapproximation
- Restricts values of (possibly intermediate) terms
- Abstract as f (x,y)
- Overapproximate as uninterpreted function f
- Value constrained only by functional consistency
40Challenges with Iterative Approximation
- Formulating Overall Strategy
- Which abstractions to apply, when and where
- How quickly to relax constraints in iterations
- Which variables to expand and by how much?
- Too conservative Each call to SAT solver incurs
cost - Too lenient Devolves to complete bit blasting.
- Predicting SAT Solver Performance
- Hard to predict time required by call to SAT
solver - Will particular abstraction simplify or
complicate SAT? - Combination Especially Difficult
- Multiple iterations with unpredictable inner loop
41STP Linear Equation Solving
- Ganesh Dill, CAV 07
- Solve linear equations over integers mod 2w
- Capture range of solutions with Boolean Variables
- Example Problem
- Variables 3-bit unsigned integers
- x x2 x1 x0 y y2 y1 y0 z z2 z1 z0
- Linear equations conjunction of linear
constraints - General Form
- A x b 0 mod 2w
3x 4y 2z 0 mod 8
2x 2y 2 0 mod 8
2x 4y 2z 0 mod 8
42Summary Modeling Levels
- Bits
- Limited ability to scale
- Hard to apply functional abstractions
- Words
- Allows abstracting data while precisely
representing control - Overlooks finite word-size effects
- Bit Vectors
- Realistic semantic model for hardware software
- Captures all details of actual operation
- Detects errors related to overflow and other
artifacts of finite representation - Can apply abstractions found at word-level
43Areas of Agreement
- SAT-Based Framework Is Only Logical Choice
- SAT solvers are good getting better
- Want to Automatically Exploit Abstractions
- Function structure
- Arithmetic properties
- E.g., associativity, commutativity
- Arithmetic reductions
- E.g., LU decomposition
- Base Level Should Be SAT
- Only semantically complete approach
44Choices
- Optimize for Special Formula Classes
- E.g., STP optimized for conjunctions of
constraints - Common in software verification testing
- Iterative Abstraction
- Natural framework for attempting different
abstractions - Having SAT solver in inner loop makes performance
tuning difficult - Others?
45Observations
- Bit-Vector Modeling Gaining in Popularity
- Recognition of importance
- Benchmarks and competitions
- Just Now Improving on Bit Blasting SAT
- Lots More Work to be Done