# Modeling Data in Formal Verification Bits, Bit Vectors, or Words - PowerPoint PPT Presentation

View by Category
Title:

## Modeling Data in Formal Verification Bits, Bit Vectors, or Words

Description:

### Basis for most CAD, model checking. Words: View each word as arbitrary value ... Historic method for most CAD, testing, and verification tools. E.g., model checkers ... – PowerPoint PPT presentation

Number of Views:1126
Avg rating:3.0/5.0
Slides: 58
Provided by: RandalE9
Category:
Tags:
Transcript and Presenter's Notes

Title: Modeling Data in Formal Verification Bits, Bit Vectors, or Words

1
Modeling Data in Formal Verification Bits, Bit
Vectors, or Words
Randal E. Bryant Carnegie Mellon University
http//www.cs.cmu.edu/bryant
2
Overview
• Issue
• How should data be modeled in formal analysis?
• Verification, test generation, security analysis,
• Approaches
• Bits Every bit is represented individually
• Basis for most CAD, model checking
• Words View each word as arbitrary value
• E.g., unbounded integers
• Historic program verification work
• Bit Vectors Finite precision words
• Captures true semantics of hardware and software
• More opportunities for abstraction than with bits

3
Bit-Level Modeling
Control Logic
• Represent Every Bit of State Individually
• Behavior expressed as Boolean next-state over
current state
• Historic method for most CAD, testing, and
verification tools
• E.g., model checkers

4
Bit-Level Modeling in Practice
• Strengths
• Allows precise modeling of system
• Well developed technology
• BDDs SAT for Boolean reasoning
• Limitations
• Every state bit introduces two Boolean variables
• Current state next state
• Overly detailed modeling of system functions
• Dont want to capture full details of FPU
• Making It Work
• Use extensive abstraction to reduce bit count
• Hard to abstract functionality

5
Word-Level Abstraction 1 Bits ? Integers
x0
x1
x2
xn-1
• View Data as Symbolic Words
• Arbitrary integers
• No assumptions about size or encoding
• Classic model for reasoning about software
• Can store in memories registers

6
Abstracting Data Bits
Control Logic
7
Word-Level Abstraction 2 Uninterpreted
Functions
f
• For any Block that Transforms or Evaluates Data
• Replace with generic, unspecified function
• Only assumed property is functional consistency
• a x ? b y ? f (a, b) f (x, y)

8
Abstracting Functions
Control Logic
Data Path
Com. Log. 1
Com. Log. 1
• For Any Block that Transforms Data
• Replace by uninterpreted function
• Ignore detailed functionality
• Conservative approximation of actual system

9
Word-Level Modeling History
• Historic
• Used by theorem provers
• More Recently
• Burch Dill, CAV 94
• Verify that pipelined processor has same behavior
as unpipelined reference model
• Use word-level abstractions of data paths and
memories
• Use decision procedure to determine equivalence
• Bryant, Lahiri, Seshia, CAV 02
• UCLID verifier
• Tool for describing verifying systems at word
level

10
Pipeline Verification Example
Pipelined Processor
Reference Model
11
Abstracted Pipeline Verification
Pipelined Processor
Reference Model
12
Experience with Word-Level Modeling
• Powerful Abstraction Tool
• Allows focus on control of large-scale system
• Can model systems with very large memories
• Hard to Generate Abstract Model
• Hand-generated how to validate?
• Automatic abstraction limited success
• Andraus Sakallah, DAC 2004
• Realistic Features Break Abstraction
• E.g., Set ALU function to A0 to pass operand to
output
• Desire
• Should be able to mix detailed bit-level
representation with abstracted word-level
representation

13
Bit Vectors Motivating Example 1
int abs(int x) int mask xgtgt31 return (x
int test_abs(int x) return (x lt 0) ? -x x
• Do these functions produce identical results?
• Strategy
• Represent and reason about bit-level program
behavior
• Specific to machine word size, integer
representations, and operations

14
Motivating Example 2
void fun() char fmt16 fgets(fmt, 16,
stdin) fmt15 '\0' printf(fmt)
• Is there an input string that causes value 234 to
• Yes "a1a2a3a4230gn"
• Depends on details of compilation
• But no exploit for buffer size less than 8
• Ganapathy, Seshia, Jha, Reps, Bryant, ICSE 05

15
Motivating Example 3
bitW popSpec(bitW x) int cnt 0 for
(int i0 iltW i) if (xi) cnt
return cnt
bitW popSketch(bitW x) loop (??) x
(x??) ((xgtgt??)??) return x
• Is there a way to expand the program sketch to
make it match the spec?
• W16
• Solar-Lezama, et al., ASPLOS 06

x (x0x5555) ((xgtgt1)0x5555) x
(x0x3333) ((xgtgt2)0x3333) x (x0x0077)
((xgtgt8)0x0077) x (x0x000f)
((xgtgt4)0x000f)
16
Motivating Example 4
Sequential Reference Model
Pipelined Microprocessor
• Is pipelined microprocessor identical to
sequential reference model?
• Strategy
• Represent machine instructions, data, and state
as bit vectors
• Compatible with hardware description language
representation
• Verifier finds abstractions automatically

17
Bit Vector Formulas
• Fixed width data words
• Arithmetic operations
• Twos complement, unsigned,
• Bit-wise logical operations
• Bitwise and/or/xor, shift/extract, concatenate
• Predicates
• , lt
• Is formula satisfiable?
• E.g., a gt 0 aa lt 0

50000 50000 -1794967296 (on 32-bit machine)
18
Decision Procedures
• Core technology for formal reasoning
• Boolean SAT
• Pure Boolean formula
• SAT Modulo Theories (SMT)
• Example theories
• Linear arithmetic over reals or integers
• Functions with equality
• Bit vectors
• Combinations of theories

19
Recent Progress in SAT Solving
20
BV Decision Procedures Some History
• B.C. (Before Chaff)
• String operations (concatenate, field extraction)
• Linear arithmetic with bounds checking
• Modular arithmetic
• Limitations
• Cannot handle full range of bit-vector operations

21
BV Decision Procedures Using SAT
• SAT-Based Bit Blasting
• Generate Boolean circuit based on bit-level
behavior of operations
• Convert to Conjunctive Normal Form (CNF) and
check with best available SAT checker
• Handles arbitrary operations
• Effective in Many Applications
• CBMC Clarke, Kroening, Lerda, TACAS 04
• Microsoft Cogent SLAM Cook, Kroening,
Sharygina, CAV 05
• CVC-Lite Dill, Barrett, Ganesh, Yices deMoura,
et al

22
Bit-Vector Challenge
• Is there a better way than bit blasting?
• Requirements
• Provide same functionality as with bit blasting
• Find abstractions based on word-level structure
• Improve on performance of bit blasting
• Observation
• Must have bit blasting at core
• Only approach that covers full functionality
• Want to exploit special cases
• Formula satisfied by small values
• Simple algebraic properties imply
unsatisfiability
• Small unsatisfiable core
• Solvable by modular arithmetic

23
Some Recent Ideas
• Iterative Approximation
• UCLID Bryant, Kroening, Ouaknine, Seshia,
• Use bit blasting as core technique
• Apply to simplified versions of formula
• Successive approximations until solve or show
unsatisfiable
• Using Modular Arithmetic
• STP Ganesh Dill, CAV 07
• Algebraic techniques to solve special case forms
• Layered Approach
• MathSat Bruttomesso, Cimatti, Franzen, Griggio,
Hanna, Nadel, Palti, Sebastiani, CAV 07
• Use successively more detailed solvers

24
Iterative Approach Background Approximating
Formula
?
Original Formula
• Example Approximation Techniques
• Underapproximating
• Restrict word-level variables to smaller ranges
of values
• Overapproximating
• Replace subformula with Boolean variable

25
Starting Iterations
?
?1-
• Initial Underapproximation
• (Greatly) restrict ranges of word-level variables
• Intuition Satisfiable formula often has
small-domain solution

26
First Half of Iteration
?
?1-
• SAT Result for ?1-
• Satisfiable
• Then have found solution for ?
• Unsatisfiable
• Use UNSAT proof to generate overapproximation ?1
• (Described later)

27
Second Half of Iteration
?1
?
?1-
• SAT Result for ?1
• Unsatisfiable
• Then have shown ? unsatisfiable
• Satisfiable
• Solution indicates variable ranges that must be
expanded
• Generate refined underapproximation

28
Iterative Behavior
?2
?1
• Underapproximations
• Successively more precise abstractions of ?
• Allow wider variable ranges
• Overapproximations
• No predictable relation
• UNSAT proof not unique

? ? ?
?k
?
?k-
? ? ?
?2-
?1-
29
Overall Effect
• Soundness
• Only terminate with solution on
underapproximation
• Only terminate as UNSAT on overapproximation
• Completeness
• Successive underapproximations approach ?
• Finite variable ranges guarantee termination
• In worst case, get ?k- ? ?

30
Generating Overapproximation
• Given
• Underapproximation ?1-
• Bit-blasted translation of ?1- into Boolean
formula
• Proof that Boolean formula unsatisfiable
• Generate
• Overapproximation ?1
• If ?1 satisfiable, must lead to refined
underapproximation
• Generate ?2- such that
• ?1- ? ?2- ? ?

31
Bit-Vector Formula Structure
• DAG representation to allow shared subformulas

?
32
Structure of Underapproximation
?-
• Linear complexity translation to CNF
• Each word-level variable encoded as set of
Boolean variables
• Additional Boolean variables represent subformula
values

33
Encoding Range Constraints
• Explicit
• View as additional predicates in formula
• Implicit
• Reduce number of variables in encoding
• Constraint Encoding
• 0 ? w ? 8 0 0 0 0 w2w1w0
• -4 ? x ? 4 xsxsxs xsxsx1x0
• Yields smaller SAT encodings

0 ? w ? 8 ? -4 ? x ? 4
34
UNSAT Proof
• Subset of clauses that is unsatisfiable
• Clause variables define portion of DAG
• Subgraph that cannot be satisfied with given
range constraints

x y
x 2 z ? 1
a
Æ
w 0xFFFF x
Ç
x 26 v
35
Extracting Circuit from UNSAT Proof
• Subgraph that cannot be satisfied with given
range constraints
• Even when replace rest of graph with
unconstrained variables

x y
x 2 z ? 1
UNSAT
a
Æ
b1
b2
36
Generated Overapproximation
• Remove range constraints on word-level variables
• Creates overapproximation
• Ignores correlations between values of subformulas

x y
x 2 z ? 1
a
?1
Æ
b1
b2
37
Refinement Property
• Claim
• ?1 has no solutions that satisfy ?1-s range
constraints
• Because ?1 contains portion of ?1- that was
shown to be unsatisfiable under range constraints

x y
x 2 z ? 1
UNSAT
a
Æ
?1
b1
b2
38
Refinement Property (Cont.)
• Consequence
• Solving ?1 will expand range of some variables
• Leading to more exact underapproximation ?2-

x y
x 2 z ? 1
a
?1
Æ
b1
b2
39
Effect of Iteration
?1
UNSAT proof generate overapproximation
?
?1-
• Each Complete Iteration
• Expands ranges of some word-level variables
• Creates refined underapproximation

40
Approximation Methods
• So Far
• Range constraints
• Underapproximate by constraining values of
word-level variables
• Subformula elimination
• Overapproximate by assuming subformula value
arbitrary
• General Requirements
• Systematic under- and over-approximations
• Way to connect from one to another
• Goal Devise Additional Approximation Strategies

41
Function Approximation Example
x x x
0 1 else
y 0 0 0 0
y 1 0 1 x
y else 0 y
• Motivation
• Multiplication (and division) are difficult cases
for SAT
• Prohibit Via Additional Range Constraints
• Gives underapproximation
• Restricts values of (possibly intermediate) terms
• Abstract as f (x,y)
• Overapproximate as uninterpreted function f
• Value constrained only by functional consistency

42
Results UCLID BV vs. Bit-blasting
results on 2.8 GHz Xeon, 2 GB RAM
• UCLID always better than bit blasting
• Generally better than other available procedures
• SAT time is the dominating factor

43
Challenges with Iterative Approximation
• Formulating Overall Strategy
• Which abstractions to apply, when and where
• How quickly to relax constraints in iterations
• Which variables to expand and by how much?
• Too conservative Each call to SAT solver incurs
cost
• Too lenient Devolves to complete bit blasting.
• Predicting SAT Solver Performance
• Hard to predict time required by call to SAT
solver
• Will particular abstraction simplify or
complicate SAT?
• Combination Especially Difficult
• Multiple iterations with unpredictable inner loop

44
STP Linear Equation Solving
• Ganesh Dill, CAV 07
• Solve linear equations over integers mod 2w
• Capture range of solutions with Boolean Variables
• Example Problem
• Variables 3-bit unsigned integers
• x x2 x1 x0 y y2 y1 y0 z z2 z1 z0
• Linear equations conjunction of linear
constraints
• General Form
• A x b mod 2w

3x 4y 2z 0 mod 8
2x 2y 2 0 mod 8
2x 4y 2z 0 mod 8
45
Solution Method
• Equations
• Some Number Theory
• Odd number has multiplicative inverse mod 2w
• Mod 8 3-1 3
• Additive inverse mod 2w -x 2w - x
• Mod 8 -4 4 -2 6
• Solve first equation for x

3x 4y 2z 0 mod 8
2x 2y 2 0 mod 8
2x 4y 2z 0 mod 8
33x 34y 36z mod 8
x 4y 2z mod 8
46
Solution Method (cont.)
• Substitutions

2x 2y 2 0 mod 8
2x 4y 2z 0 mod 8
x 4y 2z mod 8
2(4y2z) 2y 2 0 mod 8
2(4y2z) 4y 2z 0 mod 8
2y 4z 2 0 mod 8
4y 6z 0 mod 8
47
What if All Coefficients Even?
• Result of Substitutions
• Even numbers do not have multiplicative inverses
mod 8
• Observation
• Can divide through and reduce modulus

2y 4z 2 0 mod 8
4y 6z 0 mod 8
y 2z 1 0 mod 4
2y 3z 0 mod 4
y 2z 3 mod 4
z 2 mod 4
y 3 mod 4
48
General Solutions
• Original variables 3-bit unsigned integers
• x x2 x1 x0 y y2 y1 y0 z z2 z1 z0
• Solutions
• Constrained variables
• y y2 1 1 z z2 1 0
• Back Substitution
• Constrained variables
• x 0 0 0

y 3 mod 4
z 2 mod 4
x 4y 6z mod 8
x 0 mod 8
49
Linear Equation Solutions
• Equations
• Encoding All Possible Solutions
• x 0 0 0 y y2 1 1 z z2 1 0
• y2, z2 arbitrary Boolean variables
• 4 possible solutions (out of original 512)
• General Principle
• Form of LU decomposition
• Polynomial time algorithm
• Boolean variables in solution to express set of
solutions
• Only works when have conjunction of linear
constraints

3x 4y 2z 0 mod 8
2x 2y 2 0 mod 8
2x 4y 2z 0 mod 8
50
Layered Solver
• Bruttomesso, et al, CAV 07
• Part of MathSAT project
• DPLL(T) Framework
• SAT solver coupled with solver for mathematical
theory T
• BV theory solver works with conjunctions of
constraints

51
DPLL(T) Formula Structure
Boolean Structure
Atoms
?
• Atoms
• Predicates applied to bit-vector expressions
• Boolean Variables

52
DPLL(T) Operation
• Actions
• DPLL engine satisfies Boolean portion
• Theory solver determines whether resulting
conjunction of atoms satifiable

?
?
?
?
x 2 z gt 1
x 26 v
w 0xFFFF ? x
x y
x ltlt 1 gt y
• Solver provides information to DPLL engine to aid
search
• Nonchronological backtracking
• Conflict clause generation
• Successful approach for other decision procedures

53
MathSAT Layers
• Uses increasingly detailed solver layers
• Only progress if cant find conflict using more
abstract rules
• Layers
• Equality with uninterpreted functions
• Treats all bit-level functions and operators as
uninterpreted
• Simple handling of concatenations, extractions,
and transitivity
• Full solver using linear arithmetic SAT

54
Summary Modeling Levels
• Bits
• Limited ability to scale
• Hard to apply functional abstractions
• Words
• Allows abstracting data while precisely
representing control
• Overlooks finite word-size effects
• Bit Vectors
• Realistic semantic model for hardware software
• Captures all details of actual operation
• Detects errors related to overflow and other
artifacts of finite representation
• Can apply abstractions found at word-level

55
Areas of Agreement
• SAT-Based Framework Is Only Logical Choice
• SAT solvers are good getting better
• Want to Automatically Exploit Abstractions
• Function structure
• Arithmetic properties
• E.g., associativity, commutativity
• Arithmetic reductions
• E.g., LU decomposition
• Base Level Should Be SAT
• Only semantically complete approach

56
Choices
• Optimize for Special Formula Classes
• E.g., STP optimized for conjunctions of
constraints
• Common in software verification testing
• Iterative Abstraction
• Natural framework for attempting different
abstractions
• Having SAT solver in inner loop makes performance
tuning difficult
• DPLL(T) Framework
• Theory solver only deals with conjunctions
• May need to invoke SAT solver in inner loop
• Hard to coordinate outer and inner search
procedures
• Others?

57
Observations
• Bit-Vector Modeling Gaining in Popularity
• Recognition of importance
• Benchmarks and competitions
• Just Now Improving on Bit Blasting SAT
• Lots More Work to be Done