Title: Mobility, Security, and ProofCarrying Code Peter Lee Carnegie Mellon University
1Mobility, Security, andProof-Carrying Code
Peter LeeCarnegie Mellon University
- Lecture 2
- July 11, 2001
- Overview of PCC and Safety Policies
Lipari School on Foundations of Wide Area Network
Programming
2Key Idea Explicit Proofs
Code
Certifying Prover
Proof Engine
Proof
Trusted Host
3Proof-Carrying Code
Code
Certifying Prover
Proof
Proof Checker
4Certifying Compilers
Certifying Compiler
Certifying Prover
Proof Checker
5Questions
- What is the interface between the compiler and
the prover? - What is meant by safety?
- Or security or privacy or ?
- How does the proof checker work?
- How does it connect code with proof?
- How are proofs represented?
- For compactness, speed, and simplicity?
- How are proofs generated?
6Todays Lecture
- Overview of approaches.
- High-level architecture.
- Safety properties.
- Next time Detailed examples, proof generation
and checking.
7Overview of Approaches to Certified Code TAL,
kJava, and PCC
8Typed Assembly LanguageMorrisett, et al., 98
- Use modern type theory to develop a static type
system for machine code. - Prove decidability of typechecking.
- Prove soundness of type system.
- Developing such a type system is very hard, but
hopefully is done only once.
9Typed Assembly Language
Type-Directed Compiler
Type Checker
Type Checker
Somewhat surprisingly, it is possible to capture
a practical subset of the x86 architecture in a
type system.
10TAL Example
int i n1 int s 0 while (--i gt 0) s
i
mov eax, ecx inc eax mov ebx,
0 jmp test body eax B4, ebx B4
add ebx, eax test eax B4, ebx B4 dec
eax cmp eax, 0 jg body
In practice, the annotations are much more
sophisticated.
11Novel Features of TAL
- Uses existential types to capture the concept of
the current stack frame. - Type system has been evolving rapidly (including
soundness proofs). - But painful for implementors.
- Limited use of dependent types for optimized code.
12K Virtual Machine
- Designed to support the CLDC.
- Must fit into lt128KB.
- Must have fast bytecode verification.
- kJava class files must be Java-compatible.
- Divides bytecode verification into two stages.
13kJava and KVM
kJava Compiler
kJava Preverifier
Verifier
14KVM Verification
- Preverification is performed by the code
producer. - Uses global (iterative) analysis to compute the
types of stack slots and local vars at every join
point. - Second stage is performed by class loader.
- Simple linear scan verifies correctness of
join-point annotations.
15KVM Examplefrom Frank Yellin
0. aload_0 1. astore_1 2. goto 10 Long Number
ltgt 5. aload_1 6. invokeStatic
nextValue(Number) 9. astore_1 Long Number
ltgt 10. aload_1 11. invokeVirtual
intValue() 14. ffne 5 17. return
static void test(Long x) Number y x
while (y.IntValue() ! 0) y
nextValue(y) return y
16KVM Verification
- The second stage verifier is a 10KB program that
requires - a single scan of the code, and
- lt100 bytes of run-time storage.
- Impressive!
- This is Java verification done right.
17PCC Certifying Compilation
Certifying Compiler
Certifying Prover
Proof Checker
18PCC Example
L7 ANN_LOOP(INV (csubneq ebx 0), (csubneq
eax 0), (csubb edx ecx), (of rm mem),
MODREG (EDI,EDX,EFLAGS,FFLAGS,RM)) cmpl esi,
edx jae L13 movl 8(ebx, edx, 4),
edi movl edi, 8(eax, edx, 4) incl edx cmpl
ecx, edx jl L7 ret L13 call __Jv_ThrowBadA
rrayIndex ANN_UNREACHABLE nop L6 call __Jv_Thr
owNullPointer ANN_UNREACHABLE nop
ANN_LOCALS(_bcopy__6arrays6Bcopy1AIAI,
3) .text .align 4 .globl _bcopy__6arrays6Bcopy1AIA
I _bcopy__6arrays6Bcopy1AIAI cmpl 0,
4(esp) je L6 movl 4(esp), ebx movl 4(ebx),
ecx testl ecx, ecx jg L22 ret L22 xorl e
dx, edx cmpl 0, 8(esp) je L6 movl 8(esp),
eax movl 4(eax), esi
19Join-Point Annotations
- All of these approaches to certified code make
use of join-point typing annotations to reduce
code verification to a simple problem. - They are essentially the classical loop
invariants of the Dijkstra/ Hoare program
verification approach.
20Overheads
- In TAL and PCC we observe relatively large
annotations sizes (10-20), sometimes more. - Unknown for kJava.
- Research question
- Can we reduce this size?
21High-Level Architecture
22High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
23High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
24The VCGen
- The verification condition generator (VCGen)
examines each instruction. - It essentially encodes the operational semantics
of the language. - It checks some simple properties.
- E.g., direct jumps go to legal addrs.
- It invokes the Checker when dangerous
instructions are encountered.
25The VCGen, contd
- Examples of dangerous instructions
- memory operations
- procedure calls
- procedure returns
- For each such instruction, VCGen creates a
verification condition (VC). - A VC is a logical predicate whose truth implies
the instruction is safe.
26High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
27The Checker
- When given a VC, the Checker looks in the
explanation for its proof. - If found, it then checks whether the proof is
valid. - The set of allowable VCs and their valid proofs
is defined by the safety policy.
28A Dialog
- VCGen Scanning code
- VCGen Danger! This memory-write is safe only if
VC ?r.P(r) under assumptions ? is true. - Checker Looking for proof of VC.
- Checker Found proof ??? VC.
- Checker Checking with safety policy to see
whether ? is a valid proof.
29High-Level Architecture
Code
Verification condition generator
Checker
Explanation
Agent
Safety policy
Host
30The Safety Policy
- The safety policy is defined by an inference
system that defines - the language of predicates (for VCs)
- the axioms and inference rules for writing valid
proofs of VCs. - specifications (pre/post-conditions) for each
entry point in the code. - Informally, one thinks of the safety policy as
defining the constraints on the execution of safe
programs.
31Reference Interpreters
- A reference interpreter (RI) is a standard
interpreter extended with instrumentation to
check the safety of each instruction before it is
executed, and abort execution if anything unsafe
is about to happen. - In other words, an RI is capable only of safe
execution.
32Reference Interpreterscontd
- The reference interpreter is never actually
implemented. - The point will be to prove (by using the proof
rules given in the safety policy) that execution
of the code on the RI never aborts, and thus
execution on the real hardware will be identical
to execution on the RI.
33Reference Interpreterscontd
- Rule of Thumb
- Any notion of safety that can be enforced by a
reference interpreter can be encoded in a PCC
safety policy.
34Example
- Consider a safety policy for x86 code where
- the code must provide a function whose
precondition is that register eax contains a
pointer to a float array and ebx contains the
arrays length. - the code is allowed to read and write
floating-point values into the given array, but
nowhere else in the heap memory.
35Example, contd
- As the RI executes, it must perform a special
check every time the code attempts to read or
write to memory. - When reading, it must check that the address is
within the bounds of the array. - When writing, it must check that the value being
written is actually a floating-point value, and
check that the address is within bounds.
- As the RI executes, it must perform a special
check every time the code attempts to read or
write to memory. - When reading, it must check that the address is
within the bounds of the array. - When writing, it must check that the value being
written is actually a floating-point value, and
check that the address is within bounds.
36Example, contd
- To do this kind of type-checking, it will be
useful for the RI to maintain information about
the types of values in the registers. - E.g., execution of xxxxxxx fadd
eax?ebx should result in the
knowledge that ebx contains a floating-point
value, if eax and ebx held floating-point
values before execution.
37Homework Exercises
- 2. Suppose that we require the code to execute no
more than N instructions? Is such a safety
property enforceable by an RI? - 3. Suppose we require the code to terminate
eventually. Is such a safety property
enforceable by an RI?
38Reference Interpretersand Safety Policies
- A reference interpreter, if actually implemented,
would enforce safety at run-time. - PCC can be used to enforce the same safety at
load-time. - Essentially, the proofs given with the code
attest to the fact that the code will never abort.
39Operational Semantics
- In terms of operational semantics, the RI defines
a safe machine. - The proofs show that the code always makes
progress (or halts normally) in the operational
semantics. - This leads to a standard notion of soundness.
40Note
- I will avoid formal notation for statements of
some key results and theorems. - See the papers (and especially Neculas PhD
thesis for these details).
41Examples of Safety Properties
- Memory safety.
- Which addresses are readable / writable when,
and what values. - Type safety.
- What values can be stored and used in operations.
- System call safety.
- Which system routines can be called and when.
42Examples of Safety Policiescontd
- Action sequence safety.
- E.g., no network send after reading a file.
- Resource usage safety.
- E.g., instruction counts, stack limits, etc.
43What Cant Be Enforced?
- Informally
- Safety properties. ? Yes.
- No bad thing will happen.
- Liveness properties. ? Not yet.
- A good thing will eventually happen.
- Information-flow properties. ? ?
- Confidentiality will be preserved.
44What Cant Be Enforced?
- Liveness properties currently cannot be enforced
by PCC. - Actually, PCC proofs can express proofs of such
properties, but VCGen can not generate
appropriate VCs. - Conjecture
- In practice, safety properties are good enough.
45Safety Properties AreGood Enough?
- Termination is an example of a liveness property.
- Termination within a specified number of cycles
is a safety property. - In practice, the latter is often more useful than
the former.
46Applets, Not Craplets
47Architecture
Ginseng
Special J
Code producer
Host
48Architecture
Java binary
Native code
Special J
VCGen
Annotations
VC
Axioms
Proof checker
Proof
Code producer
Host
49Architecture
Java binary
Native code
Certifying compiler
VCGen
Annotations
VC
VCGen
Axioms
VC
Axioms
Proof generator
Proof checker
Proof
Code producer
Host
50Java Virtual Machine
Java Verifier
Checker
Proof-carrying code
JVM
JNI
51Show either the Mandelbrot or NBody3D demo.
52Crypto Test Suite ResultsCedilla Systems
sec
On average, 72.8 faster than Java, 37.5 faster
than Java with a JIT.
53Java Grande Suite v2.0 Cedilla Systems
sec
54Java Grande Bench Suite Cedilla Systems
ops
55Ginseng
15KB, roughly similar to a KVM verifier (but
with floating-point).
VCGen
4KB, generic.
Checker
19KB, declarative and machine-generated.
Safety Policy
Dynamic loading Cross-platform support
22KB, some optional.
56Safety Policy Specifications
57Architecture
Java binary
Native code
Certifying compiler
VCGen
Annotations
VC
VCGen
Axioms
VC
Axioms
Proof generator
Proof checker
Proof
Code producer
Host
58Ginseng
15KB, roughly similar to a KVM verifier (but
with floating-point).
VCGen
4KB, generic.
Checker
19KB, declarative and machine-generated.
Safety Policy
Dynamic loading Cross-platform support
22KB, some optional.
Ginseng is small and easy-to-integrate.
59Safety Policy
- The safety policy gives the inference rules for
constructing valid proofs. - We use a language called LF for this
specification, using Pfennings Elf syntax. - Much more on this next time.
60Safety PolicySample Rules
/\ pred -gt pred -gt pred. \/ pred -gt pred -gt
pred. gt pred -gt pred -gt pred. all (exp -gt
pred) -gt pred. pf pred -gt type. truei pf
true. andi Ppred Qpred pf P -gt pf Q -gt
pf (/\ P Q). andel Ppred Qpred pf (/\ P
Q) -gt pf P. ander Ppred Qpred pf (/\ P Q)
-gt pf Q.
61Safety PolicySome Rules
exp -gt exp -gt pred. ltgt exp -gt exp -gt
pred. gt exp -gt exp -gt pred. eq_le Eexp
E'exp pf (csubeq E E') -gt pf
(csuble E E'). jbool exp. jchar
exp. jbyte exp. jshort exp. jint
exp. of exp -gt exp -gt pred. faddf Eexp
E'exp pf (of E jfloat) -gt pf (of E'
jfloat) -gt pf (of (fadd E E') jfloat).
62Safety PolicySample Rules
aidxi Iexp LENexp SIZEexp pf
(below I LEN) -gt pf (arridx (add (imul I
SIZE) 8) SIZE LEN). wrArray4 Mexp Aexp
Texp OFFexp Eexp pf (of A
(jarray T)) -gt pf (of M mem) -gt pf
(nonnull A) -gt pf (size T 4) -gt
pf (arridx OFF 4 (sel4 M (add A 4))) -gt pf
(of E T) -gt pf (safewr4 (add A OFF) E).
63Homework Exercise 4
- The sample proof rules given here are very
Java-specific and compiler-specific. - This is clearly problematic.
- Why did Necula Lee do this?
64Summary
65Summary
- The Necula/Lee approach to PCC is based on the
notion of progress in a safe operational
semantics. - It makes use of a verification-condition
generator to extract predicates to be proven. - The approach seems to cover a wide range of
practical problems.
66Next Time
- Detailed examples, and then the representation of
proofs and algorithms for checking them.