Stacks, Heaps and Regions: One Logic to Bind Them - PowerPoint PPT Presentation

About This Presentation
Title:

Stacks, Heaps and Regions: One Logic to Bind Them

Description:

An activation record is a sequence of adjacent locations ... the top of the stack, we can access the items in the current activation record ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 79
Provided by: amala1
Category:

less

Transcript and Presenter's Notes

Title: Stacks, Heaps and Regions: One Logic to Bind Them


1
Stacks, Heaps and RegionsOne Logic to Bind Them
  • David Walker
  • Princeton University
  • SPACE 2004

2
Stacks, Heaps and RegionsOne Logic to Bind Them
  • David Walker
  • Princeton University
  • With Amal Ahmed Limin Jia

3
Certifying Compilers
Source Program
  • Certifying compilers produce
  • machine code
  • safety proof
  • type safety
  • thread safety
  • memory safety
  • Uses
  • trustworthy mobile code
  • safety-critical systems
  • compiler debugging

Certifying Compiler
Machine Code Safety Proof
4
Certifying Compilers
  • Low-level typing abstractions
  • support diverse source languages
  • support diverse implementation optimization
    strategies
  • clean interface between compiler and mechanical
    safety checkers

Java
C
ML
Transform, Optimize
Low-level Typing Abstractions
Machine Code Safety Proof (Typing Invariants
Encoded)
5
TALx86 Lessons Morrisett et al.
  • Checking control-flow safety is fairly easy
  • State memory management is the hard part
  • new typing algorithms for each new compiler trick
  • machine register state
  • heap memory (pointers, structs, ...)
  • stack memory (stack pointers, stack structs, ...)
  • user-managed memory (more pointers, aliasing
    info, ...)
  • Results
  • complex, ad hoc axioms (type checker less
    trust-worthy)
  • repeated work
  • abstractions not generally composable or reusable

6
A Goal for SPACE 20...
  • What we are looking for A new proof-carrying
    code system/typed assembly language for safe
    memory management
  • More uniform more general
  • Easier to understand (simpler semantics)
  • Allows reuse and composition of abstractions
  • A promising approach Search for new logics that
    can capture common storage invariants
  • Following Ishtiaq, OHearn, Pym, Reynolds, and
    others insights on storage semantics separation
    logic
  • And Pfenning, CMU crew and others logical design
    techniques work on logical frameworks

7
This Talk
  • What recurring properties of memory do we need to
    reason about in a proof-carrying code system?
  • Internalizing storage properties in a modal
    substructural logic
  • Semantics of formulae
  • Using the logic to describe state in a low-level
    type system (briefly)
  • Related Future work
  • This talk based on work at TLDI 03 LICS 03

8
Property 1 Separation
  • The memory for the heap is separate from the
    memory for the stack
  • The register EAX is separate from register EBX
    (and ECX, etc...)
  • In general, memory A is separate from memory B if
    the domain of A does not overlap with the domain
    of B

?74
?75
?7
?8
?9
?14
?15
stack
heap
EAX
EBX
9
Property 1 Separation
  • The importance of separation
  • If memory A is separate from memory B then
    updates to A have no impact on B
  • Eg updating the stack does not change values in
    the heap
  • Eg updating EAX does not change the contents of
    EBX
  • Eg deallocating region r1 has no impact on
    region r2 (if they are separate)
  • Present in
  • Linear type systems
  • TALx86
  • Ishtiaq, OHearn, Reynolds separation logic

10
Property 2 Adjacency
  • A struct is a sequence of adjacent locations
  • An activation record is a sequence of adjacent
    locations
  • A stack is a sequence of adjacent activation
    records
  • In general, A is adjacent to B if the greatest
    location in A is next to the least location in B,
    and A is separate from B

?7
?8
?9
a1
a2
rest...
top
11
Property 2 Adjacency
  • The importance of adjacency
  • If memory A is adjacent to memory B and we can
    access A then we can access B
  • Eg using a pointer to the beginning of a
    struct, we can access all of its elements
  • Eg using a pointer to the top of the stack, we
    can access the items in the current activation
    record
  • Present in
  • TALx86
  • Foundational PCC (Appel et al)
  • Ordered type systems (Petersen et al.)

12
Property 3 Containment
  • Register EAX can contain an integer value (or a
    pointer value or other kinds of values)
  • A memory location (say, ?7) can contain a
    sequence of 32 bits
  • A user-managed memory region may contain a
    collection of memory locations.

EAX
3
31
0
1
...
?7
on
on
off
?22
?7
13
R7
7
13
Property 3 Containment
  • The importance of containment
  • If A is contained in memory region r and region r
    has property P then A has property P
  • Eg EAX may contain an integer --- if so, we can
    add 3 to the contents of EAX
  • Eg Memory region R1 may contain live data ---
    if so, we can dereference pointers into that
    region
  • Present in
  • Tofte Talpins region calculus
  • Cardelli, Gardner, Ghelli Gordons ambient, tree
    graph logics
  • TALx86 (registers, static data segment, stack
    heap)

14
Property 4 Aliasing
  • Two pointers are aliases of one another if they
    are the same location.
  • Aliasing information is important since changing
    memory at x changes memory at y
  • Present in
  • every system!!
  • Talx86 reasoned about heap aliases and stack
    aliases

(x y)
x
y
3
15
This Talk
  • What recurring properties of memory are
    convenient for reasoning in a proof-carrying code
    system?
  • Internalizing storage properties in a modal
    substructural logic
  • Semantics of formulae
  • Using the logic to describe state in a low-level
    type system
  • Related Future work
  • This talk based on work at TLDI 03 LICS 03

16
Preliminaries - Memories
  • A memory is a mapping from locations to values.
  • Each location may have a single successor.
  • Successor relation gives rise to an ordering.
  • Locations may be composite
  • ? ?.n eg .R1.a7
    .R2.a14.b0

m
?9
?6
?5
?16
?7
?17
a
3
1
r2
r1
17
Formulae
  • Predicates q t
  • Formulae F q
  • Semantics of formulae given by m ? F _at_ ?
  • F describes memory m, whose contents are located
    in place ? (? acts like a constraint on the
    memory)
  • Simplest case
  • m ? t _at_ ? iff dom(m)? and ? m(?) t

18
Formulae
  • Example
  • m ? int _at_ ?3 if

m
?3
(notice ? m(?3) int )
5
19
Formulae Separation
  • Predicates q t
  • Formulae F q F1 ? F2
  • m ? F1 ? F2 _at_ ? iff exists disjoint m1 and m2
    such that
  • m1 ? F1 _at_ ? and m2 ? F2 _at_ ?
  • and mm1?m2

20
Formulae Separation
  • Example
  • m1 ? F1 _at_ m2 ? F2 _at_

m2
m1
?3
?16
?17
?7
?8
?9
7
r6
?3
?16
?5
21
Formulae Separation
  • Example
  • m1?m2 ? F1 ? F2 _at_

m1?m2
?3
?16
?17
?7
?8
?9
7
r6
?3
?16
?5
22
Formulae Adjacency
  • Predicates q t
  • Formulae F q F1 ? F2 F1 ? F2
  • m ? F1 ? F2 _at_ ? iff there exist adjacent (and
    disjoint)
  • m1 , m2 such that
  • m1 ? F1 _at_ ? and m2 ? F2 _at_ ?
  • and mm1?m2

23
Formulae Adjacency
  • Example
  • m1 ? F1 _at_ m2 ? F2 _at_

m2
m1
?3
?5
?7
?8
?9
?10
?16
?17
7
b
c
24
Formulae Adjacency
  • Example
  • m1?m2 ? F1 ? F2 _at_

m1?m2
?3
?5
?7
?8
?9
?10
?16
?17
7
b
c
25
Formulae Containment
  • Predicates q t
  • Formulae F q F1 ? F2 F1 ? F2 nF
  • m ? nF _at_ ? iff m ? F _at_ ?.n

26
Formulae - Containment
  • Example
  • m ? eaxint _at_ since m ? int _at_ .eax

since ? m(.eax) int
m
eax
5
27
Formulae - Containment
  • Example
  • m ? eaxint ? ebxchar _at_

m
eax
ebx
5
a
28
Formulae - Containment
  • Example
  • m ? eaxint ? ebxchar _at_
  • since m1 ? eaxint _at_ and m2 ? ebxchar _at_

m
eax
ebx
5
a
29
Formulae - Containment
  • Example
  • m ? eaxint ? ebxchar _at_
  • since m1 ? eaxint _at_ and m2 ? ebxchar _at_
  • since m1 ? int _at_ .eax and m2 ? char _at_ .ebx

m
eax
ebx
5
a
30
Aliasing
  • Types t int bool S(?) ...
  • Predicates q t
  • Formulae F q F1 ? F2 F1 ? F2 nF
  • ? v S(?) iff v ? (all values with type
    S(?) are
  • aliases of one another)

31
Aliasing
aliases
  • Example
  • m ? eaxS(.a2) ? ebxS(.a2) ? a2int _at_

m
eax
ebx
a2
7
32
One More Useful Predicate
  • Types t int bool S(?) ...
  • Predicates q t more? more?
  • Formulae F q F1 ? F2 F1 ? F2 nF ...
  • m ? more? m ?
    more?

m
m
?7
?8
?9
?6
?5
?4
?17
?18
?19
?16
?15
?14
. . .
. . .
33
Simple Machine Memory Layout
  • ( more? ? ?hd t ? Ftail ? Fheap ? ?ap t ?
    more? )
  • ? r1 t1 ? r2 t2 ? . . . ? sp S(?hd) ? ap
    S(?ap)

?hd
?ap
. . .
. . .
. . .
. . .
more? Ftail
Fheap more?
sp
r1
r2
ap
34
More logic
  • Predicates q t more? more?
  • Formulae F q F1 ? F2 F1 ? F2 nF
  • 1 F1 -o F2
  • F1 F2 ? F1 ? F2 0
  • f ?b. F b.F
  • Bindings b ?L nN aT f F
  • m ? 1 iff dom(m) is empty
  • m ? F1 F2 iff m ? F1 and m ? F2
  • m ? ? (holds for any memory m)
  • ....

35
Logical Deduction
  • Judgments have the form q ? D ? F _at_ ?
  • is a variable context a list of free variables
    their kinds
  • is a bunched context trees rather than lists
  • (OHearn Pym, 1999)
  • ? . (F _at_ ?) ?, ? ? ?

object at a place
adjacent storage (no exchange prop)
separate storage (exchange prop)
36
Logical Deduction
  • The natural deduction rules are sound with
    respect to the storage semantics
  • Semantics of contexts m ? D
  • Theorem (Soundness)
  • If m ? D and ?? D ? F _at_ ? then m ? F _at_ ?.

37
This Talk
  • What recurring properties of memory are
    convenient for reasoning in a proof-carrying code
    system?
  • Internalizing storage properties in a modal
    substructural logic
  • Semantics of formulae
  • Using the logic to describe state in a low-level
    type system
  • Related Future work
  • This talk based on work at TLDI 03 LICS 03

38
Mini-KAM Simplified ML Kit Abstract Machine
  • Registers r acc1 acc2 sp
  • Values v ....
  • Instructions i immed1(v) immed2(v) add
    sub push pop
  • selectStack(i) storeStack(i)
  • select(i) store(i)
  • letRgnInf endRgnInf alloc(i)

register ops
stack ops
region ops
39
Mini-KAM Types
  • Types t int S(?) live dead
  • (F _at_ ?) ? 0
  • Integers 5 int
  • Places ? S(?)
  • Region status live live dead dead
  • Code Locations c (F _at_ ?) ? 0
  • Means it is safe to jump to c with a memory m
    such that m ? F _at_ ?

40
Mini-KAM Simplified ML Kit Abstract Machine
  • Mini-KAM Store Hierarchy


acc1 acc2 sp stack R1 . .
. Rn
R1live ? F ? (a- ? more?)
stmore? ? ak- ? . . ? a1- ? ?
current activation record
description of data in region
region allocation boundary
live region
stack tail
stack area
41
Using Formulae in Typing Rules
  • Judgments of the form F _at_ ? can be used to
    describe the pre and postconditions of
    instructions
  • Instruction typing judgment q ? F _at_ ? ? i
    F _at_ ?

42
Using Formulae in Typing Rules
  • Judgment q ? F _at_ ? ? i F _at_ ?
  • In J, look up the type of place ?.n
  • J(?.n) F if ?? J ? (? ? nF ) _at_ ?
  • Rule for add instruction
  • (F _at_ ?)(.acc1) int (F _at_ ?)(.acc2)
    int
  • q ? F _at_ ? ? add F _at_ ?

43
Using Formulae in Typing Rules
  • Judgment q ? J ? i J (where J is of the
    form F _at_ p)
  • J(.sp)S(.stack.n0) J(.acc1)t
  • q ? J ? storeStack(i) J.stack.no i
    t

( storeStack)
In J, update the type of place ?.no
i J?.noi t (F1 ? n0- ? ??? ? nit ?
F2) ? F3 _at_ ? if ?? uJ ? ((F1 ?
n0- ? ??? ? ni- ? F2) ? F3) _at_ ?
44
This Talk
  • What recurring properties of memory are
    convenient for reasoning in a proof-carrying code
    system?
  • Internalizing storage properties in a modal
    substructural logic
  • Semantics of formulae
  • Using the logic to describe state in a low-level
    type system
  • Related Future work
  • This talk based on work at TLDI 03 LICS 03

45
Related Work
  • Reasoning about adjacency
  • Stack-based TAL (Morrisett et al., 1998)
  • Foundational PCC reasoning about memory
    allocation (Appel et al.)
  • lord - calculus for reasoning about data layout
    at the frontier (Petersen et al., 2003)
  • Reasoning about aliasing
  • Long history . . . singleton types for aliasing
    (Smith, Walker Morrisett) continue to be useful
  • Spatial logics separation and/or containment
  • BI, separation logic (Ishtiaq, OHearn, Reynolds
    others, 2000, 2001)
  • Ambient logic (Cardelli Gordon, 2000)
  • Tree and graph logics (Cardelli, Gardner, Ghelli,
    2002)

46
Lots More Work to Do
  • Add inductive definitions syntactic rules for
    reasoning about arrays, recursive data structures
  • Investigate encodings for common invariants
  • stack-allocation algorithms
  • region-allocation algorithms
  • aliasing patterns
  • Better understand the connection between modal
    (hybrid) logic regions

47
Conclusion
  • Described a unified framework for reasoning about
  • Separation
  • Adjacency
  • Containment
  • Aliasing
  • Semantics are sound, simple and uniform
  • Logic forms the basis for a sound and flexible
    low-level type system
  • See TLDI 03 LICS 03 for details

48
(No Transcript)
49
(No Transcript)
50
May Alias Formula
  • when two bits of storage (at a1 and a2) may
    alias
  • ?a1. ?a2. (a1int ? ?) (a2int ? ?)
  • both memories satisfy the formula

a1
a2
a
5
7
5
51
Example Saving Temporaries on the Stack
  • Code Describing Formula
  • (b-stackgrow)(x 2)
  • (b-unpack)(x 2)
  • sub sp,sp,2
  • st sp0,r1
  • st sp1,r2
  • lt Code for A gt
  • ld r1,sp0
  • ld r2,sp1
  • add sp,sp,2

Fpre
(more? ? ?1a1 ? ?2a2 ? ?t ? F1) ?
spS(?) ? r1t1 ? r2t2
Fpost
(more? ? ?1a1 ? ?2a2 ? ?t ? F1) ?
spS(?1) ? r1t1 ? r2t2
52
Formulae Wrapped in Types
  • Types t int S(p) (F _at_ p) ? 0
  • Informally, c (F _at_ p) ? 0 means it is safe to
    jump to c with a memory m such that m ? F _at_ p

53
Motivation Certifying Compilers
Source Program
Certifying Compiler
Safety Proof
Machine Code
54
Motivation Certifying Compilers
Source Program
Parse, Typecheck
High-level Typed IL
Analysis, Optimization
Type- preserving Compiler
Medium-level Typed IL
Code Generation
Typed Assembly Language
Assembler
Hints
Prover
Safety Proof
Machine Code
55
Motivation Certifying Compilers
Java
Java
ML
High TIL High TIL High TIL
Optimize Optimize Optimize
Type- preserving Compiler
Medium-level Typed IL
Code Generation
Typed Assembly Language
Assembler
Hints
Prover
Safety Proof
Machine Code
56
Motivation Proof-Carrying Code
  • The Princeton foundational PCC system (Appel et
    al.)
  • Scaling PCC to production compilers and realistic
    languages
  • Some requirements
  • Multiple source languages, single target language
  • Core proof system must be general and flexible
  • support for general language features
  • handle different implementation and optimization
    strategies
  • Trusted computing base should be small
  • to limit security bugs

57
PCC System Layers of Abstraction
Compiler
High-level typing abstractions
Low-level typing abstractions
Semantics of types
Machine spec
Higher-order logic
58
A Hard Problem (Semantics)
  • Semantics of memory updates and memory reuse
  • Semantic model of ML-style mutable references
    (Ahmed, Appel, Virga, 2002)
  • To handle ML function closures
  • extended model with mutable references to
    (impredicative) polymorphic types (Ahmed, Appel,
    Virga, 2003)
  • To allow memory reuse
  • extended model to support region-based memory
    management

59
Motivation Certifying Compilers
Java
C
ML
High-level Typed IL
Analysis, Optimization
Medium-level Typed IL
Typing abstractions (TAL)
  • Should be general flexible support many
  • language features
  • implementation
  • optimization strategies

Prover
Machine Code Safety Proof
60
Typing Abstractions for Memory
  • Reasoning about memory is complicated
  • many different memory management strategies,
    aliasing patterns, data layout possibilities,
    etc.
  • Systems for safe mobile code would benefit from
  • a unified framework for reasoning about a variety
    of invariants
  • convenient abstractions that help structure
    proofs of memory safety

61
Abstractions for Memory?
62
Abstractions for Memory?
Cornell Popcorn Cyclone
Cedilla Systems Special J
Princeton Foundational PCC
Source
Source
Source
High TIL
High TIL
High TIL
Medium TIL
Medium TIL
Medium TIL
TALx86
LTAL
VCGen Prover
Prover
Machine Code Safety Proof
Machine Code Safety Proof
Machine Code Safety Proof
63
Abstractions for Memory?
  • Reasoning about
  • memory is
  • complicated
  • many different
  • memory
  • management
  • strategies,
  • aliasing patterns,
  • data layout
  • possibilities, etc.

64
Typing Abstractions for Memory?
  • Reasoning about memory is complicated
  • many different memory management strategies,
    aliasing patterns, data layout possibilities, etc.

65
Formulae Wrapped in Types
  • Types t int S(p) (F _at_ p) ? 0
  • Informally, c (F _at_ p) ? 0 means it is safe to
    jump to c with a memory m such that m ? F _at_ p

66
Lessons from Typed Assembly Language
  • Lesson 1
  • Much of the type theory designed for higher-level
    languages can be reused to help verify machine
    code.
  • TAL is just the closed, continuation-passing
    style polymorphic lambda calculus ()
  • Lesson 2
  • The hard part is memory management memory
    safety.

67
One Logic to Bind Them
  • New goals for general-purpose safe memory
    management
  • composable abstractions
  • reusable abstractions
  • orthogonal abstractions
  • comprehensible abstractions
  • A unified composable framework for reasoning
    about
  • separation of objects (memory blocks)
  • adjacency of objects
  • aliasing of pointers
  • containment of one place in another
  • Proof that deduction in our logic is sound with
    respect to the memory model
  • Use logic in a type system for an IL for
    region-based memory management (Mini-KAM) and
    prove that the language is sound

68
This Talk
  • Logical formulae and the memory model
  • Flat memory
  • Hierarchical memory
  • Type system for Mini-KAM (informally)

69
A Logical Approach to Memory Management
  • One logic for reasoning about key storage
    properties
  • separation of objects (memory blocks)
  • adjacency of objects
  • containment of one place in another
  • aliasing of pointers
  • Logic comes with
  • orthogonal connectives to internalize key
    properties
  • syntactic proof rules
  • sound store semantics
  • Logic is incorporated into a typed abstract
    machine
  • safe stack, heap and region-based memory
    management

70
Formulae Multiplicative Unit
  • Predicates q ? t
  • Formulae F q F1 ? F2 F1 ? F2 1
  • m ? 1 iff

m
71
Hierarchical Memories
m
72
Hierarchical Memories, Paths

m
R2
R1
R1
R2
?7
?8
?9
?14
?15
?7
?8
?9
?14
?15
  • Path/place p p.n eg
    .R1.?7 .R2.?14

73
Hierarchical Memories, Paths

m A1
R2
R1
R1
R2
?7
?8
?9
?14
?15
?7
?8
?9
?14
?15
  • Path/place p p.n eg
    .R1.?7 .R2.?14
  • A hierarchical memory is a mapping from paths to
    values.

74
Formulae Containment
  • Predicates q t more? more?
  • Formulae F q F1 ? F2 F1 ? F2 1
  • F1 F2 ? F1 ? F2 0
  • f ?b. F b.F nF
  • Bindings b pP nN aT f F

Semantics given by m ? F _at_ p
75
Formula Semantics Separation
  • Formulae F F1 ? F2 nF
  • m ? (F1 ? F2) _at_ p iff there exist disjoint
    m1 and m2
  • m1 ? F1 _at_ p and m2 ? F2 _at_ p
  • and mm1?m2

76
Formula Semantics Separation
  • Example
  • m1 ? F1 _at_ m2 ? F2 _at_
  • dom(m1).R5.?3 dom(m2).R5.?4

m1
m2
R5
R5
?3
?4
3
3
77
Formula Semantics Separation
  • Example
  • m1 ? F1 _at_ m2 ? F2 _at_
  • dom(m1).R5.?3 dom(m2).R5.?4

m1?m2
R5
R5
?3
?4
3
3
m1?m2 ? (F1 ? F2) _at_
78
Sample Deductive Rules
(hypothesis)
q ? F _at_ p ? F _at_ p
q ? ? ? F _at_ p.n
q ? ? ? nF _at_ p
(n I)
(n E)
q ? ? ? nF _at_ p
q ? ? ? F _at_ p.n
  • Each connective is defined in terms of judgmental
    concepts only no dependencies on other
    connectives
  • Simpler to understand manipulate
Write a Comment
User Comments (0)
About PowerShow.com