Title: Languages%20of%20the%20future:%20?mega%20the%20701st%20programming%20language
1Languages of the future?mega the 701st
programming language
- Tim Sheard
- Portland State University
- (formerly from OGI/OHSU)
2Whats wrong with todays languages?
- The semantic gap
- What does the programmer know about the program?
How is this expressed? - The temporal gap
- Systems are configured with new knowledge at
many different times compile-time, link-time,
run-time. How is this expressed?
3What will languages of the future be like?
- Support reasoning about a program from within the
programming language. - Within the reach of most programmers No Ph.D.
required. - Support all of todays capabilities but organize
them in different ways. - Separate powerful but risky features from the
rest of the program, spell out obligations needed
to control the risk, ensure that obligations are
met. - Provide a flexible hierarchy of temporal stages.
Track important attributes across stages.
4How do we get there?
- In small steps, Im afraid . . .
- Two small contributions
- Putting the Curry-Howard isomorphism to work for
regular programmers - Exploiting staged computation
5Step 1- Putting Curry-Howard to work
- Programming by manipulating proofs of important
semantic properties - What is a proof?
- How do we exploit proofs?
- ? is a new point in the design space somewhere
between a - Programming language
- A logic
6We need something in between to two extremes!
Haskell Python OCaml Pascal
Java C C
7Dimensions
- Formal methods systems
- Have too few users. We cant solve the worlds
problems with a handful of users. And, for the
most part, the users are thinkers not hackers - Are used to reason about systems, but arent
designed to really execute programs. For the most
part, they dont have rich libraries, I/O etc. - Have a steep learning curve. It takes a Ph.D. to
learn to effectively use these tools.
8Between the concrete and the clouds
- Users - Train more users to use formal systems,
or add formal features to lower level languages
so existing programmers can use formal methods. - Systems Design practical extensions for formal
systems and build robust compilers for them, or
add formal extensions to practical languages.
9Haskell
Python OCaml Pascal Java
C C
10Curry-Howard. What is a proof?
Am I odd or even?
3
- Requirements for a legal proof
- Even is always stacked above odd
- Odd is always stacked above even
- The numeral decreases by one in each stack
- Every stack ends with 0
111 1 0
2 1 1
3 1 2
12Generalized Algebraic Datatypes
- Inductively formed structured data
- Generalizes enumerations tagged variants
- Types are used to prevent the construction of
ill-formed data and to encode constraints - Pattern matching allows abstract high level (yet
still efficient) access - Support the kind of proof construction were after
13Integer Indexed Type-Constructors
- Z Even 0
- E Odd m -gt Even (m1)
- O Even m -gt Odd (m1)
- O(E (O Z))
- Odd (1110)
Note Even and Odd are type constructors indexed
by integers
14GADTs Generalize this restriction
- Data Tree a
- Fork (Tree a) (Tree a)
- Node a
- Tip
- Fork Tree a -gt Tree a -gt Tree a
- Node a -gt Tree a
- Tip Tree a
- Note the data declaration
- introduces values of a new
- type
Restriction the range of every constructor
matches exactly the type being defined
15GADT in ?mega
Zero and Succ encode the natural numbers at the
type level
- kind Nat Zero Succ Nat
- data Even n
- Z where n Zero
- ex m . E(Odd m)
- where n Succ m
- data Odd n
- ex m . O(Even m)
- where n Succ m
Even and Odd are proofs!
16- Z Even Zero
- E Odd m -gt Even (Succ m)
- O Even m -gt Odd (Succ m)
- Note the different ranges in Z, E and O
17The kind decl introduces new types
- Allow algebraic definitions to define new kinds
as well as new types - Zero and Succ are new types.
- Kind Nat Zero Succ Nat
- Zero Nat
- Succ Nat gt Nat
- Succ Zero Nat
182
A hierarchy of values, types, kinds, sorts,
1
sorts
gt
Nat
kinds
Nat gt Nat
Int
Int
Zero
Succ
types
5
5
values
19Why remove the restriction?
- The parameter of a type constructor (e.g. the a
in T a) says something about the values with
type T a - phantom types
- indexed types
- Consider an expression language
- data Exp
- Eint Int
- Ebool Bool
- Eplus Exp Exp
- Eless Exp Exp
- Eif Exp Exp Exp
What about terms like (Eif (Eint 3) (Eint
0) (Eint 9))
20Imagine a type-indexed Term datatype
Note the different range types!
- Int Int -gt Term Int
- Bool Bool -gt Term Bool
- Plus Term Int -gt Term Int -gt Term Int
- Less Term Int -gt Term Int -gt Term Bool
- If Term Bool -gt Term a -gt Term a -gt Term a
21Type-indexed Data
- Benefits
- The type system disallows ill-formed Terms like
- (If (Int 3) (Int 0) (Int 9))
- Documentation
- With the right types, such objects act like
proofs
22Type-indexed Terms
- Data Term a
- Int Int where aInt
- Bool Bool where aBool
- Plus (Term Int) (Term Int) where aInt
- Less (Term Int) (Term Int) where aBool
- If (Term Bool) (Term a) (Term a)
- Int forall a.(aInt) gt Int -gt Term a
- We can specialize this kind of type to the ones
we want - Int Int -gt Term Int
- Bool Bool -gt Term Bool
- Plus Term Int -gt Term Int -gt Term Int
- Less Term Int -gt Term Int -gt Term Bool
- If Term Bool -gt Term a -gt Term a -gt Term a
23Why is (Term a) like a proof?
- A value x of type Term a is like a judgment
- - x a
- The type systems ensures that only valid
judgments can be constructed. Having a value of
type Term a guarantees (i.e. is a proof of)
that the term is well typed.
24Programming
- eval Term a -gt a
- eval (Int n) n
- eval (Bool b) b
- eval (Plus x y) eval x eval y
- eval (Less x y) eval x lt eval y
- eval (If x y z)
- if (eval x)
- then (eval y)
- else (eval z)
25Problem Type Checking
- How do we type pattern matching?
-
- case x of
- (Int n) -gt . . .
- (Bool b)-gt . . .
- What type is x?
26Type Checking
- eval Term a -gt a
- eval (Less x y) eval x lt eval y
- Less(aBool)gtTerm Int -gt Term Int -gt Term Bool
- x Term Int
- y Term Int
- (eval x) Int
- (eval y) Int
- (eval x lt eval y) Bool
Assume aBool in this context
27Basic approach
- Data is a parameterized generalized-algebraic
datatype - It is indexed by some semantic property
- New Kinds introduce new types that are used as
indexes - Programs use types to maintain semantic
properties - We construct values that are proofs of these
properties - The equality constrained types make it possible
28Constructing proofs
- Suppose we want to read a string from the user,
and interpret that string as an expression. - What if the user types in an expression of the
wrong type? - Build a proof that the term is well typed for the
context in which we use it
29data Exp Eint Int Ebool Bool Eplus Exp
Exp Eless Exp Exp Eif Exp Exp Exp
- test IO ()
- test
- do text lt- readln
- expExp lt- parse text
- case typCheck exp of
- Pair Rint x -gt
- print (show (eval x 2))
- Pair Rbool y -gt
- if (eval y)
- then print True
- else print False"
- Fail -gt error "Ill typed term"
-
30Representation Types
- data Rep t
- Rint where tInt
- Rbool where tBool
- Rep is a representation type. It is a normal
first class value (at run-time) that represents a
static (compile-time) type. - There is a 1-1 correspondence between Rint and
Int, and Rbool and Bool - If x Rep t then knowing the shape of x
determines its type, and knowing its type
determines its shape.
31Untyped Terms and Judgments
- data Exp
- Eint Int
- Ebool Bool
- Eplus Exp Exp
- Eless Exp Exp
- Eif Exp Exp Exp
- data Judgment
- Fail
- exists t . Pair (Rep t) (Term t)
32Constructing a Proof
- typCheck Exp -gt Judgment
- typCheck (Eint n) Pair Rint (Int n)
- typCheck (Ebool b) Pair Rbool (Bool b)
- typCheck (Eplus x y)
- case (typCheck x, typCheck y) of
- (Pair Rint a, Pair Rint b) -gt Pair Rint (Plus
a b) - _ -gt Fail
- typCheck (Eless x y)
- case (typCheck x, typCheck y) of
- (Pair Rint a, Pair Rint b) -gt Pair Rbool
(Less a b) - _ -gt Fail
- typCheck (Eif x y z)
- case (typCheck x, typCheck y, typCheck z) of
- (Pair Rbool a, Pair Rint b, Pair Rint c)
- -gt Pair Rint (If a b c)
- (Pair Rbool a, Pair Rbool b, Pair Rbool c)
- -gt Pair Rbool (If a b c)
- _ -gt Fail
33Step 2 Using Staging
- Suppose you are writing a document retrieval
system. - The user types in a query, and you want to
retrieve all documents that meet the query. - The query contains information not known until
run-time, but which is constant across all
accesses in the document base. - E.g.
- Width Indent lt Depth Keyword Naval
34Width Indent lt Depth Keyword Naval
- If Width and Indent are constant across all
queries, But Depth and Keyword are fields of each
document - How can we efficiently build an execution engine
that translates the users query (typed as a
String) into executable code?
35Code in Omega
- promptgt 5 5
- 5 5 Code Int
- promptgt run 5 5
- 10 Int
- promptgt let x 23
- X
- promptgt let y 56 - x
- Y
- promptgt y
- 56 - 23 Code Int
36Dynamic values
- data Dyn x
- Dint Int where x Int
- Dbool Bool where x Bool
- Dyn (Code x)
- dynamize Dyn a -gt Code a
- dynamize (Dint n) lift n
- dynamize (Dbool b) lift b
- dynamize (Dyn x) x
37translation
- trans Term a -gt (Dyn Int,Dyn Int) -gt Dyn a
- trans (Int n) (x,y) Dint n
- trans (Bool b) (x,y) Dbool b
- trans X (x,y) x
- trans Y (x,y) y
- trans (Plus a b) xy
- case (trans a xy, trans b xy) of
- (Dint m,Dint n) -gt Dint(mn)
- (m,n) -gt Dyn (dynamize m) (dynamize n)
- trans (If a b c) xy
- case trans a xy of
- (Dbool test) -gt if test then trans b xy
else trans c xy - (Dyn test) -gt
- Dyn if test
- then (dynamize (trans b xy))
- else (dynamize (trans c xy))
38Applying the translation
- -- if 3 lt 5 then (x (5 2)) else y
- x1 If (Less (Int 3) (Int 5))
- (Plus X (Plus (Int 5) (Int 2)))
- Y
- w term
- \ x y -gt
- (dynamize(trans term
- (Dyn x ,Dyn y )))
-
- -- w x1
- -- \ x y -gt x 7 Code (Int -gt Int -gt Int)
39Our Original Goals
- Build heterogeneous meta-programming systems
- Meta-language ? object-language
- Type system of the meta-language guarantees
semantic properties of object-language - Experiment with Omega
- Finding new uses for the power of the type system
- Translating existing language-based ideas into
Omega - staged interpreters
- proof carrying code
- language-based security
40Serendipity
- ?megas type system is good for statically
guaranteeing all sorts of properties. - Lists with statically known length
- RedBlack Trees
- Binomial Heaps
- Dynamic Typing
41Conclusion
- Stating static properties is a good way to think
about programming - It may lead to more reliable programs
- The compiler should ensure that programs maintain
the stated properties - Generalizing algebraic datatypes make it all
possible - Ranges other than T a
- a becomes an index describing a static property
of xT a - New kinds let a have arbitrary structure
- Computing over a is sometimes necessary
42Related Work
- Inductive Families
- In type theory -- Peter Dybjer
- Epigram -- Zhaohui Luo, James McKinna, Paul
Callaghan, and Conor McBride - First-class phantom types - Cheney and Hinze
- Guarded Recursive Data Types
- Hong Wei Xi and his students
- Guarded Recursive Datatype Constructors
- A Typeful Approach to Object-Oriented Programming
with Multiple Inheritance - Meta-Programming through Typeful Code
Representation - Constraint-based type inference for guarded
algebraic data types -- Vincent Simonet and
François Pottier - A Systematic Translation of Guarded Recursive
Data Types to Existential Types -- Martin
Sulzmann - Polymorphic typed defunctionalization -- Pottier
and Gauthier. - Towards efficient, typed LR parsers -- Pottier
and Régis-Gianas. - First Class Type Equality
- A Lightweight Implementation of Generics and
Dynamics -- Hinze and Cheney - Typing Dynamic Typing -- Baars and Swierstra
- Type-safe cast Functional pearl -- Wierich
- Rogue-Sigma-Pi as a meta-language for LF -- Aaron
Stump. - Wobbly types type inference for generalised
algebraic data types -- Peyton Jones, Washburn
and Weirich
43Examples we have done
- Typed, staged interpreters
- For languages with binding, with patterns,
algebraic datatypes - Type preserving transformations
- Simplify Exp t -gt Exp t
- Cps Exp t -gt Exp trans t
- Proof carrying code
- Data Structures
- Red-Black trees, Binomial Heaps , Static length
lists - Languages with security properties
- Typed self-describing databases, where meta data
in the database describes the database schema - Programs that slip easily between dynamic and
statically typed sections. Type-case is easy to
encode with no additional mechanism
44Some other examples
- Typed Lambda Calculus
- A Language with Security Domains
- A Language which enforces an interaction protocol
45Typed lambda CalculusExp with type t in
environment s
- data V s t
- ex m . Z where s (t,m)
- ex m x . S (V m t) where s (x,m)
-
- data Exp s t
- IntC Int where t Int
- BoolC Bool where t Bool
- Plus (Exp s Int) (Exp s Int) where t Int
- Lteq (Exp s Int) (Exp s Int) where t Bool
- Var (V s t)
- Example Type
- Plus forall s t . (tInt) gt
- Exp s Int -gt Exp s Int -gt Exp
s t
46Language with Security DomainsExp with type t in
env s in domain d
- kind Domain High Low
- data D t
- Lo where t Low
- Hi where t High
- data Dless x y
- LH where x Low , y High
- LL where x Low, y Low
- HH where x High, y High
-
- data Exp s d t
- Int Int where t Int
- Bool Bool where t Bool
- Plus (Exp s d Int) (Exp s d Int) where t
Int - Lteq (Exp s d Int) (Exp s d Int) where t
Bool - forall d2 . Var (V s d2 t) (Dless d2 d)
47Language with interaction prototcolCommand with
store St starting in state x, ending in state y
- kind State Open Closed
- data V s t
- forall st . Z where s (t,st)
- forall st t1 . S (V st t)
- where s (t1,st)
- data Com st x y
- forall t . Set (V st t) (Exp st t) where xy
- forall a . Seq (Com st x a) (Com st a y)
- If (Exp st Bool) (Com st x y) (Com st x y)
- While (Exp st Bool) (Com st x y) where x y
- forall t . Declare (Exp st t) (Com (t,st) x
y) - Open where x Closed, y Open
- Close where x Open, y Closed
- Write (Exp st Int) where x Open, y Open
48Contributions
- Manipulating strongly-typed object languages in a
semantics-preserving manner - Implementation of Cheney and Hinzes ideas in a
functional programming language - Demonstration
- Show some practical techniques
- Logical frameworks ideas translated into everyday
programming idioms