The secret life of typecheckers - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

The secret life of typecheckers

Description:

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Introduction This presentation is modeled on a paper by Luca Cardelli (Bell Labs, 1985) A general ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 31

Provided by: JoeMor9

Learn more at: http://www.joemorrison.org

Category:

more less

Transcript and Presenter's Notes

Title: The secret life of typecheckers

1
The secret life of typecheckers
2
Introduction

This presentation is modeled on a paper by Luca
Cardelli (Bell Labs, 1985)
A general view of type-checking will be presented
from the perspective of the programming language
designer
We will explore type systems past, present and
future

3
A little history

Type systems have been around longer than
computers
In the 1920s David Hilbert started a program to
formalize mathematics as strings of symbols
manipulated by logic/grammar rules
Idea was to be able to mechanically prove
things
Bertrand Russell understood the problems with
self-reference and approached Hilberts challenge
by assigning entities to types
Entities of each type are built up from entities
of the preceding type
In 1931 Kurt Gödel proved that consistent systems
of any complexity are incomplete, ending
Hilberts program
Application to programming languages
Computing involves representing and manipulating
entities as strings of symbols
Problems of representation and self-reference
crop up in numerous ways
We want to mechanically prove things about
programs
Types support this

4
What are types, really?

Types come into play whenever we have a universe
of diverse things with a similar representation
Bits in a computers memory
XML strings
DNA
If you consider these things in the absence of a
type system, you have an untyped universe
This means there is really only one type (the
memory word, the DNA base pair, etc.)

5
Operations in untyped universes

Any such universe has various operations that can
be performed
Adding and subtracting (bit strings)
Rendering HTML (XML)
Transcription/translation (DNA)
But these operations are only valid on subsets of
the untyped universe
Some XML strings represent HTML documents and
some dont
Some DNA sequences represent valid genes and some
dont
What happens if you blow it?
Tumbolia, the land of dead hiccups and
extinguished lightbulbs(Douglas Hofstadter)
The major purpose of a type system is to avoid
embarrassing questions about representations, and
to forbid situations where these questions might
come up (Cardelli)

6
Type-checking and programming languages

Type-checking avoids these embarrassing questions
Assigns types to constants, operators, variables,
and functions
Checks that every operation is performed on
inputs of the correct type
Accepts programs that can be proven to have no
type errors
Type-checker reads program code and says ok
ornot ok and heres why
By comparison
An interpreter reads program code and executes
the instructions
A compiler reads program code and translates it
into a different representation of the same
program

7
Type systems in programming languages

The term type system refers to the range of types
that can be assigned to variables and values
Base types int, float, double, etc.
User-defined types (e.g. classes, parameterized
types, etc.)
Type systems are somewhat arbitrary, and inspired
largely by the typical instruction sets of modern
computers
You can create different type systems for the
same language that are more or less expressive
Inexpressive type systems are frustrating they
either accept too many erroneous programs, or
forbid too many correct ones
Expressive type systems are more precise,
rejecting as many erroneous programs as possible
and accepting a greater percentage of correct ones

8
Expressiveness and abstract data types

Imagine a type system that supports only the
types int and object
Now youre compiling this function
int foo (object o)
return o.bar()
Does the type system say yes or no?
If yes, were overly permissive the
type-checker doesnt know whether the bar
method is really available
If no, were overly restrictive
The type system needs to be more expressive
needs to include separate types for each class,
etc.
Expressiveness means having a rich language of
types enabling the type-checker to determine with
the greatest possible precision whether it should
accept programs or not

9
Inexpressive type systems
Typesafeprograms
Programs withtype errors
Rejects too manytypesafe programs
Accepts too many unsafe programs
10
Expressive type systems
Typesafeprograms
Programs withtype errors
Does not accept any unsafe programs, and does
accept most typesafe programs
11
Polymorphism and type inference

Polymorphism gives type-checkers an even bigger
headache
Requires a major increase in expressiveness
What is the type of a generic List class?
What is the type of a generic Sort function?
Type checking is simplified by having programmers
annotate programs with type information
However this gets painful as the type system
becomes expressive
Solution is type inference let the computer
figure out all the types
The goal of type-checking researchMaximize the
expressiveness of type systems while minimizing
the need for programmers to annotate programs
with complex type information

12
Examples

The best way to explore the subtleties of type
systems is to work through examples
Lets try a few

13
Subtyping

class Base
class Derived public Base
void main(char args)
Base b new Derived ()
Derived d b
Is this typesafe?
Should it be accepted by the compiler?
If you add a dynamic cast (i.e. add further
annotations to help the compiler), will the
compiler add a runtime check? Should it?

14
Apples and oranges

// from one header file
struct Apple
int x
void appleProcessingService (Apple a)
// from another header file
struct Orange
int x
// source file
void main(char args)
appleProcessingService (new Apple())
appleProcessingService (new Orange ())

15
How about this one?

// from header file, US edition of software
struct Apple
int x
void appleProcessingService (Apple a)
// from header file, French edition of software
struct Pomme
int x
// source file
void main(char args)
appleProcessingService (new Apple())
appleProcessingService (new Pomme())

16
Math expressions

void main (char args)
int x 123
int y 234
int z x / y
Is this typesafe? Should it be accepted by the
compiler?

17
Wouldnt it be cool if

We had a rational datatype?
void main (char args)
int w 123
int x 234
rational y w / x
rational z w 0.5
Any problems here?

18
What kind of error is this?

void main (char args)
int x 1
int y 0
int z x / y
Could type systems help us here?

19
What if we introduced

A nonzero datatype?
Say the compiler requires the divisor to be of
type nonzero
void main (char args)
int x 1
nonzero y 0
int z x / y
Good idea? Or not?

20
Fibonacci strikes back

Is this typesafe? Could a type-checker prove it?
nonzero fib (int x)
if (x lt 2)
return 1
else
return fib(x-1) fib(x-2)
How about this?
int inputAndParseNumberFromUser ()
void main (char args)
nonzero x inputAndParseNumberFromUser ()
Options?

21
User-constructed types

Data abstraction implies the ability for
programmers to create new types
How do we express the type of variable foo in
this example?
struct
int x
float y
foo
Type theorists usually write the type something
like this
(int, float)
The type of an array of integers would be
int
An array of arrays of integers would be
int
The type of a function with an int argument
returning a float would be
int ? float

22
User-constructed types

The operators (), , and ? are type constructors
They take types as arguments and define new types
Once you have type constructors, your type system
can contain as many types as you like
Type-checker has to cope with all of this,
providing a syntax for programmers to write all
these types If necessary

23
Polymorphism

When introducing polymorphic constructs into the
language, type constructors are not enough
Type of the Length function for arrays of
integers
int ? int
Type of the Length function for arrays of
anything
forall (T) T ? int
Introducing polymorphic types into a type system
is analogous to introducing functions into a
programming language
The above type could also be written
forall (U) U ? int
U is a type variable and forall provides type
abstraction
Use of forall is called universal
quantification because any type can be plugged in
to U
Polymorphic types can be specialized
type V forall (U) U ? int
type W Vltstringgt

24
Why have type notation?

Why do we feel the need to write out these
complicated types?
If youre writing a function, you only need to
write the types of the return value and
arguments, not the function itself
Two reasons
If youre programming with higher order functions
(which well be doing more of in the future) its
helpful to write these types
These functions do have types, regardless of
whether were writing them out it would be nice
to have a standard notation

25
Bounded quantification

Bounded quantification is the idea that that only
some types can be plugged in to U
For example, if you had a Length function which
could only be used on arrays of different kinds
of numbers, you could write
type T forall (U U lt number) U ? int
U is constrained to be a number (or subtype
thereof)
But what if you do this?
type V Tltstringgt
Is that a type mismatch?

26
Types and kinds

In the spirit of Russell, computer scientists
generally like to keep these levels separate.
Higher-level types which ensure correctness of
types are called kinds
This level of checking is referred to as kind
checking
There are countless papers floating around with
titles like
Is type a type?and A new programming
language with type type
They are exploring the question of whether a type
system can operate on itself, or whether levels
should be kept separate.

27
C bounded quantification question

template lttypename Tgt
class Copier
T myStruct
public
void copy () myStruct.x myStruct.y
struct IntPair int x, y
struct FloatPair float x, y
struct BogusPair float x char y
void main(char args)
CopierltIntPairgt cip
cip.copy()
CopierltFloatPairgt cfp
cfp.copy()

28
Existential quantification

The type of a function that takes an array of T
and returns an integer, for some single type T
exists(T) T ? int
At this point we have implicitly defined a type T
We know nothing about type T, except that
A function of the above type could take a list of
them and return an int
T is intuitively a little like a class
It is a type, and we dont know anything about
how it works, but we know a way in which we can
use it
Universal and bounded quantification provide the
theoretical basis for parameterized types
Existential quantification provides the
theoretical basis for information hiding

29
Type inference

As type systems become more complicated, it
becomes more burdensome for programmers to write
out types
Would you write expressions like this?
forall (U U lt number) U ? int
myFunction ()
No you would just avoid higher order functions
The solution is type inference

30
Type inference

Allows programmers to omit type declarations and
have the compiler infer them
Promises all of the expressiveness of dynamic
languages, but with static type safety
Research in this area has come a long way but
there are still valid, type-safe programs which
type inference engines cannot handle
Rudimentary type inference (local variables) is
coming in .NET 3.0
Given that type theory experts like Simon
Peyton-Jones are at Microsoft we can expect to
see this area of .NET evolve rapidly

31
Ideas for the future

Continue improving type system expressiveness and
type inference engines
How about having the type-checker interact with
the programmer? e.g.
Can I assume this will always be an odd number?
Can I assume that no instances of this class are
constructed outside of this source tree?
How about monitoring running programs to generate
better type annotations for use in future
compilations?
How about a graphical interface for creating and
manipulating type information

32
Conclusion

Type-checking is not a simple, tidy field
Its a matter of tradeoffs and judgment
More expressiveness means that programming
languages can become more powerful and
polymorphic without compromising type safety
However more expressiveness more pain for
programmers
Working with higher-order functions is great, but
not if you have to type 10 lines of type
declarations for each line of code
Type inference is a promising solution
Holy grail is to provide all the power of dynamic
languages like Lisp, Python, and Ruby with the
type safety of C and no need to write a single
type declaration

33
The dreaded monad question

Parameterized interface (two type parameters)
Any monad type mlta,bgt must support methods
gtgt (bind) mltagt ? (a ? mltbgt) ? mltbgt
gtgt (sequence) mltagt ? mltbgt ? mltbgt
return a ? mltagt
fail String ? mltagt
A kind of type framework that can represent
sequencing,non-determinism, and other concepts
Sequencing
writeFile testFile.txt Hello File gtgt putStr
Hello World
The parameterized type Nullable also turns out to
be a monad