The secret life of typecheckers - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

The secret life of typecheckers

Description:

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Introduction This presentation is modeled on a paper by Luca Cardelli (Bell Labs, 1985) A general ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 31
Provided by: JoeMor9
Category:

less

Transcript and Presenter's Notes

Title: The secret life of typecheckers


1
The secret life of typecheckers
2
Introduction
  • This presentation is modeled on a paper by Luca
    Cardelli (Bell Labs, 1985)
  • A general view of type-checking will be presented
    from the perspective of the programming language
    designer
  • We will explore type systems past, present and
    future

3
A little history
  • Type systems have been around longer than
    computers
  • In the 1920s David Hilbert started a program to
    formalize mathematics as strings of symbols
    manipulated by logic/grammar rules
  • Idea was to be able to mechanically prove
    things
  • Bertrand Russell understood the problems with
    self-reference and approached Hilberts challenge
    by assigning entities to types
  • Entities of each type are built up from entities
    of the preceding type
  • In 1931 Kurt Gödel proved that consistent systems
    of any complexity are incomplete, ending
    Hilberts program
  • Application to programming languages
  • Computing involves representing and manipulating
    entities as strings of symbols
  • Problems of representation and self-reference
    crop up in numerous ways
  • We want to mechanically prove things about
    programs
  • Types support this

4
What are types, really?
  • Types come into play whenever we have a universe
    of diverse things with a similar representation
  • Bits in a computers memory
  • XML strings
  • DNA
  • If you consider these things in the absence of a
    type system, you have an untyped universe
  • This means there is really only one type (the
    memory word, the DNA base pair, etc.)

5
Operations in untyped universes
  • Any such universe has various operations that can
    be performed
  • Adding and subtracting (bit strings)
  • Rendering HTML (XML)
  • Transcription/translation (DNA)
  • But these operations are only valid on subsets of
    the untyped universe
  • Some XML strings represent HTML documents and
    some dont
  • Some DNA sequences represent valid genes and some
    dont
  • What happens if you blow it?
  • Tumbolia, the land of dead hiccups and
    extinguished lightbulbs(Douglas Hofstadter)
  • The major purpose of a type system is to avoid
    embarrassing questions about representations, and
    to forbid situations where these questions might
    come up (Cardelli)

6
Type-checking and programming languages
  • Type-checking avoids these embarrassing questions
  • Assigns types to constants, operators, variables,
    and functions
  • Checks that every operation is performed on
    inputs of the correct type
  • Accepts programs that can be proven to have no
    type errors
  • Type-checker reads program code and says ok
    ornot ok and heres why
  • By comparison
  • An interpreter reads program code and executes
    the instructions
  • A compiler reads program code and translates it
    into a different representation of the same
    program

7
Type systems in programming languages
  • The term type system refers to the range of types
    that can be assigned to variables and values
  • Base types int, float, double, etc.
  • User-defined types (e.g. classes, parameterized
    types, etc.)
  • Type systems are somewhat arbitrary, and inspired
    largely by the typical instruction sets of modern
    computers
  • You can create different type systems for the
    same language that are more or less expressive
  • Inexpressive type systems are frustrating they
    either accept too many erroneous programs, or
    forbid too many correct ones
  • Expressive type systems are more precise,
    rejecting as many erroneous programs as possible
    and accepting a greater percentage of correct ones

8
Expressiveness and abstract data types
  • Imagine a type system that supports only the
    types int and object
  • Now youre compiling this function
  • int foo (object o)
  • return o.bar()
  • Does the type system say yes or no?
  • If yes, were overly permissive the
    type-checker doesnt know whether the bar
    method is really available
  • If no, were overly restrictive
  • The type system needs to be more expressive
    needs to include separate types for each class,
    etc.
  • Expressiveness means having a rich language of
    types enabling the type-checker to determine with
    the greatest possible precision whether it should
    accept programs or not

9
Inexpressive type systems
Typesafeprograms
Programs withtype errors
Rejects too manytypesafe programs
Accepts too many unsafe programs
10
Expressive type systems
Typesafeprograms
Programs withtype errors
Does not accept any unsafe programs, and does
accept most typesafe programs
11
Polymorphism and type inference
  • Polymorphism gives type-checkers an even bigger
    headache
  • Requires a major increase in expressiveness
  • What is the type of a generic List class?
  • What is the type of a generic Sort function?
  • Type checking is simplified by having programmers
    annotate programs with type information
  • However this gets painful as the type system
    becomes expressive
  • Solution is type inference let the computer
    figure out all the types
  • The goal of type-checking researchMaximize the
    expressiveness of type systems while minimizing
    the need for programmers to annotate programs
    with complex type information

12
Examples
  • The best way to explore the subtleties of type
    systems is to work through examples
  • Lets try a few

13
Subtyping
  • class Base
  • class Derived public Base
  • void main(char args)
  • Base b new Derived ()
  • Derived d b
  • Is this typesafe?
  • Should it be accepted by the compiler?
  • If you add a dynamic cast (i.e. add further
    annotations to help the compiler), will the
    compiler add a runtime check? Should it?

14
Apples and oranges
  • // from one header file
  • struct Apple
  • int x
  • void appleProcessingService (Apple a)
  • // from another header file
  • struct Orange
  • int x
  • // source file
  • void main(char args)
  • appleProcessingService (new Apple())
  • appleProcessingService (new Orange ())

15
How about this one?
  • // from header file, US edition of software
  • struct Apple
  • int x
  • void appleProcessingService (Apple a)
  • // from header file, French edition of software
  • struct Pomme
  • int x
  • // source file
  • void main(char args)
  • appleProcessingService (new Apple())
  • appleProcessingService (new Pomme())

16
Math expressions
  • void main (char args)
  • int x 123
  • int y 234
  • int z x / y
  • Is this typesafe? Should it be accepted by the
    compiler?

17
Wouldnt it be cool if
  • We had a rational datatype?
  • void main (char args)
  • int w 123
  • int x 234
  • rational y w / x
  • rational z w 0.5
  • Any problems here?

18
What kind of error is this?
  • void main (char args)
  • int x 1
  • int y 0
  • int z x / y
  • Could type systems help us here?

19
What if we introduced
  • A nonzero datatype?
  • Say the compiler requires the divisor to be of
    type nonzero
  • void main (char args)
  • int x 1
  • nonzero y 0
  • int z x / y
  • Good idea? Or not?

20
Fibonacci strikes back
  • Is this typesafe? Could a type-checker prove it?
  • nonzero fib (int x)
  • if (x lt 2)
  • return 1
  • else
  • return fib(x-1) fib(x-2)
  • How about this?
  • int inputAndParseNumberFromUser ()
  • void main (char args)
  • nonzero x inputAndParseNumberFromUser ()
  • Options?

21
User-constructed types
  • Data abstraction implies the ability for
    programmers to create new types
  • How do we express the type of variable foo in
    this example?
  • struct
  • int x
  • float y
  • foo
  • Type theorists usually write the type something
    like this
  • (int, float)
  • The type of an array of integers would be
  • int
  • An array of arrays of integers would be
  • int
  • The type of a function with an int argument
    returning a float would be
  • int ? float

22
User-constructed types
  • The operators (), , and ? are type constructors
  • They take types as arguments and define new types
  • Once you have type constructors, your type system
    can contain as many types as you like
  • Type-checker has to cope with all of this,
    providing a syntax for programmers to write all
    these types If necessary

23
Polymorphism
  • When introducing polymorphic constructs into the
    language, type constructors are not enough
  • Type of the Length function for arrays of
    integers
  • int ? int
  • Type of the Length function for arrays of
    anything
  • forall (T) T ? int
  • Introducing polymorphic types into a type system
    is analogous to introducing functions into a
    programming language
  • The above type could also be written
  • forall (U) U ? int
  • U is a type variable and forall provides type
    abstraction
  • Use of forall is called universal
    quantification because any type can be plugged in
    to U
  • Polymorphic types can be specialized
  • type V forall (U) U ? int
  • type W Vltstringgt

24
Why have type notation?
  • Why do we feel the need to write out these
    complicated types?
  • If youre writing a function, you only need to
    write the types of the return value and
    arguments, not the function itself
  • Two reasons
  • If youre programming with higher order functions
    (which well be doing more of in the future) its
    helpful to write these types
  • These functions do have types, regardless of
    whether were writing them out it would be nice
    to have a standard notation

25
Bounded quantification
  • Bounded quantification is the idea that that only
    some types can be plugged in to U
  • For example, if you had a Length function which
    could only be used on arrays of different kinds
    of numbers, you could write
  • type T forall (U U lt number) U ? int
  • U is constrained to be a number (or subtype
    thereof)
  • But what if you do this?
  • type V Tltstringgt
  • Is that a type mismatch?

26
Types and kinds
  • In the spirit of Russell, computer scientists
    generally like to keep these levels separate.
  • Higher-level types which ensure correctness of
    types are called kinds
  • This level of checking is referred to as kind
    checking
  • There are countless papers floating around with
    titles like
  • Is type a type?and A new programming
    language with type type
  • They are exploring the question of whether a type
    system can operate on itself, or whether levels
    should be kept separate.

27
C bounded quantification question
  • template lttypename Tgt
  • class Copier
  • T myStruct
  • public
  • void copy () myStruct.x myStruct.y
  • struct IntPair int x, y
  • struct FloatPair float x, y
  • struct BogusPair float x char y
  • void main(char args)
  • CopierltIntPairgt cip
  • cip.copy()
  • CopierltFloatPairgt cfp
  • cfp.copy()

28
Existential quantification
  • The type of a function that takes an array of T
    and returns an integer, for some single type T
  • exists(T) T ? int
  • At this point we have implicitly defined a type T
  • We know nothing about type T, except that
  • A function of the above type could take a list of
    them and return an int
  • T is intuitively a little like a class
  • It is a type, and we dont know anything about
    how it works, but we know a way in which we can
    use it
  • Universal and bounded quantification provide the
    theoretical basis for parameterized types
  • Existential quantification provides the
    theoretical basis for information hiding

29
Type inference
  • As type systems become more complicated, it
    becomes more burdensome for programmers to write
    out types
  • Would you write expressions like this?
  • forall (U U lt number) U ? int
    myFunction ()
  • No you would just avoid higher order functions
  • The solution is type inference

30
Type inference
  • Allows programmers to omit type declarations and
    have the compiler infer them
  • Promises all of the expressiveness of dynamic
    languages, but with static type safety
  • Research in this area has come a long way but
    there are still valid, type-safe programs which
    type inference engines cannot handle
  • Rudimentary type inference (local variables) is
    coming in .NET 3.0
  • Given that type theory experts like Simon
    Peyton-Jones are at Microsoft we can expect to
    see this area of .NET evolve rapidly

31
Ideas for the future
  • Continue improving type system expressiveness and
    type inference engines
  • How about having the type-checker interact with
    the programmer? e.g.
  • Can I assume this will always be an odd number?
  • Can I assume that no instances of this class are
    constructed outside of this source tree?
  • How about monitoring running programs to generate
    better type annotations for use in future
    compilations?
  • How about a graphical interface for creating and
    manipulating type information

32
Conclusion
  • Type-checking is not a simple, tidy field
  • Its a matter of tradeoffs and judgment
  • More expressiveness means that programming
    languages can become more powerful and
    polymorphic without compromising type safety
  • However more expressiveness more pain for
    programmers
  • Working with higher-order functions is great, but
    not if you have to type 10 lines of type
    declarations for each line of code
  • Type inference is a promising solution
  • Holy grail is to provide all the power of dynamic
    languages like Lisp, Python, and Ruby with the
    type safety of C and no need to write a single
    type declaration

33
The dreaded monad question
  • Parameterized interface (two type parameters)
  • Any monad type mlta,bgt must support methods
  • gtgt (bind) mltagt ? (a ? mltbgt) ? mltbgt
  • gtgt (sequence) mltagt ? mltbgt ? mltbgt
  • return a ? mltagt
  • fail String ? mltagt
  • A kind of type framework that can represent
    sequencing,non-determinism, and other concepts
  • Sequencing
  • writeFile testFile.txt Hello File gtgt putStr
    Hello World
  • The parameterized type Nullable also turns out to
    be a monad
Write a Comment
User Comments (0)
About PowerShow.com