Scrap your boilerplate: generic programming in Haskell - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Scrap your boilerplate: generic programming in Haskell

Description:

True (==) (x:xs) (y:ys) = (x == y) && (xs == ys) (==) xs ys = False ... Works easily for weird data types. data Rose a = MkR a [Rose a] ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 47
Provided by: peyt4
Category:

less

Transcript and Presenter's Notes

Title: Scrap your boilerplate: generic programming in Haskell


1
Scrap your boilerplategeneric programming in
Haskell
  • Ralf Lämmel, Vrije University
  • Simon Peyton Jones, Microsoft Research

2
The problem boilerplate code
Company
Dept Research
Dept Production
Manager
Manager
Dept Devt
Bill
15k
Fred
10k
Dept Manuf
Employee
Find all people in tree and increase their salary
by 10
Fred
10k
3
The problem boilerplate code
  • data Company C Dept
  • data Dept D Name Manager SubUnit
  • data SubUnit PU Employee DU Dept
  • data Employee E Person Salary
  • data Person P Name Address
  • data Salary S Float
  • type Manager Employee
  • type Name String
  • type Address String

incSal Float -gt Company -gt Company
4
The problem boilerplate code
  • incSal Float -gt Company -gt Company
  • incSal k (C ds) C (map (incD k) ds)
  • incD Float -gt Dept -gt Dept
  • incD k (D n m us) D n (incE k m) (map (incU k)
    us)
  • incU Float -gt SubUnit -gt SubUnit
  • incU k (PU e) incE k e
  • incU k (DU d) incD k d
  • incE Float -gt Employee -gt Employee
  • incE k (E p s) E p (incS k s)
  • incS Float -gt Salary -gt Salary
  • incS k (S f) S (kf)

5
Boilerplate is bad
  • Boilerplate is tedious to write
  • Boilerplate is fragile needs to be changed when
    data type changes (schema evolution)
  • Boilerplate obscures the key bits of code

6
Getting rid of boilerplate
  • Use an un-typed language, with a fixed collection
    of data types
  • Convert to a universal type and write (untyped)
    traversals over that
  • Use reflection to query types and traverse
    child nodes

7
Getting rid of boilerplate
  • Generic (aka polytypic) programming define
    function by induction over the (structure of the)
    type of its argument
  • PhD required. Elegant only for totally generic
    functions (read, show, equality)

generic inclttgt Float -gt t -gt t inclt1gt k Unit
Unit incltabgt k (Inl x) Inl (incltagt k
x) incltabgt k (Inr y) Inr (incltbgt k
y) incltabgt k (x, y) (incltagt k x, incltagt k y)
8
Our solution
  • Generic programming for the rest of us
  • Typed language
  • Works for arbitrary data types parameterised,
    mutually recursive, nested...
  • No encoding to/from some other type
  • Very modest language support
  • Elegant application of Haskell's type classes

9
Our solution
  • incSal Float -gt Company -gt Company
  • incSal k everywhere (mkT (incS k))
  • incS Float -gt Salary -gt Salary
  • incS k (S f) S (kf)

10
Two ingredients
  • incSal Float -gt Company -gt Company
  • incSal k everywhere (mkT (incS k))
  • incS Float -gt Salary -gt Salary
  • incS k (S f) S (kf)

2. Apply a function to every node in the tree
1. Build the function to apply to every node,
from incS
11
Type classes
member a -gt a -gt Bool member x
False member x (yys) xy True
otherwise member x ys
No! member is not truly polymorphic it does not
work for any type a, only for those on which
equality is defined.
12
Type classes
member Eq a gt a -gt a -gt Bool member x
False member x (yys) xy True
otherwise member x ys
The class constraint "Eq a" says that member only
works on types that belong to class Eq.
13
Type classes
class Eq a where () a -gt a -gt
Bool instance Eq Int where () i1 i2 eqInt
i1 i2 instance (Eq a) gt Eq a where ()
True () (xxs) (yys) (x y)
(xs ys) () xs ys
False member Eq a gt a -gt a -gt Bool member
x False member x (yys) xy
True otherwise member x ys
14
Implementing type classes
data Eq a MkEq (a-gta-gtBool) eq (MkEq e)
e dEqInt Eq Int dEqInt MkEq eqInt dEqList
Eq a -gt Eq a dEqList (MkEq e) MkEq el
where el True el (xxs)
(yys) x e y xs el ys el xs
ys False member Eq a -gt a -gt a -gt
Bool member d x False member d x (yys)
eq d x y True otherwise member d x ys
Class witnessed by a dictionary of methods
Instance declarations create dictionaries
Overloaded functions take extra dictionary
parameter(s)
15
Ingredient 1 type extension
  • (mkT f) is a function that
  • behaves just like f on arguments whose type is
    compatible with f's,
  • behaves like the identity function on all other
    arguments
  • So applying (mkT (incS k)) to all nodes in the
    tree will do what we want.

16
Type safe cast
cast (Typeable a, Typeable b) gt a -gt
Maybe b ghcigt (cast 'a') Maybe Char Just
'a' ghcigt (cast 'a') Maybe Bool Nothing ghcigt
(cast True) Maybe Bool Just True
17
Type extension
mkT (Typeable a, Typeable b) gt (a-gta) -gt
(b-gtb) mkT f case cast f of Just g -gt
g Nothing -gt id ghcigt (mkT not)
True False ghcigt (mkT not) 'a' 'a'
18
Implementing cast
An Int, perhaps
data TypeRep instance Eq TypeRep mkRep String
-gt TypeRep -gt TypeRep class Typeable a where
typeOf a -gt TypeRep instance Typeable Int
where typeOf i mkRep "Int"
Guaranteed not to evaluate its argument
19
Implementing cast
class Typeable a where typeOf a -gt
TypeRep instance (Typeable a, Typeable b)
gt Typeable (a,b) where typeOf p mkRep "(,)"
ta,tb where ta typeOf (fst p) tb
typeOf (snd p)
20
Implementing cast
cast (Typeable a, Typeable b) gt a -gt
Maybe b cast x r where r if typeOf x
typeOf (get r) then Just (unsafeCoerce
x) else Nothing get Maybe a -gt a
get x undefined
21
Implementing cast
  • In GHC
  • Typeable instances are generated automatically by
    the compiler for any data type
  • The definition of cast is in a library
  • Then cast is sound
  • Bottom line cast is best thought of as a
    language extension, but it is an easy one to
    implement. All the hard work is done by type
    classes

22
Two ingredients
  • incSal Float -gt Company -gt Company
  • incSal k everywhere (mkT (incS k))
  • incS Float -gt Salary -gt Salary
  • incS k (S f) S (kf)

2. Apply a function to every node in the tree
1. Build the function to apply to every node,
from incS
23
Ingredient 2 traversal
  • Step 1 implement one-layer traversal
  • Step 2 extend one-layer traversal to recursive
    traversal of the entire tree

24
One-layer traversal
  • class Typeable a gt Data a where
  • gmapT (forall b. Data b gt b -gt b) -gt
    a -gt a
  • instance Data Int where
  • gmapT f x x
  • instance (Data a,Data b) gt Data (a,b)
    where
  • gmapT f (x,y) (f x, f y)

(gmapT f x) applies f to the IMMEDIATE CHILDREN
of x
25
One-layer traversal
  • class Typeable a gt Data a where
  • gmapT (forall b. Data b gt b -gt b) -gt
    a -gt a
  • instance (Data a) gt Data a where
  • gmapT f
  • gmapT f (xxs) f x f xs -- !!!

gmapT's argument is a polymorphic function so
gmapT has a rank-2 type
26
Step 2 Now traversals are easy!
everywhere Data a gt (forall b. Data b gt b
-gt b) -gt a -gt a everywhere f x f (gmapT
(everywhere f) x)
27
Many different traversals!
everywhere, everywhere' Data a gt (forall
b. Data b gt b -gt b) -gt a -gt a everywhere f x
f (gmapT (everywhere f) x) -- Bottom
up everywhere' f x gmapT (everywhere' f) (f
x) -- Top down
28
More perspicuous types
everywhere Data a gt (forall b. Data b gt b
-gt b) -gt a -gt a everywhere (forall b. Data b
gt b -gt b) -gt (forall a. Data a gt a
-gt a) type GenericT forall a. Data a gt a -gt
a everywhere GenericT -gt GenericT
Aha!
29
What is "really going on"?
  • inc Data t gt Float -gt t -gt t
  • The magic of type classes passes an extra
    argument to inc that contains
  • The function gmapT
  • The function typeOf
  • A call of (mkT incS), done at every node in tree,
    entails a comparison of the TypeRep returned by
    the passed-in typeOf with a fixed TypeRep for
    Salary this is precisely a dynamic type check

30
Summary so far
  • Solution consists of
  • A little user-written code
  • Mechanically generated instances for Typeable and
    Data for each data type
  • A library of combinators (cast, mkT, everywhere,
    etc)
  • Language support
  • cast
  • rank-2 types
  • Efficiency is so-so (factor of 2-3 with no effort)

31
Summary so far
  • Robust to data type evolution
  • Works easily for weird data types

data Rose a MkR a Rose a instance (Data a)
gt Data (Rose a) where gmapT f (MkR x rs) MkR
(f x) (f rs) data Flip a b Nil Cons a (Flip
b a) -- Etc...
32
Generalisations
  • With this same language support, we can do much
    more
  • generic queries
  • generic monadic operations
  • generic folds
  • generic zips (e.g. equality)

33
Generic queries
  • Add up the salaries of all the employees in the
    tree

salaryBill Company -gt Float salaryBill
everything () (0 mkQ billS) billS
Salary -gt Float billS (S f) f
2. Apply the function to every node in the tree,
and combine results with ()
1. Build the function to apply to every node,
from billS
34
Type extension again
mkQ (Typeable a, Typeable b) gt d -gt
(b-gtd) -gt a -gt d (d mkQ q) a case cast a
of Just b -gt q b Nothing -gt
d ghcigt (22 mkQ ord) 'a' 97 ghcigt (22 mkQ
ord) True 22
Apply 'q' if its type fits, otherwise return 'd'
ord Char -gt Int
35
Traversal again
class Typeable a gt Data a where gmapT
(forall b. Data b gt b -gt b) -gt a -gt a
gmapQ forall r. (forall b. Data b gt b
-gt r) -gt a -gt r
Apply a function to all children of this node,
and collect the results in a list
36
Traversal again
class Typeable a gt Data a where gmapT
(forall b. Data b gt b -gt b) -gt a -gt a
gmapQ forall r. (forall b. Data b gt b
-gt r) -gt a -gt r instance Data Int
where gmapQ f x instance (Data a,Data b)
gt Data (a,b) where gmapQ f (x,y) f
x f y
37
The query traversal
everything Data a gt (r-gtr-gtr) -gt (forall
b. Data b gt b -gt r) -gt a -gt r everything k f x
foldl k (f x) (gmapQ (everything f) x)
Note that foldr vs foldl is in the traversal, not
gmapQ
38
Looking for one result
  • By making the result type be (Maybe r), we can
    find the first (or last) satisfying value
    laziness

findDept String -gt Company -gt Maybe Dept
findDept s everything orElse (Nothing
mkQ findD s) findD String -gt Dept -gt Maybe
Dept findD s d_at_(D s' _ _) if ss' then Just
d else Nothing
39
Monadic transforms
class Typeable a gt Data a where gmapT
(forall b. Data b gt b -gt b) -gt a -gt a
gmapQ forall r. (forall b. Data b gt b
-gt r) -gt a -gt r gmapM Monad m
gt (forall b. Data b gt b -gt m b) -gt a
-gt m a
  • Uh oh! Where do we stop?

40
Where do we stop?
  • Happily, we can generalise all three gmaps into
    one

data Employee E Person Salary instance Data
Employee where gfoldl k z (E p s) (z E k p)
k s
  • We can define gmapT, gmapQ, gmapM in terms of
    (suitably parameterised) gfoldl
  • The type of gfoldl hurts the brain (but the
    definitions are all easy)

41
Where do we stop?
class Typeable a gt Data a where gfoldl
(forall a b. Data a gt c (a -gt b) -gt a -gt c
b) -gt (forall g. g -gt c g) -gt a
-gt c a
42
But we still can't do show!
  • Want show Data a gt a -gt String

show Data a gt a -gt String show t ???
concat (gmapQ show t)
show the children and concatenate the results
But how to show the constructor?
43
Add more to class Data
class Data a where toConstr a -gt
Constr data Constr -- abstract conString
Constr -gt String conFixity Constr -gt Fixity
  • Very like typeOf Typeable a gt a -gt
    TypeRepexcept only for data types, not functions

44
So here is show
show Data a gt a -gt String show t conString
(toConstr t) concat (gmapQ show t)
  • Simple refinements to deal with parentheses,
    infix constructors etc
  • toConstr on a primitive type (like Int) yields a
    Constr whose conString displays the value

45
Further generic functions
  • read Data a gt String -gt a
  • toBin Data a gt a -gt BitfromBin Data a
    gt Bit -gt a
  • testGen Data a gt RandomGen -gt a

class Data a where toConstr a -gt Constr
fromConstr Constr -gt a dataTypeOf a -gt
DataType data DataType -- Abstract stringCon
DataType -gt String -gt Maybe Constr indexCon
DataType -gt Int -gt Constr dataTypeCons
DataType -gt Constr
46
Conclusions
  • Simple, elegant
  • Modest language extensions
  • Rank-2 types
  • Auto-generation of Typeable, Data instances
  • Fully implemented in GHC
  • Shortcomings
  • Stop conditions
  • Types are a bit uninformative

Paper http//research.microsoft.com/simonpj
Write a Comment
User Comments (0)
About PowerShow.com