Title: Java Generics Robert Corky Cartwright Rice University 19 Jan 2005
1Java GenericsRobert Corky Cartwright Rice
University 19 Jan 2005
2Motivation
- In 1998, Java represented a quantum leap forward
in mainstream programming technology. How could
the PL community make it better? - Enriching the data model and associated type
system. Adding genericity in the terminology
of OO language designers.
3Rules of the Game for Extending Java
- Upward compatibility old program binaries behave
as before (excluding programs that make extensive
use of reflection) - No changes to the JVM (except libraries)
- Interoperability between old and new code
- Extension through revised compiler (javac) and
class loader - Coherence with the existing language design
4Potential changes
- Moving primitive types into the object type
hierarchy (not done correctly in C). - Parametric polymorphism for classes, interfaces,
and methods. - Full first class genericity allow type
variables as superclasses supporting abstraction
with respect to the superclass of a class
definition (OO mixins).
5Blueprint for Extending Java
- Design a coherent extension of existing language.
- Implement the extension without changing the JVM
including run-time libraries (except additions). - Extensions must be supported entirely by the
source language compiler (javac) and extensions
to class loader. - Ensure that the overhead should be low.
- Key tricks
- A source program may generate extra class files
(Precedent inner classes). - Class files may be augmented by new attributes.
6Adding Genericity (Parametric Polymorphism)
- What is parametric polymorphism?The
parameterization of types, e.g., adding a
parameter to the List type so that ListltIntegergt
designates a list containing only integers. - In Java, coherence is a challenging problem
- Array types are already generic with co-variant
subtyping (Integer is a subtype of Number). - Co-variant subtyping conflicts with flexible
static type checking (updates are
contra-variant). - Supporting generic run-time types requires
significant new execution machinery. - Container classes should easily migrate to
corresponding generic classes, e.g. - Vector VectorltTgt
7Generics in Java 1.5 (Odersky, Wadler, et al)
- Any class or method can be parameterized by type,
which introduces type variables just as ?
introduces data variables in Scheme. - Class CltTgt / T can be used almost
anywhere an ordinary type is used / Class D
ltTgt T first(ListltTgt) / The scope of T
is the method definition / - Each type parameter has an upper bound (Object by
default) specified by an extends clause, e.g.,
class EltT extends Numbergt - Type parameters are non-variantly subtyped, e.g.
VectorltNumbergt is unrelated to VectorltIntegergt. - Parametric classes and methods are implemented
using type erasure every reference to a
generic type variable is replaced by its bound.
All of the instantiations of a parametric
(generic) classes are implemented by a single
erased class. Similarly, all of the
instantiations of a polymorphic method are
implemented by a single erased method.
8Understanding Type Erasure
- In essence, type erasure translates parameterized
code to the standard idiom used to simulate
genericity in ordinary Java, e.g., - VectorltIntegergt
Vector - augmented by casts where required these
- generated casts never fail.
- Technical complications compiler must bridge
methods to connect parametric and erased
signatures for a method. The parametric
signature appears in byte code when a class A
extends an instantiated generic class BltEgt, e.g. - class Environment extends VectorltBindinggt
public Binding elementAt(int i)
9What Java 5.0 Generics Omit
- Absence of run-time types inconsistent with naked
type parameters and built-in array type - new T(), new T, new T, are all invalid.
- Absence of run-time types inconsistent with
run-time type tests provided by Java - instanceof VectorltTgt is invalid.
- (VectorltTgt) and (T) are invalid.
- Exception types cannot be parametric.
- Per-class-instance static fields not an option.
10 Do Run-time Generic Types Matter?
Yes. Awkwad to code around absence of
- Isolated parametric allocation hacked APIsnew
T(), new T, new T, ... . - Parametric casts JSR14(T) ... , (T) ,
(VectorltTgt) , . - Instantiated casts cloning, integration of
legacy code
(VectorltIntegergt) ... , (ListltNumbergt) , . - Per-class-instantiation static fields
(singletons!)
11Co-variant Wild Card Types
- New form of parameterized type that allows a
wildcard () as a type argument in paramterized
type, e.g., Vectorltgt. - Every usage of the wildcard operator has an upper
bound (Object by default). - Contra-variant form is analogous but rarely used.
12More General Approach NextGen (Allen,
Cartwright, and Steele)
- Supports exactly the same extension syntax as
Java 1.5, less the restrictions. - All types are available at run-time for casting
and instanceof tests. - Lightweight homogeneous (code shared across
parametric instantiations) implementation. - Performance of prototype compiler is encouraging.
13NextGen Implementation Strategy
Augment GJ implementation relying on type-erasure.
- Use lightweight instantiation classes
(generated on demand) to specify run-time types - Replace type dependent operations in base classes
by abstract methods (snippets)and override them
in instantiation classes
14Observations
- Performance difference between different JVMs is
much greater than difference between GJ, NextGen,
and Java. - Implementation tuning of JIT can eliminate
essentially all of the performance penalty
through code specialization and method inlining. - Specialization provides opportunity for
performance gains! Explicit generic type
information provides guidance on how code should
be specialized.
15Beyond NextGen
- Object inlining of boxed primitive types (easy to
do with new wrapper classes). - Full Genericity using parameterized types
anywhere that they are sensible. Only
significant restriction on use of generic types
in NextGen - class CltT implements Igt extends T
16Why Mixins
- Mixins allow programs to abstract directly over
uniform class extensions decorator pattern is
the Java workaround for this limitation. - class AddScrollBarltT implements Windowgt
extends T implements ScrollableWindow - Mixins provide the machinery for defining a
components within the language as generic
classes - class ModuleltBgt static class A extends B
-
17Semantics of Mixins Two Options
- Raw macro-expansion (C templates)
- Performed on demand (lazily) by class loader
- Lacking in hygiene
- Hygienic macro-expansion
- Methods in superclass argument are renamed to
avoid accidental overriding - Example
- class AddHiddenPropertyltT implements Widgetgt
extends Timplements Hideable private boolean
isHidden false public boolean isHidden()
return isHidden public void
setHidden(boolean b) isHidden b -
- What if T already contains the method
- public boolean isHidden()
18Type Checking for Mixins
- Hygienic formulation is straightforward legality
of a mixin application only depends on whether
the type arguments satisfy their specified
bounds. - Non-hygienic case is more difficult a type
argument may contain a method that conflicts with
a method introduced in a mixin. It is doubtful
that these constraints can be checked by a class
compiler (like javac) because type arguments can
flow across a program via type application.
19Challenging issue in mapping mixin genericity
onto the JVM
- Constraints
- Compatible with existing Java binaries.
- Must enforce mixin hygiene by systematically
renaming some methods in the class loader to
avoid accidental overriding. - Extension of the existing NextGen implementation.
-
20Strategy
- In class loader, rename all methods m in all
classes by - prefixing them with the mangled name of the class
in - which they are introduced.
- Example method name value in class interp.Interp
becomes - interpInterpvalue
- Complication
- bridge methods for interface methods
- must forward method dispatches on interface types
to - corresponding methods in classes
21Implementation Subtleties
- Same method signature may appear in different
interfaces, e.g. - interface I void next()
- interface J void next()
- class C implements I
- class FooltTgt extends T implements J
- Consider FooltCgt
- Implements both I.next() and J.next()
- Must include forwarding methods for both.
- Solution class loader prefixes names of methods
in - interfaces by the interface name where they are
- introduced. This extension will also enable us
to - support multiple per-interface definitions for
a - given method signature in a class in Java source.
22Implementation Subtleties (cont.)
- In principle, several different instantiations of
the same generic interface could be implemented
by a class in a source program. Java 5.0 and
NextGen disallow these programs because we cannot
distinguish the methods after erasure. It rarely
happens in practice, but it is a corner case that
we must handle. (Failure on class loading is not
a very satisfactory solution.) - Question can we eliminate this restriction
in our extension of NextGen to support mixins?
23Supporting Multiple Instantiations of the Same
Interface
- We can modify the NextGen compiler to use
instantiated interfaces instead of erased
interfaces in both type declarations and the
prefixing of method names in interfaces. This
approach introduces some extra code because it
significantly reduces code sharing. Every
interface method call with a receiver type that
is an instantiated interface with a free type
variable must be implemented by a snippet (since
the snippet code will call different methods for
different receiver types).
24Status of Project
- Beta release of NextGen compiler should be
available within the next month from
www.cs.rice.edu/javaplt - Based on Sun Java 1.5 compiler. Distribution is
binary only at this point. - Prototype of MixGen (NextGen Mixins) will be
ready for internal testing by the end of Spring. - Beta release of MixGen during the summer.