Académique Documents
Professionnel Documents
Culture Documents
1
as \raw types" and type argument inference for generic }
method calls). Nonetheless, it has been a useful tool Pair setfst(Object newfst) {
in clarifying our thought, and led to the discovery and return new Pair(newfst, this.snd);
x of at least one bug in the GJ compiler. Because the }
model is small, it is easy to contemplate further exten- }
sions, and we have begun the work of adding raw types class A extends Object {
to the model; so far, this has revealed at least one corner A() { super(); }
of the design that was underspecied. }
Our main goal in designing FJ was to make a proof of
type soundness (\well-typed programs don't get stuck") class B extends Object {
as concise as possible, while still capturing the essence B() { super(); }
of the soundness argument for the full Java language. }
Any language feature that made the soundness proof For the sake of syntactic regularity, we always in-
longer without making it signicantly dierent was a clude the supertype (even when it is Object), we always
candidate for omission. As in previous studies of type write out the constructor (even for the trivial classes
soundness in Java, we don't treat advanced features A and B), and we always write the receiver for a eld
such as concurrency, inner classes, and re
ection. Other access (as in this.snd) or a method invocation. Con-
Java features omitted from FJ include assignment, in- structors always take the same stylized form: there is
terfaces, overloading, messages to super, null pointers, one parameter for each eld, with the same name as the
base types (int, bool, etc.), abstract method declara- eld; the super constructor is invoked on the elds of
tions, shadowing of superclass elds by subclass elds, the supertype; and the remaining elds are initialized
access control (public, private, etc.), and exceptions. to the corresponding parameters. Here the supertype is
The features of Java that we do model include mutually always Object, which has no elds, so the invocations
recursive class denitions, object creation, eld access, of super have no arguments. Constructors are the only
method invocation, method override, method recursion place where super or = appears in an FJ program. Since
through this, subtyping, and casting. FJ provides no side-eecting operations, a method body
One key simplication in FJ is the omission of as- always consists of return followed by an expression, as
signment. We assume that an object's elds are initial- in the body of setfst().
ized by its constructor and never changed afterwards. In the context of the above denitions, the expres-
This restricts FJ to a \functional" fragment of Java, sion
in which many common Java idioms, such as use of
enumerations, cannot be represented. Nonetheless, this new Pair(new A(), new B()).setfst(new B())
fragment is computationally complete (it is easy to en-
code the lambda calculus into it), and is large enough evaluates to the expression
to include many useful programs (many of the programs
in Felleisen and Friedman's Java text [12] use a purely new Pair(new B(), new B()).
functional style). Moreover, most of the tricky typing There are ve forms of expression in FJ. Here, new A(),
issues in both Java and GJ are independent of assign- new B(), and new Pair(e1,e2) are object constructors,
ment. An important exception is that the type inference and e3.setfst(e4) is a method invocation. In the body
algorithm for generic method invocation in GJ has some of setfst, the expression this.snd is a eld access, and
twists imposed on it by the need to maintain soundness the occurrences of newfst and this are variables. FJ
in the presence of assignment. This paper treats a sim- diers from Java in that this is an ordinary variable
plied version of GJ without type inference. rather than a special keyword.
The remainder of this paper is organized as follows. The remaining form of expression is a cast. The
Section 2 introduces the main ideas of Featherweight expression
Java, presents its syntax, type rules, and reduction
rules, and sketches a type soundness proof. Section 3 ((Pair)new Pair(new Pair(new A(), new B()),
extends Featherweight Java to Featherweight GJ, which new A()).fst).snd
includes generic classes and methods. Section 4 presents evaluates to the expression
an erasure map from FGJ to FJ, modeling the tech-
niques used to compile GJ into Java. Section 5 discusses new B().
related work, and Section 6 concludes.
Here, ((Pair)e7), where e7 is new Pair(...).fst, is
2 Featherweight Java a cast. The cast is required, because e7 is a eld access
to fst, which is declared to contain an Object, whereas
In FJ, a program consists of a collection of class def- the next eld access, to snd, is only valid on a Pair. At
initions plus an expression to be evaluated. (This ex- run time, it is checked whether the Object stored in the
fst eld is a Pair (and in this case the check succeeds).
pression corresponds to the body of the main method in In Java, one may prex a eld or parameter declara-
Java.) Here are some typical class denitions in FJ. tion with the keyword final to indicate that it may not
class Pair extends Object { be assigned to, and all parameters accessed from an in-
Object fst; ner class must be declared final. Since FJ contains
Object snd; no assignment and no inner classes, it matters little
Pair(Object fst, Object snd) {
super(); this.fst=fst; this.snd=snd;
whether or not final appears, so we omit it for brevity.
Dropping side eects has a pleasant side eect: eval- Once the subject of the cast is reduced to an object, it
uation can be easily formalized entirely within the syn- is easy to check that the class of the constructor is a
tax of FJ, with no additional mechanisms for model- subclass of the target of the cast. If so, as is the case
ing the heap. Moreover, in the absence of side eects, here, then the reduction removes the cast. If not, as in
the order in which expressions are evaluated does not the expression (A)new B(), then no rule applies and the
aect the nal outcome, so we can dene the opera- computation is stuck, denoting a run-time error.
tional semantics of FJ straightforwardly using a nonde- There are three ways in which a computation may
terministic small-step reduction relation, following long- get stuck: an attempt to access a eld not declared for
standing tradition in the lambda calculus. Of course, the class, an attempt to invoke a method not declared
Java's call-by-value evaluation strategy is subsumed by for the class (\message not understood"), or an attempt
this more general relation, so the soundness properties to cast to something other than a superclass of the class.
we prove for reduction will hold for Java's evaluation We will prove that the rst two of these never happen
strategy as a special case. in well-typed programs, and the third never happens
There are three basic computation rules: one for eld in well-typed programs that contain no downcasts (or
access, one for method invocation, and one for casts. \stupid casts"|a technicality explained below).
Recall that, in the lambda calculus, the beta-reduction As usual, we allow reductions to apply to any subex-
rule for applications assumes that the function is rst pression of an expression. Here is a computation for the
simplied to a lambda abstraction. Similarly, in FJ the second example expression, where the next subexpres-
reduction rules assume the object operated upon is rst sion to be reduced is underlined at each step.
simplied to a new expression. Thus, just as the slogan
for the lambda calculus is \everything is a function," ((Pair)new Pair(new Pair(new A(),
here the slogan is \everything is an object." new B()), new A()).fst).snd
Here is the rule for eld access in action: ! ((Pair)new Pair(new A(),new B())).snd
! new Pair(new A(), new B()).snd
new Pair(new A(), new B()).snd ! new B() ! new B()
Because of the stylized form for object constructors, we We will prove a type soundness result for FJ: if an ex-
know that the constructor has one parameter for each pression0 e reduces to expression e0 , and if e is well typed,
eld, in the same order that the elds are declared. Here then e is also well typed and its type is a subtype of
the elds are fst and snd, and an access to the snd eld the type of e.
selects the second parameter. With this informal introduction in mind, we may
Here is the rule for method invocation in action (= now proceed to a formal denition of FJ.
denotes substitution):
new Pair(new
A(), new B()).setfst(new
B()) 2.1 Syntax
new B()=newfst ; The syntax, typing rules, and computation rules for FJ
! new Pair(new A(),new B())=this are given in Figure 1, with a few auxiliary functions in
new Pair(newfst, this.snd) Figure 2.
i.e.,new Pair(new B(), The metavariables A, B, C, D, and E range over class
new Pair(new A(), new B()).snd) names; f and g range over eld names; m ranges over
method names; x ranges over parameter names; d and
The receiver of the invocation is the object e range over expressions; CL ranges over class decla-
new Pair(new A(), new B()), so we look up the rations; K ranges over constructor declarations; and M
setfst method in the Pair class, where we nd ranges over method declarations. We write f as short-
that it has formal parameter newfst and body hand for f1 ,. . . ,f (and similarly for C, x, e, etc.) and
new Pair(newfst, this.snd) . The invocation reduces
n
write M as shorthand for M1 . . . M (with no commas). We
to the body with the formal parameter replaced by write the empty sequence as and denote concatenation
n
the actual, and the special variable this replaced of sequences using a comma. The length of a sequence x
by the receiver object. This is similar to the beta is written #(x). We abbreviate operations on pairs of se-
rule of the lambda calculus, (x.e0)e1 ! [e1=x]e0. quences in the obvious way, writing \C f" as shorthand
The key dierences are the fact that the class of for \C1 f1 ,. . . ,C f ", and similarly \C f;" as short-
the receiver determines where to look for the body n n
hand for \C1 f1 ;. . . C f ;", and \this.f=f;" as short-
(supporting method override), and the substitution of
n n
hand for \this.f1 =f1 ;. . . ;this.f =f ;". Sequences of
the receiver for this (supporting \recursion through
n n
eld declarations, parameter names, and method decla-
self"). Readers familiar with Abadi and Cardelli's rations are assumed to contain no duplicate names.
Object Calculus will see a strong similarity to their & A class table CT is a mapping from class names C
reduction rule [1]. In FJ, as in the lambda calculus and to class declarations CL. A program is a pair (CT ; e) of
the pure Abadi-Cardelli calculus, if a formal parameter a class table and an expression. To lighten the notation
appears more than once in the body this may lead in what follows, we always assume a xed class table
duplication of the actual, but since there are no side CT .
eects this causes no problems. The abstract syntax of FJ class declarations, con-
Here is the rule for a cast in action: structor declarations, method declarations, and expres-
(Pair)new Pair(new A(), new B()) sions is given at the top left of Figure 1. As in Java, we
!
new Pair(new A(), new B()) assume that casts bind less tightly than other forms of
Syntax: Expression typing:
CL ::= class C extends C {C f; K M} ` x 2 ( x) (T-Var)
K ::= C(C f) {super(f); this.f = f;}
` e0 2 C0 elds (C0 ) = C f
M ::= C m(C x) {return e;} ` e0 .f 2 C (T-Field)
i i
e ::= x
j e.f ` e0 2 C0
j e.m(e) mtype (m; C0 ) = D!C
j new C(e) ` e 2 C C <: D (T-Invk)
j (C)e ` e0 .m(e) 2 C
Subtyping: elds (C) = D f
` e 2 C C <: D (T-New)
C <: C ` new C(e) 2 C
C <: D D <: E ` e0 2 D D <: C
(T-UCast)
C <: E ` (C)e0 2 C
CT (C) = class C extends D {...} ` e0 2 D C <: D C 6= D (T-DCast)
C <: D ` (C)e0 2 C
` e0 2 D 6
C <: D 6
D <: C
Computation: stupid warning
` (C)e0 2 C (T-SCast)
elds (C) = C f
!e (R-Field)
(new C(e)).fi i Method typing:
mbody (m; C) = (x; e0 ) x : C; this : C ` e0 2 E0 E0 <: C0
(new C(e)).m(d)
(R-Invk) CT (C) = class C extends D {...}
![ d=x; new C(e)=this e0 ] override (m; D; C!C0 )
C0 m (C x) {return e0 ;} OK IN C
C <: D
(D)(new C(e)) ! new C(e)
(R-Cast) Class typing:
K = C(D g, C f) {super(g); this.f = f;}
elds ( ) =
D D g M OK IN C
class C extends D {C f; K M} OK
expression. We assume that the set of variables includes with the same name is declared in the superclass then
the special variable this, but that this is never used it must have the same type.
as the name of an argument to a method.
Every class has a superclass, declared with extends. 2.2 Typing
This raises a question: what is the superclass of the
Object class? There are various ways to deal with this The typing rules for expressions, method declarations,
issue; the simplest one that we have found is to take and class declarations are in the right column of Fig-
Object as a distinguished class name whose denition ure 1. An environment is a nite mapping from vari-
does not appear in the class table. The auxiliary func- ables to types, written x:C.
tions that look up elds and method declarations in the The typing judgment for expressions has the form
class table are equipped with special cases for Object ` e 2 C, read \in the environment , expression e has
that return the empty sequence of elds and the empty type C." The typing rules are syntax directed, with one
set of methods. (In full Java, the class Object does have rule for each form of expression, save that there are three
several methods. We ignore these in FJ.) rules for casts. The typing rules for constructors and
By looking at the class table, we can read o the sub- method invocations check that each actual parameter
type relation between classes. We write C <: D when C is has a type that is a subtype of the corresponding formal.
a subtype of D { i.e., subtyping is the re
exive and tran- We abbreviate typing judgments on sequences in the
sitive closure of the immediate subclass relation given obvious way, writing ` e 2 C as shorthand for ` e1 2
by the extends clauses in CT . Formally, it is dened in C1 , . . . , ` e 2 C and writing C <: D as shorthand
n n
The given class table is assumed to satisfy some One technical innovation in FJ is the introduction
sanity conditions: (1) CT (C) = class C... for every of \stupid" casts. There are three rules for type casts:
C 2 dom (CT ); (2) Object 2 = dom (CT ); (3) for every in an upcast the subject is a subclass of the target, in
class name C (except Object) appearing anywhere in a downcast the target is a subclass of the subject, and
CT , we have C 2 dom (CT ); and (4) there are no cycles in a stupid cast the target is unrelated to the subject.
in the subtype relation induced by CT { that is, the <: The Java compiler rejects as ill typed an expression con-
relation is antisymmetric. taining a stupid cast, but we must allow stupid casts in
For the typing and reduction rules, we need a few FJ if we are to formulate type soundness as a subject
auxiliary denitions, given in Figure 2. The elds of a reduction theorem for a small-step semantics. This is
class C, written elds (C), is a sequence C f pairing the because a sensible expression may be reduced to one
class of a eld with its name, for all the elds declared containing a stupid cast. For example, consider the fol-
in class C and all of its superclasses. The type of the lowing, which uses classes A and B as dened as in the
method m in class C, written mtype (m; C), is a pair, writ- previous section:
ten B!B, of a sequence of argument types B and a result ! (A)new
type B. Similarly, the body of the method m in class C, (A)(Object)new B() B()
written mbody (m; C), is a pair, written (x,e), of a se- We indicate the special nature of stupid casts by includ-
quence of parameters x and an expression e. The pred- ing the hypothesis stupid warning in the type rule for
icate override (C0 !C; m; D) judges if a method m with stupid casts (T-SCast); an FJ typing corresponds to a
argument types C and a result type C0 may be dened legal Java typing only if it does not contain this rule.
in a subclass of D. In case of overriding, if a method
(Stupid casts were omitted from Classic Java [14], caus- the nal rule used in the derivation is T-DCast. Sup-
ing its published proof of type soundness to be incorrect; pose the type of e0 is C0 and C <: C0 . By the induction
this error was discovered independently by ourselves and hypothesis, ` [d=x]e 2 D0 for some D0 <: C0 . But,
the Classic Java authors.) since D0 and C may or may not be in the subtype rela-
The typing judgment for method declarations has tion, the derivation of ` (C)[d=x]e 2 C may involve a
the form M OK IN C, read \method declaration M is ok stupid warning. On the other hand, if (C)e0 is derived
if it occurs in class C." It uses the expression typing using T-UCast, then (C)[d=x]e will also be an upcast.
judgment on the body of the method, where the free The theorem itself is now proved by induction on the
variables are the parameters of the method with their derivation of e ! e0 , with a case analysis on the last
declared types, plus the special variable this with type rule used. The case for R-Invk is easy, using the lemma
C. above. Other base cases are also straightforward, as are
The typing judgment for class declarations has the most of the induction steps. The only interesting case is
form CL OK, read \class declaration CL is ok." It checks the congruence rule for casting|that is, the case where
that the constructor applies super to the elds of the (C)e ! (C)e0 is derived using e ! e0 . Using an
superclass and initializes the elds declared in this class, argument similar to the term substitution lemma above,
and that each method declaration in the class is ok. we see that a downcast expression may be reduced to
The type of an expression may depend on the type a stupid cast and an upcast expression will be always
of any methods it invokes, and the type of a method reduced to an upcast.
depends on the type of an expression (its body), so it
behooves us to check that there is no ill-dened circu- We can also show that if a program is well typed,
larity here. Indeed there is none: the circle is broken then the only way it can get stuck is if it reaches a
because the type of each method is explicitly declared. point where it cannot perform a downcast.
It is possible to load and use the class table before all
the classes in it are checked, so long as each class is 2.4.2 Theorem [Progress]: Suppose e is a well-
eventually checked. typed expression.
(1) If e includes new C0 (e).f as a subexpression, then
2.3 Computation elds (C0 ) = T f and f 2 f.
The reduction relation is of the form e ! e0 , read (2) If e includes new C0 (e).m(d) as a subexpression,
\expression e reduces to expression e0 in one step." We then mbody (m; C0 ) = (x; e0 ) and #(x) = #(d).
write ! for the re
exive and transitive closure of !.
The reduction rules are given in the bottom left col- To state a similar property for casts, we say that an
umn of Figure 1. There are three reduction rules, one expression e is safe in if the type derivations of the
for eld access, one for method invocation, and one for underlying CT and ` e 2 C contain no downcasts
casting. These were already explained in the introduc- or stupid casts (uses of rules T-DCast or T-SCast).
tion to this section. We write [d=x; e=y]e0 for the result In other words, a safe program includes only upcasts.
of replacing x1 by d1 , . . . , x by d , and y by e in ex-
n n Then we see that a safe expression always reduces to
pression e0 . another safe expression, and, moreover, typecasts in a
The reduction rules may be applied at any point in safe expression will never fail, as shown in the following
an expression, so we also need the obvious congruence pair of theorems.
rules (if e ! e0 then e.f ! e0 .f, and the like), which
we omit here. 2.4.3 Theorem: [Reduction preserves safety] If e
is safe in and e !e0 , then e0 is safe in .
2.4 Properties 2.4.4 Theorem [Progress of safe programs]:
Formal denitions are fun, but the proof of the pudding Suppose e is safe in . If e has (C)new C0 (e) as a
is in. . . well, the proof. If our denitions are sensible, we subexpression, then C0 <: C.
should be able to prove a type soundness result, which
relates typing to computation. Indeed we can prove 3 Featherweight GJ
such a result: if a term is well typed and it reduces to
a second term, then the second term is well typed, and Just as GJ adds generic types to Java, Featherweight
furthermore its type is a subtype of the type of the rst GJ (or FGJ, for short) adds generic types to FJ. Here
term. is the class denition for pairs in FJ, rewritten with
generic type parameters in FGJ.
2.4.1 Theorem [Subject Reduction]: If ` e 2 C
and e ! e0 , then ` e0 2 C0 for some C0 <: C. class Pair<X extends Object, Y extends Object>
extends Object {
Proof sketch: The main property required in the X fst;
Z is a parameter of the setfst method. Each type pa- similarly for T, N, etc.), and assume sequences of type
rameter has a bound ; here X, Y, and Z are each bounded variables contain no duplicate names.
by Object. The abstract syntax of FGJ is given at the top left
In the context of the above denitions, the expres- of Figure 3. We allow C<> and m<> to be abbreviated as
sion C and m, respectively.
As before, we assume a xed class table CT , which is
new Pair<A,B>(new A(), new B()).setfst<B>(new B()) a mapping from class names C to class declarations CL,
evaluates to the expression obeying the same sanity conditions as given previously.
new Pair<B,B>(new B(), new B()) 3.2 Typing
If we were being extraordinarily pedantic, we would A type environment is a nite mapping from type
write A<> and B<> instead of A and B, but we allow the variables to nonvariable types, written X <: N, that
latter as an abbreviation for the former in order that FJ takes each type variable to its bound.
is a proper subset of FGJ.
In GJ, type parameters to generic method invoca- Bounds of types
tions are inferred. Thus, in GJ the expression above
would be written We write bound (T) for the upper bound of T in , as
dened in Figure 4. Unlike calculi such as F [9], this
new Pair<A,B>(new A(), new B()).setfst(new B())
promotion relation does not need to be dened recur-
with no <B> in the invocation of setfst. So while FJ is sively: the bound of a type variable is always a nonva-
a subset of Java, FGJ is not quite a subset of GJ. We riable type.
regard FGJ as an intermediate language { the form that
would result after type parameters have been inferred. Subtyping
While parameter inference is an important aspect of GJ,
we chose in FGJ to concentrate on modeling other as- The subtyping relation is dened in the left column of
pects of GJ. Figure 3. As before, subtyping is the re
exive and tran-
The bound of a type variable may not be a type sitive closure of the / relation. Type parameters are in-
variable, but may be a type expression involving type variant with regard to subtyping (for reasons explained
variables, and may be recursive (or even, if there are in the GJ paper), so T <: U does not imply C<T> <: C<U>.
several bounds, mutually recursive). For example, if
C<X> and D<Y> are classes with one parameter each, Well-formed types
one may have bounds such as <X extends C<X>> or If the declaration of a class C begins class C<X / N>,
even <X extends C<Y>, Y extends D<X>>. For more then a type like C<T> is well formed only if substituting
on bounds, including examples of the utility of recur- T for X respects the bounds N, that is if T <: [T=X]N.
sive bounds, see the GJ paper [7]. We write ` T ok if type T is well-formed in context
GJ and FGJ are intended to support either of two . The rules for well-formed types appear in Figure 3.
implementation styles. They may be implemented di- Note that we perform a simultaneous substitution, so
rectly, augmenting the run-time system to carry infor- any variable in X may appear in N, permitting recursion
mation about type parameters, or they may be imple- and mutual recursion between variables and bounds.
mented by erasure, removing all information about type A type environment is well formed if ` (X) ok
parameters at run-time. This section explores the rst for all X in dom (). We also say that an environment
style, giving a direct semantics for FGJ that maintains is well formed with respect to , written ` ok,
type parameters, and proving a type soundness theo- if ` (x) ok for all x in dom ( ).
rem. Section 4 explores the second style, giving an era-
sure mapping from FGJ into FJ and showing a corre-
spondence between reductions on FGJ expressions and Field and method lookup
reductions on FJ expressions. The second style corre- For the typing and reduction rules, we need a few aux-
sponds to the current implementation of GJ, which com- iliary denitions, given in Figure 4; these are fairly
piles GJ into the Java Virtual Machine (JVM), which of straightforward adaptations of the lookup rules given
course maintains no information about type parameters previously. The elds of a nonvariable type N, writ-
at run-time; the rst style would correspond to using ten elds (N), are a sequence of corresponding types and
an augmented JVM that maintains information about eld names, T f. The type of the method invocation m
type parameters.
Syntax: Expression typing:
CL ::= class C<X / N> / N {T f; K M} ; ` x 2 (x)
K ::= C(T f) {super(f); this.f = f;}
Well-formed types: ; ` e0 2 T0 ` N ok
` bound (T0 ) 6<: N ` N 6<: bound (T0 )
` Object ok stupid warning
; ` (N)e0 2 N
X 2 dom ()
` X ok Method typing:
CT (C) = class C<X / N> / N {...}
` T ok ` T <: [T=X]N = X<:N; Y<:O
` C<T> ok ` T ok ` T ok ` O ok
; x : T; this : C<X> ` e0 2 S ` S <: T
CT (C) = class C<X / N> / N {...}
Computation: override (m; N; <Z / P>U!U)
elds (N) = T f "
<Y / O> T m (T x) { e0 ;} OK IN C<X / N>
(new N(e)).fi !e i
Class typing:
mbody (m<V>; N) = (x; e0 )
(new N(e)).m<V>(d) X<:N ` N ok X<:N ` N ok X<:N T ` ok
![ d=x; new N(e)=this e0 ] elds (N) = U g M OK IN C<X / N>
K = C(U g, T f) {super(g); this.f = f;}
; ` N <: O class C<X / N> / N {T f; K M} OK
(O)(new N(e)) ! new N(e)
at nonvariable type N, written mtype (m; N), is a type of The typing rule for methods contains one additional
the form <X / N>U!U. Similarly, the body of the method subtlety. In FGJ (and GJ), unlike in FJ (and Java),
invocation m at nonvariable type N with type parameters covariant subtyping of method results is allowed. That
V, written mbody (m<V>; N), is a pair, written (x,e), of a is, the result type of a method may be a subtype of
sequence of parameters x and an expression e. the result type of the corresponding method in the su-
perclass, although the bounds of type variables and the
Typing rules argument types must be identical (modulo renaming of
type variables).
Typing rules for expressions, methods, and classes ap- As before, a class table is ok if all its class denitions
pear in Figure 3. are ok.
The typing judgment for expressions is of form
; ` e 2 T, read as \in the type environment and 3.3 Reduction
the environment , e has type T." Most of the sub-
tleties are in the eld and method lookup relations that The operational semantics of FGJ programs is only a
we have already seen; the typing rules themselves are little more complicated than what we had in FJ. The
straightforward. rules appear in Figure 3.
In the rule GT-DCast, the last premise ensures that
the result of the cast will be the same at run time, no 3.4 Properties
matter whether we use the high-level (type-passing) re-
duction rules dened later in this section or the erasure FGJ programs enjoy subject reduction and progress
semantics considered in Section 4. For example, sup- properties exactly like programs in FJ (2.4.1 and 2.4.2).
pose we have dened: The basic structures of the proofs are similar to those
class List<X / Object> / Object { ... }
of Theorem 2.4.1 and 2.4.2. For subject reduction, how-
class LinkedList<X / Object> / List<X> { ... }
ever, since we now have parametric polymorphism com-
bined with subtyping, we need a few more lemmas. The
Now, if o has type Object, then the cast (List<C>)o main lemmas required are a term substitution lemma as
is not permitted. (If, at run time, o is bound before, plus similar lemmas about the preservation of
to new List<D>(), then the cast would fail in the subtyping and typing under type substitution. (Read-
type-passing semantics but succeed in the erasure se- ers familiar with proofs of subject reduction for typed
mantics, since (List<C>)o erases to (List)o while lambda-calculi like F [9] will notice many similarities).
both new List<C>() and new List<D>() erase to We begin with the three substitution lemmas, which are
new List().) On the other hand, if cl has type proved by straightforward induction on a derivation of
List<C> , then the cast (LinkedList<C>)cl is permit- ` S <: T or ; ` e 2 T.
ted, since the type-passing and erased versions of the
cast are guaranteed to either both succeed or both fail.
3.4.1 Lemma: [Type substitution preserves sub- FGJ is backward compatible with FJ. Intuitively,
typing] If 1 ; X<:N; 2 ` S <: T and 1 ` U <: [U=X]N this means that an implementation of FGJ can be used
with 1 ` U ok, and none of X appearing in 1 , then to typecheck and execute FJ programs without changing
1 ; [U=X]2 ` [U=X]S <: [U=X]T. their meaning. We can show that a well-typed FJ pro-
gram is always a well-typed FGJ program and that FJ
3.4.2 Lemma: [Type substitution preserves typ- and FGJ reduction correspond. (Note that it isn't quite
ing] If 1 ; X<:N; 2 ; ` e 2 T and 1 ` U <: [U=X]N the case that the well-typedness of an FJ program under
where 2 ` U ok and none of X appears in 1 , then the FGJ rules implies its well-typedness in FJ, because
1 ; [U=X]2 ; [U=X] ` [U=X]e 2 S for some S such that FGJ allows covariant overriding and FJ does not.) In
1 ; [U=X]2 ` S <: [U=X]T. the statement of the theorem, we use !FJ and !FGJ
3.4.3 Lemma: [Term substitution preserves typ- to show which set of reduction rules is used.
ing] If ; ; x : T ` e 2 T and, ; ` d 2 S where 3.4.6 Theorem [Backward compatibility]: If an
` S <: T, then ; ` [d=x]e 2 S for some S such that FJ program (e; CT ) is well typed under the typing
` S <: T. rules of FJ, then it is also well-typed under the rules of
3.4.4 Theorem [Subject reduction]: If ; ` e 2 FGJ. Moreover, for all FJ programs e and e0 (whether
T and e ! e0 , then ; ` e0 2 T0 , for some T0 such
well typed or not), e !FJ e0 i e !FGJ e0 .
that ` T0 <: T. Proof: The rst half is shown by straightforward in-
Proof sketch: By induction on the derivation of duction on the derivation of ` e 2 C (using FJ typing
e ! e0 with a case analysis on the reduction rule used. rules), followed by an analysis of the rules GT-Method
We show in detail just the base case where e is a method and GT-Class. In the second half, both directions are
invocation. From the premises of the rule GR-Invk, we shown by induction on a derivation of the reduction re-
have lation, with a case analysis on the last rule used.
e = new N(e).m<V>(d)
mbody (m<V>; N) = (x; e0 ) 4 Compiling FGJ to FJ
e0 = [d=x; new N(e)=this]e0 : We now explore the second implementation style for GJ
By the rule GT-Invk and GT-New, we also have and FGJ. The current GJ compiler works by translation
into the standard JVM, which maintains no informa-
; ` new N(e) 2 N tion about type parameters at run-time. We model this
mtype (m; bound (N)) = <Y / O>U!U compilation in our framework by an erasure translation
` V <: [V=Y]O from FGJ into FJ. We show that this translation maps
` V ok well-typed FGJ programs into well-typed FJ programs,
; ` d 2 S and that the behavior of a program in FGJ matches (in
` S <: [V=Y]U a suitable sense) the behavior of its erasure under the
T = [V=Y]U: FJ reduction rules.
A program is erased by replacing types with their
By examining the derivation of mtype (m; bound (N)), erasures, inserting downcasts where required. A type is
we can nd a supertype C<T> of N where erased by removing type parameters, and replacing type
Y<:O; x : U; this : C<T> ` e0 2 S
variables with the erasure of their bounds. For example,
` S <: U the class Pair<X,Y> in the previous section erases to the
Y<:O
following:
and none of the Y appear in T. Now, by Lemma 3.4.2, class Pair extends Object {
;; x : [V=Y]U; this : C<T> ` e0 2 [V=Y]S: Object fst;
Object snd;
From this, a straightforward weakening lemma (not Pair(Object fst, Object snd) {
shown here), plus Lemma 3.4.3 and Lemma 3.4.1, gives }
super(); this.fst=fst; this.snd=snd;
Letting T0 = S0 nishes the case, since ` S0 <: [V=Y]U Similarly, the eld selection
by S-Trans.
new Pair<A,B>(new A(), new B()).snd
3.4.5 Theorem [Progress]: Suppose e is a well-
typed expression. erases to
(1) If e includes new N0 (e).f as a subexpression, then (B)new Pair(new A(), new B()).snd
elds (N0 ) = T f and f 2 f. where the added downcast (B) recovers type informa-
(2) If e includes new N0 (e).m<V>(d) as a subexpres- tion of the original program. We call such downcasts
sion, then mbody (m<V>; N0 ) = (x; e0 ) and #(x) = inserted by erasure synthetic.
#(d).
4.1 Erasure of Types The maximum method type of m in C, written
To erase a type, we remove any type parameters and mtypemax (m, C), is dened as follows:
replace type variables with the erasure of their bounds.
Write jTj for the erasure of type T with respect to type CT (C) = class C<X / N> / D<U> {...}
environment <Y / O>T !T = mtype (m; D<U>)
jTj = C mtypemax (m; C) = mtypemax (m; D)
where bound (T) = C<T>. CT (C) = class C<X / N> / D<U> {...}
mtype (m; D<U>) undened
4.2 Field and Method Lookup <Y / O>T!T = mtype (m; C<X>) = X<:N; Y<:O
In FGJ (and GJ), a subclass may extend an instantiated mtypemax (m; C) = jTj !jTj
superclass. This means that, unlike in FJ (and Java),
the types of the elds and the methods in the subclass We also need a way to look up the maximum type
may not be identical to the types in the superclass. In of a given eld. If eldsmax (C) = D f then we set
order to specify a type-preserving erasure from FGJ to eldsmax (C)(f ) = D .
i i
FJ, it is necessary to dene additional auxiliary func-
tions that look up the type of a eld or method in the 4.3 Erasure of Expressions
highest superclass in which it is dened.
For example, we previously dened the generic class The erasure of an expression depends on the typing of
Pair<X,Y> . We may declare a specialized subclass that expression, since the types are used to determine
PairOfA as a subclass of the instantiation Pair<A,A> , which downcasts to insert. The erasure rules are opti-
which instantiates both X and Y to a given class A. mized to omit casts when it is trivially safe to do so;
this happens when the maximum type is equal to the
class PairOfA extends Pair<A,A> {
erased type.
PairOfA(A fst, A snd) {
super(fst, snd); Write jej for the erasure of a well-typed expres-
;
}
} jxj = x ;
super(fst, snd);
} ; ` e0 .m<V>(e) 2 T ; ` e0 2 T0
Pair setfst(Object newfst) { mtypemax (m; jT0 j ) = C!D D = jTj
}
return new PairOfA(newfst, this.fst);
je0 .m<V>(e)j = je0 j .m(jej )
; ; ;
(In GJ, the actual erasure is somewhat more complex, casts can persist for a while in the FJ expression, al-
involving the introduction of bridge methods, so that though we expect those casts will eventually turn out
one ends up with two overloaded methods: one with to be upcasts when a reduces to a new expression.
the maximum type, and one with the instantiated type. In the example above, an FJ expression d reduced
We don't model that extra complexity here, because it from jej had more synthetic casts than je0 j . How-
; ;
depends on overloading of method names, which is not ever, this is not always the case: d may have less casts
modeled in FJ.) than je0 j when the reduction step involves method
;
The erasure of constructors and classes is: invocation. Consider the following class and its erasure:
jC(U g, T f) {super(g); this.f = f;} ;C j class C<X extends Object> extends Object {
X f;
= C(eldsmax ( ) C ) {super(g); this.f = f;} C(X f) { this.f = f; }
C<X> m() { return new C<X>(this.f); }
= X<:N }
Having dened erasure, we may investigate some of its Now consider the FGJ expression
properties. First, a well-typed FGJ program erases to a
well-typed FJ program, as expected: e = new C<A>(new A()).m()
4.5.1 Theorem [Erasure preserves typing]: If an and its erasure
FGJ class table CT is ok and ; ` e 2 T, then
j j ` jej 2 jTj and jCT j is ok using FJ rules.
; jej = new C(new A()).m():
;
0