The imperative core JavaI of Java
Here we de_ne the basic model execJavaI , which de_nes the semantics of the sequential imperative core of Java with statements (appearing in method bodies) and expressions (appearing in statements) over the primitive types of Java. Each machine is started with an arbitrary but _xed Java program it has to interpret. Later we describe the signature of JavaI and the static semantics of the input programs. We explain the form in which these programs are supposed to appear to the ASM, namely as annotated abstract syntax trees, resulting from parsing and elaboration. For future use in the proofs we also list the constraints which are imposed on the syntax of programs and on the types of the constructs appearing in them. We de_ne the ASM rules for the dynamic semantics of JavaI programs. Static semantics of Java
The primitive types of JavaI are: boolean, byte, short, int, long, float, double, char. Types are denoted by capital letters A, B, C. The types byte, short, int, long and char are called integral types. The types float and double are called oating point types. Numeric types are integral types or oating point types. External representations of values of primitive type are called literals. Table below contains examples of literals.
A binary relation _ is de_ned between primitive types. In terms of the JLS, the relation A _ B means that there exists an identity conversion or a widening primitive conversion from A to B. In traditional terms, the relation A _ B means that A is a subtype of B, i.e., each value of type A can be used as a value of type B. In some cases, however, information may be lost. For example, if a 64-bit value of type long is converted to a 32-bit value of type float, then some precision may be lost.
De_nition The relation _ is the least relation on the set of primitive types which is reexive, transitive and has the following properties:
byte _ short _ int _ long _ float _ double; char _ int:
Reexive means that A _ A for each primitive type A. Transitive means that, if A _ B and B _ C, then A _ C.
The primitive types of Java
The syntax of the imperative core of Java is de_ned in Fig. below. It can also be viewed as de_ning corresponding domains (also called universes) of JavaI. Although in our ASMs we will extend some of these domains by a small number of auxiliary constructs which do not appear in the syntax of Java, we use the names of Java constructs also as names for the corresponding extended ASM universes.
Usually we denote domains by words beginning with a capital letter and write dom for elements of Dom, i.e., assuming without further mentioning that dom 2 Dom.
Fig. below uses universes which represent basic syntactic constructs of Java, namely:
Exp . . . . . expressions, Lit . . . . . . literals,
Asgn . . . . assignments, Loc . . . . . . local variables,
Stm . . . . . statements, Uop . . . . . unary operators,
Block . . . . blocks, Bop . . . . . binary operators,
Bstm . . . . block statements, Lab . . . . . . labels.
Local variables and labels are identi_ers, sequences of letters and digits starting with a letter. Java programs are written in the Unicode 16-bit character set and so letters and digits may be drawn from the entire Unicode character set.
The unary operators Uop are listed in Table below and the binary operators Bop in Table below. The function `max' in the column `Result type' of the two tables denotes the maximum of types with respect to the subtype relation _. Although the set of primitive types is not linearly ordered by the relation _, the maximum always exists for the special cases in Tables below.
The type cast operator `(B)' is considered as a unary operator in Table below provided that B is a primitive type. Binary operators are associated to the left: e1 bop e2 bop e3 is read as (e1 bop e2) bop e3.
Syntax of JavaI
A block statement `Aloc;' is called a local variable declaration. The local variable loc is declared to be of type A. The scope of the declaration of loc consists of the statements following the variable declaration in the block.
Consider the following declaration:
f: : : Aloc; bstm1 : : : bstmng
Then the scope of the local variable loc are the statements bstm1; : : : ; bstmn.
Constraint . A JavaI block must satisfy the following constraints:
- If a variable loc is used in an expression, then it is in the scope of a declaration of loc.
- A local variable declaration of loc is not in the scope of another declaration of loc, i.e., there are no hidden variables.
- A labeled statement `lab : stm' does not contain proper substatements with the same label lab.
- A jump statement `break lab;' occurs as a substatement of a labeled statement `lab : stm' with label lab.
- A jump statement `continue lab;' occurs as a substatement of a labeled loop statement `lab : while (exp) stm' with labellab.
- Every local variable must be initialized before it is used (rule of de_nite assignment).
In the following code fragment, for example, the compiler is able to conclude that the variable x is initialized before it is used:
if (0 < z)
x = 2;
x = 3;
y = x * x;
Unary operators for operands of primitive type
Binary operators for operands of primitive type
Type constraints for expressions of JavaI
Type checking of JavaI
Positions in the given program are denoted by small Greek letters _, _, , etc. The reader can think of positions either as positions in the source code of the program or as positions in an abstract syntax tree. Positions are displayed as superscripts, for example, as in _exp or in _stm. The set of positions of the given program is considered as a universe of the superuniverse of the ASM for JavaI. The universe of positions is called Pos.
Java programs are statically typed. The compiler has to verify that a program is well-typed. As a result of parsing and elaboration the parse tree of the program is annotated with type information. In the ASM, the annotation is modeled by a static function T which assigns to each position _ in an expression a type. The function T has to satisfy the constraints in Table below. Moreover, if an expression _exp is the test expression in an if or a while statement, then T (_) must be boolean.
During compilation and type checking explicit unary conversion operators (type casts) are inserted at places where they are necessary. For example, an explicit type cast is inserted in an assignment loc = exp, if the type of exp is di_erent from the declared type of loc. The assignment is replaced by loc = (A)exp, where A is the declared type of loc. In this way it is ensured that during run-time a variable of declared primitive type A always holds a value of type A. Table below contains the additional type constraints which the abstract syntax tree must satisfy after parsing and elaboration.
Type constraints after introduction of primitive type casts
Vocabulary of JavaI
The input program body, an annotated abstract syntax tree, is the initial value of the to be computed term restbody, a dynamic function which contains a) the still to be executed part of the current program (method body), b) the already computed and still needed expression values, and c) information about the current abruption or the successful termination of the computation. We view program execution as a walk through the annotated abstract syntax tree: at each position, the corresponding phrase, originally an expression or a statement, is evaluated or executed and then the control ow proceeds to the next phrase (at the same or the next position). We denote the current position by a 0-ary dynamic function
which represents an abstract program counter, pointing to the current expression or statement to be executed. We view restbody as a function which initially assigns to each position a phrase:
restbody: Pos ! Phrase
The expression restbody=pos denotes the currently to be computed subterm of restbody at pos. It will eventually be replaced by the computed value (element of Val ) of the subexpression, or by a reason for abruption (element of Abr), or by the constant Norm, which we use to denote successful termination (also called normal completion) of a statement. The universe Phrase therefore contains semi-evaluated expressions and statements which at certain positions may contain values, reasons for abruptions or Norm.
The set Abr serves to distinguish the normal termination of the control ow from its possible abruptions, which in JavaI can happen due to the restricted jump statements `break Lab;' and `continue Lab;'. It contains elements of form Break(lab) and Continue(lab) which indicate reasons for abrupt completion of a statement in JavaI. The notion of abruption will be extended in JavaC and JavaE to include also return and exception values.
The current value of the local variables is kept in the local environment function (association between local variables and their values)
type Locals = Map(Loc;Val )
which is updated upon execution of assignment statements or as side e_ect of expression evaluation. The universe Val , de_ned by type Val = boolean j byte j short j int j long j float j double j char contains the primitive values of Java: booleans, integers in speci_c ranges, and oating point numbers according to IEEE 754. For simplicity, we identify the booleans of Java with the corresponding ASM values True and False, and often abbreviate bool = True to bool .
This concludes the de_nition of the core signature restbody; pos; locals for the states of JavaI. Auxiliary state components and additions pertaining only to some special constructs will be presented in the corresponding sections