Académique Documents
Professionnel Documents
Culture Documents
OVERVIEW
Need for Intermediate Representation Types of Intermediate Representations
For k languages and n target architectures how many compilers we should have programmed?
C SPARC
Pascal
HP PA
k*n
FORTRAN
x86
C++
IBM PPC
C
Pascal
IR
SPARC
HP PA
FORTRAN
x86
IBM PPC
C++
k+n
Intermediate Representation
Machine and Language independent version of original source code. Increased abstraction Cleaner separation between the front end and back end Adds possibilities for re-targeting/cross-compilation. Supports compiler optimizations.
Trees, includes parse trees and (abstract) syntax trees Linear representation, three-address code
Intermediate representations are categorized according to where they fall between a high-level language and machine code.
IRs close to high-level language are called high-level IRs IRs close to assembly are called low-level IRs
Loop-structure, array subscripts, if-then-else statements etc. They tend to reflect the source language they are compiling more than low-level IRs.
Medium-level IRs are independent of both the source language and the target machine. Low-level IRs
Converts the array subscripts into explicit addresses and offsets. Tend to reflect the target architecture very closely, often machine-dependent.
An abstract syntax tree is a tree representation of the source code used as an IR where
Intermediate nodes may be collapsed Grouping units can be dispensed with etc,
Each node represents a piece of the program structure and the node will have references to its children subtrees. Grouping is done is such a way that the structure obtained drives the semantic processing and code generation.
EXAMPLE
EXAMPLE
EXAMPLE
In a syntax tree, there is only one path from root to each leaf of a tree.
When using trees as intermediate representations, it is often the case that some subtrees are duplicated. A logical optimization is to share the common sub-tree.
A syntax tree with more than one path from start symbol to terminals is called Directed Acyclic Graph (DAG).
They are hard to construct internally, but provide an obvious savings in space. They also highlight equivalent sections of code which will be used in optimizations like computing the needed result once and saving it, rather than re-generating it several times.
EXAMPLE
is:
EXAMPLE
THREE-ADDRESS CODES
Instructions
Assignment instruction Assignment instruction
Copy instruction
Unconditional jump
x := y
goto L
Conditional jump
Procedural call
if x relop y goto L
param x1 param x2 .. param xn call p, n
EXAMPLE 1
a := b * -c + b * -c
t1 := -c t2 := b * t1 t3 := -c t4 := b * t3 t5 := t2 + t4 a := t5
EXAMPLE 2
(1) if c == 0 goto 3 (2) goto 7 (3) if c < 20 goto 5 (4) goto 9 (5) c = c + 2 (6) goto 3 (7) t2 = n * n (8) c = t2 + 2 (9) (1) if c == 0 goto 5 (2) t2 = n * n (3) c = t2 + 2 (4) goto 8 (5) if c >= 20 goto 8 (6) c = c + 2 (7) goto 5 (8)
EXAMPLE 3
QUADRUPLES
Quadruple op y, z, result
Unary Operation
result := op y
op y, , result
Procedure call
// calls f(x+1,y) add x,1,t param t1,, param y,, call f,2,
Indexed Operation
EXAMPLE
Operator
Arg1
Arg2
Result
Code
0
1 2
uminus
sub uminus
c
b c t1
t1
t2 t3
t1=-c
t2=b*t1 t3=-c
3
4 5
mul
add mov
b
t2 t5
t3
t4
t4
t5 a
t4=b*t3
t5=t2+t4 a=t5
EXAMPLE
01: 02: 03: 04: 05: 06: 07: 08: mov add mov lt jmpf add mov mod 1,,x x,10,t1 t1,,y x,y,t2 t2,,17 x,1,t3 t3,,x x,2,t4 09: 10: 11: 12: 13: 14: 15: 16: 17: eq jmpf add mov jmp sub mov jmp t4,1,t5 t5,,14 y,1,t6 t6,,y ,,16 y,2,t7 t7,,y ,,4
if (x%2==1) then
y:=y+1;
else y:=y-2;
}
TRIPLES
A triple has only three fields, op, arg1 and arg2. Instead of a result field we can use pointers to the triple structure itself The DAG and triple representations are equivalent.
INDIRECT TRIPLES
Indirect triples consist of a listing of pointers to triples, rather than listing of triples themselves. Useful for an optimizing compiler
Can move instructions by reordering the instruction list, without affecting the triples themselves.
COMPARISON
Quadruples
direct access of the location for temporaries easier for optimization space efficiency
Triples
Indirect Triples
Front end analyzes the source code and creates an intermediate representation.
Details of the source language are confined to front end. Ensures that operators are applied to compatible operands.
Details of target machine are confined to back end.