Vous êtes sur la page 1sur 52

The Tiger compiler

Slide from Ivan Pribela


&
Mofidifed by Nalinadevi Kadiresan

Contents

The Tiger language
Course structure
Course sequence
Object-oriented Tiger language
Functional Tiger language
Existing course adaption to Tiger
Tiger compiler
The Tiger language
Tiger programming language
is simple, but nontrivial language
belongs to Algol family with nested scope
heap-allocated records with implicit pointers
arrays, integer and string variables
few simple structured control constructs
Easily modified to
a functional programming language
be object-oriented
The Tiger language
Lexical issues
Identifiers
sequence of letters, digits and underscores,
starting with letter
Comments
starting with /* and ending with */
can apear betwean any two tokens
can be nested
Declarations
decs
{ dec }
dec
tydec |
vardec |
fundec
Declaration sequence
a sequence of type, value
and function declarations
no punctation separates or
terminates individual
declarations
Data Types
tydec
type id = ty
ty
id | { tyfld } |
array of id
tyfld
|
id : type-id
{ , id : type-id }
Built-in types
int and string
can be redefined
Type equality
by name
Mutually recursive
consecutive sequence
list = {hd: int, tl: list}
Field name reusability

Variables
vardec
var id := exp |
var id [ : type-id ]
:= exp
Variable type
in short form, type of the
expression is used
in long form, given type and
type of expression must
match
if expression is nil, long fom
must be used
Variable lasts until end of
scope
Functions
Parameters
all parameters are passed by
value
Mutually recursive
declared in consecutive
sequence
fundec
function id ( tyfld )
= exp |
function id ( tyfld ) :
type-id = exp
Scope rules
Variables
let ... vardec ... in exp end
Parameters
function id ( ... id1 : id2 ... ) = exp
Nested scopes
access to a variable in outer scopes is permited
Types
let ... typedec ... in exp end
Scope rules
Functions
let ... fundec ... in exp end
Name spaces
two name spaces (types, varables & functions)
Local redeclarations
object can be hidden in a smaller scope
mutually recursive objects must have different
names
Values
L-values
location whose value may be
read or assigned
variables, procedure
parameters, fields of records
and elements of the array
fundec
function id ( tyfld )= exp
| function id ( tyfld ) :
type-id = exp
Expressions
L-value
evaluates to the location contents
Valueless expressions
procedure calls, assignment, if-then, while,
break, and sometimes if-then-else
Nil
expression nil denotes a value nil
when used, it must have a type determined
Sequencing
sequence of expressions (exp
1
; exp
2
; ... exp
n
)
Expressions
No value
empty sequence
let expression with empty in...end
Integer literal
sequence of digits
String literal
sequence of 0 or more printable characters betwean quotes
\\ \n \t \ddd \ \f...f\
Negation
Function call
has value of function result, or produces no value
Operations
Arithmetic operators
+ - * /
Comparison
= < > <= >= <>
produces: 0 for false, 1 for true
Boolean operators
& |
0 is considered false, non zero is true
Records and arrays
Record creation
type-id { id = exp { , id = exp} }
Array creation
type-id [ exp
1
] of exp
2
Assignment and Extent
records and arrays assignment is by
reference
records and arrays have infinite extent
Statements
If-then-else
if exp
1
then exp
2
[ else exp
3
]
exp
2
and exp
3
must be the same type
While loop
while exp
1
do exp
2
For loop
for id := exp
1
to exp
2
do exp
3
Statements
Break
terminates evaluation of nearest while or
for
Let
let decs in expseq end
evaluates decs
binds types variables and functions
result (if any) is the result of last
expression
Parentheses
Standard library


function size (s: string): int

function substring (
s: string,
first: int, n: int
): string

function concat (
s1: string,
s2: string
): string


function print (s: string)
function flush ()
function getchar (): string


function ord (s: string): int
function chr (i: int): string


function not (i: int): int
function exit (i: int)
Sample Tiger Program
Sample Tiger programs

let
function do_nothing1(a: int,
b: string): int = (
do_nothing2(a + 1);
0
)
function do_nothing2(d: int):
string = (
do_nothing1(d, str);

)
in
do_nothing1(0, str2)
end

let
var a := 0
in
for i := 0 to 100 do (
a := a + 1;
()
)
end

Topic dependency
Target language
CISC
few registers (16, 8, or 6)
registers divided in to classes
some operations available
only on certain registers
arithmetic operations on
registers and memory
two-address instructions

various addressing modes
variable length instructions
instructions with side effects
RISC
32 registers
only one class of
integer/pointer registers
arithmetic operations only
between registers
three-address instructions of
form r
1
r
2
r
3
load and store only with
M[reg+const] addressing
every instruction is 32bit long
one result effect per
instruction
Tiger compiler
Course sequence
1. Introduction
6. Activation records
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
Course sequence
Phases
each phase is described in
one section
some compilers combine
parse, semantic analysis
others put instruction
selection much later
simple compilers omit control
and dataflow analysis
1. Introduction
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
Introduction
Modules and interfaces
large software is much
easies to understand
and to implement
Tools and software
context-free grammars
reguar expressions
Data structures
intermediate representations
tables, trees
2. Lexical analysis
6. Activation records
1. Introduction
3. Parsing
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
2. Lexical analysis
Lexical analysis
Transforms program text
reads program text
outputs sequence of tokens
Algorithm
generated from lexical
specification
JLex lexer generator
3. Parsing
6. Activation records
2. Lexical analysis
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
3. Parsing
Parsing
1. Introduction
Checks program syntax
detects errors in order of
tokens
Parsing algorithm
LALR(1) - parsing
CUP parser generator
4. Abstract syntax
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
4. Abstract syntax
Abstract syntax
Improves modularuty
syntax analysis is separated
form semantic analysis
Semantic actions
during parsing
produce abstract parse tree
5. Semantic analysis
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
5. Semantic analysis
Semantic analysis
Checks program semantic
reports scope and type
errors
Actions
builds symbol tables
performes scope analysis
checks types
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
6. Activation records
Activation records
Functions, local variables
several invocations of the
same function may coexist
each invocation has its own
instances of local variables
Stack frames
local variables, parameters
return address, temporaries
static and dynamic links
7. Intermediate code
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
7. Intermediate code
Intermediate code
Allows portability
only N front ends and M back
ends
Abstract machine language
can express target machine
operations
indipendent of details of
cource language
represented by simple
expression trees
Intermediate code

a + b * 4


if a = b
then break
else x:= 5
8. Blocks and traces
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
8. Blocks and traces
Blocks and traces
7. Intermediate code
Basic blocks
begins with a label
ends with jump
no other labels or jumps
Traces
blocks can be arranged in
any order
arrange that most jumps are
followed by their label
9. Instruction selection
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
10. Liveness analysis
11. Register allocation
12. Putting all together
Instruction selection
9. Instruction selection
7. Intermediate code
Allows portability
finding apropriate machine
instructions to implement IR
Tree patterns
one pattern represents one
instruction
instruction selection is tiling
of IR tree
Instruction selection

LOAD r
1
M [fp + a]
ADDI r
2
r
0
+ 4
MUL r
2
r
i
x r
2
ADD r
1
r
1
+ r
2
LOAD r
2
M [fp + x]
STORE M [r
1
+ 0] r
2

LOAD r
1
M [fp + a]
ADDI r
2
r
0
+ 4
MUL r
2
r
i
x r
2
ADD r
1
r
1
+ r
2
ADDI r
2
fp + x
MOVE M [r
1
] M [r
2
]
10. Liveness analysis
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
11. Register allocation
12. Putting all together
10. Liveness analysis
Leveness analysis
9. Instruction selection
7. Intermediate code
Detects needed values
determines which variable
will be needed in the future
Problem
IR has unbounded number of
temporaries
target machine has limited
number of registers
Solution
Control and dataflow graph
11. Register allocation
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
10. Liveness analysis
12. Putting all together
11. Register allocation
Register allocation
9. Instruction selection
7. Intermediate code
Assignes registers
links temporaries with
registers
Interference graph
is created from examination
of control and dataflow graph
is colored for registers to be
assigned
12. Putting all together
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
10. Liveness analysis
11. Register allocation
12. Putting all together
Putting it all together
7. Intermediate code
9. Instruction selection
Properties
nested functions
missing structured values
tree intermediate
representations
register allocation
Remains
list all registers
procedure entry / exit
implement strings
17. Dataflow analysis
18. Loop optimizations
19. Static single assignment form
20. Pipelinining, scheduling
21. Memory hierarchies
12. Putting all together
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
10. Liveness analysis
11. Register allocation
Optimizations
7. Intermediate code
9. Instruction selection
17. Dataflow analysis
18. Loop optimizations
19. Static single assignment form
20. Pipelinining, scheduling
21. Memory hierarchies
13. Garbage collection
12. Putting all together
Optimizing compiler
transforms programs to
improve efficiency
uses dataflow analysis
Algorithms
Static single assignment form
use pipelining if available
utilize cache
13. Garbage collection
12. Putting all together
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
8. Blocks and traces
10. Liveness analysis
11. Register allocation
Garbage collection
7. Intermediate code
9. Instruction selection
17. Dataflow analysis
18. Loop optimizations
19. Static single assignment form
20. Pipelinining, scheduling
21. Memory hierarchies
13. Garbage collection
12. Putting all together
Algorithms
mark and sweep
reference counts
copying collection
generational collection
Incremental collection
16. Polymorphic types
15. Functional languages
14. Object-oriented languages
6. Activation records
1. Introduction
2. Lexical analysis
3. Parsing
4. Abstract syntax
5. Semantic analysis
7. Intermediate code
8. Blocks and traces
9. Instruction selection
10. Liveness analysis
11. Register allocation
12. Putting all together
Language modifications
17. Dataflow analysis
18. Loop optimizations
19. Static single assignment form
20. Pipelinining, scheduling
21. Memory hierarchies
13. Garbage collection
14. Object-oriented languages
16. Polymorphic types
15. Functional languages
Tiger compiler
Object-oriented Tiger language
Object-oriented principles
Information hiding
Useful software principle
module provide values of
given type
only that module knows its
representation
Extension
inheritance

Object-Tiger
Tiger can easily become
object oriented

Program example

class Truck extends Vehicle {

method move (int x) =
if x <= 80 then
position := position + x

}

var t := new Truck
var v: Vehicle := t

in
t.move(50)
v.move(100)
end

let

start := 10

class Vehicle extends Object {

var position := start

method move (int x) = (
position := position + x
)

}
Classes in Object-Tiger
Added class definitions
No multiple inheritance
Allows method override
dec
classdec
classdec
class class-id extends class-id
{ { classfield} }
classfield
vardec | method
method
method id ( tyfld ) =
exp
method id ( tyfld ) : type-id =
exp
Expressions in Object-Tiger
exp
new class-id
lvalue . id ( )
lvalue . id (
exp { , exp } )
Added object construction
And method invocation

Tiger compiler
Functional Tiger language
Functional languages
Equational reasoning
if f(x) = a this time
f(x) must be a next time
Imperative languages
functions have side effects
x can be changed betwean
calls
Functions in Fun-Tiger
ty
ty ty |
( ty { , ty } ) ty |
( ) ty

exp
exp ( exp { , exp } ) |
exp ( )
Added function type
function type is equal to
other data types
any expression can be called
PureFun-Tiger
Modifications
no assignment statement
no while and for
no conpound statements
Tiger compiler
The end