Vous êtes sur la page 1sur 84

Brief Overview

Also see document from Education Board of


ACM SIGPLAN (Motivation on web page)
Main Questions in PL
Q: Is this a valid program?
Compile-time and run-time checking (in 6341: attribute
grammars and type systems)
Q2: What is this program supposed to do?
Precise language semantics (in 6341: operational
semantics and axiomatic semantics)
Q3: How do we execute this program correctly and
efficiently?
Implementation of compilers and interpreters (in 6341:
projects to build an interpreter; attribute grammars for
code generation in a compiler)
2
Why Study Foundations of PL?
Understand your tools better
Compilers, interpreters, virtual machines, code
checking tools, debuggers, assemblers, linkers
Write your own languages, compilers, analyzers,
Happens more often than youd think
To fix bugs & make programs fast, often you need
to understand whats happening under the hood

Most importantly: PL are the foundations of


software; we need to be clear on what they mean
and how to support their users with useful tools
3
Example: Inside a Compiler

Regular grammars &


context-free grammars
(expected background)

Attribute grammars,
type checking,
PL semantics

PL semantics

4
Attribute Grammars

Pagan Ch. 2.1, 2.2, 2.3, 3.2


Stansifer Ch. 2.2, 2.3
Slonneger and Kurtz Ch 3.1, 3.2
Formal Languages
Theoretical basis for the design and
implementation of programming languages
Alphabet: finite set T of symbols
String: finite sequence of symbols
Empty string
T* - set of all strings over T (incl. )
T+ - set of all non-empty strings over T
Language: set of strings L T*

6
Grammars
G = (N, T, S, P)
Finite set of non-terminal symbols N
Finite set of terminal symbols T (this is our alphabet)
Starting non-terminal symbol S N
Finite set of productions P
Goal: define a language L T*
Production: x y
x: non-empty sequence of terminals and non-terminals
y: possibly-empty sequence of terminals/non-terminals
Applying a production: uxv uyv

7
Languages and Grammars
Derivation of a string
* wn
w1 w2 wn; denoted w1
Language generated by a grammar
L(G) = { w T* | S * w}
Traditional classification of languages and grammars
Regular Context-free Context-sensitive
Unrestricted

8
Use in Compilers and Interpreters
stream of
w,h,i,l,e,(,a,1,5,>,b,b,),d,o,
characters

Lexical Analyzer (uses a regular grammar)

stream of keyword[while],leftparen,id[a15],op[>],
tokens id[bb],rightparen,keyword[do],

Parser (uses a context-free grammar)

parse each token is a leaf in the parse tree


tree
more components
9
Context-Free Languages
Strict superset of regular languages
Example: L = { anbn | n > 0 } is context-free but not
regular
Generated by a context-free grammar
Each production: A w
A is a non-terminal, w is a (possibly empty) sequence of
terminals and non-terminals
BNF: alternative notation for context-free grammars
Backus-Naur form: John Backus and Peter Naur, for
ALGOL60 (both have received the ACM Turing Award)

10
BNF Example
<stmt> ::= while <exp> do <stmt>
| if <exp> then <stmt>
| if <exp> then <stmt> else <stmt>
| <exp> := <exp>
| <id> ( <exprs> )
<exprs> ::= <exp> | <exprs> , <exp>

Note: if there are several productions <X> ::= , for convenience we


write them as a single production <X> ::= | |
We say, for example, the second production alternative
11
Derivation Tree
Also called parse tree or concrete syntax tree
Leaf nodes: terminals
Inner nodes: non-terminals
Root: starting non-terminal of the grammar
Describes a particular way to derive a string
Leaf nodes from left to right are the string

12
Example of a Parse Tree
<expr> ::= <term> | <expr> + <term>
<term> ::= x | y | z | (<expr>)

<expr>
(x+y)+z <expr> + <term>
<term> z
( <expr> )
<expr> + <term>
<term> y
x
13
Ambiguous Grammar
For some string, there are multiple parse trees
An ambiguous grammar
Gives more freedom to the compiler writer: e.g., for
code optimizations (several possible translations)
Allows under-specification of irrelevant details
Has to be disambiguated when we build a real parser
To remove ambiguity
add non-terminals, or
add operator precedence and associativity, or
use an attribute grammar (more later ...)

14
Elimination of Ambiguity
<expr> ::= <expr> + <expr> | <expr> * <expr> |
( <expr> ) | id

1. Prove that this grammar is ambiguous


2. Create an equivalent non-ambiguous grammar
with the appropriate precedence and associativity
* has higher precedence than +
both are left-associative

Example: parse tree(s) for a + b * (c+d) * e for the


two versions of the grammar

15
Abstract Syntax Trees (AST)
A simplified version of a concrete syntax tree,
without loss of information [will not use in this course]
<procedure> ::= proc id ( <formals> ) { < body> }

<procedure> parse tree

proc id(p34) ( <formals> ) { <body> }

<procedure> [p34] AST

<formals> <body>

AST for a + b * (c+d) * e?


16
Use of Context-Free Grammars
Syntax of a programming language
e.g. Java: Chapter 19 of the language specification (JLS)
defines a grammar
Terminals: identifiers, keywords, literals, separators,
operators
Starting non-terminal: CompilationUnit
Implementation of a parser in a compiler
e.g. the JLS grammar (Ch. 19) is used by the parser
inside the javac compiler

17
Limitations of Context-Free Grammars
Cannot represent semantics
Example: every variable used in a statement should be
declared earlier in the code or the use of a variable
should conform to its type (type checking)
Need to allow only programs that satisfy certain
context-sensitive conditions
An example of a context: prior declaration of x
exists and it is int x;
Cannot generate things other than parse trees
Example: what if we wanted to generate assembly code
for the given program?

18
Attribute Grammars
Generalization of context-free grammars
Used for semantic checking and other compile-time
analyses
e.g. type checking in a compiler
Used for translation
e.g. parse tree assembly code
Implicitly represents a traversal of the parse tree and
the computation of information during traversal

19
Structure of an Attribute Grammar
Underlying context-free grammar
For a terminal or non-terminal: some attributes
For each attribute: type of possible values
e.g., integer, string, map(string list(integer))
Each attribute is either synthesized or inherited (but
not both)
Set of evaluation rules for each production
Set of boolean conditions for attribute values

20
Example
L = { anbncn | n > 0 }; not a context-free language
BNF
<start> ::= <A><B><C> <A> ::= a | a<A>
<B> ::= b | b<B> <C> ::= c | c<C>
Attributes
Na: associated with <A>
Nb: associated with <B>
Nc: associated with <C>
Type of possible values for Na, Nb, Nc: integer values

21
Example
Evaluation rules (similar for <B>, <C>)
<A> ::= a
<A>.Na := 1
| a<A>2
<A>.Na := 1 + <A>2.Na
Conditions
<start> ::= <A><B><C>
Cond: <A>.Na = <B>.Nb = <C>.Nc
a string belongs to the language defined by this attribute grammar
if and only if the parse tree satisfies the condition

22
Parse Tree
<start> Cond:true

<A> Na:2 <B> Nb:2 <C> Nc:2

a <A> Na:1 b Nb:1 <B> c <C> Nc:1

a b c

23
Parse Tree for an Attribute Grammar
Valid tree for the underlying BNF
Each node has (attribute,value) pairs
One pair for each attribute associated with the node
Some nodes have boolean conditions
If there is a corresponding Cond: rule
Valid parse tree
Attribute values are consistent with the evaluation rules
All boolean conditions are true

24
Comments
If non-terminal X has an attribute A, each
occurrence of X in the parse tree must have a value
for A. The evaluation rules should define exactly
one value for A for a particular X node.
Attributes are not like program variables; cannot have:
<X>.A := 1 + <X>.A
In rules/conditions, can only refer to attributes of
non-terminals and terminals in the current
production alternative
Cannot look at grandparents, grandchildren, etc.

25
Synthesized vs. Inherited Attributes
Each non-terminal X: disjoint sets of synthesized
attributes and inherited attributes
An attribute A for X cannot be both

For each synthesized attribute A: each


production alternative in X ::= should have
exactly one evaluation rule for X.A

For each inherited attribute A: each occurrence


of X in ::= X X X should have exactly
one evaluation rule for X.A
26
Synthesized vs. Inherited Attributes
Synthesized attributes convey information about
the subtree rooted at the node
Inherited attributes convey context conditions
E.g., information about variable declarations that
have appeared earlier in the program
The starting non-terminal does not have inherited
attributes
For simplicity: assume each terminal symbol has
one attribute lexval with a pre-defined value
The lexical analyzer sets these values (e.g., int value
for a token representing an integer constant)
27
Example (revisited)
<start> ::= <A><B><C>
<B>.expNb := <A>.Na
<C>.expNc := <A>.Na
<A> ::= a
<A>.Na := 1
| a<A>2
<A>.Na := 1 + <A>2.Na
<B> ::= b similarly for <C>
Cond: expNb = 1
| b<B>2
<B>2.expNb := <B>.expNb 1

Na is synthesized, expNb and expNc are inherited


28
Example: Binary Numbers
Context-free grammar
<B> ::= <D>
<B> ::= <D><B>
<D> ::= 0
<D> ::= 1

Goal: compute the value of the binary number


Needed, for example, in compilers during code
translation

29
BNF Parse Tree for Input 1010
B Define integer attributes
<B>: synthesized val
D B <B>: synthesized pos
<D>: inherited pow
D B <D>: synthesized val
1

0 D B

1 D

0
30
Example: Binary Numbers
<B> ::= <D>
<B>.pos := 1
<B>.val := <D>.val
<D>.pow := 0
<B>1 ::= <D><B>2
<B>1.pos := <B>2.pos + 1
<B>1.val := <B>2.val + <D>.val
<D>.pow := <B>2.pos
<D> ::= 0
<D>.val := 0
<D> ::= 1
<D>.pow
<D>.val := 2
31
Evaluated Parse Tree
B pos:4 val:10

pow:3 D B pos:3 val:2


val:8
1 D pow:2 B pos:2 val:2
val:0

0 D pow:1 B pos:1 val:0


val:2
1 D pow:0
val:0
0
32
Example: Interpreter for Expression Language
Problem: evaluate an expression
<S> ::= <E>
<E> ::= 0 | 1 | <I> | (<E> + <E>) |
let <I> = <E> in <E> end
<I> ::= id
Attributes
<I>: synthesized string name
<E>: synthesized integer val,
inherited map env (short for environment)
type of env is map(string int)
id: synthesized string lexval represents the string that
the lexical analysis puts inside token id
33
Attribute Grammar
<S> ::= <E>
<E>.env :=
<E> ::= 0
<E>.val := 0
<E> ::= 1
<E>.val := 1 helper functions
<E> ::= <I>
Cond: containsKey(<E>.env, <I>.name)
<E>.val := lookup(<E>.env, <I>.name)
<I> ::= id
<I>.name := id.lexval
34
Attribute Grammar
creates a fresh
<E>1 ::= (<E>2 + <E>3)
environment;
<E>1.val := <E>2.val + <E>3.val does not affect
<E>2.env := <E>1.env <E>1.env;
<E>3.env := <E>1.env if the name
<E>1 ::= let <I> = <E>2 in <E>3 end already exists,
<E>2.env := <E>1.env replaces the
<E>3.env := value
update(<E>1.env,<I>.name,<E>2.val)
<E>1.val := <E>3.val

35
Example
Evaluation of let x = 1 in (x+x) end
S
E env: val:2
let I = E in E end
id[x] 1 ( E + E )

env:
I I env:x=1
env:x=1 val:2
val:1 id[x] id[x]
val:1
env:x=1
val:1
36 Try this at home: let x = 1 in let x = (x+1) in (x+2) end end
More than Context-Free Power
L = { anbncn | n > 0 } not a context-free language
Unlike context-free language L = { anbn | n > 0 }, here
we need explicit counting
L = { wcw | w {a,b}* }
The flavor of checking whether identifiers are
declared before their uses
Cannot be done with a context-free grammar requires
context-sensitive conditions
Syntax analysis (i.e., parser), which is based on a
context-free grammar, cannot handle context-
sensitive semantic properties
37
Complex Evaluation Rules
<X>.A := could be rather complex e.g. with
helper functions, conditional expressions, etc.
Example:
<X>.A := if (<Y>.B = <Z>.C) then f1(<Y>.D) else f2(<Z>.E)
Must be if-then-else; cannot be if-then. Why?
Helper functions f1 and f2 can use basic algorithms and
local data structures, but cannot have global effects
Can only use attributes of non-terminals and terminals
that appear in this production alternative

38
Attribute Evaluation: Dependence Graph
<X>.A := <Y>.B + <Z>.C
Since the value of <X>.A depends on <Y>.B:
X.A Y.B dependence edge
Since the value of <X>.A depends on <Z>.C:
X.A Z.C dependence edge
<X>1.A := <X>2.A two different X nodes in the parse tree
Since the value of <X>1.A depends on <X>2.A:
X1.A X2.A dependence edge

39
Algorithm for Attribute Evaluation
Given a parse tree with attributes attached to tree
nodes, how do we compute the attribute values?

Step 1: find evaluation order of attributes


a) Build dependence graph where a node is a pair
(parse tree node, attribute)
b) Complain about cycles in the graph: cannot evaluate
c) Topologically sort the graph

Step 2: evaluate the attributes in sorted order

40
Example: Binary Numbers
<B> ::= <D>
<B>.pos := 1
<B>.val := <D>.val
<D>.pow := 0
<B>1 ::= <D><B>2
<B>1.pos := <B>2.pos + 1
<B>1.val := <B>2.val + <D>.val
<D>.pow := <B>2.pos
<D> ::= 0
<D>.val := 0
<D> ::= 1
<D>.pow
<D>.val := 2
41
Dependence Graph for Binary Numbers

B1 pos val

D1 pow val B2 pos val

1 D2 pow val B3 pos val

0 D3 pow val B4 pos val

1 D4 pow val

0
42
Sort the Graph
Topological sort: x is smaller than y iff x y

D4.pow, B4.pos, D4.val, B4.val,

B3.pos, D3.pow, D3.val, B3.val,

B2.pos, D2.pow, D2.val, B2.val,

B1.pos, D1.pow, D1.val, B1.val


43
Cycles
The notion of topological sort only makes sense
for directed acyclic graphs
Cycles in the dependence graph means we have
recursive dependencies
There are approaches to solve meaningful recursive
systems of equations (e.g., fixed-point computation)
But, in this course we will disallow cycles
No cyclic dependencies in your solutions in exams and
homeworks

44
Context-Sensitive Conditions in PL
Condition: ids that are used must have been declared
<prog> ::= <block>
<block> ::= begin <declist> ; <stmtlist> end
How to ensure that in <stmtlist> we only use variables declared in
<declist>?
Using synthesized attributes:
<block> ::=
Cond: <stmtlist>.used-ids <declist>.declared-ids
Using inherited attributes: (inherited attr. allowed-ids for <stmtlist>)
<block> ::=
<stmtlist>.allowed-ids := <declist>.declared-ids

45
Context-Sensitive Conditions in PL
Problem: nested blocks; for example, in Java we can have
{ int x,y; x = 1; y=2;
{ int y,z; y = 3; z = x+y; } // correct; uses the innermost y
x = y+z; } // incorrect
Consider
<block> ::= begin <declist> ; <stmtlist> end
Suppose we have another <block> inside <stmtlist>. How do we know
which variables are declared in the outer block and which ones are
declared in the inner block? Solution: use a stack
<block> ::= begin <declist> ; <stmtlist> end
<stmtlist>.allowed-ids-stack :=
push-on-top-of-stack(<declist>.declared-ids,<block>.allowed-ids-stack)
46
Long Example 1: Type Checking
Given program with declarations, check types of
variables (integer and boolean only)
Do not allow duplicate declarations in the same block:
e.g. int x; int x; or int y; bool y;
For type checking inside a nested block, use the
innermost variable declarations
In general, type checking is a form of semantic
checking that a compiler will perform after parsing,
on the parse tree (or, more likely, on the AST)
We will create an attribute grammar that (1) defines
when a program is well-typed, and (2) implicitly defines
a type checking algorithm
47
Data Structure
Use a stack of symbol tables
symbol table: set of pairs (name,type)

Build a symbol table for the declarations in a


begin-end block
as a synthesized attribute tbl which is a map:
map(string { INT, BOOL })

Propagate a stack of symbol tables to


statements and expressions
propagate downwards on the parse tree as an
inherited attribute tbl-stack (a stack of maps)
48
Type Checking
Propagate downwards a stack of symbol tables
begin
bool i;
int j;
begin
int x;
top of stack
int i; [
x := i + j; {(x",INT), ("i",INT)},
end {("i",BOOL), ("j",INT)}
end ]

49
Context-Free Grammar Part 1
<prog> ::= <block>
<block> ::= begin <declist> ; <stmtlist> end

<declist> ::= <decl> | <decl> ; <declist>


<decl> ::= int id | bool id

<stmtlist> ::= <stmt> | <stmt> ; <stmtlist>


<stmt> ::= <assign> | <block> |
if <boolexpr>
then <stmtlist> else <stmtlist>

50
Context-Free Grammar Part 2
<assign> ::= id := <boolexp> |
id := <intexp>

<boolexp> ::= true | false | id

<intexp> ::= const | id |


<intexp> + <intexp> ambiguous, but it
doesnt matter because we have already done the parsing; using an
ambiguous context-free grammar here makes the corresponding
attribute grammar much simpler

51
Attribute Grammar Top Level
<prog> ::= <block>
<block>.tbl-stack := emptystack
<block> ::= begin <declist> ; <stmtlist> end
creates <stmtlist>.tbl-stack :=
a fresh
stack push(<declist>.tbl, <block>.tbl-stack)
<stmtlist>1 ::= <stmt>
<stmt>.tbl-stack := <stmtlist>1.tbl-stack
| <stmt> ; <stmtlist>2
<stmt>.tbl-stack := <stmtlist>1.tbl-stack
<stmtlist>2.tbl-stack := <stmtlist>1.tbl-stack
52
Declarations: Synthesized Attribute tbl
<declist>1 ::= <decl>
<declist>1.tbl := <decl>.tbl
| <decl> ; <declist>2
<declist>1.tbl := <decl>.tbl <declist>2.tbl
Cond: ids(<decl>.tbl)ids(<declist>2.tbl)=

ids is a helper function that returns the set of all names


in a table; used to check for duplicates: e.g. int x; int x;
in the same <declist>

53
Declarations
<decl> ::= int id
<decl>.tbl := { (id.lexval, INT) } table with one pair
| bool id
<decl>.tbl := { (id.lexval, BOOL) } table with one pair

lexval is a built-in attribute for terminal symbols


Shows the value of the token from the lexical analysis
stage of a compiler/interpreter (in this case, the string
name of the variable)

54
Statements
<stmt> ::= <assign>
<assign>.tbl-stack := <stmt>.tbl-stack
| <block>
<block>.tbl-stack:= <stmt>.tbl-stack
| if <boolexp> then <stmtlist>1 else <stmtlist>2
<boolexp>.tbl-stack := <stmt>.tbl-stack
<stmtlist>1.tbl-stack := <stmt>.tbl-stack
<stmtlist>2.tbl-stack := <stmt>.tbl-stack

55
Statements
<assign> ::= id := <intexp>
typeOf: <intexp>.tbl-stack := <assign>.tbl-stack
returns the Cond: typeOf(id.lexval,<assign>.tbl-stack) = INT
innermost
type on | id := <boolexp>
the stack <boolexp>.tbl-stack := <assign>.tbl-stack
Cond: typeOf(id.lexval,<assign>.tbl-stack) = BOOL

56
Expressions
<intexp>1 ::= const
| id
Cond: typeOf(id.lexval,<intexp>1.tbl-stack) = INT
| <intexp>2 + <intexp>3
<intexp>2.tbl-stack := <intexp>1.tbl-stack
<intexp>3.tbl-stack := <intexp>1.tbl-stack
<boolexp> ::= true
| false
| id
Cond: typeOf(id.lexval,<boolexp>.tbl-stack) = BOOL

57
Allow Function Declarations and Calls
<decl> ::= int id | bool id |
fun id ( <formalslist> ) = <block>

<intexp> ::= const


| id
|<intexp> + <intexp>
| id ( <actualslist> )

Check parameter passing and function bodies?


Details of <formalslist> and <actualslist> not shown
For simplicity, assume that functions return integer values

58
Declarations
<decl> ::= int id
<decl>.tbl := { (id.lexval, INT) }
| bool id
<decl>.tbl := { (id.lexval, BOOL) }
| fun id ( <formalslist> ) = <block>
<decl>.tbl :=
synthesized attribute:
{ (id.lexval, list of INT/BOOL;
FUN(<formalslist>.types) ) } details not shown

Need function types: e.g. FUN(INT,INT,BOOL)

59
More Info Needed to Check Function Bodies
Declarations: add inherited attribute tbl-stack
<block> ::= begin <declist> ; <stmtlist> end
<stmtlist>.tbl-stack :=
push(<declist>.tbl, <block>.tbl-stack)
<declist>.tbl-stack :=
push(<declist>.tbl, <block>.tbl-stack)

We first gather all info for all declarations appearing in


<declist>, and then we use them to check the blocks (i.e.,
function bodies) embedded in <declist>
60
Declarations
<declist>1 ::= <decl>
<declist>1.tbl := <decl>.tbl
<decl>.tbl-stack := <declist>1.tbl-stack
| <decl> ; <declist>2
<declist>1.tbl := <decl>.tbl <declist>2.tbl
<decl>.tbl-stack := <declist>1.tbl-stack
<declist>2.tbl-stack := <declist>1.tbl-stack
Cond: ids(<decl>.tbl) ids(<declist>2.tbl) =

61
Declarations Revisited
<decl> ::= int id
<decl>.tbl := { (id.lexval, INT) }
| bool id
<decl>.tbl := { (id.lexval, BOOL) }
| fun id ( <formalslist> ) = <block>
<decl>.tbl :=
set of (id, type);
{ (id.lexval, details not shown;
FUN(<formalslist>.types) ) } same as <declist>.tbl
<block>.tbl-stack :=
push(<formalslist>.tbl,<decl>.tbl-stack)
62
Function Call
<intexp> ::= id ( <actualslist> )
Cond: typeOf(id.lexval,
<intexp>.tbl-stack) = FUN()
list of <intexp> list of INT/BOOL
and <boolexp> <actualslist>.expectedTypes :=
paramTypes(typeOf(id.lexval,
<intexp>.tbl-stack))
<actualslist>.tbl-stack := <intexp>.tbl-stack

expectedTypes is compared against the actual parameters;


details not shown; will be developed in a homework
63
Type-Checking Function Bodies
Is the following code legal?
begin
fun f (int i) =
begin
... g(5) ...
end;
fun g (int j) =
begin
... f(8)
end;
end
Related question: is recursion legal?

64
Long Example 2: Code Generation
Given: parse tree for a simple program (after type
checking)
Goal: translate to assembly code
The evaluation rules of the attribute grammar
generate the assembly code

Note: in a real compiler, the parse tree (or AST) will be translated to
a machine-independent simplified representation (e.g., three-
address code) which is then optimized and translated to machine-
specific assembly code. Details in CSE 5343.

65
Simple Imperative Language (IMP)
<c>1 ::= skip | <assign> | <c>2 ; <c>3
| if <be> then <c>2 else <c>3
| while <be> do <c>2
<assign> ::= id := <ae>
<ae>1 ::= id | const | <ae>2 + <ae>3
| <ae>2 - <ae>3 | <ae>2 * <ae>3
<be>1 ::= true | false
| <ae>1 = <ae>2 | <ae>1 < <ae>2
| <be>2 | <be>2 <be>3
| <be>2 <be>3

66
Artificial Assembly Language
Processor with a single register (accumulator)
LOAD x: copy the value of memory location
(variable) x into the accumulator
LOAD const: set the value of the accumulator to an
integer constant
STO x: write accumulator to memory location
(variable) x

67
Assembly Language
ADD x: add the value of memory location x to the
content of the accumulator
The result stays in the accumulator
ADD const: add constant to accumulator
BR L: branch to label L (i.e., this is a goto)
BZ L: if accumulator is zero, branch to label L
L: NOP: label L, associated with a no-operation
instruction NOP

68
Code Generation Strategy
Synthesized attribute code contains a sequence
of instructions: concatenation of subsequences
from its children, plus new instructions

<ae>1 ::= <ae>2 + <ae>3


<ae>1.code :=
<ae>2.code temporary
leaves its result "STO T1" variable T1
in the accumulator <ae>3.code
"ADD T1"

69
Code Generation Strategy
<c>1 ::= if <be> then <c>2 else <c>3
<c>1.code :=
<be>.code
"BZ L1"
<c>2.code
"BR L2"
"L1: NOP"
<c>3.code
"L2: NOP"

70
Problems
T1 cannot be used in <ae>3.code
Need to generate new temporary names
Labels L1 and L2 cannot be used in <c>2.code,
<c>3.code, or elsewhere
Need to generate label names
Keep counter for temporary names
Inherited attribute temp (integer)
Keep counters for label names
Inherited attribute labin, synthesized attribute labout
(integers)

71
Code Generation for Assignments
<assign> ::= id := <ae>
<ae>.temp := 1
<assign>.code :=
<ae>.code
"STO" id.lexval

For recursive languages, we can have multiple run-time


instances of id.lexval; need to manage them through a call
stack, and to issue STO for the latest instance on the stack

72
Code Generation for Expressions
<ae>1 ::= const
<ae>1.code := "LOAD" const.lexval
| id
<ae>1.code := "LOAD" id.lexval
| <ae>2 + <ae>3
<ae>2.temp := <ae>1.temp
<ae>3.temp := <ae>1.temp + 1
<ae>1.code :=
<ae>2.code
"STO T" <ae>1.temp
<ae>3.code
"ADD T" <ae>1.temp
73
Examples
x := y+z x := a*b + c*d x := a*b*c +
LOAD y LOAD a d*e*f + p*q*r
STO T1 STO T1 ?
LOAD z LOAD b STO x
ADD T1 MUL T1
STO x STO T1
LOAD c
After compiler STO T2
optimizations: LOAD d
LOAD y MUL T2
ADD z ADD T1
STO x STO x
74
Additional Questions
Q1: Can we exchange the order of operands: e.g.,
<ae>3.code
"STO T" <ae>1.temp
<ae>2.code
"ADD T" <ae>1.temp
Q2: Is such exchange allowed for all operators?
Q3: Changes to <ae>2.temp and <ae>3.temp?
Q4: How do we minimize the total number of
temporary variables?

75
Code Generation for Statements
<prog> ::= <c>
<prog>.code := <c>.code
<c>.labin := 1
<c>1 ::= <c>2 ; <c>3
<c>1.code :=
<c>2.code
<c>3.code
<c>2.labin := <c>1.labin
<c>3.labin := <c>2.labout
76
<c>1.labout := <c>3.labout
Code Generation for Statements
<c> ::= skip
<c>.code := "NOP"
<c>.labout := <c>.labin
| <assign>
<c>.code := <assign>.code
<c>.labout := <c>.labin

77
Code Generation for Statements
<c>1 ::= if <be> then <c>2 else <c>3
<c>2.labin := <c>1.labin + 2
<c>3.labin := <c>2.labout
<c>1.labout := <c>3.labout
<c>1.code := <be>.code
"BZ L" <c>1.labin
<c>2.code
"BR L" (<c>1.labin+1)
"L" <c>1.labin ": NOP"
<c>3.code
"L" (<c>1.labin+1) ": NOP"
78
Code Generation for Statements
<c>1 ::= while <be> do <c>2
<c>2.labin := <c>1.labin + 2
<c>1.labout := <c>2.labout
<c>1.code :=
"L" <c>1.labin ": NOP"
<be>.code
"BZ L" (<c>1.labin+1)
<c>2.code
"BR L" <c>1.labin
"L" (<c>1.labin+1) ": NOP"

79
Example for Code Generation
Source program
if x = 42 then
if a = b then
y := 1
else
y := 2
else
y := 3

80
Example for Code Generation
Generated code for outer if statement
# code for x=42 comparison
BZ L1
# code for inner if
BR L2
L1: NOP
# code for y := 3
L2: NOP

81
Inner If Statement
# code for x=42 comparison
BZ L1
# code for a=b comparison
BZ L3
# code for y := 1
BR L4
L3: NOP
# code for y := 2
L4: NOP
BR L2
L1: NOP
# code for y := 3
L2: NOP

82
Code for Assignments
# code for x=42 comparison
BZ L1
# code for a=b comparison
BZ L3
LOAD 1
STO y
BR L4
L3: NOP
LOAD 2
STO y
L4: NOP
BR L2
L1: NOP
LOAD 3
STO y
L2: NOP

83
Summary: Attribute Grammars
Useful for expressing arbitrary cycle-free traversals
over context-free parse trees
Synthesized and inherited attributes
Conditions to reject invalid parse trees
Evaluation order depends on attribute dependencies
Uses: type checking and code generation
Basic data structures (sets, maps, etc.) can be used
The evaluation rules can call helper functions
but the functions cannot have global effects
(side effects)

84