Académique Documents
Professionnel Documents
Culture Documents
,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Compiler :
A Compiler is a program that reads a program written in one language (Source
Language like C,C++,etc…) and translate it into an equivalent program in another
language (Target Language like Machine Language) and the complier reports to its user
the presence of errors in the source program.
Error Message
Classification of Compiler :
Software Tools :
Many software tools that manipulate source programs first perform some kind of
analysis. Some examples of such tools include:
Structure Editors :
A structure editor takes as input a sequence of commands to build a
source program.
The structure editor not only performs the text-creation and
modification functions of an ordinary text editor, but it also analyzes
the program text, putting an appropriate hierarchical structure on the
source program.
Example – while …. do and begin….. end.
Pretty printers :
A pretty printer analyzes a program and prints it in such a way that the
structure of the program becomes clearly visible.
Static Checkers :
A static checker reads a program, analyzes it, and attempts to discover
potential bugs without running the program.
1 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Interpreters :
Translate from high level language ( BASIC, FORTRAN, etc..) into
assembly or machine language.
Interpreters are frequently used to execute command language, since
each operator executed in a command language is usually an
invocation of a complex routine such as an editor or complier.
The analysis portion in each of the following examples is similar to
that of a conventional complier.
Text formatters.
Silicon Compiler.
Query interpreters.
Identifier Expression
| |
position +
Expression Expression
| |
identifier *
|
initial Expression
Expression |
| number
identifier |
| 60
rate
2 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Semantic analysis :
In this phase checks the source program for semantic errors and
gathers type information for subsequent code generation phase.
An important component of semantic analysis is type checking.
Example : int to real conversion.
Expression
|
*
Expression
Expression |
| number
identifier |
| inttoreal
rate |
60
Phases of Complier:
A Compiler operates in phases, each of which transforms the source program from
one representation to another.
3 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Source Program
Lexical Analyzer
Syntax Analyzer
Intermediate Code
Generator
Code Optimizer
Code Generator
Target Program
Syntax analysis:
id1 +
id2 *
id3 60
Semantic analysis :
In this phase checks the source program for semantic errors and
gathers type information for subsequent code generation phase.
An important component of semantic analysis is type checking.
4 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
id1 +
id2 *
id3 inttoreal
|
60
Code Optimization:
5 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Error Handler:
The syntax analysis phase can detect errors where the token
stream violates the structure rules of language.
The code optimizer, doing control flow analysis may detect that
certain statements can never be reached.
6 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Write down the output of each phase for expression position : = initial + rate * 60
Source Program
position : = initial + rate * 60
Lexical Analyzer
Syntax Analyzer
:=
id1 +
id2 *
id3 60
Semantic Analyzer
:=
id1 + Error
Symbol Table Handler
Management id2 *
id3 inttoreal
|
60
Intermediate Code
Generator
Code Optimizer
Code Generator
Target Program
MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2, R1
MOVF R1, id1
7 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Preprocessor
Source Program
Complier
Target Assembly Program
Assembler
Relocatable Machine Code
Load/Link-editor Library,
Relocatable Object Files
Complier :
It converts the source program (HLL) into target program (LLL).
Assembler :
It converts an assembly language (LLL) into machine code.
8 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Grouping of Complier :
A Symbol table is data structure containing a record for each identifier
with fields for the attributes of an identifier.
When an identifier in the source program is detected by the lexical analyzer, the
9 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Parser Generators:
These produce syntax analyzers, normally from input that is based on
CFG. In early compliers, syntax analysis consumed not only a large fraction of the
scanning time of a complier, but a large fraction of the intellectual effort of writing a
complier.
Eg: PIC, EQM
Scanner Generator:
These automatically generate lexical analyzers, normally from a
specification based on regular expressions. The basic organization of the resulting
lexical analyzer is in effect of finite automation.
Dataflow Engines:
Much of the information needed to perform good code optimization
involves “dataflow analysis”, the gathering of information about how values are
transmitted from one part of a program to each other part.
10 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
Tokens
Source Program
Lexical Parser
analyzer
Get next token
Symbol table
Management
Its main task is to read the input characters and produce as output a sequence
of tokens that the parser uses for syntax analysis.
Receiving a “get next token” command from the parser, the lexical analyzer
reads input characters until it can identify the next token.
FUNCTIONS:
1. It produces the stream of tokens.
2. It eliminates blank and commands.
3. It generates symbol table which stores the information about ID, constants
encountered in the input.
4. It keeps track of line number.
5. It reports the error encountered while interrupting the tokens.
There are several reasons for separating the analysis phase of compiling into
lexical analysis and parsing.
Simpler design.
Compiler efficiency is improved.
Compiler portability is enhanced.
11 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
TOKEN:
PATTERN:
A set of strings in the input for which the same token is produced as
output. This set of strings is described by a rule called a pattern associated with
the token.
LEXEME:
INPUT BUFFERING :
During the analysis, the scanner scans the input string from left to right one
character at a time to identify tokens. It uses two pointers for doing this analysis
1. Begin pointer (to keep track of first character for each token).
f l o a t a , b ; a = A + 2 ;
fp
Steps in Scanning the Input:
1. Initially, both begin pointer and forward pointer points to the first character of the
lexeme.
2. The fp scans the buffer until there is a match with the described token is found.
3. Once the lexeme is found (either a space or a delimiter), the fp will represent the
right end to the lexeme.
12 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
bp
f l o a t a , b ; a = A + 2 ;
fp
4. After processing the lexeme, both pointers will be set to point the character
immediately after the lexeme.
bp
f l o a t a , b ; a = A + 2 ;
fp
5. This procedure is represented for the entire source program.
First N characters of the input string are read into the buffer. When the fp
reaches the end into the buffer, it will be filled with the next set of N
characters.
Drawbacks:
The problem with this implementation is that when the size of the token is
greater than „N‟ this scheme fails to produce the tokens.
f L o a t eof a , b ; a = a + 2 eof
fp
13 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
First N characters are read into the first half of the buffer. If the buffer
hasn‟t filled (<N) then a special character called EOF will be inserted to
indicate the end.
When the pointer reaches the end of first half, then the second half will be
loaded with next N characters of the same program.
When the pointer is about to reach the end of second half, then the first
half will be loaded with next N characters of the input.
fp = fp+1;
begin
if fp = eof then
if fp at the end of first half then
begin
Load second half;
Fp by 1;
end
else if fp at the end of second half then
begin
Load first half;
Set fp to first character of first half;
end
else
Terminate lexical analysis;
End
14 / 15
Anna University – B.E -VI Sem CSE D. Jagadeesan, M.Tech., MISTE.,
CS1352 – Principles of Compiler Design Lect/CSE,APCE
Unit – I
1. Finite Automata
2. DFA
3. NFA
4. Regular Expression
5. Converting R.E into NFA
6. Converting NFA with into NFA and DFA
7. Minimization of DFA.
Formatted: Indent: Left: 1"
15 / 15