Académique Documents
Professionnel Documents
Culture Documents
Introduction to Compilers
Input
COMPILER
Source Program
Output
Target Program
Analysis Part
(Source Program Intermediate Code)
Synthesis Part
(Intermediate Code to Target Code)
PHASES OF COMPILER
Source Program
Lexical Analyzer Analysis Phase
Syntax Analyzer
Semantic Analyzer Symbol table Management Intermediate Code Generator Synthesis Phase Code Optimizer Error detection and handling
Code Generator
(Scanning or Tokenization)
2. Syntax Analysis
(Parsing)
source string
Intermediate code generation t1 : = int_to_float (10) t2 : = rate * t1 t3 : = count + t2 total : = t3 Code Optimization
Improves the intermediate code this is necessary to have faster executing code
Code Generation
Intermediate code is translated into sequence of machine instructions MOV MUL MOV ADD MOV rate, R1 #10.0, R1 count, R2 R2, R1 R1, total
Input String
Lexical Analyzer Returns tokens Parser
Syntax tree
1. 2. 3. 4. 5.
It produces stream of tokens It eliminates blank and comments It generates symbol table It keeps track of line numbers It reports the error encountered while generating the tokens
Tokens, Patterns, and Lexemes : Tokens : Category of input string ex: identifiers, Keywords, constants Patterns : Set of rules that describe the tokens (RE) Lexemes : Sequence of characters in source program that are matched with
the pattern of the token. Ex int, I, num, ans, choice
Input Buffering : for storing input string and scanning in LA bp (begin pointer)
i n t i , j ; i = i + 1 ;
fp (forward pointer)
bp and fp used to keep track of the portion of the input scanned Two Schemes of buffering 1. One buffer scheme 2. Two buffer scheme
Recognition of Tokens :
Token is represented by a pair :
Token Type Token Value
Type
identifier . . constant . . identifier . . constant
Value
a
!=
i
7 8
8 9 9 10
5 1
2 1 2 -
(
) + =