Académique Documents
Professionnel Documents
Culture Documents
JOHNPAUL C I
WHY COMPILER??
INTRODUCTION
TRANSLATION COMPILER CONSTRUCTION
SCANNING
Johnpaul C I
Compiler Design(IS0426)
August 2015
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Why Compiler....
Introduction
Translation Process
Scanning
JOHNPAUL C I
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Why Compiler....
Introduction
Translation Process
Scanning
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Why Compiler....
Introduction
Translation Process
Scanning
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Text Book:
Compiler Construction: Principles and Practice
by Kenneth C Louden, 1997 edition
Reference Books:
Compilers Principles, Techniques and Tools by
Aho, J. D. Ullman,Monica S Lam
Compiler Construction, by Niklaus Wirth
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
Late in 1940s....After the stored program
SCANNING concept of John Von Newman
Initially Binary language is used to make the
machine to perform actions !!!
It is not human readable and not at all
user-friendly
Then Assembly language is used (eg MOV AX,
BX)
It has also got demerits
Lastly high level languages came which is
human friendly and not machine friendly
To make it machine friendly Compilers
originated
WHY COMPILER??
High-level languages Human friendly
INTRODUCTION
TRANSLATION
languages
SCANNING
Similar to the communication language of
human beings, these high level languages also
have their own origin, grammar and usage rules.
Noam Chomsky did a great research in the
grammar and usage scenarios of these High
level languages.
He classified the grammar of Programming
languages can be in the form of four basic
groups
I Type 0 REL - Recursive Enumerated
LanguageTuring Machine
I Type 1 CSL - Context Sensitive Language
Linear Bounded Automata
I Type 2 CFL - Context Free Language PDA
I Type 3 Regular Language Finite Automata
JOHNPAUL C I () COMPILER DESIGN August 2015 12 / 28
COMPILER DESIGN
History of Compiler Programs...
JOHNPAUL C I
WHY COMPILER??
INTRODUCTION
TRANSLATION Each of These Languages have their own
SCANNING
associated grammar
I Type 0 - REL REG
I Type 1 - CSL CSG
I Type 2 - CFL CFG
I Type 3 - RL RG
The Language of our interest will be Type 2
(Context Free Language)
Since most of the programming languages will
be coming in this group, CFG will be also a
part of our exploration
Regular expressions are one of the
representation of these programming languages
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Interpreters Python, Lisp
Assemblers as
Linkers Library and header files
Loaders Memory allocation etc
Preprocessors Unfold the source code as much
as possible
Debuggers gdb
Profilers memory usage
Project Managers Manage and monitor
projects
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Why Compiler....
Introduction
Translation Process
Scanning
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING Scanner Lexical Analysis Tokens
Parser Parse tree creations Syntax tree
Semantic Analysis Meaning of statements
Annotated tree
Source Code Optimizer Which modifies the
Source code Optimized source code
Code generator Generated Assembly
Language Instructions Assembly
language(Target code)
Target Code Optimizer Optimizes the
assembly language Optimized Target code
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
Why Compiler....
Introduction
Translation Process
Scanning
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING The basic mechanism of tokenisation is pattern
matching.
The most common token classifications are
I Reserved words IF, ELSE, WHILE, DO, ......
I Special Symbols PLUS, MINUS, MUL, DIV,
.....
I Multiple Strings NUM, ID
The value of IF token is if, similarly THEN is
then,....
To avoid the ambiguity in their token name and
value, usually the if, then, else.... etc are known
as lexemes.
WHY COMPILER??
INTRODUCTION
TRANSLATION Any value associated with a token is called
SCANNING
attributes
For instance if we type any number say 5655.
That character sequence will be a string 5655.
Apart from that string value it also has a
numeric value of five thousand six hundred and
fifty five; ie a numeric value.
If we type +, It has a character value + and
operative meaning of addition
These string values and numeric values are
known as attributes.
A token can have as many number of attributes.
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
gettoken()....skip...gettoken()....gettoken()
WHY COMPILER??
Scanning...
INTRODUCTION
TRANSLATION
SCANNING
Scanning will give rise to tokens.
It is made possible by pattern matching.
For pattern matching Regular Expressions are
used.
So to give life to regular expression based
pattern matching, there should be a Finite
Automata.
Finite Automata will tell the compiler designer
where to use the Conditions, from which input
where to go..
Finally he will convert the Finite Automata in
to a scanning program.
JOHNPAUL C I
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
JOHNPAUL C I
WHY COMPILER??
INTRODUCTION
TRANSLATION
SCANNING
THANK YOU.