Vous êtes sur la page 1sur 38

DEPARTMENT OF INFORMATION TECHNOLOGY

Indore – Dewas Bypass Road, Indore-452001


Approved By AICTE & Affiliated to RGPV Bhopal

Laboratory – Manual
For Academic Session Jan-June 2019

COMPILER DESIGN

Student Name : Shivam Panwar

Enrollment No. :- 0821IT161007

Batch :- 2016-2020

Department :- Information Technology

Year/Semester :- VI-semester(III-year)
INDEX
Serial No. Topic Date of submission Teacher’s
Remark

1 Study of LEX Tools

2 Study of YACC Tools

3 Develop a lexical analyzer to


recognize few pattern

4 Develop LL(1) parser

5 Develop an operator precedence


parser

6 Develop a recursive descent


parser

7 Write an algorithm to convert


NFA to DFA

8 Write an algorithm to minimize


NFA

9 Write an algorithm to check


string is grammar or not

10 To study object-oriented
Compiler
EXPERIMENT NO. 1

Aim :- To study the LEX Tools.

Lex is a tool in lexical analysis phase to recognize tokens using regular


expression. Lex tool itself is a lex compiler. Lex is a program designed to
generate scanners, also known as tokenizers, which recognize lexical patterns
in text. Lex is an acronym that stands for "lexical analyzer generator." It is
intended primarily for Unix-based systems. The code for Lex was originally
developed by Eric Schmidt and Mike Lesk.

Lex can perform simple transformations by itself but its main purpose is to
facilitate lexical analysis, the processing of character sequences such as source
code to produce symbol sequences called tokens for use as input to other
programs such as parsers. Lex can be used with a parser generator to perform
lexical analysis. It is easy, for example, to interface Lex and Yacc, an open
source program that generates code for the parser in the C programming
language.

Lex is proprietary but versions based on the original code are available as
open source. These include a streamlined version called Flex, an acronym for
"fast lexical analyzer generator".

Lexical analysis is the first phase of a compiler. It takes the modified source
code from language preprocessors that are written in the form of sentences.
The lexical analyzer breaks these syntaxes into a series of tokens, by
removing any whitespace or comments in the source code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical
analyzer works closely with the syntax analyzer. It reads character streams
from the source code, checks for legal tokens, and passes the data to the
syntax analyzer when it demands.

Structure of Lex Programs

Lex program will be in following form:-

declarations

%%

translation rules

%%

auxiliary functions

Declarations This section includes declaration of variables, constants and regular


definitions.

Translation rules It contains regular expressions and code segments.

Auxiliary functions This section holds additional functions which are used in actions. These
functions are compiled separately and loaded with lexical analyzer.
EXPERIMENT NO. 2

Aim :- To study the YACC Tools.

Yacc (Yet Another Compiler-Compiler) is a computer program for


the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead
Left-to-Right (LALR) parser generator, generating a parser, the part of
a compiler that tries to make syntactic sense of the source code, specifically
a LALR parser, based on an analytic grammar written in a notation similar
to Backus–Naur Form (BNF).[1]Yacc is supplied as a standard utility on BSD and
AT&T Unix. GNU-based Linux distributions include Bison, a forward-compatible
Yacc replacement.
Yacc (for "yet another compiler compiler." ) is the standard parser generator for
the Unixoperating system. An open source program, yacc generates code for the
parser in the Cprogramming language. The acronym is usually rendered in
lowercase but is occasionally seen as YACC or Yacc. The original version of yacc was
written by Stephen Johnson at American Telephone and Telegraph (AT&T).
Versions of yacc have since been written for use with Ada, Java and several other
less well-known programming languages.
The name YACC is an acronym for “Yet Another Compiler Compiler.” YACC is a
specialized compiler that generates source code for part of a compiler you’re
trying to construct. Specifically, YACC is a parser generator, a software tool that
helps automate a portion of compiler development, namely building the parser. It
generates source code for a LALR (look-ahead left-to-right) parser.
YACC with lex tools

YACC – Yet Another Compiler-Compiler (Bison) is a parser generator for LALR(1)


grammars ▫ Given a description of the grammar generates a C source for the
parser .The input is a file that contains the grammar description with a formalism
similar to the BNF (Backus-Naur Form) notation for language specification ▫ non
terminal symbols – lowercase identifiers  expr, stmt ▫ terminal symbols–
uppercase identifiers or single characters  INTEGER, FLOAT, IF, WHILE, ‘;’, ‘.’ ▫
Grammar rules (production rules)  expr: expr ‘+’ expr | expr ‘*’ expr; E →
E+E|E*E.
EXPERIMENT NO. 3

Aim :- To develop a lexical analyzer to recognize a few pattern.

#include<iostream>
#include<fstream>
#include<stdlib.h>
#include<string.h>
#include<ctype.h>

using namespace std;

int isKeyword(char buffer[]){


char keywords[32][10] = {"auto","break","case","char","const","continue","default",

"do","double","else","enum","extern","float","for","goto",

"if","int","long","register","return","short","signed",

"sizeof","static","struct","switch","typedef","union",
"unsigned","void","volatile","while"};
int i, flag = 0;

for(i = 0; i < 32; ++i){


if(strcmp(keywords[i], buffer) == 0){
flag = 1;
break;
}
}

return flag;
}

int main(){
char ch, buffer[15], operators[] = "+-*/%=";
ifstream fin("program.txt");
int i,j=0;

if(!fin.is_open()){
cout<<"error while opening the file\n";
exit(0);
}

while(!fin.eof()){
ch = fin.get();

for(i = 0; i < 6; ++i){


if(ch == operators[i])
cout<<ch<<" is operator\n";
}

if(isalnum(ch)){
buffer[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){
buffer[j] = '\0';
j = 0;

if(isKeyword(buffer) == 1)
cout<<buffer<<" is keyword\n";
else
cout<<buffer<<" is indentifier\n";
}

fin.close();

return 0;
}

Output

EXPERIMENT NO. 4
Aim :- Develop a LL(1) Parser

#include<stdio.h>
#include<ctype.h>
#include<string.h>
void followfirst(char, int, int);
void follow(char c);
void findfirst(char, int, int);
int count, n = 0;
char calc_first[10][100];
char calc_follow[10][100];
int m = 0;
char production[10][10];
char f[10], first[10];
int k;
char ck;
int e;
int main(int argc, char **argv)
{ int jm = 0;
int km = 0;
int i, choice;
char c, ch;
count = 8;
strcpy(production[0], "E=TR");
strcpy(production[1], "R=+TR");
strcpy(production[2], "R=#");
strcpy(production[3], "T=FY");
strcpy(production[4], "Y=*FY");
strcpy(production[5], "Y=#");
strcpy(production[6], "F=(E)");
strcpy(production[7], "F=i");
int kay;
char done[count];
int ptr = -1;
for(k = 0; k < count; k++) {
for(kay = 0; kay < 100; kay++) {
calc_first[k][kay] = '!'; } }
int point1 = 0, point2, xxx;
for(k = 0; k < count; k++)
{ c = production[k][0];
point2 = 0;
xxx = 0;
for(kay = 0; kay <= ptr; kay++)
if(c == done[kay])
xxx = 1;
if (xxx == 1)
continue;
findfirst(c, 0, 0);
ptr += 1;
done[ptr] = c;
printf("\n First(%c) = { ", c);
calc_first[point1][point2++] = c;
for(i = 0 + jm; i < n; i++) {
int lark = 0, chk = 0;
for(lark = 0; lark < point2; lark++) {
if (first[i] == calc_first[point1][lark])
{ chk = 1;
break; } }
if(chk == 0)
{ printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i]; } }
printf("}\n");
jm = n;
point1++; }
printf("\n");
printf("-----------------------------------------------\n\n");
char donee[count];
ptr = -1;
for(k = 0; k < count; k++) {
for(kay = 0; kay < 100; kay++) {
calc_follow[k][kay] = '!';
}}
point1 = 0;
int land = 0;
for(e = 0; e < count; e++)
{ ck = production[e][0];
point2 = 0;
xxx = 0;
for(kay = 0; kay <= ptr; kay++)
if(ck == donee[kay])
xxx = 1;
if (xxx == 1)
continue;
land += 1;
follow(ck);
ptr += 1;
donee[ptr] = ck;
printf(" Follow(%c) = { ", ck);
calc_follow[point1][point2++] = ck;
for(i = 0 + km; i < m; i++) {
int lark = 0, chk = 0;
for(lark = 0; lark < point2; lark++)
{if (f[i] == calc_follow[point1][lark])
{chk = 1;
break; } }
if(chk == 0)
{ printf("%c, ", f[i]);
calc_follow[point1][point2++] = f[i]; } }
printf(" }\n\n");
km = m;
point1++; } }
void follow(char c)
{ int i, j;
if(production[0][0] == c) {
f[m++] = '$';
}
for(i = 0; i < 10; i++)
{ for(j = 2;j < 10; j++)
{ if(production[i][j] == c)
{ if(production[i][j+1] != '\0')
{ followfirst(production[i][j+1], i, (j+2)); }
if(production[i][j+1]=='\0' && c!=production[i][0])
{ follow(production[i][0]);
}}}}}
void findfirst(char c, int q1, int q2)
{ int j;
if(!(isupper(c))) {
first[n++] = c;
}
for(j = 0; j < count; j++)
{
if(production[j][0] == c)
{ if(production[j][2] == '#')
{
if(production[q1][q2] == '\0')
first[n++] = '#';
else if(production[q1][q2] != '\0'
&& (q1 != 0 || q2 != 0)) {
findfirst(production[q1][q2], q1, (q2+1));
}
else
first[n++] = '#';
}
else if(!isupper(production[j][2]))
{first[n++] = production[j][2;}
else
{findfirst(production[j][2], j, 3); } } } }
void followfirst(char c, int c1, int c2)
{ int k;
if(!(isupper(c)))
f[m++] = c;
else
{
int i = 0, j = 1;
for(i = 0; i < count; i++)
{
if(calc_first[i][0] == c)
break;
}
while(calc_first[i][j] != '!')
{
if(calc_first[i][j] != '#')
{
f[m++] = calc_first[i][j];
}
else
{
if(production[c1][c2] == '\0')
{
follow(production[c1][0]);
}
else
{
followfirst(production[c1][c2], c1, c2+1);
}
} j++; } } }
}
Output :

First(E)= { (, i, } Follow(E) = { $, ), }

First(R)= { +, #, } Follow(R) = { $, ), }

First(T)= { (, i, } Follow(T) = { +, $, ), }

First(Y)= { *, #, } Follow(Y) = { +, $, ), }

First(F)= { (, i, } Follow(F) = { *, +, $, ), }

EXPERIMENT NO. 5

Aim :- Develop an operator precendence Parser(with construction table also.)

#include<stdio.h>
#include<conio.h>
void main(){
char stack[20],ip[20],opt[10][10][1],ter[10];
int i,j,k,n,top=0,col,row;
clrscr();
for(i=0;i<10;i++)
{
stack[i]=NULL;
ip[i]=NULL;
for(j=0;j<10;j++)
{
opt[i][j][1]=NULL;
}}
printf("Enter the no.of terminals :\n");
scanf("%d",&n);
printf("\nEnter the terminals :\n");
scanf("%s",&ter);
printf("\nEnter the table values :\n");
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
printf("Enter the value for %c %c:",ter[i],ter[j]);
scanf("%s",opt[i][j]);
}
}
printf("\n**** OPERATOR PRECEDENCE TABLE ****\n");
for(i=0;i<n;i++)
{
printf("\t%c",ter[i]);
}
printf("\n");
for(i=0;i<n;i++){printf("\n%c",ter[i]);
for(j=0;j<n;j++){printf("\t%c",opt[i][j][0]);}}
stack[top]='$';
printf("\nEnter the input string:");
scanf("%s",ip);
i=0;
printf("\nSTACK\t\t\tINPUT STRING\t\t\tACTION\n");
printf("\n%s\t\t\t%s\t\t\t",stack,ip);
while(i<=strlen(ip))
{
for(k=0;k<n;k++)
{
if(stack[top]==ter[k])
col=k;
if(ip[i]==ter[k])
row=k;
}
if((stack[top]=='$')&&(ip[i]=='$')){
printf("String is accepted\n");
break;}
else if((opt[col][row][0]=='<') ||(opt[col][row][0]=='='))
{ stack[++top]=opt[col][row][0];
stack[++top]=ip[i];
printf("Shift %c",ip[i]);
i++;
}
else{
if(opt[col][row][0]=='>')
{
while(stack[top]!='<'){--top;}
top=top-1;
printf("Reduce");
}
else
{
printf("\nString is not accepted");
break;
}
}
printf("\n");
for(k=0;k<=top;k++)
{
printf("%c",stack[k]);
}
printf("\t\t\t");
for(k=i;k<strlen(ip);k++){
printf("%c",ip[k]);
}
printf("\t\t\t");
}
getch();
}
/*
output:
Enter the value for * *:>
Enter the value for * $:>
Enter the value for $ i:<
Enter the value for $ +:<
Enter the value for $ *:<
Enter the value for $ $:accept
**** OPERATOR PRECEDENCE TABLE ****
i + * $

i e > > >


+ < > < >
* < > > >
$ < < < a
*/
Enter the input string:
i*i

STACK INPUT STRING ACTION

$ i*i Shift i
$<i *i Reduce
$ *i Shift *
$<* i Shift i
$<*<i
String is not accepted

EXPERIMENT NO. 6

Aim :- Develop a recursive descent parser

#include <stdio.h>
#include <conio.h>
char input[100];
char prod[100][100];
int pos=-1,l,st=-1;
char id,num;
void E();
void T();
void F();
void advance();
void Td();
void Ed();
void advance()
{
pos++;
if(pos<l)
{
if(input[pos]>='0'&& input[pos]<='9')
{
num=input[pos];
id='\0';
}
if((input[pos]>='a' || input[pos]>='A')&&(input[pos]<='z' || input[pos]<='Z'))
{id=input[pos];
num='\0';
}
}
}
void E()
{
strcpy(prod[++st],"E->TE'");
T();
Ed();
}
void Ed()
{
int p=1;
if(input[pos]=='+')
{
p=0;
strcpy(prod[++st],"E'->+TE'");
advance();
T();
Ed();
}
if(input[pos]=='-')
{ p=0;
strcpy(prod[++st],"E'->-TE'");
advance();
T();
Ed();
}

// Recursive Descent Parser


if(p==1)
{
strcpy(prod[++st],"E'->null");
}
}

void T()
{
strcpy(prod[++st],"T->FT'");
F();
Td();
}
void Td()
{
int p=1;
if(input[pos]=='*')
{
p=0;
strcpy(prod[++st],"T'->*FT'");
advance();
F();
Td();
}
if(input[pos]=='/')
{ p=0;
strcpy(prod[++st],"T'->/FT'");
advance();
F();
Td();
}
if(p==1)
strcpy(prod[++st],"T'->null");
}
void F()
{
if(input[pos]==id) {
strcpy(prod[++st],"F->id");
advance(); }
if(input[pos]=='(')
{
strcpy(prod[++st],"F->(E)");
advance();
E();
if(input[pos]==')') {
//strcpy(prod[++st],"F->(E)");
advance(); }
}
if(input[pos]==num)
{
strcpy(prod[++st],"F->num");
advance();
}
}
int main()
{
int i;
printf("Enter Input String ");
scanf("%s",input);
l=strlen(input);
input[l]='$';
advance();
E();
if(pos==l)
{
printf("String Accepted\n");
for(i=0;i<=st;i++)
{
printf("%s\n",prod[i]);
}
}
else
{
printf("String rejected\n");
}
getch();
return 0;
}

OUTPUT:

EXPERIMENT NO. 7

Aim :- Write an algorithm to convert NFA to DFA

Problem Statement
Let X = (Qx, ∑, δx, q0, Fx) be an NDFA which accepts the language L(X). We have to design
an equivalent DFA Y = (Qy, ∑, δy, q0, Fy) such that L(Y) = L(X). The following procedure
converts the NDFA to its equivalent DFA −

Algorithm
Input − An NDFA

Output − An equivalent DFA


Step 1 − Create state table from the given NDFA.

Step 2 − Create a blank state table under possible input alphabets for the equivalent DFA.

Step 3 − Mark the start state of the DFA by q0 (Same as the NDFA).

Step 4 − Find out the combination of States {Q0, Q1,... , Qn} for each possible input
alphabet.

Step 5 − Each time we generate a new DFA state under the input alphabet columns, we
have to apply step 4 again, otherwise go to step 6.

Step 6 − The states which contain any of the final states of the NDFA are the final states
of the equivalent DFA.

Example

Let us consider the NDFA shown in the figure below.


Q δ(q,0) δ(q,1)

A {a,b,c,d,e} {d,e}

B {c} {e}

C ∅ {b}

D {e} ∅

E ∅ ∅

Using the above algorithm, we find its equivalent DFA. The state table of the DFA is shown
in below.

Q δ(q,0) δ(q,1)

[a] [a,b,c,d,e] [d,e]

[a,b,c,d,e] [a,b,c,d,e] [b,d,e]

[d,e] [e] ∅
[b,d,e] [c,e] [e]

[e] ∅ ∅

[c, e] ∅ [b]

[b] [c] [e]

[c] ∅ [b]

The state diagram of the DFA is as follows −


EXPERIMENT NO. 8

Aim :-Write an algorithm to check minimize of DFA

DFA minimization stands for converting a given DFA to its equivalent DFA with minimum
number of states.

Minimization of DFA
Suppose there is a DFA D < Q, Σ, q0, δ, F > which recognizes a language L. Then the
minimized DFA D < Q’, Σ, q0, δ’, F’ > can be constructed for language L as:
Step 1: We will divide Q (set of states) into two sets. One set will contain all final states
and other set will contain non-final states. This partition is called P0.
Step 2: Initialize k = 1
Step 3: Find Pk by partitioning the different sets of Pk-1. In each set of Pk-1, we will take all
possible pair of states. If two states of a set are distinguishable, we will split the sets into
different sets in Pk.
Step 4: Stop when Pk = Pk-1 (No change in partition)
Step 5: All states of one set are merged into one. No. of states in minimized DFA will be
equal to no. of sets in Pk.
Example
Consider the following DFA shown in figure.

Step 1. P0 will have two sets of states. One set will contain q1, q2, q4 which are final states
of DFA and another set will contain remaining states. So P0 = { { q1, q2, q4 }, { q0, q3, q5
} }.
Step 2. To calculate P1, we will check whether sets of partition P0 can be partitioned or
not:
i) For set { q1, q2, q4 } :
δ ( q1, 0 ) = δ ( q2, 0 ) = q2 and δ ( q1, 1 ) = δ ( q2, 1 ) = q5, So q1 and q2 are not
distinguishable.
Similarly, δ ( q1, 0 ) = δ ( q4, 0 ) = q2 and δ ( q1, 1 ) = δ ( q4, 1 ) = q5, So q1 and q4 are
not distinguishable.
Since, q1 and q2 are not distinguishable and q1 and q4 are also not distinguishable, So
q2 and q4 are not distinguishable. So, { q1, q2, q4 } set will not be partitioned in P1.
ii) For set { q0, q3, q5 } :
δ ( q0, 0 ) = q3 and δ ( q3, 0 ) = q0
δ ( q0, 1) = q1 and δ( q3, 1 ) = q4
Moves of q0 and q3 on input symbol 0 are q3 and q0 respectively which are in same set
in partition P0. Similarly, Moves of q0 and q3 on input symbol 1 are q3 and q0 which are
in same set in partition P0. So, q0 and q3 are not distinguishable.
δ ( q0, 0 ) = q3 and δ ( q5, 0 ) = q5 and δ ( q0, 1 ) = q1 and δ ( q5, 1 ) = q5
Moves of q0 and q5 on input symbol 0 are q3 and q5 respectively which are in different
set in partition P0. So, q0 and q5 are distinguishable. So, set { q0, q3, q5 } will be
partitioned into { q0, q3 } and { q5 }. So,
P1 = { { q1, q2, q4 }, { q0, q3}, { q5 } }

To calculate P2, we will check whether sets of partition P1 can be partitioned or not:
iii)For set { q1, q2, q4 } :
δ ( q1, 0 ) = δ ( q2, 0 ) = q2 and δ ( q1, 1 ) = δ ( q2, 1 ) = q5, So q1 and q2 are not
distinguishable.
Similarly, δ ( q1, 0 ) = δ ( q4, 0 ) = q2 and δ ( q1, 1 ) = δ ( q4, 1 ) = q5, So q1 and q4 are
not distinguishable.
Since, q1 and q2 are not distinguishable and q1 and q4 are also not distinguishable, So
q2 and q4 are not distinguishable. So, { q1, q2, q4 } set will not be partitioned in P2.
iv)For set { q0, q3 } :
δ ( q0, 0 ) = q3 and δ ( q3, 0 ) = q0
δ ( q0, 1 ) = q1 and δ ( q3, 1 ) = q4
Moves of q0 and q3 on input symbol 0 are q3 and q0 respectively which are in same set
in partition P1. Similarly, Moves of q0 and q3 on input symbol 1 are q3 and q0 which are
in same set in partition P1. So, q0 and q3 are not distinguishable.
v) For set { q5 }:
Since we have only one state in this set, it can’t be further partitioned. So,
P2 = { { q1, q2, q4 }, { q0, q3 }, { q5 } }
Since, P1=P2. So, this is the final partition. Partition P2 means that q1, q2 and q4 states
are merged into one. Similarly, q0 and q3 are merged into one. Minimized DFA
corresponding to DFA of Figure 1 is shown in Figure 2 as:
EXPERIMENT NO. 9

Aim :- Write a progrma to check if the given string is grammar or not

#include<stdio.h>
#include<conio.h>
#include<string.h>
void main() {
char string[50];
int flag,count=o;
clrscr();
printf("The grammar is: S->aS, S->Sb, S->ab\n");
printf("Enter the string to be checked:\n");
gets(string);
if(string[0]=='a') {
flag=0;
for (count=1;string[count-1]!='\0';count++) {
if(string[count]=='b') {
flag=1;
continue;
} else if((flag==1)&&(string[count]=='a')) {
printf("The string does not belong to the specified
grammar");
break;
} else if(string[count]=='a')
continue; else if(flag==1)&&(string[count]='\0')) {
printf("String accepted…..!!!!");
break;
} else {
printf("String not accepted");
}
}
}
getch();
}

Output

The grammer is:

S->aS

S->Sb

S->ab

Enter the string to be checked:

aab

String accepted....!!!!
EXPERIMENT NO. 10

Aim :- To study object –oriented Compiler

1. Object-Oriented Compiler Construction

Vous aimerez peut-être aussi