Tou SS Fin

TERM PAPER
SUB:-SYSTEM SOFTWARE
Topic:-SOURCE TO SOURCE COMPILER
Submitted By
CHETAN ANAND
RD1801A08
Reg.No-10808811
Submitted To
Ms.Neha Tewari
TABLE OF CONTENT
1. COMPILER
• Structure of compiler
• One pass Vs multi-pass
compiler
2. SOURSE TO SOURSE
COMPILER
• Defination
• Purpose
• Symbolic Manipulation
• Analyses Transformation
Compiler
A compiler is a computer program or set of programs that
transforms source code written in a programming language
into another computer language known as object code. The
most common reason for wanting to transform source code is
to create an executable program.
The name "compiler" is primarily used for programs that

translate source code from a high-level programming language
to a lower level language e.g., assembly language or machine
code. If the compiled program can only run on a computer
whose CPU or operating system is different from the one on
which the compiler runs the compiler is known as a cross-
compiler. A program that translates from a low level language
to a higher level one is a decompiler. A program that translates
between high-level languages is usually called a language
translator, source to source translator, or language converter. A
language rewriter is usually a program that translates the form
of expressions without a change of language.
A compiler is likely to perform many or all of the following

operations:
1. lexical analysis
2. preprocessing
3. parsing
4. semantic analysis
5. code generation
6. code optimization.
Program faults caused by incorrect compiler behavior can be

very difficult to track down and work around and compiler
implementors invest a lot of time ensuring the correctness of
their software.
The term compiler-compiler is sometimes used to refer to a

parser generator, a tool often used to help create the lexer and
parser.
Structure of compiler
Compilers bridge source programs in high-level languages with the

underlying hardwares. A compiler requires ,
1) to recognize legitimacy of programs
2) to generate correct and efficient code,
3) run-time organization,
4) to format output according to assembler or linker conventions.
A compiler consists of three main parts: frontend, middle-end, and

backend.
• Frontend checks whether the program is correctly written in terms

of the programming language syntax and semantics. Here legal and
illegal programs are recognized. Errors are reported, if any, in a
useful way. Type checking is also performed by collecting type
information. Frontend generates IR for the middle-end. Optimization
of this part is almost complete so much are already automated.
There are efficient algorithms typically in O(n) or O(n log n).
• Middle-end is where the optimizations for performance take place.

Typical transformations for optimization are removal of useless or
unreachable code, discovering and propagating constant values,
relocation of computation to a less frequently executed place (e.g.,
out of a loop), or specializing a computation based on the context.
Middle-end generates IR for the following backend. Most
optimization efforts are focused on this part.
• Backend is responsible for translation of IR into the target

assembly code. The target instruction(s) are chosen for each IR
instruction. Variables are also selected for the registers. Backend
utilizes the hardware by figuring out how to keep parallel FUs busy,
filling delay slots, and so on. Although most algorithms for
optimization are in NP, heuristic techniques are well-developed.
One-pass versus multi-pass compilers
Classifying compilers by number of passes has its background in the
hardware resource limitations of computers. Compiling involves
performing lots of work and early computers did not have enough memory
to contain one program that did all of this work. So compilers were split up
into smaller programs which each made a pass over the source
performing some of the required analysis and translations.
The ability to compile in a single pass has classically been seen as a

benefit because it simplifies the job of writing a compiler and one pass
compilers generally compile faster thanmulti-pass compilers. Thus, partly
driven by the resource limitations of early systems, many early languages
were specifically designed so that they could be compiled in a single pass
(e.g., Pascal).
In some cases the design of a language feature may require a compiler to

perform more than one pass over the source. For instance, consider a
declaration appearing on line 20 of the source which affects the
translation of a statement appearing on line 10. In this case, the first pass
needs to gather information about declarations appearing after
statements that they affect, with the actual translation happening during
a subsequent pass.
The disadvantage of compiling in a single pass is that it is not possible to
perform many of the sophisticated optimizations needed to generate high
quality code. It can be difficult to count exactly how many passes an
optimizing compiler makes. For instance, different phases of optimization
may analyse one expression many times but only analyse another
expression once.
Splitting a compiler up into small programs is a technique used by

researchers interested in producing provably correct compilers. Proving
the correctness of a set of small programs often requires less effort than
proving the correctness of a larger, single, equivalent program.
While the typical multi-pass compiler outputs machine code from its final
pass, there are several other types:
 A "source-to-source compiler" is a type of compiler that takes a high
level language as its input and outputs a high level language. For
example, an automatic parallelizing compiler will frequently take in a
high level language program as an input and then transform the code
and annotate it with parallel code annotations or language constructs.
 Stage compiler that compiles to assembly language of a theoretical
machine, like some Prolog implementations
 This Prolog machine is also known as the Warren Abstract
Machine (or WAM).
 Bytecode compilers for Java, Python, and many more are also
a subtype of this.
 Just-in-time compiler, used by Smalltalk and Java systems, and also
by Microsoft .NET's Common Intermediate Language (CIL)
Source-to-source compiler
Defination
A source-to-source compiler is a type of compiler that takes a
high level programming language as its input and outputs a
high level language. For example, an automatic parallelizing
compiler will frequently take in a high level language program
as an input and then transform the code and annotate it with
parallel code annotations e.g. OpenMP or language constructs
e.g. Fortran's DOALL statements.
Purpose
Purpose of source-to-source-compiling is translating legacy
code to use the next version of the underlying programming
language or an API that breaks backward compatibility.
Examples of this are seen in Python 3000 or Qt's qt3to4 tool.
Low Level Virtual Machine can translate from any language

supported by gcc 4.2.1 (Ada, C, C++, Fortran, Java), Objective-
C, or Objective-C++, to any of: C, C++, or MSIL by way of the
"arch" command in llvm-gcc.
EG:
% llvm-g++ x.cpp -o program
% llc -march=c program.bc -o program.c
% cc x.c
% llvm-g++ x.cpp -o program

% llc -march=msil program.bc -o program.msil
Symbolic Manipulation
Like its predecessor Polaris [1, 2], a key feature of Cetus is its ability to
reason
about the represented program in symbolic terms. For example, program
analysis
and optimization techniques at the source level often require the
expressions in
the program to be in a simpli_ed form. A speci_c example is a data
dependence
test which collects the coe_cients of a_ne subscript expressions to
interface
with the underlying data dependence solvers. Cetus supports such
expression
manipulations by providing a set of utilities that simplify and normalize
symbolic
expressions. .
1+2*a+4-a ) 5+a (folding)

a*(b+c) ) a*b+a*c (distribution)
(a*2)/(8*c) ) a/(4*c) (division)
(1-a)<(b+2) ) (1+a+b)>0 (normalization)
a && 0 && b ) 0 (short-circuit evaluation)
ANALYSES AND TRANSFORMATIONS

Automatically instrumenting source programs is a commonly
used compiler capability.Here we assume that
the function call cetus_tic(id) stores the current time
(get time of day, for example) with the given integer tag,
and cetus_toc(id) computes and stores the time difference
since the last call of cetus_tic(id).
At the end of an application run, basic runtime statistics
on the instrumented code sections can be generated
from this information.
class Instrumenter
{
...
public void instrumentLoops(Program p)
{
DepthFirstIterator iter = new DepthFirstIterator(p);
int loop_number = 0;
while ( iter.hasNext() ) {
Object obj = iter.next();
if ( obj instanceof ForLoop )
insertTimingCalls ((ForLoop )obj, loop_number++);
}
}
private void insertTimingCalls (ForLoop loop, int number)
{
FunctionCall tic, toc;
tic = new FunctionCall(new Identifier(“cetus_tic”));
toc = new FunctionCall(new Identifier(“cetus_toc”));
tic.addArgument(new IntegerLiteral(number));
toc.addArgument(new IntegerLiteral(number));
CompoundStatement parent = (CompoundStatement )loop.getParent();
parent.addStatementBefore(loop, new ExpressionStatement(tic));
parent.addStatementAfter(loop, new ExpressionStatement
(toc));
}
}
ANALYSES AND TRANSFORMATIONS
Automatically instrumenting source programs is a commonly
used compiler capability. Figure 1 shows a basic
loop instrumenter that inserts timing calls around each
for loop in the given input program. Here we assume that
the function call cetus_tic(id) stores the current time
(get time of day, for example) with the given integer tag,
and cetus_toc(id) computes and stores the time difference
since the last call of cetus_tic(id).
At the end of an application run, basic runtime statistics
on the instrumented code sections can be generated
from this information.
int foo(void)
{
int i;
double t, s, a[100];
for ( i = 0; i < 50; ++i )
{
t = a[i];
a[i+50] = t + (a[i] + a[i+50])/2.0;
s = s + 2*a[i];
}
return 0;
}
(a)
int foo()
{
int i;
double t, s, a[100];
cetus_tic(0);
for (i = 0; i < 50; ++i)
{
t = a[i];
a[(i+50)] = (t + ((a[i] + a [(i+50)])/2.0));
s = (s + (2*a[i]));
}
cetus_toc(0);
return 0;
}
Original code
int a,b;
a = b;
a = b + c;
Modified Code
int a0,b0, a1, b1;
a0 = b0;
a1 = b1;
if (b0 != b1)
error();
a0 = b0 + c0;
a1 = b1 + c1;
if((b0!=b1)||(c0!=c1))
error();
Original code
int res,a;
res = search
(a);
…
int search
(int p)
{ int q;
…
q = p + 1;
…
return(1);
}
Modified code
int res0, res1, a0, a1;
search(a0, a1,
&res0, &res1);
…
void search(int p0,int
p1,int *r0,int *r1)
{ int q0, q1;
…
q0 = p0 + 1;
q1 = p1 + 1;
if (p0 != p1)
error();
…
*r0 = 1;
*r1 = 1;
return;
}

Tou SS Fin

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Tou SS Fin

Transféré par

Droits d'auteur :

Formats disponibles

TERM PAPER

The name "compiler" is primarily used for programs that

A compiler is likely to perform many or all of the following

Program faults caused by incorrect compiler behavior can be

The term compiler-compiler is sometimes used to refer to a

Compilers bridge source programs in high-level languages with the

1) to recognize legitimacy of programs

2) to generate correct and efficient code,

4) to format output according to assembler or linker conventions.

A compiler consists of three main parts: frontend, middle-end, and

• Frontend checks whether the program is correctly written in terms

• Middle-end is where the optimizations for performance take place.

• Backend is responsible for translation of IR into the target

The ability to compile in a single pass has classically been seen as a

In some cases the design of a language feature may require a compiler to

Splitting a compiler up into small programs is a technique used by

Low Level Virtual Machine can translate from any language

% llvm-g++ x.cpp -o program

1+2*a+4-a ) 5+a (folding)

ANALYSES AND TRANSFORMATIONS

Vous aimerez peut-être aussi