Académique Documents
Professionnel Documents
Culture Documents
Software is of two types: Systems software: With this only the computer works. Ex: Operating systems, assemblers, compilers, loaders, macroprocessors. Applications software: Packages that are developed using the systems software are the applications software. Ex: accounts package, salary system package etc. Without systems software, the computer is like a dead machine. A compiler is a program that takes as a source program in high-level language and produces its equivalent assembly-level language program as out.
Literal Table: It is created by the lexical analyzer to describe all literals used in the source program. There is one entry for each literal consisting a value, a number of attributes, an address denoting the location of the literal at execution (filled by a later phase), and other information. The attributes such as data type, or precession can be deduced from the literal itself and filled in by the lexical analyzer. Format of literal table:
Identifier table: This is also created by the lexical analyzer to describe all identifiers used in the source program. There is one entry for each identifier. Lexical analyzer creates an entry in the table and places the name of the identifier into that entry. Since in many languages the name of identifiers may be from 1 to 31 characters long, the lexical phase may enter a pointer in the identifier table for efficiency of storage. The pointer points to the name in a table of names. Later phase will fill in the data attributes and address of each identifier. Format of identifier table:
Uniform symbol table: This is created by the lexical analyzer to represent the program as a string of tokens rather than of individual characters. (Spaces and comments in the source code are not represented as uniform symbols and therefore are not used by further phases.) There is one uniform symbol for every token in the program. Each entry in the uniform symbol table contains the identification of the table of which the token is a member and its index within that table. Format of UST:
Terminal Table: It is a permanent database that has an entry for each terminal symbol (Ex: arithmetic operators, keywords, special symbols like #, {, }, etc). Each entry consists of the terminal symbol an indication of its classification (K:keyword, P:operator, or B:break-character) and its precedence (used in the later phase).
Source Program: Original form of the program, which appears to the compiler as a string of characters. Example: Let us consider a COBLE statement: COMPUTE XYZ = (A + B - 10) For the above statement we construct the databases as follows: Terminal Table
SNO 1 2 3 4 5 .. .. .. 10 11 12 13 14 15 16 17 18 .. .. TERMINAL PERFORM GOTO COMPUTE INDICATOR K K K BREAKCHAR NO NO NO
BLANK + * / = ( ) .. .. ..
P P P P P S S .. .. ..
Literal Table
SNO 1 LITERAL 10
Identifier Table
SNO 1 2 3 IDENTIFIER XYZ A B
Uniform Symbol (US) Table SNO 1 2 3 4 5 6 7 8 9 10 US TER ID TER TER ID TER ID TER LIT TER INDEX 3 1 15 16 2 11 3 12 1 17
Note: 1. Tokens are entered only once in their corresponding tables. But in the UST, they may appear more than once. 2. There is no need of entering BLANKS in the UST.
(b) Matrix form Here, we represent any statement in the matrix form. Ex: A = B + C - D
LNO Op. Code Opr1 Opr2
+ =
B (1) A
C D (2)
Ex: X = (B + C) D + (P/Q)
LNO Op. Code Opr1 Opr2
+ / + =
B P (1) (3) X
C Q D (2) (4)
1. Common sub-expression elimination Let us consider the statement. X = (C + D) + Q ^ P (D + C - 10) Matrix table
LNO Op. Code Opr1 Opr2
+ + ^ + =
+ ^ + =
Steps to optimize the table: (i) (ii) (iii) (iv) Arrange all the operands in the ascending order for commutative Op.Codes. Identify rows having common sub-expressions. Eliminate all such rows except one. Modify the matrix form accordingly.
2. Compile time compute technique The compiler will have a capacity to do simple arithmetics. Ex: X = 2 * 3 / A Matrix table
LNO Op. Code Opr1 Opr2
* / =
2 (1) X
3 A 2
(1) (2)
/ =
6 X
A (1)
We have to reduce the compile time. For that If both operands are scalars, then compute the value and substitute the result at the necessary places. Eliminate the lines in which both operands are numbers (or scalars). 3. Moving invariant computation outside the loop
We have to find whether the variable is variant, not variant or partially variant. 4. Boolean expression optimization Ex: IF <COND> THEN <SL1> ELSE <SL2> We can apply the short cut methods of Boolean expressions for simplifying conditional statements in any high level language program. In this way we can optimize intermediate code. We can also save time and space. It is better to write conditional statements using all OR gates or AND gates, where they are necessary.
To get the ALC from ICO, we first read the database. ICO database
LNO Op. Code Opr1 Opr2
+ .. .. ..
A C .. .. ..
B D .. .. ..
Read the operands Execute the ADD routine Store results somewhere else
The routines developed in assembly language are maintained in a separate database. Assembly language database (ALDB)
Op. Code + * / = Routine L 1, Op1 A 1, Op2 ST 1, MX L 1, Op1 S 1, Op2 ST 1, MX L 1, Op1 M 1, Op2 ST 1, MX L 1, Op1 D 1, Op2 ST 1, MX L 1, Op2 ST 1, Op1
M1 M2 M3
+ =
A M1 X
B C M2
The other and better way is discussed through an example. Intermediate code SNO M1 M2 M3 M4 Op. Code * * + = Opr1 A C M1 X Opr2 B D M2 M3
Table-1: ALC for the above Intermediate Code L 1, A M 1, B ST 1, M1 L 1, C M 1, D ST 1, M2 L 1, M1 A 1, M2 ST 1, M3 L 1, M3 ST 1, X The steps for optimization are (i) Consecutive store and load instructions to be eliminated, if they are dealing with same operand. Table-2: Reduced ALC from Table-1 L 1, A M 1, B ST 1, M1 L 1, C M 1, D A 1, M1 ST 1, X (ii) Try to use RR type instructions in place of RM type instructions. RR type instructions are much faster than RM type instructions. L 1, A M 1, B ST 1, 3 L 1, C M 1, D A 1, 3 ST 1, X Note: Here, 3 is another register, which is used in place of M1.
11
(iii)
If it is possible, try to use one or more register as general purpose registers to reduce one or more instructions. By using general purpose registers we can eliminate more instructions. L 1, A M 1, B L 2, C M 2, D A 1, 2 ST 1, X