Vous êtes sur la page 1sur 10

LOVELY PROFESSIONAL UNIVERSITY

Course code Course Name Software Ours instructor Mam Date of submission Students Roll no Section No

CAP 607 System Nitu Sharma 27/02/2012 RD1801A49 D1801

Declaration:
I declare that this synopsis is my individual work mention below. I have not copied from any other students work or from any other source except where due acknowledgement is made explicitly in the text, nor has any part being typed for us by another person.

Signature: Harjinder Singh

Project Synopsis

Names(s) of Students: - Harjinder Singh

Registration Number(s):- 10802114

Project Undertaken: - Syntax directed compiler

What is compiler? A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can run on a computer whose CPU or operating system is different from the one on which the compiler runs, the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a decompiler. A program that translates between high-level languages is usually called a language translator, source to source translator, or language converter. A language rewriter is usually a program that translates the form of expressions without a change of language. Syntax directed compiler:A general-purpose compiler that can service a family of languages by providing the syntactic rules for language analysis in the form of data, typically in tabular form, rather than using a specific parsing algorithm for a particular language. Also known as syntax-oriented compiler. Overview Syntax-directed translation fundamentally works by adding actions to the productions in a context-free grammar. Actions are steps or procedures that will be carried out when that production is used in a derivation. A grammar specification embedded with actions to be performed is called a syntax-directed translation scheme[2] (sometimes simply called a 'translation scheme'.) Each symbol in the grammar can have an attribute, which is a value that is to be associated with the symbol. Common attributes could include a variable type, the value of an expression, etc. Given a symbol X, with an attribute t, that attribute is referred to as X.t Thus, given actions and attributes, the grammar can be used for translating strings from its language by applying the actions and carrying information through each symbol's attribute.

Introduction:Syntax Directed Translation There are two notations for associating semantic rules with productions, syntax directed definitions and translation schemes. Conceptually, with both syntax directed definitions and translation schemes, we pass the input token stream, build the parse tree, and then traverse the tree as needed to evaluate the semantic rules at the parse tree nodes. Evaluation of the semantic rules may generate code, save information in a symbol table, issue error messages, or perform any other activities. The translation of the token stream is the result obtained by evaluating the semantic rules. SYNTAX DIRECTED DEFINITIONS A syntax directed definition is a generalization of a context free grammar in which each grammar symbol has an associated set of attributes, partitioned into two subsets called the synthesized and inherited attributes of that grammar symbol. An attribute can represent anything we choose: a string, a number, a type, a memory location, or whatever. The value of an attribute at a parse tree node is defined by a semantic rule associated with a production used at that node. The value of a synthesized attribute at a node is computed from the values of attributes at the children of that node in the parse tree; the value of an inherited attribute is computed from the values of attributes at the siblings and parent of that node. Semantic rules set up dependencies between attributes that will be represented by a graph. From the dependency graph, we derive an evaluation order for the semantic rules. Evaluation of the semantic rules defines the values of the attributes at the nodes in parse tree for the input string. A parse tree showing the values of attributes at each node is called an annotated parse tree. The process of computing the attributes at the nodes is called annotating or decorating the parse tree. Form of a Syntax Directed Definition In a syntax directed definition, each grammar production A has associated with it a set of semantic rules of the form b:= f(c1, c2,..,ck), where f is a function, and either
1. b is a synthesized attribute of A and c1, c2,..,ck are attributes belonging to

the grammar symbols of the production, or,

2. b is an inherited attribute of one of the grammar symbols on the right side of

the production, and c1, c2,..,ck are attributes belonging to the grammar symbols of the production. In either case, we say that the attribute b depends on attributes c1, c2,..,ck. An attribute grammar is a syntax directed definition in which the functions in semantic rules cannot have side effects. Synthesized Attributes A syntax directed definition that uses synthesized attributes exclusively is said to be an S-attributed definition. A parse tree for an S-attributed definition can always be annotated by evaluating the semantic rules for the attributes at each node bottom up, from the leaves to the root. EXAMPLE: The figure contains an annotated parse tree for the input 3*5+4n. The output printed at the root of the tree, is the value of E.val at the first child of the root. To see how attribute values are computed, consider the leftmost, bottommost interior node, which corresponds to the use of the production F-> digit. The corresponding semantic rule, F.val:=digit.lexval, defines the attribute F.val at that node to have the value 3 because the value of digit.lexval at the child of this node is 3. Similarily, at the parent of this F node, the attribute T.val has a value 3. Now consider the node for the production T->T*F. The value of the attribute T.val at this node is defined by PRODUCTION T->T1*F SEMANTIC RULE T.val:=T1.val x F.val

When we apply he semantic rule at this node, T1.val has a value 3 from the left child and F.val the value 5 from the right child. Thus, T.val acquires the value 15 at this node. The rule associated with production for the starting nonterminal L->E n prints the value of the expression generated by E. Inherited Attributes An inherited attribute is one whose value at a node in a parse tree is defined in terms of attributes at the parent and/or siblings of that node. Inherited attributes are convenient for expressing the dependence of a programming language construct on the context in which it appears. For example, we can use an inherited attribute to keep track of whether an identifier appears on the left or the right side of an assignment in order to decide whether the address or the value of the identifier is

needed. Although it is always possible to rewrite a syntax directed definitions with inherited attributes.

Syntax-Directed Definition Example Production L E return E E1 + T ET T T1 * F TF F(E) F digit Semantic Rules print(E.val) E.val = E1.val + T.val E.val = T.val T.val = T1.val * F.val T.val = F.val F.val = E.val F.val = digit.lexval

Symbols E, T, and F are associated with a synthesized attribute val. The token digit has a synthesized attribute lexval (it is assumed that it is evaluated by the lexical analyzer). Syntax-Directed Definition Example2 Production E E1 + T ET T T1 * F TF F(E) F id Semantic Rules E.loc=newtemp(), E.code = E1.code || T.code || add E1.loc,T.loc,E.loc E.loc = T.loc, E.code=T.code T.loc=newtemp(), T.code = T1.code || F.code || mult T1.loc,F.loc,T.loc T.loc = F.loc, T.code=F.code F.loc = E.loc, F.code=E.code F.loc = id.name, F.code=

Symbols E, T, and F are associated with synthesized attributes loc and code.

The token id has a synthesized attribute name (it is assumed that it is evaluated by the lexical analyzer). It is assumed that || is the string concatenation operator.

Syntax-Directed Definition Inherited Attributes Production Semantic Rules L.in = T.type T.type = integer T.type = real L1.in = L.in, addtype(id.entry,L.in) addtype(id.entry,L.in)

DTL T int T real L L1 id L id

Symbol T is associated with a synthesized attribute type. Symbol L is associated with an inherited attribute in. Syntax Trees Decoupling Translation from Parsing-Trees. Syntax-Tree: an intermediate representation of the compilers input. Example Procedures: mknode, mkleaf Employment of the synthesized attribute nptr (pointer) PRODUCTION E E1 + T E E1 - T E T T (E) T id SEMANTIC RULE E.nptr = mknode(+,E1.nptr ,T.nptr) E.nptr = mknode(-,E1.nptr ,T.nptr) E.nptr = T.nptr T.nptr = E.nptr T.nptr = mkleaf(id, id.lexval)

T num

T.nptr = mkleaf(num, num.val)

S-Attributed Definitions Syntax-directed definitions are used to specify syntax-directed translations. To create a translator for an arbitrary syntax-directed definition can be difficult. We would like to evaluate the semantic rules during parsing (i.e. in a single pass, we will parse and we will also evaluate semantic rules during the parsing). We will look at two sub-classes of the syntax-directed definitions: S-Attributed Definitions: only synthesized attributes used in the syntax-directed definitions. L-Attributed Definitions: in addition to synthesized attributes, we may also use inherited attributes in a restricted fashion. To implement S-Attributed Definitions and L-Attributed Definitions we can evaluate semantic rules in a single pass during the parsing. Implementations of S-attributed Definitions are a little bit easier than implementations of L-Attributed Definitions

Bottom-Up Evaluation of S-Attributed Definitions We put the values of the synthesized attributes of the grammar symbols into a parallel stack. When an entry of the parser stack holds a grammar symbol X (terminal or nonterminal), the corresponding entry in the parallel stack will hold the synthesized attribute(s) of the symbol X. We evaluate the values of the attributes during reductions. A XYZ A.a=f(X.x,Y.y,Z.z) where all attributes are synthesized.

stack parallel-stack top

top

A.a .

Bottom-Up Evaluation Example

At each shift of digit, we also push digit.lexval into val-stack. stack 0 0d6 0F4 0T3 0E2 0E2+8 0E2+8d6 0E2+8F4 0E2+8T11 0E2+8T11*9 0E2+8T11*9d6 0E2+8T11*9F12 0E2+8T11 0E2 0E2r7 stack 0L1 17 $ acc 5 5 5 5 55-3 5-3 5-3 5-35-3-4 5-3-4 5-12 17 17val-stack input action semantic rule 5+3*4rs6 d.lexval(5) into val-stack

+3*4r Fd F.val=d.lexval do nothing +3*4r TF T.val=F.val do nothing +3*4r ET E.val=T.val do nothing +3*4r s8 3*4r *4r *4r *4r 4r r r r r $ s6 push empty slot into val-stack d.lexval(3) into val-stack

Fd F.val=d.lexval do nothing TF T.val=F.val do nothing s9 s6 push empty slot into val-stack d.lexval(4) into val-stack

Fd F.val=d.lexval do nothing TT*F EE+T s7 T.val=T1.val*F.val E.val=E1.val*T.val

push empty slot into val-stack

LEr print(17), pop empty slot from val-

Conclusion Automated compiler generation is by no means a new concept. Much research in the area has taken place, but we have yet to see any significant products. There is a constant trade-o between generality and efficiency; while the former leads to inefficient theorem-prove style solutions, the latter leads to a dressed-up language with the emphasis still on the user. In order to get anywhere useful, compromises have to be made. It is simply not feasible to be general enough to cope with the most advanced type systems, nor to be as efficient as a modern, optimised, compiler. In my work on SemCom, I wanted to see how far complete automation is possible. To this end, it has been a great success. By imposing a few simple constraints, such as syntax-directed typing, it is possible to automatically generate a laxer and parser for the target language, in addition to an interpreter. Furthermore, generality is not greatly sacrificed, since SemCom can deal with polymorphic type systems, and both big- and small-step semantics, including non-determinism. Testing semantics of a subset of ML, and of Milner's CCS, I investigated various effects; for example call-by-value versus call-by-name, and the consequences of non-value-restricted let-polymorphism. Automated compiler generation is hard, but the majority of languages are relatively simple. In fact, many are custom scripting languages, geared towards a specific application, and are often interpreted. This sort of domain could benefit a great deal from automation, and I feel that SemCom illustrates the feasible of this. As a solution, it is not complete, but it serves to show that such tools are not as far away as one might think.

Vous aimerez peut-être aussi