Vous êtes sur la page 1sur 7

Formal Language and Automata Theory ( C S2 1 004)

C ourse C overage
Class : CSE 2 nd Year 4 th January, 2 01 0 ( 2 hours) : Tutorial-1 + Alphabet: , , , string over , 0 = { } , n = { x : x = a 1 a 2 a n , a i , 1 i n} , = n , + = \ { } , ( , c o n, ) is a monoid . Language L over the alphabet is a subset of

, L . The size of is countably innite , so the collection of all languages over , 2 uncountably innite . So every language cannot have nite description.

nN

2 N , is

5 th January, 2 01 0 ( 1 hour) : No set is equinumerous to its power-set ( Cantor ) . The proof is by reductio ad absurdum ( reduction to a contradiction) . An one to one map from A to 2 A is easy to get, a { a } . Let A be a set and there is an onto map ( surjection ) f from A to 2 A . We consider the set B = { x A : x f ( x ) 2 A } . As f : A 2 A is a surjection, there is an element a 0 A so that f ( a 0 ) = 2 A , but then a 0 f ( a 0 ) = . So, a 0 B , and B is a non-empty subset of A . So, there is an element a 1 A such that f ( a 1 ) = B . Does a 1 f ( a 1 ) = B ? This leads to contradiction. If a 1 f ( a 1 ) = B , then a 1 B , if a 1 f ( a 1 ) = B , then a 1 B i. e. a 1 f ( a 1 ) = B if and only if a 1 f ( a 1 ) = B . So there is no surjection possible and the set 2 A is more numerous than A .

Decision problems from dierent areas of computing can be mapped to decision problems in formal language. REACHABLE = { < G , s , d > : G is a directed graph and the destination node d is reachable from the source node s } .

Let be an alphabet. 2 is the collection of languages over . We know that both < , c o n c, > and < 2 , c o n c, { } > are monoids. Let L , L 1 , L 2 2 , the set operations L 1 L 2 , L 1 L 2 , L 1 L 2 are dened as usual. Concatenation: L 1 L 2 = { x : y L 1 z L 2 so that x = yz } , L 0 = { } , L n = L L n 1 , n > 0 . n n Kleene Closure/ star: L = L+ = n n= 0 L , =1 L . 6 th January, 2 01 0 ( 1 hour) : r Right quotient and right derivative : L 1 \ L 2 = { x : y L 2 x y L 1 } , y ( L) = { x : x y L} = L\{ y} . l Left quotient and left derivative : L 2 / L 1 = { y : x L 2 x y L 1 } , x ( L) = { y : x y L } = { x } / L. Reverse or mirror image : R = , ( a x ) R = x R a , L R = { x R : x L } . Substitution and homomorphism : Let a be an alphabet for each a and L a be a language over a . The map ( a ) = L a for all a induces a map : 2 so that ( ) = { } and ( a x ) = ( a ) ( x ) [ in other words ( x y ) = ( x ) ( y ) ] . The map : 2 is called sub stitution . It is -free if no L a has in it. A substitution is a homomorphism if | L a | = 1 for all a . Finite description of languages - phrase structure grammar or type- 0 grammar. 1

INVMAT = { < M > : M is an invertible matrix over rationals } .

EULERPATH = { < G > : there is an Eulerian walk in the undirected graph G } .

PRIME = { n N : n is a prime } .

G = ( N , , P, S ) , where i. N is a nite set of variables or non-terminals , ii. nite set of object language symbols called constants or terminals , iii. S N is a special symbol called the start symbol or axiom ,

iv. P is a nite subset of ( ( N ) N ( N ) ( N ) ) called the production or transformation or rewriting rules.

An element ( , ) P is such that = u A v , where u, v , ( N ) and A N , there must be a non-terminal in the rst component of a production rule. The ordered pair of the production rule is written as . 1 1 th January, 2 01 0 ( 2 hour) : i. Phrase- Structure Grammar (PSG) : as dened earlier. This is also called a unrestricted grammar or type- 0 grammar . ii. Context-Sensitive Grammar (CSG): Each production rule is of the A u , where , ( N ) , A N , u ( N ) + i. e. one non-terminal from the left-side of the production rule will be replaced by a non-null string to form the right side of the production. This is also called a type- 1 grammar . iii. Length Increasing Grammar (LIG) : In each production rule the length of the right side string is not shorter than the length of the left side string i. e. if u v P , | u | | v | . It is clear that any context- sensitive grammar is a length- increasing grammar . But it can also be proved that for every length-increasing grammar there is an equivalent context- sensitive grammar . iv. Context-Free Grammar: Each production rule is of the form A , where A N and ( N ) . Replacement of a non-terminal does not depends on the context. This is also called a type- 2 grammar . v. Right- linear Grammar: Each production rule is either of the following two forms, A x B or A x , where A N and x . Without loss of power we can take x { } . This is also called a type- 3 grammar or regular grammar . Given a grammar G = ( N , , P, S ) , we dene the binary relation one step derivation ( ) on the set ( N ) . If u and v are two strings of ( N ) and u v P , we say that u G derives or produces v in the grammar G in one step, and write u v . We shall drop G from if there is no scope of confusion. The reexive- transitive closure of one step derivation relation gives the notion of derivation in any nite number of steps ( including 0) , . We shall often drop the and abuse the notation for both. Sentential form and sentence : Given a grammar G , any string that can be derived from the start symbol S in nite number of states is a sentential for, S u , u is a sentential form. It is a sentence if it is a string of . Language : Given a grammar G = ( N , , P, S ) , the language generated by the grammar or language described by the grammar is the collection of all sentences. L ( G ) = { x : S x } . The language of a context- sensitive grammar (CSG) is called a context-sensitive language (CSL) , the language of a length- increasing grammar (LIG) is also a CSL ( as the grammars are equivalent) . The language of a context- free grammar (CFG) is called a context- free language (CFL) . The language of a right- linear grammar is called a regular set or a regular language . Example 1 . Following is a length-increasing grammar for the language L = { a n b 2 n cn : n G 1 = ( { S , B } , { a , b, c } , P , S ) , the production rules are,

1}

S a SB B c S abbc cB Bc bB bb The grammar is not context-sensitive due to presence of the rule c B B c. We replace it by three context-sensitive rules and get the context-sensitive grammar of the same language. In doing so we rst replace the terminal c by a new non-terminal D . S a SB B D S abbD DB DE DE BE BE BD bB bb Dc Following is a context-free grammar for the language L = { x : | x | a = | x | b } . G 2 = ( { S } , { a , b } , P, S ) , the production rules are S a Sb S b Sa S SS S Following is a right-linear grammar, what is the language? G 3 = ( { S , A } , { a , b } , P, S ) , the production rules are S aA S bS S A aS A bA

1 2 th January, 2 01 0 ( 1 hour) : Tutorial II 1 3 th January, 2 01 0 ( 1 hour) : We rst prove that for every length-increasing grammar G there is a context-sensitive grammar G , so that they are equivalent i. e. L ( G ) = L ( G ) . Without any loss of generality we take the rules of LIG in any one of the following form: A a , where A, A 1 , , A m , B 1 , , B n N A1 A2 A m B1 B2 Bn and x , and m n. We have replaced every terminal a from the productions by new nonterminal A and add a production A a .

Example 2 . Consider the grammar G 1 = ( { S , B } , { a , b , c } , P, S ) , where the production rule P is S a SB B c S abbc cB Bc bB bb The transformed grammar is G 1 = ( { S , B , A , B , C } , { a , b , c } , P , S ) , where the production rules are S A SB B C SA B B C C B BC B BB B A a B b C c

The rules of rst type and the second type rule with m = 1 are context-sensitive rules. So we are interested about the second type of rule where m 2 . We replace A1 A2 A m B1 B2 B n by the following set of 2 m rules, A1 A2 A m C1 A 2 Am A m C1 C2 Am C1 A 2

C1 A 2 C1 A 2 C1 C2 B 1 C2

A m 1 A m C1 C2 Cm 1 A m Cm 1 A m C1 C2 Cm 1 Cm B m + 1 B n Cm 1 Cm B m + 1 B n B 1 C2 Cm 1 Cm B m + 1 Cm 1 Cm B m + 1 B n B 1 B 2 Cm 1 Cm B m + 1

Bn Bn

B1 B2 B m 1 Cm B m + 1 B n B 1 B 2 B m 1 B m Bm + 1 Bn All these rules are context-sensitive in nature. Soundness and Completeness : Given a language L and a grammar G we have to establish that L = L ( G ) . There are two parts of the process - we have to prove that the grammar does not generate any string outside L i. e. L ( G ) L - the grammar is sound . Every string of the language is generated by the grammar, L L ( G ) - the grammar is complete .

1 8 th January, 2 01 0 ( 2 hour) : No class due to Death of Jyoti Basu. 1 9 th January, 2 01 0 ( 1 hour) : Rooted tree, Parse or derivation tree. Ambiguously derived string and ambiguous grammar. Inherently ambiguous language. Simplication of a CFG removal of useless symbol. 2 5 th January, 2 01 0 ( 2 hour) : 1 hour tutorial + Elimination of -production and elimination of unit-production. Deterministic nite automaton ( DFA) - M = ( Q , , , s , F ) , state transition function : Q Q , state transition dia : Q Q , string accepted by M , language of M , gram, state transition table, ( s , x) F } . L( M) = { x : 2 5 th January, 2 01 0 ( 1 . 5 hour) ( compensation for 1 8 th ) : Examples of DFA, Non- determin: istic nite automaton ( NFA) - N = ( Q , , , s , F ) , state transition function : Q 2 Q , Q Q 2 , ( P, a ) , where P Q . Equivalence of DFA and NFA - subset construction. 27 th January, 201 0 (1 hour): Subset construction, NFA with -transition and its equivalence with NFA without -transition ( not done properly) . 1 st February, 201 0 ( 2 hour) : 1 hour tutorial + NFA with -transition, equivalence of NFA with -transition and NFA without -transition, regular expression and its language. 2 nd February, 201 0 (1 hour): -NFA from regular expression, L x , derivative of L with respect to x , if L is regular then so is L x . Unique solution of X = AX + B when A , X = A B . Regular expression from DFA - solution of simultaneous set equations.

3 rd February, 201 0 (1 hour): Regular expression from DFA using state equations. Closure properties. 8 th February, 201 0 ( 2 hour) : 1 hour tutorial + Closure properties of regular languages: closure under boolean operations, concatenation, Kleene-star, reversal, homomorphism, inverse homomorphism. 9 th February, 2 01 0 ( 1 hour) : Closure properties, pumping theorem - proving a language non-regular, decidability results.

1 0 th February, 2 01 0 ( 1 hour) : Myhill-Nerode Theorem - identication of regular languages as union of equivalence classes of a right invariant equivalence relation of nite index. Regular language - a countable boolean algebra. Given a nite state transition diagram with k states on an alphabet , we can dene 2 k DFAs ( set of nal states may be any subset of k states) . These 2 k languages forms a boolean algebra. 1 5 th February, 2 01 0 ( 2 hour) : 1 hour tutorial + Minimisation of DFA, Minimisation algorithm and equivalence of two DFAs, Finite Automata with output - Moore and Mealy machine. 1 6 th February, 2 01 0 ( 1 hour) - ??? - denition of a PDA 2 nd March, 2 01 0 ( 1 hour) - Denition of a PDA - acceptance of a string by empty stack of a PDA M , the language N ( M ) , acceptance of a string at a nal state by a PDA M , the language T( M ) . A language L = N ( M1 ) for a PDA M1 if and only if L = T( M2 ) for some PDA M2 . The equivalence is not true in case of DPDA. Every CFL L is accepted by a PDA M - one state PDA simulates left-most derivation. 3 rd March, 2 01 0 ( 1 hour) - Any regular set L is accepted by a DPDA in nal state. The regular language { 0 } is accepted by a DPDA in nal state, but is not accepted by a DPDA in empty stack. If L = N ( M1 ) for a DPDA, then there is a DPDA M2 so that L = T ( M2 ) . But the reverse is not true. If a language is accepted by a PDA, then it is a CFL - example from the PDA of { a n b n : n 1 } . 8 th March, 2 01 0 ( 2 hour) : 1 hour tutorial + Pumping Lemma for CFL, { a n b n cn : n 1 } is not a CFL, substitution of a language, the collection of context free language is closed under substitution, nite union, concatenation, Kleene closure and homomorphism, the collection of CFL is not closed under intersection. 9 th March, 2 01 0 ( 1 hour) : Closure under substitution, ( ?) 1 0 th March, 2 01 0 ( 1 hour) : The class of context-free languages is closed under inverse homomorphism. Decision problems of context-free languages - language is empty, language is nite, language is innite, x L ( G ) - CYK algorithm. 1 5 th March, 2 01 0 ( 2 hour) : 1 hour tutorial + Intersection of a CFL and a regular language is a CFL, Turing machine - as an acceptor, as a computor and as a enumerator. 1 6 th March, 2 01 0 ( 1 hour) : Turing machine - formal description. If L is a CFL over the alphabet { a } , the L is regular. Proof: If L is nite then L is regular. So we assume that L is innite and the CFL pumping constant is k . We partition L = L 1 L 2 , where L 1 = { x L : | x | < k } and L 2 = { x L : | x | k } . L 1 being nite is regular. We shall prove that L 2 is also regular. Let w L and | w | k , so by the pumping lemma we can write w = u v x yz , such that ii. | v x y | i. | v y | > 0 , k, 0 , u v ix yiz L 0 , u x z ( v y ) i L . If | v y | = p, then
m k! j

iii. for all i for all i

The monoid { a } is commutative so ( iii) implies that for all i So, w L and | w | k implies that for all i
p i

0 u x z v y ( v y ) i = w ( a p) i L . Let = k ! , so ( a ) m = ( a j )

= ( a j ) m , where = 0 , w( a )
m

0 , w ( a ) L implies that for all m

L.

k! j

We see that each w L and | w | k is an element of the set a k + i ( a ) , for some i , 0 i < , i. e. L 2 a k + i ( a ) . Let w i be the least element of the set L a k + i ( a ) , so for all m 0 w i ( a ) m L and each such element belongs to a k + i ( a ) as w i = a k + i ( a ) m i . So all these elements starting from w i can be represented by the regular expression w i ( a ) . We also claim that for some i , 0 i < , there is no other element in a k + i ( a ) belonging to L . If there is some such element w i = a k + i ( a ) l i , then | l i | > | m i | as w i is the least element. Let | l i | | mi | = d , so w i = a k + i ( a ) m i + d = w i ( a ) d belonging to the chain of w i . So we conclude that w i ( a ) = L a k + i ( a ) and L 2 = ( w 0 + w 1 + + w 1 ) ( a ) is a regular language.

i<

1 7 th March, 2 01 0 ( 1 hour) : Design of DTM, remembering information in a state - a state may be an n-tuple e. g. ( q , a ) and ( q , b ) , a tape symbol may be an n-tuple and one component may be modied e. g. ( a , b , a ) is changed to ( a , b, b ) . Equivalence of singly-innite tape and doubly-innite tape Turing machine. 2 2 nd March, 2 01 0 ( 2 hour) : 1 hour tutorial + Equivalence of singly-innite DTM and doubly-innite DTM. Parikh s theorem 2 3 rd March, 2 01 0 ( 1 hour) : Multi-tape DTM, non-deterministic Turing machine, their equivalence with DTM. A Recursively enumerable and recursive languages. 2 4 th March, 2 01 0 ( 1 hour) : A language is Turing recognisable if and only if it is generated by a unrestricted grammar. 2 9 th March, 2 01 0 ( 2 hour) : 1 hour tutorial + Continuation of equivalence of Turing machine and unrestricted grammar. The collection of recursive sets is a countable Boolean algebra. Any DTM over the = { 0 , 1 } can be simulated by a DTM with tape symbols = { 0 , 1 , } , where is the blank symbol. 30 th March, 2 01 0 ( 1 hour) : Encoding of a DTM over { 0 , 1 } and with tape symbols { 0 , 1 , } . A DTM may be viewed as a binary numeral of a natural number. Binary representation of every natural number do not encode a DTM. We dene such a numeral as a code of a DTM recognising a null set . Let M1 , M2 , be an enumeration of DTM where Mi is the binary representation of i . Let x1 , x2 , be the enumerations of strings over { 0 , 1 } . We dene the diagonal language L d = { x i : Mi does not accepts x i } . We claim that L d is not recursively-enumerable. If it is, then there is a DTM Td = Ti that recognises L d . But that leads to contradiction as Ti recognises x i if and only if Ti does not recognises x i . So L d is not Turing recognisable or recursively-enumerable. There is a Universal Turing Machine that take the encoding of a DTM < M > ( including itself) an input x to M as input and simulates the behaviour of M on x . Let the language recognised by U is L u = { < M , x > : M is a DTM accepts x } . We claim that L d = { x i : Mi accepts x i } is recursively enumerab le but not recursive . It is not recursive as that makes L d recursive. But we know that L d is not even recursively enumerable. Following machine recognises L d . Md : Input: x

1 . Enumerate strings over { 0 , 1 } , x 1 , x 2 , Stop, if they are equal. Let x = x j .

and compare each enumerated string with x .

2. Consider the binary representation of j , < j > . If it is not a valid encoding of DTM, reject x. Any invalid binary string represents a DTM whose language is empty, so it does not accept x = x j . 3. If < j > is a valid machine, run the universal machine U on input < Mj , x j > . 4. If U reaches the nal state i. e. Mj reaches the nal state on x j , then accept x .

5. If U reaches a non-nal state and halts, then let Md also halt at a non-nal state and reject x . 6. If the simulation goes in an innite loop, Md also does the same. It is clear that the language recognised by Md is L d . The language L u of a Universal TM is certainly recursively-enumerable. But it cannot be recursive as a decider for L u makes a decider for L d ( in the construction of Md , we shall replace the U by this decider) and that will make L d also recursive. But we have already proved otherwise. This is called problem reduction - we reduce the decision-problem of L d to the decision-problem of L u . As L d is known to be undecidable, then so is L u . Again the language L u = { < M , x > : M does not accept x } cannot be recursively-enumerable as that will make both L u and L u recursive. So we have two languages and their complements L d and L u - recursively-enumerable, and L d and L u are not even recursively-enumerable. Problem reduction is a method of converting a decision-problem of a language A , to the decision-problem of a language B , so that a solution to the decision-problem of B results a solution to the decision-problem of A . As an example consider the construction of the previous machine Md . In a sense it reduces the decision-problem of L d ( A ) to the decision-problem of L u ( B) . 31 st March, 2 01 0 ( 1 hour) :

Vous aimerez peut-être aussi