Theory of Computation
CSE ( CET2008B)
TY BTech
Unit III
CFG and PDA
Unit- III
Context Free Grammar(CFG) and Pushdown
Automata
Course Objectives & Course Outcomes
Course Objectives:
•To understand the basics of automata theory and its operations.
•To understand problem classification and problem solving by machines.
•To study computing machines by describing, classifying and comparing different types
of computational models.
•To understand the fundamentals of decidability and computational complexity.
Course Outcomes:
• After completion of this course students will be able:
•To construct finite state machines to solve problems in computing.
•To write mathematical expressions and syntax verification for the formal languages.
•To construct and analyze Push Down Automata and Turing Machine for formal
languages.
•To express the understanding of decidability and complexity.
Text Books & Reference Books
• Text Books
•John C. Martin, Introduction to Language and Theory of Computation, TMH,
3rdEdition, ISBN: 978-0-07-066048-9.
•Vivek Kulkarni, Theory of Computation, Oxford University Press, ISBN-13:
978-0-19-808458-7.
• Reference Books
•K.L.P Mishra,N. Chandrasekaran,Theory of Computer Science (Automata, Languages
and Computation), Prentice Hall India, 2nd Edition.
•Michael Sipser, Introduction to the Theory of Computation,CENGAGE Learning, 3rd
Edition, ISBN:13:978-81-315-2529-6.
•Daniel Cohen, Introduction to Computer Theory, Wiley India, 2nd Edition, ISBN:
9788126513345.
•Kavi Mahesh, Theory of Computation: A Problem Solving Approach, 1st Edition,
Wiley-India, ISBN: 978-81-265-3311-4.
Introduction
•Finite Automata accept all regular languages and only regular languages
•Many simple languages are non regular:
- {anbn : n = 0, 1, 2, …}
- {w : w a is palindrome}
and there is no finite automata that accepts them.
• Context-Free Languages are a larger class of languages that encompasses
all regular languages and many others, including the two above.
Context-Free Languages and Regular
Languages
Context-Free Languages
Regular Languages
Formal Definition of a Grammar
Grammar:
Set of
variables
Set of Start Set of
terminal variable productions
symbols
Language for the Grammar
Grammar:
S🡪aSb
S🡪є
Language of the grammar:
A Convenient Notation
We write:
for zero or more derivation steps
Instead of:
Generalizing:
Trivially:
Formal Definition-CFG
Definition: A context-free grammar is a 4-tuple (V, T, P, S),
where:
• V is a set (each element in V is called nonterminal)
• T is an alphabet (each character in T is called terminal)
• P, the set of rules, is a subset of V × (T ∪ V)*
If (α,β) ∈ P, we write production α 🡪 β
β is called a sentential form
• S, the start symbol, is one of the symbols in V
Example of Context-Free Grammar
productions
P={S🡪aSb,
S🡪є}
start variable
variables
terminals
Context-Free Language
A language is context-free
if there is a context-free grammar
with
Context-Free Language-Example 1
is a context-free language
since context-free grammar :
{S🡪aSb | є}
generates
Context-Free Language – Example 2
Context-free grammar :
{S🡪aSa | bSb | є}
Example derivations:
Palindromes of even length
Derivation and Parse Trees
Consider the following example grammar
with 5 productions:
S🡪AB A🡪aaA|ϵ B🡪Bb|ϵ
Leftmost and Rightmost Derivation
Consider the following example grammar
with 5 productions:
S🡪AB A🡪aaA|ϵ B🡪Bb|ϵ
Leftmost derivation: for the string aab
Rightmost derivation: for the string aab
Leftmost and Rightmost Derivation
S🡪AB A🡪aaA|ϵ B🡪Bb|ϵ
Leftmost derivation:
Rightmost derivation:
Give same
derivation tree
ϵ ϵ
Context Free Language for a CFG
Find the CFL generated for the given CFG.
1. S🡪aSb | ab
L(G)={anbn | n>=1}
2. S🡪aB|bA
A🡪a|aS|bAA
B🡪b|bS|aBB
L(G)={x|x containing equal no of a’s and b’s}
CFG for a CFL
1. Write the grammar for generating strings over
Σ={a}, containing any(zero or more) number of a’s.
S 🡪 a | aS | ϵ
2. Find the CFG represented by the RE (a+b)*
S 🡪 aS | bS | ϵ
3. Write the grammar for all strings consisting of a’s
and b’s with atleast 2 a’s.
S🡪AaAaA
A🡪aA | bA | ϵ
Ambiguous Grammar
A context-free grammar is ambiguous
if there is a string which has:
two different derivation trees
or
two leftmost derivations
(Two different derivation trees give two
different leftmost derivations and vice-versa)
Ambiguous Grammar – An Example
string has two derivation trees
Ambiguous Grammar – An Example
string has two leftmost derivations
Removing Ambiguity
Equivalent
Ambiguous
Non-Ambiguous
Grammar
Grammar
generates the same
language
Unique derivation
tree and leftmost
derivation
For the string
a+a*a
Simplification of Context Free Grammar
A CFG G can be simplified as:
1. Each variable and terminal of the CFG should appear in the
derivation if at least one word in the L(G).
2. There should not be any production of the form A🡪B where A and
B are both non-terminals.
Simplification techniques are:
1.Removal of Useless Symbols
2.Removal of Unit Productions
3.Elimination of ∈ production
25
Removal of Useless symbol
Ex.1 Consider the following grammar:
G={ (S,A),(1,0),P,S}
Where P consists of the following productions:
S🡪1 0|0 S 1|1 S 0|A|S S
Simplify the grammar by removing the useless symbols if any.
Remove S🡪A, as A is useless symbol.
S🡪1 0|0 S 1|1 S 0|S S
26
Removal of Useless symbol
Ex.2 Consider the following grammar:
G={ (S,A,B),(a),(1,0),P,S}
Where P consists of the following productions:
S🡪 A B | a
A🡪a
Simplify the grammar by removing the useless symbols if any.
B is useless, Remove B
S🡪A | a
A🡪 a
27
Removal of Unit Production
• A production rule of the form A🡪B where A and B are both
non-terminals is called a unit production.
• All the other productions (including ∈ productions) are non-unit
productions.
Ex.1 Consider the following grammar:
G={ (A,B),(a,b),P,A}
Where P consists of :
A🡪B, B🡪a | b
Simplify the grammar by removing the unit productions if any.
On eliminating unit production A🡪B
A🡪a|b
28
Removal of Unit Production
Ex.2Consider the following grammar:
S🡪 a | X b | a Y a | b | a a
X 🡪Y
Y🡪b|X
Simplify the grammar by removing the unit productions if any.
Simplified grammar is:
S🡪a | Yb | aYa | b | aa
Y🡪 b
29
Removal of ∈ - Production
A production of the form A🡪 ∈ where A is non-terminal is known as
∈ - Production.
Ex.1 Consider the following grammar,
S🡪 a S a | b S b | ∈
Simplify the grammar by eliminating ∈ - Productions if any.
Simplified grammar is G’:
S 🡪 aSa | bSb | aa | bb
G’=G-ϵ rules.
L(G’)=L(G)-{ϵ}
30
Removal of ∈ - Production
Ex.2 Consider the following grammar,
S 🡪a | X b | a Y a
X🡪Y | ∈
Y🡪b | X
Simplify the grammar by eliminating ∈ - Productions if any.
Simplified grammar is:
S 🡪 a | X b | a Y a | b | aa
X🡪 Y
Y🡪 b | X
31
Chomsky Hierarchy
1. Type 0 grammar (Unrestricted Grammar)
2. Type 1 grammar (Context Sensitive Grammar)
3. Type 2 grammar (Context Free Grammar)
4. Type 3 grammar (Regular Grammar)
Chomsky Hierarchy
• Type 3 grammar:
It is also called as regular grammar.
A 🡪α
Recognized by FSM.
Two types of Regular Grammar are:
1. Left Linear Grammar 2. Right Linear Grammar
• Type 2 grammar:
It is also called as context free grammar.
A 🡪 α where A is NT and α is sentential form.
Start symbol of the Grammar can also appear on RHS.
Recognized by PDA(Pushdown Automata)
Chomsky Hierarchy
• Type 1 grammar:
It is context sensitive grammar or context dependent.
1. α 🡪 β where length of β is at least as much as the length of α
except S🡪 ∈.
2. The rule S🡪 ∈ is allowed only if start symbol S does not appear
on RHS.
3. Productions are of the form
α1Aα2 🡪α1 β α2 (β ≠ ∈)
Recognized by TM(Turing Machine).
34
Chomsky Hierarchy
• Type 0 grammar:
It is unrestricted grammar that is no restriction on production.
α 🡪 β (α ≠ ∈)
Recognized by TM(Turing machines)
These languages are known as Recursively Enumerable Languages.
35
Normal Forms
There are certain standard ways of writing CFG.
They satisfy certain restrictions on the productions in the CFG.
Then the G is said to be in Normal forms.
1. Chomsky Normal Form (CNF)
2. Greibach Normal Form (GNF)
Chomsky Normal Form(CNF)
A CFG is in CNF if every production is of the form
A🡪 a, where A is NT and a is T
A 🡪 BC where A,B and C are NTs
S 🡪 ∈ is in G if ∈ belongs to L(G).
When ∈ is in L(G),
we assume that S does not appear on the RHS of any production.
e.g. G is :
S🡪AB| ∈
A🡪 a
B🡪b
Construction of G in CNF
Step 1. Elimination of null productions and unit productions using
previous method.
Let the G is (V,T,P,S)
Step 2. Elimination of terminals on RHS.
Step 3. Restricting the number of variables on RHS.
Reduce to CNF
1. Convert to CNF, S🡪aSa | bSb | a | b | aa | bb
Adding A🡪a and B🡪b we can rewrite the grammar as
S🡪ASA|BSB|a|b|AA|BB
A🡪a, B🡪b
Only S🡪ASA and S🡪BSB are not in CNF
For S🡪ASA we can write
S🡪AR1, R1 🡪SA
For S🡪BSB we can write
S🡪BR2, R2🡪SB
Grammar in CNF is
S🡪AR1|BR2|a|b|AA|BB
A🡪a, B🡪b
R1🡪SA, R2🡪SB
Reduce to CNF
2. Convert to CNF,
S🡪bA|aB, A🡪bAA|aS|a, B🡪aBB|bS|b
Add R1🡪a, R2🡪 b
Rewriting the grammar,
S🡪 R2 A| R1 B,
A🡪 R2 AA| R1 S|a,
B🡪R1 BB| R2 S|b
R1🡪a, R2🡪b
A🡪R2 AA and B🡪 R1 BB are not in CNF
Converted grammar in CNF is
S🡪R2 A| R1 B,
A🡪R2 R3| R1 S|a, R3🡪AA
B🡪R1 R4 | R2 S|b, R4 🡪BB
R1🡪a, R2🡪b
Greibach Normal Form
A context free grammar is said to be in Greibach
Normal Form if all productions are in the following
form:
A → αX
• A is a non terminal symbols
• α is a terminal symbol
• X is a sequence of non terminal
symbols.
It may be empty.
Closure properties of CFL
CFLs are closed under:
Union
Concatenation
Kleene closure operator
Substitution
Reversal
CFLs are not closed under:
Intersection
Difference
Complementation
42
Push Down Automata
Context-Free Languages
Context-Free Pushdown
Grammars Automata
stack
automaton
Pushdown Automaton - PDA
Input String
Stack
States
Initial Stack Symbol
Stack Stack
stack
top
head
bottom special symbol
Appears at time 0
Formal definition of PDA
The PDA is as:
A =(Q,Σ, Г ,δ,q0,Z0,F)
Where
Q : A finite set of states
Σ : A finite set of input symbols
Г : A finite stack alphabet or pushdown symbols
δ : the transition function Q X(Σ U {ϵ } ) X Г to the set of
finite subsets of Q X Г*
q0: the start state
Z0 : the start symbol(pushdown symbol)
F : the set of accepting state or final states
Transition Function
δ :The transition function is a triple δ(q,a,x)
where
1. q is a state in Q
2. a is either an input symbol in Σ or a =ϵ, the empty string,
3. x is a stack symbol, that is a member of Г.
The output of δ is a finite set of pairs (p,ɤ )
Where
p is the new state and
ɤ is the string of stack symbols that replaces x at the top of the
stack.
Instantaneous Description of a PDA
The configuration of a PDA by a triple (q,w,ɤ) where
1. q is the state
2. w is the remaining input
3. ɤ is the stack contents
we show the top of the stack at the left end of ɤ and
the bottom at the right end.
Such a triple is called an Instantaneous Description
or ID of a PDA.
δ(q, a, b) = (q, c)
stack
top Replac
e
δ(q, a, b) = (q, cb)
stack
top Pus
h
δ(q, a, b) = (q, ϵ)
stack
top Po
p
δ(q, a, b) = (q, b)
stack
top No
Change
A PDA Example
A =(Q,Σ, Г ,δ,q0,Z0,F)
Q={q0, q1, q2}, Σ={a,b}, Г={Z0, a}, F={q2}, δ as below
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q2, Z0)
Acceptance by PDA using final state
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q2, Z0)
For the string aabb
(q0, aabb , Z0) String aabb is
Ⱶ (q0, abb , aZ0) accepted, as final
Ⱶ (q0, bb , aaZ0) state q2 is
Ⱶ (q1, b , aZ0) reached on
Ⱶ (q1, ∈ , Z0) reading string
Ⱶ (q2,∈ , Z0) aabb completely
Rejection by PDA
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q2, Z0)
For the string aabbb
(q0, aabbb , Z0) String aabbb is rejected as q1 is not
Ⱶ (q0, abbb , aZ0) final state and string aabbb is not
Ⱶ (q0, bbb , aaZ0) read completely.
Ⱶ (q1, bb , aZ0)
Ⱶ (q1, b , Z0)
Acceptance by PDA using null store or empty store
or empty stack
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q1, ∈)
For the string aabb
(q0, aabb , Z0) String aabb is
Ⱶ (q0, abb , aZ0) accepted, as
Ⱶ (q0, bb , aaZ0) stack is empty on
Ⱶ (q1, b , aZ0) reading string
Ⱶ (q1, ∈ , Z0) aabb completely
Ⱶ (q1,∈ , ∈)
Rejection by PDA
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q1, ∈)
For the string aabbb
(q0, aabbb , Z0) String aabbb is rejected as string
Ⱶ (q0, abbb , aZ0) aabbb is not read completely and
Ⱶ (q0, bbb , aaZ0) stack is not empty.
Ⱶ (q1, bb , aZ0)
Ⱶ (q1, b , Z0)
Acceptance by PDA
Acceptance of input strings by PDA can be defined in terms of final
states or in terms of PDS(pushdown store).
Let A= (Q,Σ, Г ,δ,q0,Z0,F) be a PDA.
The set accepted by final state is defined by
T(A) ={w ϵ Σ* | (q0,w,Z0 ) Ⱶ * (qf , ʌ, α ) for some qf ϵ F and α ϵ Г*}
The set accepted by null store(or empty store)is defined by
N(A) = {w ϵ Σ* | (q0,w,Z0 ) Ⱶ * (q , ʌ, ʌ ) for some q ϵ Q }
n n
PDA for L={a b | n>0 }
Logic:
W
Let
A= (Q, Σ, Г , δ, q0, Z0, F)
Q={q0,q1},
Σ={a,b},
Г={a,Z0},
F={q1}
is a PDA.
δ (Transition Function) by Final State is
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1,∈)
5 q1 ∈ Z0 (q1, Z0)
Acceptance of a string aabb
(q0, aabb , Z0)
Ⱶ (q0, abb , aZ0)
Ⱶ (q0, bb , aaZ0)
Ⱶ (q1, b , aZ0)
Ⱶ (q1, ∈ , Z0)
δ (Transition Function) by Empty stack or Empty
store or Null stack is
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 a a (q0,aa)
3 q0 b a (q1, ∈)
4 q1 b a (q1∈)
5 q1 ∈ Z0 (q1, ∈)
Acceptance of a string aabb
(q0, aabb , Z0)
Ⱶ (q0, abb , aZ0)
Ⱶ (q0, bb , aaZ0)
Ⱶ (q1, b , aZ0)
Ⱶ (q1, ∈ , Z0)
Ⱶ (q1, ∈ , ∈)
Ex: PDA to accept language of palindromes with the
marker. i.e. L={xcxr | x ∈ {a,b}*}
Let
A= (Q, Σ, Г , δ, q0, Z0, F) is a PDA.
where
Q={q0,q1,qf},
Σ={a,b,c},
Г={a,b,Z0},
F={qf}
δ (Transition Function) by Final State is
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 b Z0 (q0,bZ0)
3 q0 a a (q0,aa)
4 q0 b b (q0,bb)
5 q0 a b (q0, ab)
6 q0 b a (q0, ba)
7 q0 c Z0 (q1,Z0)
8 q0 c a (q1,a)
9 q0 c b (q1,b)
10 q1 a a (q1, ∈)
11 q1 b b (q1, ∈)
12 q1 ∈ Z0 (qf, Z0)
δ (Transition Function) by Empty stack or Empty
store or Null stack is
Move no State input stack Move
symbol
1 q0 a Z0 (q0,aZ0)
2 q0 b Z0 (q0,bZ0)
3 q0 a a (q0,aa)
4 q0 b b (q0,bb)
5 q0 a b (q0, ab)
6 q0 b a (q0, ba)
7 q0 c Z0 (q1,Z0)
8 q0 c a (q1,a)
9 q0 c b (q1,b)
10 q1 a a (q1, ∈)
11 q1 b b (q1, ∈)
12 q1 ∈ Z0 (q1, ∈ )
Examples for practice
1. PDA for L={anb2n | n>0}
2. PDA for L={anbncmdm | n, m>0 }
3. PDA for L={anbmcm | m, n>0}
66
Deterministic and non-deterministic PDA
DPDA:
transition function is :
Q X Σ X Г 🡪 Q X Г*
e.g. δ(q,a,Z) is either empty or a singleton.
δ(q,a,Z) ≠ Ø
NPDA:
Q X Σ U {ϵ } X Г 🡪 finite subsets of Q X Г*
e.g. δ(q,a,Z) = {(p1,ɤ1) ,(p2, ɤ2)…..(pm, ɤm)}
DPDA and NPDA
NPDA and DPDA
• For every NPDA, there may not exist an equivalent DPDA.
• The NPDA can accept any CFL, while DPDA is a special case of
NPDA that accepts only a subset of the CFLs accepted by the
NPDA.
• Thus, DPDA is less powerful than NPDA.
NPDA to accept language of palindromes without the
marker.
Let
A= (Q, Σ, Г , δ, q0, Z0, F) is a PDA.
where
Q={q0,q1,qf},
Σ={a,b},
Г={a,b,Z0},
F={qf}
NPDA to accept language of all palindrome strings
Move no State input stack symbol Move
1 q0 a Z0 {(q0,aZ0), (q1, Z0)}
2 q0 b Z0 {(q0,bZ0), (q1, Z0)}
3 q0 a a {(q0,aa),(q1,a)}
4 q0 b a {(q0,ba), (q1,a)}
5 q0 a b {(q0, ab), (q1,b)}
6 q0 b b {(q0, bb), (q1,b)}
7 q0 ∈ Z0 {(q1,Z0)}
8 q0 ∈ a {(q1,a)}
9 q0 ∈ b {(q1,b)}
10 q1 a a {(q1, ∈)}
11 q1 b b {(q1, ∈)}
12 q1 ∈ Z0 {(qf, Z0)}
CFG to PDA
Theorem: If L is a CFL then we can construct a PDA A accepting L by empty
store ie. L=N(A).
Proof: We construct A by making use of productions in G.
Let L=L(G) where G=(V, T, P, S) is a CFG.
We construct PDA A as
A= (Q, Σ, Г , δ, q, Z0,F)
where Σ=T
Г is (V U T)
Z0 = S
F=Ф
δ is defined as
R1 : δ(q, ∈, A) = {(q, α ) | A🡪 α is in P}
R2 : δ(q, a, a) = {(q, ∈ ) } for every a in Σ.
CFG to PDA
1 .Construct a PDA for the CFG
S🡪 0BB
B🡪0S | 1S |0
Test whether 0104 is in N(A).
We construct PDA A as
A = (Q, Σ, Г , δ, q, Z0,F)
Q = {q}
Σ = {0,1} Move State input stack symbol Move
Г = {S, B, 0, 1} no
Z0= S 1 q ∈ S {(q,0BB)}
F=Ф 2 q ∈ B {(q,0S), (q,1S), (q,0)}
3 q 0 0 {(q, ∈)}
4 q 1 1 {(q,∈)}
CFG to PDA
1 .Construct a PDA for the CFG
S🡪 0BB
B🡪0S | 1S |0
Test whether 0104 is in N(A).
2. Convert the grammar
S🡪aSb | A
A🡪bSa |S| ∈
To a PDA that accepts the same language by empty stack.
PDA to CFG
For the given PDA A = ({q0,q1}, {0,1}, {X, Zo}, δ, q0, Zo)
with δ as:
δ(q0,0,Z0) = (q0,XZo) (1a)
δ(q0,0,X) = (q0,XX) (1b)
δ(q0,1,X) = (q1, ∈) (2a)
δ(q1,1,X) = (q1, ∈) (2b)
δ(q1, ∈,X) = (q1, ∈) (2c)
δ(q1, ∈,Zo) = (q1, ∈) (2d)
We convert into CFG as follows
Σ = {0,1} (same as sigma from the PDA)
1. Define Start Symbol "S".
2. [pXq] where p& q are states in Q and X is from Г
V = (S, [q0,X,q0], [q0,X,q1], [q1,X,q0], [q1,X,q1], [q0,Zo,q0], [q0,Zo,q1],
[q1,Zo,q0], [q1,Zo,q1])
PDA to CFG
For the productions the rules are
1. For starting symbol S we write productions S 🡪 [q0,Zo,p]
E.g.
S 🡪 [q0,Zo,q0]
S 🡪 [q0,Zo,q1]
2. Now, We will derive forward, only building productions for variables that can be
derived.
Let δ(q0,a,X) = (r, y1y2y3y4…yk)
Then (q0 X rk) 🡪a [r y1 r1], [r1 y2 r2], [r2 y3 r3], ….. [rk-1 yk rk],
For all states r1,r2,……rk
δ(q0,a,X) = (r, ∈) then [q0 X r] 🡪 a
δ(q0, ∈,X) = (r, ∈) then [q0 X r] 🡪 ∈
PDA to CFG
Example:
PDA to CFG
Solution:
Applications
Context Free Grammar (CFG) is of great practical importance.
It is used for the following purposes-
• For defining programming languages
• For parsing the program by constructing a syntax tree
• For translation of programming languages
• For describing arithmetic expressions
• For construction of compilers
Thank You…!!!
80