Formal Languages and Automata
CH5 Simplification of Context-free Grammars and
Normal Forms
Learning Objectives
At the conclusion of the chapter, the student will be able to:
Simplify a context-free grammar by removing useless productions
Simplify a context-free grammar by removing -productions
Simplify a context-free grammar by removing unit-productions
Determine whether or not a context-free grammar is in Chomsky normal
form
Transform a context-free grammar into an equivalent grammar in Chomsky
normal form
Determine whether or not a context-free grammar is in Greibach normal
form
Transform a context-free grammar into an equivalent grammar in Greibach
normal form
2
Outline
1 Methods for Transforming Grammars
2 Two Important Normal Form
3 A Membership Algorithm for CFGs*
3
Theorem 6.1
Let G = (V, T, S, P) be a context-free grammar. Suppose that P contains a production of the form
A → x1Bx2
Assume that A and B are different variables and that
B → y1|y2| … |yn
is the set of all productions in P which have B as the left side.
Let Ĝ=(V, T, S, Ṗ) be the grammar in which Ṗ is constructed by deleting
A → x1Bx2
from P, and adding to it
A → x1y1x2|x1y2x2| … |x1ynx2
Then
L(Ĝ) = L(G)
4
Example 6.1
Consider G with following productions
A → a | aaA | abBc
B → abbA | b
Using the suggested substitution for the variable B, we get the grammar Ĝ
A → a | aaA | ababbAc | abbc
5
Useful Substitution Rules
Rule 1: Remove Nullable Variables
Rule 2: Remove Unit-Productions
Rule 3: Remove Useless Variables
6
Nullable Variables
− production : A→
Nullable Variable: A
7
Example 6.4
{a b : n 1}
n n
S → aS1b S → aS1b | ab
S1 → aS1b | S1 → aS1b | ab
8
Example 6.5
Find a CFG without λ-productions equivalent to the grammar G:
Grammar G
S → ABaC | BaC | AaC | ABa | aC | Aa | Ba | a
S → ABaC
A → B | C | BC
A → BC
B→b
B →b|
C→D
C → D|
D→d
D→d
A,B, and C are nullable variables
9
Unit-Productions
Unit Production: A→ B
(a single variable in both sides)
10
Removing Unit Productions
Observation:
A→ A
Is removed immediately
11
Example 6.6
Remove all unit-productions from
S → Aa | B
B → A | bb
A → a | bc | B
dependency graph
12
Example 6.6 S → Aa | B
B → A | bb
A → a | bc | B
S → Aa S → a | bc | bb S → a | bc | bb | Aa
B → bb B → a | bc B → a | bb | bc
A → a | bc A → bb A → a | bb | bc
Non-unit production New rules
dependency graph
13
Useless Productions
S → aSb
S →
S→A
A → aA Useless Production
Some derivations never terminate...
S A aA aaA aa aA
14
Another grammar:
S→A
A → aA
A→
B → bA Useless Production
Not reachable from S!
15
contains only
In general:
terminals
if S xAy w
w L(G )
then variable A is useful
otherwise, variable A is useless
16
A production A → x is useless
iff any of its variables is useless
S → aSb
S → Productions
Variables S→A useless
useless A → aA useless
useless B→C useless
useless C→D useless
17
Removing Useless Productions
Example 6.3:
Eliminate useless symbols and productions from
the grammar below:
S → aS | A | C
A→a
B → aa
C → aCb
First: find all variables that can produce
strings with only terminals
S → aS | A | C { A, B}
A→a
S → A
B → aa
C → aCb { A, B, S }
19
Keep only the variables
that produce terminal symbols: { A, B, S }
(the rest variables are useless)
S → aS | A | C
A→a S → aS | A
B → aa A→a
C → aCb B → aa
Remove useless productions
20
Second: Find all variables
reachable fromS
Dependency graph
•Vertex labeled with variable
•Edge (A, B) exists iff a production form
A → xBy
S → aS | A
A→a
S A B
B → aa
not
reachable
21
Keep only the variables
reachable from S
(the rest variables are useless)
Final Grammar
S → aS | A
S → aS | A
A→a
A→a
B → aa
Remove useless productions
22
Theorem 6.5
Let L be a CFL that does not contain λ. Then there exists a CFG that generates L
and that does not have any useless-, unit-, or λ-production.
S0 → S | λ
Which one needs to be removed first?
Remove all undesirable productions using the following sequence of steps:
Step 1: Remove λ-productions
Step 2: Remove unit-productions
Step 3: Remove useless-productions
23
Outline
1 Methods for Transforming Grammars
2 Two Important Normal Form
3 A Membership Algorithm for CFGs*
24
Chomsky Normal Form (CNF)
Each productions has form:
A → BC or A→a
variable variable terminal
Example 6.7
S → AS S → AS
S →a S → AAS
A → SA A → SA
A→b A → aa
Chomsky Not Chomsky
Normal Form Normal Form
26
Example 6.8
Convert the grammar with following productions to CNF:
S → ABa
A → aab
B → Ac
27
Introduce variables for terminals: Ta , Tb , Tc
S → ABTa
S → ABa A → TaTaTb
A → aab B → ATc
B → Ac Ta → a
Tb → b
Tc → c
28
Introduce intermediate variable: V1
S → AV1
S → ABTa
V1 → BTa
A → TaTaTb
A → TaTaTb
B → ATc
B → ATc
Ta → a
Ta → a
Tb → b
Tb → b
Tc → c
Tc → c
29
Introduce intermediate variable: V2
S → AV1
S → AV1 V1 → BTa
V1 → BTa A → TaV2
A → TaTaTb V2 → TaTb
B → ATc B → ATc
Ta → a Ta → a
Tb → b Tb → b
Tc → c Tc → c
30
S → AV1
Final grammar in Chomsky Normal Form:
V1 → BTa
A → TaV2
Initial grammar V2 → TaTb
S → ABa B → ATc
A → aab Ta → a
B → Ac Tb → b
Tc → c
31
Theorem 6.6
From any context-free grammar
(which doesn’t produce )
not in Chomsky Normal Form
we can obtain:
An equivalent grammar
in Chomsky Normal Form
32
The Procedure
First remove:
Nullable variables
Unit productions
33
Then, for every symbol a :
Add production Ta → a
In productions: replace a with Ta
New variable: Ta
34
Replace any production A → C1C2 Cn
with A → C1V1
V1 → C2V2
Vn−2 → Cn−1Cn
New intermediate variables: V1, V2 , ,Vn−2
35
Theorem: For any context-free grammar
(which doesn’t produce )
there is an equivalent grammar
in Chomsky Normal Form
36
Observations
• Chomsky normal forms are good
for parsing and proving theorems
• It is very easy to find the Chomsky normal
form for any context-free grammar
37
Greibach Normal Form
All productions have form:
A → a V1V2 Vk k 0
symbol variables
Prof. of UCLA(CS)
38
Examples:
S → cAB
S → abSb
A → aA | bB | b
S → aa
B→b
Greibach Not Greibach
Normal Form Normal Form
39
Example 6.9:
S → AB S → aAB | bBB | bB
A → aA | bB | b A → aA | bB | b
B→b B→b
40
Example 6.10:
S → aTb STb
S → abSb S → aTa
S → aa Ta → a
Tb → b
Greibach
Normal Form
41
Theorem 6.7:
For any context-free grammar
(which doesn’t produce )
there is an equivalent grammar
in Greibach Normal Form
42
Observations
• Greibach normal forms are very good
for parsing
recursive descent parsers
• It is hard to find the Greibach normal
form of any context-free grammar
43
Outline
1 Methods for Transforming Grammars
2 Two Important Normal Form
3 A Membership Algorithm for CFGs*
44
Membership Question:
for context-free grammar G
find if string w L(G )
Membership Algorithms: Parsers
• Exhaustive search parser: O(P2|w|+1)
• CYK parsing algorithm: O(|w|3)
45
The CYK Parser
J. Cocke
D. H. Younger
T. Kasami
The CYK Membership Algorithm
Input:
• Grammar G in Chomsky Normal Form
• String w
Output:
find if w L(G )
47
The Algorithm
Input example:
• Grammar G : S → AB
A → BB
A→a
B → AB
B→b
• String w: aabbb
48
aabbb (V15 )
a a b b b
aa ab bb bb
aab abb bbb
aabb abbb
aabbb
49
S → AB
A → BB a a b b b
A A B B B
A→a
aa ab bb bb
B → AB
B→b aab abb bbb
aabb abbb
aabbb
50
S → AB
a a b b b
A → BB A A B B B
A→a aa ab bb bb
B → AB S,B A A
aab abb bbb
B→b
aabb abbb
aabbb
51
S → AB a a b b b
A → BB A A B B B
A→a aa ab bb bb
B → AB S,B A A
B→b aab abb bbb
S,B A S,B
aabb abbb
A S,B
aabbb
S,B
52
Therefore: aabbb L(G )
3
Time Complexity: | w|
Observation: The CYK algorithm can be
easily converted to a parser
(bottom up parser)
53