0% found this document useful (0 votes)
20 views275 pages

TSR - Class Cd-Unit 2

The document outlines the course content for 'Compiler Design' at the School of Computing, detailing the roles and phases of compilers, including lexical and syntax analysis. It covers various parsing techniques, error handling, and context-free grammars, along with examples and notational conventions. The course is taught by Dr. T Saju Raj and is part of the Computer Science & Engineering program, offering 3 credits.

Uploaded by

amjayasoorya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views275 pages

TSR - Class Cd-Unit 2

The document outlines the course content for 'Compiler Design' at the School of Computing, detailing the roles and phases of compilers, including lexical and syntax analysis. It covers various parsing techniques, error handling, and context-free grammars, along with examples and notational conventions. The course is taught by Dr. T Saju Raj and is part of the Computer Science & Engineering program, offering 3 credits.

Uploaded by

amjayasoorya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 275

School of Computing

Department of Computer Science & Engineering

10211CS107 – COMPILER DESIGN


Category : Program Core
Credit : 3

Course Handling Faculty :


Dr. T SAJU RAJ
Associate Professor
COURSE CONTENT
UNIT I Introduction to Compilers 9

Compilers, Analysis of the Source Program, The Phases of a Compiler, Cousins of


the Compiler, The Grouping of Phases, Compiler-Construction Tools. LEXICAL
ANALYSIS: Need and role of lexical analyzer-Lexical errors, Input Buffering -
Specification of Tokens, Recognition of Tokens, Design of a Lexical Analyzer
Generator.

1/28/2025 Dr.T SAJU RAJ Compiler Design 2


COURSE CONTENT
UNIT II Syntax Analysis 9

Need and role of the parser- Context Free Grammars-Top Down parsing –
Recursive Descent Parser - Predictive Parser - LL (1) Parser -Shift Reduce Parser -
LR Parser - LR (0) item - Construction of SLR Parsing table -Introduction to LALR
Parser, YACC- Design of a syntax analyser for a sample language

1/28/2025 Dr.T SAJU RAJ Compiler Design 3


Phases of Compiler

Tokens

Parse Tree

Annotated parse tree

Three Address Code

Optimized Code

Machine Code

Target Program
1/28/2025 Department of Computer Science and Engineering 4
Phases of Compiler

Error Controller
Symbol Table

Record
Information
(name,Functi Handle Errors
on,Interfaces
,etc)

1/28/2025
Target Program
Department of Computer Science and Engineering 5
Where Are We?

• Source code: if (b==0) a = “Hi”;

Lexical Analysis

• Token Stream: if( b == 0) a = “Hi”;

Syntactic Analysis

• Abstract Syntax Tree if


• (AST) Semantic Analysis
== = ;

b 0 a “Hi”
Do tokens conform to the language syntax?

1/28/2025 Department of Computer Science and Engineering 6


Roles of the Lexical analyzer
➢ Read input characters from the source program

➢ Helps to identify token into the symbol table

➢ Removes white spaces and comments from the source program

➢ Correlates error messages with the source program-Lexical errors

and Project
Management
(SEPM)

1/28/2025 Department of Computer Science and Engineering 7


Syntax Analysis

• Syntax analysis recognizes the syntactic structure of


the programming language and transforms a string
of tokens into a tree of tokens and syntactic
categories
• Parser is the program that performs syntax analysis

1/28/2025 Department of Computer Science and Engineering 8


Outline

• Role of parser
• Context free grammars
• Top down parsing
• Bottom up parsing
• Parser generators

1/28/2025 Department of Computer Science and Engineering 9


The role of parser

token
Source Lexical Parse tree Rest of Front Intermediate
Parser
program Analyzer End representation
getNext
Token

Symbol
table

1/28/2025 Department of Computer Science and Engineering 10


Parsing

• A.K.A. Syntax Analysis


– Recognize sentences in a language.
– Discover the structure of a document/program.
– Construct (implicitly or explicitly) a tree (called as a
parse tree) to represent the structure.
– The above tree is used later to guide translation.

1/28/2025 Department of Computer Science and Engineering 11


Parsing During Compilation
regular
expressions errors

lexical token rest of intermediate


source parser parse
representation
program analyzer get next tree front end
token

symbol
• Collecting token
table information
• Perform type checking
• uses a grammar to check structure of tokens • Intermediate code
generation
• produces a parse tree
• syntactic errors and recovery
• recognize correct syntax
• report errors
1/28/2025 Department of Computer Science and Engineering 12
Parsing Responsibilities
Syntax Error Identification / Handling
Recall typical error types:

• Syntactic : Omission, wrong order of tokens


• if ((x<1) & (y>5)))

• Semantic Incompatible Types, Unidentified ID’s


• if ((x<1) & (y>5.0)))

• Logical : Infinite Loop/ recursive call


• Should be <= not <

• Lexical : Misspellings
• if x<1 then y = 5:

1/28/2025 Department of Computer Science and Engineering 13


Error Reporting is Not a Trivial Task
• Difficult to generate clear and accurate error messages.
Example
function foo () {
...
if (...) {
...
} else {
...
Missing } here

...
}
<eof> Not detected until here
Example
int myVarr;
...
x = myVar; Misspelled ID here
...
Not detected until here
1/28/2025 Department of Computer Science and Engineering 14
Error Recovery
• After first error recovered
– Compiler must go on!
• Restore to some state and process the rest of the input

• Error-Correcting Compilers
– Issue an error message
– Fix the problem
– Produce an executable

Example
Error on line 23: “myVarr” undefined.
“myVar” was used.

1/28/2025 Department of Computer Science and Engineering 15


Error Recovery Trigger More Errors!
• Inadequate recovery may introduce more errors
– Those were not programmers errors

• Example:
int myVar flag ;
...
Declaration of flag is discarded
x := flag;
...
... Variable flag is undefined
while (flag==0)
... Variable flag is undefined

Too many Error message may be obscuring


– May bury the real message
– Remedy: allow 1 message per token or per statement
• Quit after a maximum (e.g. 100) number of errors
1/28/2025
• Department of Computer Science and Engineering 16
Panic Mode – E.R.A
• Discard tokens until we see a “synchronizing” token.

Example

Skip to next occurrence of


} end ;
Resume by parsing the next statement

• The key...
– Good set of synchronizing tokens
– Knowing what to do then
• Advantage
– Simple to implement Commonly used
– Does not go into infinite loop

• Disadvantage
– May skip over large sections of source with some errors
1/28/2025 Department of Computer Science and Engineering 17
Phrase-Level Recovery - ERA
• Compiler corrects the program
by deleting or inserting tokens
...so it can proceed to parse from where it was.

Example

while (x==4) y:= a + b

Insert do to fix the statement

• The key...
Don’t get into an infinite loop

1/28/2025 Department of Computer Science and Engineering 18


Context Free Grammars (CFG)
• A context free grammar is a formal model that consists of:
• Terminals
Keywords
Token Classes
Punctuation
• Non-terminals
Any symbol appearing on the lefthand side of any rule
• Start Symbol
Usually the non-terminal on the lefthand side of the first rule
• Rules (or “Productions”)
BNF: Backus-Naur Form / Backus-Normal Form
Stmt ::= if Expr then Stmt else Stmt

1/28/2025 Department of Computer Science and Engineering 19


Rule Alternative Notations

1/28/2025 Department of Computer Science and Engineering 20


Context Free Grammars : A First Look
assign_stmt → id := expr ;
expr → expr operator term
expr → term
term → id
term → real
term → integer
operator → +
operator → -

Derivation: A sequence of grammar rule applications and


substitutions that transform a starting non-term into a sequence
of terminals / tokens.

1/28/2025 Department of Computer Science and Engineering 21


Derivation
Let’s derive: id := id + real – integer ; using production:

assign_stmt → id := expr ; assign_stmt


→ id := expr ;
expr → expr operator term
→id := expr operator term;
expr → expr operator term
→id := expr operator term operator term;
expr → term
→ id := term operator term operator term;
term → id
→ id := id operator term operator term;
operator → +
→ id := id + term operator term;
term → real
→ id := id + real operator term;
operator → -
→ id := id + real - term;
term → integer → id := id + real - integer;

Department of Computer Science and Engineering 22


Arithmetic Expressions
expr → expr op expr
expr → ( expr )
expr → - expr
expr → id
op → + 9 Production rules
op → -
op → *
op → /
op → 
Terminals: id + - * /  ( )
Nonterminals: expr, op
Start symbol: expr
1/28/2025 Department of Computer Science and Engineering 23
Notational Conventions
• Terminals
– Lower-case letters early in the alphabet: a, b, c
– Operator symbols: +, -
– Punctuations symbols: parentheses, comma
– Boldface strings: id or if

• NonTerminals:
– Upper-case letters early in the alphabet: A, B, C
– The letter S (start symbol)
– Lower-case italic names: expr or stmt

• Upper-case letters late in the alphabet, such as X, Y, Z,


represent either nonterminals or terminals.
• Lower-case letters late in the alphabet, such as u, v, …, z,
represent strings of terminals.
1/28/2025 Department of Computer Science and Engineering 24
Notational Conventions
• Lower-case Greek letters, such as , , , represent strings of
grammar symbols. Thus A→  indicates that there is a single
nonterminal A on the left side of the production and a string of
grammar symbols  to the right of the arrow.

• If A→ 1, A→ 2, …., A→ k are all productions with A on the


left, we may write A→ 1| 2| …. | k

• Unless otherwise started, the left side of the first production is


the start symbol.

E → E A E | ( E ) | -E | id
A→+|-| *| / |
1/28/2025 Department of Computer Science and Engineering 25
Derivations

Doesn’t contain nonterminals

1/28/2025 Department of Computer Science and Engineering 26


Derivation

1/28/2025 Department of Computer Science and Engineering 27


Derivation

1/28/2025 Department of Computer Science and Engineering 28


Leftmost Derivation

1/28/2025 Department of Computer Science and Engineering 29


Rightmost Derivation

1/28/2025 Department of Computer Science and Engineering 30


Parse Tree

1/28/2025 Department of Computer Science and Engineering 31


Parse Tree

1/28/2025 Department of Computer Science and Engineering 32


Parse Tree

1/28/2025 Department of Computer Science and Engineering 33


Parse Tree

1/28/2025 Department of Computer Science and Engineering 34


Ambiguous Grammar

1/28/2025 Department of Computer Science and Engineering 35


Ambiguous Grammar

• More than one Parse Tree for some sentence.


– The grammar for a programming language may be
ambiguous
– Need to modify it for parsing.

• Also: Grammar may be left recursive.


• Need to modify it for parsing.

1/28/2025 Department of Computer Science and Engineering 36


Elimination of Ambiguity

• Ambiguous

• A Grammar is ambiguous if there are multiple parse


trees for the same sentence.
• Disambiguation

• Express Preference for one parse tree over others


– Add disambiguating rule into thegrammar

1/28/2025 Department of Computer Science and Engineering 37


Parser

Parsing means determining the syntactic


structure of an expression

and Project
Management
(SEPM)

1/28/2025 Department of Computer Science and Engineering 38


Example of Parsing

 X=(A-(D*F))

1/28/2025 Department of Computer Science and Engineering 39


Types of Parser

and Project
Management
(SEPM)

1/28/2025 Department of Computer Science and Engineering 40


Top Down Parser Vs Bottom Up Parser

Top-down Bottom-up
Parsing Parsing

Initiates from
Initiates from Leaves
Root

Production is used to derive


Starts from the token and then go
and check the similarity in the
to the start symbol.
string.

Type of derivation Leftmost Type of derivation Rightmost


derivation derivation
and Project
Management
(SEPM)

1/28/2025 Department of Computer Science and Engineering 41


Top-down parsers

 Top-down parsing expands a parse tree from the start


symbol to the leaves
 Always expand the leftmost non-terminal

id + id * id
1/28/2025 Department of Computer Science and Engineering 42
Top-down parsers

Apply E-> E+E E

id + id * id
1/28/2025 Department of Computer Science and Engineering 43
Top-down parsers

Apply E-> id E

id + id * id
1/28/2025 Department of Computer Science and Engineering 44
Top-down parsers

Apply E-> E*E E

E
E E

id + id * id
1/28/2025 Department of Computer Science and Engineering 45
Top-down parsers

Apply E-> id E

E
E E

id + id * id
1/28/2025 Department of Computer Science and Engineering 46
Top-down parsers

Apply E-> id E

E
E E

id + id * id
1/28/2025 Department of Computer Science and Engineering 47
Bottom-up Parser

 Start at the leaves and grow toward root

 And just as efficient


 Builds on ideas in top-down parsing
 Preferred method in practice

 Also called LR parsing


 L means that tokens are read left to right
 R means that it constructs a rightmost derivation -->

1/28/2025 Department of Computer Science and Engineering 48


Bottom-up Parser

1/28/2025 Department of Computer Science and Engineering 49


Example of Bottom-up Parser
14

int + (int) + (int)

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 50


Example of Bottom-up Parser
15

int + (int) + (int)


E + (int) + (int)

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 51


Example of Bottom-up Parser
16

int + (int) + (int)


E + (int) + (int)
E + (E) + (int)

E E

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 52


Example of Bottom-up Parser
17

int + (int) + (int)


E + (int) + (int)
E + (E) + (int)
E + (int)

E E

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 53


Example of Bottom-up Parse
18

int + (int) + (int)


E + (int) + (int)
E + (E) + (int)
E + (int)
E + (E)
E

E E E

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 54


Example of Bottom-up Parse
19

int + (int) + (int) E


E + (int) + (int)
E + (E) + (int)
E + (int)
E + (E)
E E

E E E

int + ( int ) + ( int )

1/28/2025 Department of Computer Science and Engineering 55


Classification of Parsers
Parsers
Top Down Bottom Up Parsers
Parsers (Shift Reduce Parsers)

with full without Operator Precedence


LR Parsers
Backtracking Backtracking Parsers

Brute Force Recursive


Method LR (0)
Descent Parsers

Non recursive Descent Simple LR or


Parsers or LL(1) SLR(1)

Canonical LR or
CLR(1) or LR(1)

LALR(1)
1/28/2025 Department of Computer Science and Engineering 56
TOP DOWN PARSING

➢ Find a left-most derivation


➢ Find (build) a parse tree
➢ St a r t building from the root and work down…
➢ As we search for a derivation
➢ Must make choices:
➢ Which rule to use
➢ Where to use it

➢ May r u n into problems!!

18

1/28/2025 Department of Computer Science and Engineering 57


TOP-DOWN PARSING
Recursive-Descent Parsing
Backtracking is needed (If a choice of a production rule
does not work, we backtrack to try other alternatives.)
It is a general parsing technique, but not widely used.
Not efficient
Predictive Parsing
No backtracking
Efficient
Needs a special form of grammars (LL(1) grammars).
Recursive Predictive Parsing is a special form of
Recursive Descent parsing without backtracking.
Non-Recursive (Table Driven) Predictive Parser is also
known as LL(1) parser.
19

1/28/2025 Department of Computer Science and Engineering 58


EXAMPLE
Consider the grammar S → cAd
A→ab|a and the input string w = cad.
To construct a parse tree for this string using top- down approach,
Step 1:
Initially create a tree with single node labeled S. An input pointer points to
‘c’, the first symbol of w. Expand the tree with the production of S.

c A d
28-01-2025 Department of Computer Science and Engineering 59
EXAMPLE
Step2:
➢ The leftmost leaf ‘c’ matches the first symbol of w, so advance the
input pointer to the second symbol of w ‘a’ and consider the next leaf
‘A’.
➢ Expand A using the first alternative.

c A d

a b

28-01-2025 Department of Computer Science and Engineering 60


EXAMPLE

Step3:

➢ The second symbol ‘a’ of w also matches with second leaf of tree.

➢ So advance the input pointer to third symbol of w ‘d’.

➢ But the third leaf of tree is b which does not match with the input

symbol d.

➢ Hence discard the chosen production and reset the pointer to second

position. This is called backtracking.

28-01-2025 Department of Computer Science and Engineering 61


EXAMPLE
Step4:
Now try the second alternative for A.
S

c A d

Now we can halt and announce the successful completion of parsing.

28-01-2025 Department of Computer Science and Engineering 62


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 63


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 64


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 65


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 66


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 67


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 68


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 69


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 70


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 71


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 72


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 73


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 74


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 75


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 76


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 77


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

1/28/2025 Department of Computer Science and Engineering 78


R E C U R S I V E D E S C E N T PA R S I N G
(BACKTRACKING)

36
Successfully parsed!!
1/28/2025 Department of Computer Science and Engineering 79
R E C U R S I V E D E S C E N T PA R S I N G - Algorithm

A recursive-descent parsing program consists of a set of


procedures – one for each non-terminal
Execution begins with the procedure for the s ta r t symbol
Announces success if the procedure body scans the entire input

void A(){
for (j=1 to t){ /* assume there is t number of A-productions */
Choose a A-production, Aj→X1X2…Xk;
for (i=1 to k){
if (Xi is a non-terminal)
call procedure Xi();
else if (Xi equals the current input symbol a)
advance the input to the next symbol;
else backtrack in input and reset the pointer
}
} 37
}
1/28/2025 Department of Computer Science and Engineering 80
PREDICTIVE PARSING

8
1

1/28/2025 Department of Computer Science and Engineering 81


PREDICTIVE PARSING
 LL(1) Grammars Left Recursion- X
 Can do predictive parsing Non determinism- X
 Can select the right rule
 Looking at only the next 1 input symbol
 First L : Left to Right Scanning
 Second L: Leftmost derivation
 1 : one input symbol look-ahead for predictive decision

 LL(k) Grammars
 Can do predictive parsing
 Can select the right rule
 Looking at only the next k input symbols

 Techniques to modify the grammar:


 Left Factoring
 Removal of Left Recursion

 LL(k) Language 5
 Can be described with an LL(k) grammar
1/28/2025 Department of Computer Science and Engineering 82
FIRST FUNCTION

FIRST(E) = {(, id}


FIRST(E) = {+, }
FIRST(T) = {(, id}
• FIRST (E) = FIRST(T) = {*, }
• FIRST (E’) = FIRST(F) = {(, id}}
• FIRST (T) = 44
• FIRST (T’) =
1/28/2025 • FIRST (F) =Department of Computer Science and Engineering 83
COMPUTING FOLLOW(A)

1. If A is s t a r t symbol, put $ in FOLLOW(A)


2. Productions of the form B →  A ,
Add FIRST() – {} to FOLLOW(A)
3. Productions of the form B →  A or
B →  A  wh ere  * 
Add FOLLOW(B) to FOLLOW(A)

1/28/2025 Department of Computer Science and Engineering 84


EXAMPLE
FOLLOW(E) = {$}
FOLLOW(E) =
FOLLOW(T) =
FOLLOW(T) =
FOLLOW(F) =

FIRST(E) = {(, id} Using rule #1


FIRST(E) = {+, }
FIRST(T) = {(, id} 1. If A is start symbol, put $ in FOLLOW(A)
FIRST(T) = {*, }
FIRST(F) = {(, id}} 51
Assume the first non-terminal is the start symbol
1/28/2025 Department of Computer Science and Engineering 85
EXAMPLE
FOLLOW(E) = {$, )}
FOLLOW(E) =
FOLLOW(T) = {+}
FOLLOW(T) =
FOLLOW(F) = {*}

FIRST(E) = {(, id}


FIRST(E) = {+, } Using rule #2
FIRST(T) = {(, id} 2. Productions of the form B →  A ,
Add FIRST() – {} to FOLLOW(A)
FIRST(T) = {*, }
FIRST(F) = {(, id}}
52

1/28/2025 Department of Computer Science and Engineering 86


EXAMPLE
FOLLOW(E) = {$, )}
FOLLOW(E) = FOLLOW(E)
= {$, )}
FOLLOW(T) = {+}  FOLLOW(E)
= {+, $, )}
FOLLOW(T) = FOLLOW(T)
= {+, $, )}
FIRST(E) = {(, id}
FOLLOW(F) = {*}  FOLLOW(T)
FIRST(E) = {+, } = {*, +, $, )}
FIRST(T) = {(, id}
Using rule #3
FIRST(T) = {*, }
3. Productions of the form B →  A or
FIRST(F) = {(, id}}
B →  A  where  * 
53
Add FOLLOW(B) to FOLLOW(A)
1/28/2025 Department of Computer Science and Engineering 87
EXAMPLE
 S → ( A) | 
 A→TE
 E→&TE|
 T→(A)|a|b|c
 FOLLOW(S) ={$}
 FIRST(T) ={(,a,b,c}  FOLLOW(A) = { )}
 FIRST(E) = {&, }  FOLLOW(E) = FOLLOW(A) = { )}
 FIRST(A) ={(,a,b,c}  FOLLOW(T) = FIRST(E)  FOLLOW(E)
 FIRST(S) = {(,}
= {&, )}

55

1/28/2025 Department of Computer Science and Engineering 88


EXAMPLE


S→aSe|B  FIRST(C) =
FOLLOW(C) =
 B→bBCf|C  FIRST(B) =
 C→cCg|d|   FIRST(S) = FOLLOW(B) =

FOLLOW(S) = {$}

Assume the first non-terminal is the start symbol


1. If A is start symbol, put $ in FOLLOW(A)
2. Productions of the form B →  A , Add FIRST() – {} to FOLLOW(A)
3. Productions of the form B →  A or B →  A  where  * 
Add FOLLOW(B) to FOLLOW(A)
56

1/28/2025 Department of Computer Science and Engineering 89


EXAMPLE

 S→aSe|B  FOLLOW(C) =
 B→bBCf|C {f,g}  FOLLOW(B)
 C→cCg|d| 
= {c,d,e,f,g,$}
 FIRST(C) = {c,d,}  FOLLOW(B) =
 FIRST(B) = {b,c,d,} {c,d}  FOLLOW(S)
 FIRST(S) = {a,b,c,d,}
= {c,d,e,$}
 FOLLOW(S) = {$, e }

57

1/28/2025 Department of Computer Science and Engineering 90


FOLLOW EXAMPLE
 S→aSe|B  FOLLOW(C) =
 B→bBCf|C
 C→cCg|d|   FOLLOW(B) =

 FIRST(C) =  FOLLOW(S) = {$}


 FIRST(B) = Assume the first non-terminal is the start symbol
 FIRST(S) = 1. If A is start symbol, put $ in FOLLOW(A)
2. Productions of the form B →  A
, Add FIRST() – {} to
FOLLOW(A)
3. Productions of the form B→  A or
B →  A  where  * 
Add FOLLOW(B) to FOLLOW(A) 2

1/28/2025 Department of Computer Science and Engineering 91


FOLLOW EXAMPLE

 S→aSe|B  FOLLOW(C) =
 B→bBCf|C {f,g}  FOLLOW(B)
 C→cCg|d| 
= {c,d,e,f,g,$}
 FIRST(C) = {c,d,}  FOLLOW(B) =
 FIRST(B) = {b,c,d,} {c,d}  FOLLOW(S)
 FIRST(S) = {a,b,c,d,}
= {c,d,e,$}
 FOLLOW(S) = {$, e }

9
2

1/28/2025 Department of Computer Science and Engineering 92


FOLLOW EXAMPLE
 S→ aSe|B  FOLLOW(C) =
 B→ bBCf|C {f,g}  FOLLOW(B)
 C → c C g | d |
= {c,d,e,f,g,$}
 FIRST(C) = {c,d,}  FOLLOW(B) =
 FIRST(B) = {b,c,d,} {c,d}  FOLLOW(S)
 FIRST(S) = {a,b,c,d,}
= {c,d,e,$}
 FOLLOW(S) = {$, e }

9
3

1/28/2025 Department of Computer Science and Engineering 93


PREDICTIVE PARSING

9
4

1/28/2025 Department of Computer Science and Engineering 94


PREDICTIVE PARSING
 LL(1) Grammars
 Can do predictive parsing
 Can select the right rule
 Looking at only the next 1 input symbol
 First L : Left to Right Scanning
 Second L: Leftmost derivation
 1 : one input symbol look-ahead for predictive decision

 LL(k) Grammars
 Can do predictive parsing
 Can select the right rule
 Looking at only the next k input symbols

 Techniques to modify the grammar:


 Left Factoring
 Removal of Left Recursion

 LL(k) Language 5
 Can be described with an LL(k) grammar
1/28/2025 Department of Computer Science and Engineering 95
TABLE DRIVEN PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 96


PARSE TABLE CONSTRUCTION

1/28/2025 Department of Computer Science and Engineering 97


TABLE DRIVEN PREDICTIVE
PARSING

1/28/2025 Department of Computer Science and Engineering 98


TABLE DRIVEN PREDICTIVE
PARSING

1/28/2025 Department of Computer Science and Engineering 99


FOLLOW (DO IT YOURSELF)

DIY!!!!!!!!!!!

First & Follow of the Given G

1
0
0
1/28/2025 Department of Computer Science and Engineering 100
FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 101


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 102


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 103


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 104


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 105


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 106


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 107


FIRST AND FOLLOW

1/28/2025 Department of Computer Science and Engineering 108


PREDICTIVE PARSING ALGORITHM

1/28/2025 Department of Computer Science and Engineering 109


PREDICTIVE PARSING ALGORITHM

Production Rule First Follow

E → TE’ { (, id } { $, )}

E’ → +TE’ | ϵ { +, ϵ } {$, ) }

T → FT’ {(, id } {+, $, )}

T’→*FT’ | ϵ { *, ϵ} {+, $, )}

F → (E)| id { (, id} { *, +, $, )}

1/28/2025 Department of Computer Science and Engineering 110


PREDICTIVE PARSING

Production First Follow


Rule

E → TE’ { (, id } { $, )}

E’ → +TE’ | ϵ { +, ϵ } {$, ) }

T → FT’ {(, id } {+, $, )}

T’→*FT’ | ϵ { *, ϵ} {+, $, )}

F → € | id { (, id} { *, +, $, )}

1/28/2025 Department of Computer Science and Engineering 111


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 112


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 113


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 114


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 115


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 116


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 117


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 118


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 119


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 120


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 121


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 122


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 123


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 124


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 125


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 126


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 127


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 128


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 129


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 130


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 131


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 132


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 133


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 134


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 135


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 136


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 137


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 138


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 139


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 140


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 141


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 142


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 143


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 144


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 145


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 146


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 147


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 148


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 149


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 150


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 151


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 152


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 153


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 154


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 155


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 156


PREDICTIVE PARSING

1/28/2025 Department of Computer Science and Engineering 157


RECONSTRUCTING THE PARSE TREE

1/28/2025 Department of Computer Science and Engineering 158


RECONSTRUCTING THE PARSE TREE

1/28/2025 Department of Computer Science and Engineering 159


RECONSTRUCTING THE PARSE TREE

1/28/2025 Department of Computer Science and Engineering 160


EXAMPLE: THE “DANGLING ELSE”

GRAMMAR

1/28/2025 Department of Computer Science and Engineering 161


EXAMPLE: THE “DANGLING ELSE”

GRAMMAR

1/28/2025 Department of Computer Science and Engineering 162


EXAMPLE: THE “DANGLING ELSE”
GRAMMAR

1/28/2025 Department of Computer Science and Engineering 163


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 164


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 165


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 166


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 167


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 168


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 169


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 170


EXAMPLE: THE “DANGLING ELSE”

1/28/2025 Department of Computer Science and Engineering 171


LL(1) GRAMMAR
LL(1) Grammar
 LL(1) grammars
 Are never ambiguous.
 Will never have left recursion.

 Furthermore...
 If we are looking for an “A” and the next symbol is
“b”,
 Then only one production must be possible
 Although elimination of left recursion and left
factoring is easy.
 Some grammar will never be a LL(1) grammar. 72

1/28/2025 Department of Computer Science and Engineering 172


PROPERTIES OF LL(1) GRAMMAR

 A grammar G is LL(1) if an only if whenever A


→   are two distinct productions of G the
following conditions hold:
1. For no terminal a do both  and  derive strings
beginning with a.
2. At most one of  and  can derive the empty string.
3. If then  *   then  does not derive any string
beginning with a terminal in FOLLOW(A).

1/28/2025 Department of Computer Science and Engineering 173


ERROR RECOVERY
When Do Errors Occur? Recall Predictive Parser Function:

a + b $ Input

Stack X Predictive Parsing Output


Program
Y
Z
Parsing Table
$ M[A,a]

1. If X is a terminal and it doesn’t match input.


2. If M[X, Input] is empty – No allowable actions 75

1/28/2025 Department of Computer Science and Engineering 174


ERROR RECOVERY

76

1/28/2025 Department of Computer Science and Engineering 175


ERROR RECOVERY: SKIP INPUT SYMBOLS

1/28/2025 Department of Computer Science and Engineering 176


ERROR RECOVERY: POP THE STACK

78

1/28/2025 Department of Computer Science and Engineering 177


PREDICTIVE PARSING

Production First Follow


Rule

E → TE’ { (, id } { $, )}

E’ → +TE’ | ϵ { +, ϵ } {$, ) }

T → FT’ {(, id } {+, $, )}

T’→*FT’ | ϵ { *, ϵ} {+, $, )}

F → € | id { (, id} { *, +, $, )}

synch synch

synch synch synch

synch synch synch synch

1/28/2025 Department of Computer Science and Engineering 178


ERROR RECOVERY: PANIC MODE

1/28/2025 Department of Computer Science and Engineering 179


ERROR RECOVERY - TABLE ENTRIES
5. Phrase Mode Error Recovery

1/28/2025 Department of Computer Science and Engineering 180


BOTTOM-UP PARSING

1/28/2025 Department of Computer Science and Engineering 181


RIGHTMOST DERIVATION IN REVERSE

1/28/2025 Department of Computer Science and Engineering 182


RIGHTMOST DERIVATION IN REVERSE

LR parsing corresponds to rightmost derivation in reverse


1/28/2025 Department of Computer Science and Engineering 183
REDUCTION

• A reduction step replaces a specific substring (matching the body of a production)

• Reduction is the opposite of derivation


• Bottom up parsing is a process of reducing a string  to the start symbol S of the
grammar

1/28/2025 Department of Computer Science and Engineering 184


HANDLE

1/28/2025 Department of Computer Science and Engineering 185


HANDLE PRUNING
Grammar
Reduction of id + id * id
to start symbol E

Rightmost Derivation for


id + id * id

1/28/2025 Department of Computer Science and Engineering 186


SHIFT-REDUCE PARSING

1/28/2025 Department of Computer Science and Engineering 187


SHIFT-REDUCE PARSING

• It takes the given input string and builds a parse tree


• Starting from the bottom at the leaves.
• And growing the tree towards the top to the root.

1/28/2025 Department of Computer Science and Engineering 188


SHIFT-REDUCE PARSING

Data Structures-

Two data structures are required to implement a shift-


reduce parser
- A Stack is required to hold the grammar symbols.
- An Input buffer is required to hold the string to be
parsed.

1/28/2025 Department of Computer Science and Engineering 189


SHIFT-REDUCE PARSING
Working

Initially, shift-reduce parser is present in the following


configuration where-
-- Stack contains only the $ symbol.
-- Input buffer contains the input string with $ at its end.

1/28/2025 Department of Computer Science and Engineering 190


SHIFT-REDUCE PARSING

The parser works by


- Moving the input symbols on the top of the stack.
- Until a handle β appears on the top of the stack.

The parser keeps on repeating this cycle until


- An error is detected.
- Or stack is left with only the start symbol and the input buffer
becomes empty.

1/28/2025 Department of Computer Science and Engineering 191


SHIFT-REDUCE PARSING

After achieving this configuration,


- The parser stops / halts.
- It reports the successful completion of parsing.

1/28/2025 Department of Computer Science and Engineering 192


SHIFT-REDUCE PARSING
Possible Actions : A shift-reduce parser can possibly make
the following four actions-

1. Shift-

In a shift action,
The next symbol is shifted onto the top of the stack.

2. Reduce-

In a reduce action,
The handle appearing on the stack top is replaced with the
appropriate non-terminal symbol.
1/28/2025 Department of Computer Science and Engineering 193
SHIFT-REDUCE PARSING
3. Accept
In an accept action,
The parser reports the successful completion of parsing.

4. Error

In this state,
The parser becomes confused and is not able to make any
decision.
It can neither perform shift action nor reduce action nor accept
action.

1/28/2025 Department of Computer Science and Engineering 194


SHIFT-REDUCE PARSING

Rules To Remember

- It is important to remember the following rules while


performing the shift-reduce action-
- If the priority of incoming operator is more than the priority of
in stack operator, then shift action is performed.
- If the priority of incoming operator is same or less than the
priority of in stack operator, then reduce action is performed.

1/28/2025 Department of Computer Science and Engineering 195


SHIFT-REDUCE PARSING
Consider the following grammar-
E→E–E
E→ExE
E → id
Parse the input string id – id x id using a shift-reduce parser.

Solution-

The priority order is: id > x > –


1/28/2025 Department of Computer Science and Engineering 196
SHIFT-REDUCE PARSING
E→E–E
E→ExE
E → id
Stack Input Buffer Parsing Action
$ id – id x id $ Shift
$ id – id x id $ Reduce E → id
$E – id x id $ Shift
$E– id x id $ Shift
$ E – id x id $ Reduce E → id
$E–E x id $ Shift
$E–Ex id $ Shift
$ E – E x id $ Reduce E → id
$E–ExE $ Reduce E → E x E
$E–E $ Reduce E → E – E
$E $ Accept

1/28/2025 Department of Computer Science and Engineering 197


SHIFT-REDUCE PARSING
S→aTRe Remaining input: abbcde
T→Tbc|b
R→d

Rightmost derivation:
S ➔ aTRe
➔ aTde
➔ a Tb c de
➔abbcde

1/28/2025 Department of Computer Science and Engineering 198


SHIFT-REDUCE PARSING
S→aTRe Remaining input: bcde
T→Tbc|b
R→d

➔Shift a, Shift b

a b Rightmost derivation:
S ➔ aTRe
➔ aTde
➔ a Tb c de
➔abbcde

1/28/2025 Department of Computer Science and Engineering 199


SHIFT-REDUCE PARSING
S→aTRe Remaining input: bcde
T→Tbc|b
R→d

➔Shift a, Shift b
➔Reduce T → b
T

a b
Rightmost derivation:
S ➔ aTRe
➔ aTde
➔ a T b c de
➔ ab b c de

1/28/2025 Department of Computer Science and Engineering 200


SHIFT-REDUCE PARSING
S→aTRe Remaining input: de
T→Tbc|b
R→d

➔Shift a, Shift b
➔Reduce T → b
T
➔Shift b, Shift c
a b b c
Rightmost derivation:
S ➔ aTRe
➔ aTde
➔ a T b c de
➔ ab b c de

1/28/2025 Department of Computer Science and Engineering 201


SHIFT-REDUCE PARSING
S→aTRe Remaining input: de
T→Tbc|b
R→d

➔Shift a, Shift b T
➔Reduce T → b
T
➔Shift b, Shift c
➔Reduce T → T b c
a b b c
Rightmost derivation:
S ➔ aTRe
➔ aTde
➔ a T b c de
➔ ab b c de

1/28/2025 Department of Computer Science and Engineering 202


SHIFT-REDUCE PARSING
•S → a T R e T Remaining input: e

→Tbc|b R
→d T

T
• ➔Shift a, Shift b
• ➔Reduce T → b a b b c d
Rightmost derivation:
• ➔Shift b, Shift c S ➔ aTRe
• ➔Reduce T → T b c ➔ aTde

• ➔Shift d ➔ a T b c de
➔ ab b c de

1/28/2025 Department of Computer Science and Engineering 203


SHIFT-REDUCE PARSING
•S → a T R e T Remaining input: e

→Tbc|b R
→d T

T R
• ➔Shift a, Shift b
• ➔Reduce T → b a b b c d
Rightmost derivation:
• ➔Shift b, Shift c S ➔ aTRe
• ➔Reduce T → T b c ➔ aTde

• ➔Shift d ➔ a T b c de
➔ ab b c de
• ➔Reduce R → d

1/28/2025 Department of Computer Science and Engineering 204


SHIFT-REDUCE PARSING
•S → a T R e T Remaining input:

→Tbc|b R
→d T

T R
• ➔Shift a, Shift b
• ➔Reduce T → b a b b c d e
Rightmost derivation:
• ➔Shift b, Shift c S ➔ aTRe
• ➔Reduce T → T b c ➔ aTde

• ➔Shift d ➔ a T b c de
➔ ab b c de
• ➔Reduce R → d
• ➔Shift e

1/28/2025 Department of Computer Science and Engineering 205


SHIFT-REDUCE PARSING
S→aTRe Remaining input:
T→Tbc|b
R→d S

➔Shift a, Shift b T
➔Reduce T → b
T R
➔Shift b, Shift c
➔Reduce T → T b c
➔Shift d a b b c d e
Rightmost derivation:
➔Reduce R → d S ➔ aTRe
➔Shift e
➔ aTde
➔Reduce S → a T R e ➔ a T b c de
➔ ab b c de

1/28/2025 Department of Computer Science and Engineering 206


SHIFT-REDUCE PARSING - EXAMPLE
Consider the grammar:

Stack Input Action

$ id1 + id2$ shift


$id1 + id2$ reduce 6

$F + id2$ reduce4
$T + id2$ reduce2
$E + id2$ shift
$E + id2$ shift
$E + id2 reduce6
$E + F reduce 4
$E + T reduce 1
$E accept

1/28/2025 Department of Computer Science and Engineering 207


SHIFT-REDUCE PARSING
• Handle will always appear on Top of stack, never
inside
• Possible forms of two successive steps in any
rightmost derivation
STACK INPUT
• CASE 1:
S
$ yz$
A
After Reducing the
yz$
B handle
  y $B
S * Az  Byz 
yz z Shifting from Input z$
rm rm rm
$By
Reduce the handle

$A z$

1/28/2025 Department of Computer Science and Engineering 208


SHIFT-REDUCE PARSING
STACK INPUT
• Case 2: $ xyz$
S After Reducing the handle
B A
$B xyz$
  x y z
Shifting from Input
S * BxAz  Bxyz 
xyz $Bxy z$
rm rm rm
Reducing the handle
$BxA
z$

1/28/2025 Department of Computer Science and Engineering 209


SHIFT-REDUCE CONFLICT
1. stmt → id(parameter_list)
2. stmt → expr:=expr
3. parameter_list → parameter_list,
parameter
4. parameter_list → parameter
5. parameter_list → id
6. expr → id(expr_list)
7. expr → id
8. expr_list → expr_list, expr
9. expr_list → expr

STACK INPUT
….id ( id , id ) …$

 We can’t decide which production will be used to


reduce id?

1/28/2025 Department of Computer Science and Engineering 210


SHIFT-REDUCE PARSERS

1. Operator-Precedence Parser
– simple, but only a small class of grammars.
CFG
LR SLR LALR
2. LR-Parsers

– covers wide range of grammars.


• SLR – simple LR parser
• LR – most general LR parser
• LALR – intermediate LR parser (lookhead LR parser)
– SLR, LR and LALR work same, only their parsing tables are different.

1/28/2025 Department of Computer Science and Engineering 211


LR PARSERS

LR parsing is attractive because:


– LR parsing is most general non-backtracking shift-reduce
parsing,
yet it is still efficient.
– The class of grammars that can be parsed using LR methods is
a proper superset of the class of grammars that can be parsed
with predictive parsers.
LL(1)-Grammars  LR(1)-Grammars
– An LR-parser can detect a syntactic error as soon as it is possible
to do so a left-to-right scan of the input.
– LR parsers can be constructed to recognize virtually all programming
language constructs for which CFG grammars canbe written

Drawback of LR method:
– Too much work to construct LR parser by hand
• Fortunately tools (LR parsers generators) are available

1/28/2025 Department of Computer Science and Engineering 212


LL VS. LR

 LR (shift reduce) is more powerful than LL


(predictive parsing)

 Can detect a syntactic error as soon as possible.

 LR is difficult to do by hand (unlike LL)

1/28/2025 Department of Computer Science and Engineering 213


LR PARSING ALGORITHM

input a1 ... ai ... an $


stack
Sm
Xm
Sm-1 LR Parsing
Xm-1 Algorithm Output
.
.
S1 Action Table Goto Table
X1 terminals and $ non-terminal
S0 s four different s
t actions t each item is
a a a state number
t t
e e
s s

1/28/2025 Department of Computer Science and Engineering 214


CONFIGURATION OF LR PARSING

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ )

Stack Rest of Input

• Sm and ai decides the parser action by consulting the


parsing action table. (Initial Stack
contains just So)

• A configuration of a LR parsing represents the right


sentential form:
X1 ... Xm ai ai+1 ... an $

1/28/2025 Department of Computer Science and Engineering 215


ACTIONS OF A LR-PARSER

1. shift s -- shifts the next input symbol and the state s onto the stack

( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1


S1 ... Xm Sm ai s, ai+1 ... an $ )

2. reduce A→ (or rN where N is a production number)


– pop 2|| (=r) items from the stack;

– then push Aand s where s=goto[sm-r,A]


( So X1 S1 ... Xm Sm, ai ai+1 ... an $ ) € ( So X1 S1 ... Xm-r
Sm-r A s, ai ... an $ )
– Output is the reducing production reduce A→

Accept – Parsing successfully completed

Error -- Parser detected an error (an empty entry in the action table)

1/28/2025 Department of Computer Science and Engineering 216


ACTIONS OF A LR-PARSER
Reduce Action

• pop 2||(=r)items from the stack;

• let us assume that  = Y1Y2...Yr


• then push A and s where s=goto[sm-r,A]
( So X1 S1 ... Xm-r Sm-r Y1 Sm-r ...Yr Sm, ai ai+1 ... an $ )
€ ( So X1 S1 ... Xm-r Sm-r A s, ai ... an $ )

• In fact, Y1Y2...Yr is a handle.


X1 ... Xm-r A ai ... an $ c X1 ... Xm Y1...Yr ai ai+1 ...
an $

1/28/2025 Department of Computer Science and Engineering 217


LR PARSER STACK(S)

1/28/2025 Department of Computer Science and Engineering 218


LR PARSER STACK(S)

1/28/2025 Department of Computer Science and Engineering 219


CONSTRUCTING SLR PARSING TABLES – LR(0) ITEM

• An item indicates how much of a production we have seen at a given point in the
parsing process
• For Example the item A → X • YZ
– We have already seen on the input a string derivable from X
– We hope to see a string derivable from YZ
• For Example the item A → • XYZ
– We hope to see a string derivable from XYZ
• For Example the item A → XYZ •
– We have already seen on the input a string derivable from XYZ
– It is possibly time to reduce XYZ toA

• Special Case:
Rule: A →  yields only one item
A→•

1/28/2025 Department of Computer Science and Engineering 220


CONSTRUCTING SLR PARSING TABLES

• A collection of sets of LR(0) items (the canonical LR(0) collection) is


the basis for constructing SLR parsers.

• Canonical LR(0) collection provides the basis of constructing a


DFA called LR(0) automaton
– This DFA is used to make parsing decisions

• Each state of LR(0) automaton represents a set of


items in the canonical LR(0) collection

• To construct the canonical LR(0) collection for a grammar


– Augmented Grammar
– CLOSURE function
– GOTO function
1/28/2025 Department of Computer Science and Engineering 221
CONSTRUCTING SLR PARSING TABLES – LR(0) ITEM

• An LR(0) item of a grammar G is a production of G a dot at the some position of the


right side.
A → • aBb
• Ex: A → aBb
A → a • Bb
A → aB • b
Possible LR(0) Items: A → aBb •
(four different
possibility)

• Sets of LR(0) items will be the states of action and goto table of the SLR parser.
– States represent sets of “items”
• LR parser makes shift-reduce decision by maintaining states to keep track of where we
are in a parsing process

1/28/2025 Department of Computer Science and Engineering 222


GRAMMAR AUGMENTATION

1/28/2025 Department of Computer Science and Engineering 223


THE CLOSURE OPERATION

• If I is a set of LR(0) items for a grammar G, then closure(I) is the set of


LR(0) items constructed from I by the two rules:

1. Initially, every LR(0) item in I is added to closure(I).

2. If A → .B is in closure(I) and


B→ is a production rule of G;
then B→. will be in the closure(I).
We will apply this rule until no more new LR(0) items can be
added to closure(I).

1/28/2025 Department of Computer Science and Engineering 224


CONSTRUCTION OF THE CANONICAL LR(0) COLLECTION

• To create the SLR parsing tables for a grammar G, we will


create the canonical LR(0) collection of the grammar G’.

• Algorithm:
C is { closure({S'→ • S}) }
repeat the followings until no more set of LR(0) items can
be added to C.
for each I in C and each grammar symbol X
if GOTO(I,X) is not empty and not in C
add GOTO(I,X) to C

• GOTO function is a DFA on the sets in C.

1/28/2025 Department of Computer Science and Engineering 225


CONSTRUCTION OF THE CANONICAL LR(0) COLLECTION

I0: E’ → .E I1: E’ → E. I6: E → E+.T I9: E → E+T.


E → .E+T E → E.+T T → .T*F T → T.*F
E → .T T → .F
T → .T*F I2: E → T. F → .(E) I10: T → T*F.
T → T.*F F → .id

T → .F
F → .(E)
F → .id I3: T → F. I7: T → T*.F I11: F → (E).
F → .(E)
I4: F → (.E) F → .id
E → .E+T
E → .T I8: F → (E.)
T → .T*F E → E.+T
I5: F → id. T → .F
F → .id F → .(E)

1/28/2025 Department of Computer Science and Engineering 226


CONSTRUCTING SLR PARSING TABLE

1. Construct the canonical collection of sets of LR(0) items for G’.


C{I0,...,In}

2. Create the parsing action table as follows:


 If a is a terminal, A→.a in Ii and goto(Ii,a)=Ij then action[i,a] is shift j.
 If A→. is in Ii , then action[i,a] is reduce A→
for all a in FOLLOW(A) where AS’.
 If S’→S. is in Ii , thenaction[i,$] is accept.
 If any conflicting actions generated by these rules, the grammar is
not SLR(1).
3. Create the parsing goto table
 for all non-terminals A,
if goto(Ii,A)=Ij then goto[i,A]=j
4. All entries not defined by (2) and (3) are errors.
5. Initial state of the parser contains S’→.S
1/28/2025 Department of Computer Science and Engineering 227
PARSING TABLES OF EXPRESSION GRAMMAR

Action Table Goto Table

state id + * ( ) $ E T F
0 s5 s4 1 2 3
1 s6 acc
2 r2 s7 r2 r2
3 r4 r4 r4 r4
4 s5 s4 8 2 3
5 r6 r6 r6 r6
6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

1/28/2025 Department of Computer Science and Engineering 228


PARSING TABLES OF EXPRESSION GRAMMAR

Action Table Goto Table

state id + * ( ) $ E T F
0 s5 s4 1 2 3
Key to Notation
1 s6 acc

S4=“Shift input symbol and 2 r2 s7 r2 r2


push state 4” 3 r4 r4 r4 r4
R5= “Reduce by rule 5” 4 s5 s4 8 2 3
Acc=Accept 5 r6 r6 r6 r6
(blank)=Syntax Error 6 s5 s4 9 3
7 s5 s4 10
8 s6 s11
9 r1 s7 r1 r1
10 r3 r3 r3 r3
11 r5 r5 r5 r5

1/28/2025 Department of Computer Science and Engineering 229


1/28/2025 Dr.T SAJU RAJ 230
EXAMPLE LR PARSE: (ID+ID)*ID

1/28/2025 Department of Computer Science and Engineering 231


EXAMPLE LR PARSE: (ID+ID)*ID

1/28/2025 Department of Computer Science and Engineering 232


EXAMPLE LR PARSE: (ID+ID)*ID

1/28/2025 Department of Computer Science and Engineering 233


ACTIONS OF A (S)LR-PARSER -- EXAMPLE
stack input action output
0 id*id+id$ shift 5
0id5 *id+id$ reduce by F→id F→id
0F3 *id+id$ reduce by T→F T→F
0T2 *id+id shift 7
0T2*7 $ shift 5
id+id
$
0T2*7id5 +id$ reduce by F→id F→id
0T2*7F10 +id$ reduce by T→T*F T→T*F
0T2 +id$ reduce by E→T E→T
0E1 +i shift 6
0E1+6 d$ shift 5
0E1+6id5 id$ reduce by F→id F→id
$
0E1+6F3 $ reduce by T→F T→F
0E1+6T9 $ reduce by E→E+T E→E+
1/28/2025 Department of Computer Science and Engineering 234
LR PARSING ALGORITHM

1/28/2025 Department of Computer Science and Engineering 235


SLR GRAMMAR: REVIEW

 An LR parser using SLR parsing tables for a


grammar G is called as the SLR parser for G.

 If a grammar G has an SLR parsing table, it is called SLR grammar (or SLR
grammar in short).

 Every SLR grammar is unambiguous, but every unambiguous grammar is not


a SLR grammar.

 If the SLR parsing table of a grammar G has a conflict, we say that that
grammar is not SLR grammar.

1/28/2025 Department of Computer Science and Engineering 236


CONFLICT EXAMPLE

1/28/2025 Department of Computer Science and Engineering 237


CONFLICT EXAMPLE

1/28/2025 Department of Computer Science and Engineering 238


CONFLICT EXAMPLE

1/28/2025 Department of Computer Science and Engineering 239


TASK 01

1. Derive the (i) LL(1) parse table, (ii) LR(0)


automation, (iii) LR(0) parse table of following
grammar:
S-> SS+
S-> SS*
S-> a

2. Parse the following string using (i) LL (1) parse


table, (ii) LR (0) parse table:
“aa+a*a+”
2

1/28/2025 Department of Computer Science and Engineering 240


LR(1) ITEM

 To avoid some of invalid reductions, the states need to carry more


information.

 Extra information is put into a state by including a terminal symbol


as a second component in an item.

 A LR(1) item is:


.
A →  ,a where a is the look-head of the LR(1) item
(a is a terminal or end-marker.)

1/28/2025 Department of Computer Science and Engineering 241


LR(1) ITEM

1/28/2025 Department of Computer Science and Engineering 242


INTUITION BEHIND LR (1) ITEMS

1/28/2025 Department of Computer Science and Engineering 243


INTUITION BEHIND LR (1) ITEMS

1/28/2025 Department of Computer Science and Engineering 244


THE CLOSURE FUNCTION

1/28/2025 Department of Computer Science and Engineering 245


THE CLOSURE FUNCTION

1/28/2025 Department of Computer Science and Engineering 246


THE GOTO FUNCTION

1/28/2025 Department of Computer Science and Engineering 247


1/28/2025 Dr.T SAJU RAJ 248
LR (1) EXAMPLE

10

1/28/2025 Department of Computer Science and Engineering 249


LR (1) EXAMPLE

10

1/28/2025 Department of Computer Science and Engineering 250


LR (1) EXAMPLE

10

1/28/2025 Department of Computer Science and Engineering 251


LR (1) EXAMPLE

10

1/28/2025 Department of Computer Science and Engineering 252


LR (1) EXAMPLE

10

1/28/2025 Department of Computer Science and Engineering 253


CONSTRUCTION OF LR(1) PARSING

1. Construct the canonical collection of sets of LR(1) items for


G’. C{I0,...,In}
2. Create the parsing action table as follows
• .
If a is a terminal, A→ a ,b in Ii and goto(Ii,a)=Ij then action[i,a] is
shift j.
• .
If A→ ,a is in Ii , then action[i,a] is reduce A→ where AS’.
• .
If S’→S ,$ is in Ii , then action[i,$] is accept.
• If any conflicting actions generated by these rules, the grammar is not
LR(1).

3. Create the parsing goto table


• for all non-terminals A, if goto(Ii,A)=Ij then goto[i,A]=j

4. All entries not defined by (2) and (3) are errors.


12
5. Initial state of the parser contains S’→.S,$
1/28/2025 Department of Computer Science and Engineering 254
LALR Parsing Tables
• LALR stands for LookAhead LR.

• LALR parsers are often used in practice because


LALR parsing tables are smaller than LR(1) parsing
tables.
• The number of states in SLR and LALR parsing tables
for a grammar G are equal.
• But LALR parsers recognize more grammars than
SLR parsers.
• Astate of LALR parser will be again a set of LR(1)
items.

1/28/2025 Department of Computer Science and Engineering 255


CREATING LALR PARSING TABLES

Canonical LR(1) Parser € LALR Parser


shrink # of states

• This shrink process may introduce a reduce/reduce


conflict in the resulting LALR parser (so the grammar
is NOT LALR)

• But, this shrink process does not produce a


shift/reduce conflict.

1/28/2025 Department of Computer Science and Engineering 256


Reduce/Reduce Conflict
• But, we may introduce a reduce/reduce conflict during the
shrink process for the creation of the states of a LALR parser.

.
I1 : A →  ,a I2: A →  ,b.
.
B →  ,b B →  ,c .

.
I12: A →  ,a/b € reduce/reduce conflict
.
B →  ,b/c

1/28/2025 Department of Computer Science and Engineering 257


1/28/2025 Dr.T SAJU RAJ 258
Canonical LR or CLR(1) Parsing table

1/28/2025 Department of Computer Science and Engineering 259


1/28/2025 Dr.T SAJU RAJ 260
Merging of States

1/28/2025 Department of Computer Science and Engineering 261


Merging of States (contd.)

1/28/2025 Department of Computer Science and Engineering 262


Canonical Collection of LR(1) items

1/28/2025 Department of Computer Science and Engineering 263


Automatic construction of parsers (YACC)

• Two classical tools for compilers:


– Lex: A Lexical Analyzer Generator
– Yacc: “Yet Another Compiler Compiler” (Parser Generator)
• Lex creates programs that scan your tokens one by one.
• Yacc takes a grammar (sentence structure) and generates a
parser.

Lexical Rules Grammar Rules

Lex Yacc

Input yylex() yyparse() Parsed Input


1/28/2025 Department of Computer Science and Engineering 264
YACC specifications.
• Lex and Yacc generate C code for your analyzer & parser.

Lexical Rules Grammar Rules

C code Lex C code Yacc

Parsed
Input yylex() yyparse()
token Input
char C code stream C code
stream
LexicalAnalyzer Parser
(Tokenizer)

1/28/2025 Department of Computer Science and Engineering 265


Automatic construction of parsers (YACC)

• Often, instead of the standard Lex and Yacc,


Flex and Bison are used:
– Flex: A fast lexical analyzer
– (GNU) Bison: A drop-in replacement for (backwards
compatible with) Yacc
• Byacc is Berkeley implementation of Yacc (so it
is Yacc).

1/28/2025 Department of Computer Science and Engineering 106 266


Automatic construction of parsers (YACC)

• Yacc is not a new tool, and yet, it is still used in many


projects.
• Yacc syntax is similar to Lex/Flex at the top level.
• Lex/Flex rules were regular expression – action pairs.
• Yacc rules are grammar rule – action pairs.

declarations
%%
rules
%%
programs

1/28/2025 Department of Computer Science and Engineering 107 267


Automatic construction of parsers (YACC)

1/28/2025 Department of Computer Science and Engineering 108 268


Automatic construction of parsers (YACC)

• The Translation Rules Part

1/28/2025 Department of Computer Science and Engineering 110 269


Automatic construction of parsers (YACC)

1/28/2025 Department of Computer Science and Engineering 115 270


Design of a syntax analyzer

yacc –d bas.y # create y.tab.h, y.tab.c


lex bas.l # create lex.yy.c
cc lex.yy.c y.tab.c –o bas.exe # compile/link

1/28/2025 Department of Computer Science and Engineering 116 271


Design of a syntax analyzer

1/28/2025 Department of Computer Science and Engineering 120 272


Design of a syntax analyzer

1/28/2025 Department of Computer Science and Engineering 122 273


Design of a syntax analyzer

1/28/2025 Department of Computer Science and Engineering 127 274


Thank You

Department of Computer Science and Engineering

You might also like