Name: Neil Kapadia, Shaikh Shams Aaliya
Roll No: 7, 14
STD: MSC-I
Subject: Design and Implementation of Modern Compilers
A project on topic
“LR Parser”
COMPILER
A compiler translates the code written in one language to some other language
without changing the meaning of the program. It is also expected that a compiler
should make the target code efficient and optimized in terms of time and space.
Compiler design principles provide an in-depth view of translation and
optimization process. Compiler design covers basic translation mechanism and
error detection & recovery. It includes lexical, syntax, and semantic analysis as
front end, and code generation and optimization as back-end.
TYPES OF COMPILER
Syntax analyzers follow production rules defined by means of context-free
grammar. The way the production rules are implemented (derivation) divides
parsing into two types : top-down parsing and bottom-up parsing.
BOTTOM UP
Bottom-up parsing starts from the leaf nodes of a tree and works in upward
direction till it reaches the root node. Here, we start from a sentence and then
apply production rules in reverse manner in order to reach the start symbol. The
image given below depicts the bottom-up parsers available.
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing. These steps are
known as shift-step and reduce-step.
Shift step: The shift step refers to the advancement of the input pointer to
the next input symbol, which is called the shifted symbol. This symbol is
pushed onto the stack. The shifted symbol is treated as a single node of the
parse tree.
Reduce step : When the parser finds a complete grammar rule (RHS) and
replaces it to (LHS), it is known as reduce-step. This occurs when the top of
the stack contains a handle. To reduce, a POP function is performed on the
stack which pops off the handle and replaces it with LHS non-terminal
symbol.
Bottom up parsing is also known as shift-reduce parsing.
Bottom up parsing is used to construct a parse tree for an input string.
In the bottom up parsing, the parsing starts with the input symbol and
construct the parse tree up to the start symbol by tracing out the rightmost
derivations of string in reverse.
Example
Production
Parse Tree representation of input string "id * id" is as follows:
1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. Table Driven LR Parsing
a. LR( 1 )
b. SLR( 1 )
c. CLR ( 1 )
d. LALR( 1 )
LR Parser
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide
class of context-free grammar which makes it the most efficient syntax analysis
technique. LR parsers are also known as LR(k) parsers, where L stands for left-to-
right scanning of the input stream; R stands for the construction of right-most
derivation in reverse, and k denotes the number of lookahead symbols to make
decisions.
There are three widely used algorithms available for constructing an LR parser:
SLR(1) – Simple LR Parser:
o Works on smallest class of grammar
o Few number of states, hence very small table
o Simple and fast construction
LR(1) – LR Parser:
o Works on complete set of LR(1) Grammar
o Generates large table and large number of states
o Slow construction
LALR(1) – Look-Ahead LR Parser:
o Works on intermediate size of grammar
o Number of states are same as in SLR(1)
LR Parsing Algorithm
Here we describe a skeleton algorithm of an LR parser:
LL vs. LR
LR algorithm:
The LR algorithm requires stack, input, output and parsing table. In all type of LR
parsing, input, output and stack are same but parsing table is different.
Input buffer is used to indicate end of input and it contains the string to be parsed
followed by a $ Symbol.
A stack is used to contain a sequence of grammar symbols with a $ at the bottom
of the stack.
Parsing table is a two dimensional array. It contains two parts: Action part and Go
To part.
LR (1) Parsing
Various steps involved in the LR (1) Parsing:
o For the given input string write a context free grammar.
o Check the ambiguity of the grammar.
o Add Augment production in the given grammar.
o Create Canonical collection of LR (0) items.
o Draw a data flow diagram (DFA).
o Construct a LR (1) parsing table.
Augment Grammar
Augmented grammar G` will be generated if we add one more production in the
given grammar G. It helps the parser to identify when to stop the parsing and
announce the acceptance of the input.
Example
Given grammar
The Augment grammar G` is represented by
Canonical Collection of LR(0) items
An LR (0) item is a production G with dot at some position on the right side of the
production.
LR(0) items is useful to indicate that how much of the input has been scanned up
to a given point in the process of parsing.
In the LR (0), we place the reduce node in the entire row.
Example:
Given grammar:
I0 State:
Add Augment production to the I0 State and Compute the Closure
I0 = Closure (S` → •S)
Add all productions starting with S in to I0 State because "•" is followed by the
non-terminal. So, the I0 State becomes
I0 = S` → •S
S → •AA
Add all productions starting with "A" in modified I0 State because "•" is followed
by the non-terminal. So, the I0 State becomes.
I0= S` → •S
S → •AA
A → •aA
A → •b
I1= Go to (I0, S) = closure (S` → S•) = S` → S•
Here, the Production is reduced so close the State.
I1= S` → S•
I2= Go to (I0, A) = closure (S → A•A)
Add all productions starting with A in to I2 State because "•" is followed by the
non-terminal. So, the I2 State becomes
I2 =S→A•A
A → •aA
A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
Go to (I2, b) = Closure (A → b•) = (same as I4)
I3= Go to (I0,a) = Closure (A → a•A)
Add productions starting with A in I3.
A → a•A
A → •aA
A → •b
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
I4= Go to (I0, b) = closure (A → b•) = A → b•
I5= Go to (I2, A) = Closure (S → AA•) = SA → A•
I6= Go to (I3, A) = Closure (A → aA•) = A → aA•
Drawing DFA:
LR(0) Table
o If a state is going to some other state on a terminal then it correspond to a
shift move.
o If a state is going to some other state on a variable then it correspond to go to
move.
o If a state contain the final item in the particular row then write the reduce
node completely.
Explanation:
o I0 on S is going to I1 so write it as 1.
o I0 on A is going to I2 so write it as 2.
o I2 on A is going to I5 so write it as 5.
o I3 on A is going to I6 so write it as 6.
o I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
o I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
o I4, I5 and I6 all states contains the final item because they contain • in the
right most end. So rate the production as production number.
Productions are numbered as follows:
o I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
o I4 contains the final item which drives A → b• and that production
corresponds to the production number 3 so write it as r3 in the entire row.
o I5 contains the final item which drives S → AA• and that production
corresponds to the production number 1 so write it as r1 in the entire row.
o I6 contains the final item which drives A → aA• and that production
corresponds to the production number 2 so write it as r2 in the entire row.