Bottom-Up Parsing
A bottom-up parse
corresponds to the
construction of a parse tree
for an input string
beginning at the leaves
(the bottom) and working
A bottom-up parse of the token string id * id with
up towards the root (the respect to the following expression grammar:
top).
CD / Module-III / Jasaswi 1 17 March 2024
Rightmost Derivation of e
CD / Module-III / Jasaswi 2 17 March 2024
Bottom-up Parsing: Reverse of rightmost derivation
Bottom-up Parsing is the reverse of rightmost derivation.
CD / Module-III / Jasaswi 3 17 March 2024
Reductions
Bottom-up parsing reduces a string to the start symbol
• At each reduction step, a chosen substring that is the RHS (or body) of a
production is replaced by the LHS (or head) nonterminal.
Key decision: When to reduce and about what production to apply, as the parse
proceeds.
CD / Module-III / Jasaswi 4 17 March 2024
Reductions: Example
The following figure illustrate a sequence of reductions of the token string id * id
of the given expression grammar.
The reductions will be discussed in terms of the sequence of strings id * id, F * id,
T * id, T * F, T, E.
The strings in this sequence are formed from the roots of all the subtrees in the
snapshots.
CD / Module-III / Jasaswi 5 17 March 2024
Handle
It is a substring that matches the body of a production.
• Reducing the handle is one step in the reverse of the rightmost derivation
CD / Module-III / Jasaswi 6 17 March 2024
Handle: Formal Definition
If then production in
the position following is a handle of .
For convenience, we refer to the body
rather than as a handle.
String right of a handle must contain only
terminals.
If a grammar is unambiguous, then every
right-sentential form of the grammar has
exactly one handle.
If grammar is ambiguous, then there can be
more than one rightmost derivation of .
CD / Module-III / Jasaswi 7 17 March 2024
Handle Pruning
A rightmost derivation in reverse can be obtained by "handle pruning”.
We start with a string of terminals w to be parsed. If w is a sentence of the
grammar at hand, then let , where , is the nth right-sentential form of
some as yet unknown rightmost derivation.
To reconstruct this derivation in reverse order, we locate the handle in and
replace by the head of the relevant production , to obtain the previous
right-sentential form .
We then repeat this process. That is, we locate the handle in and
reduce this handle to obtain the right-sentential form .
If by continuing this process we produce a right-sentential form consisting only of
the start symbol S, then we halt and announce successful completion of parsing.
CD / Module-III / Jasaswi 8 17 March 2024
Shift Reduce Parsing
Shift-Reduce Parsing
It is a type of bottom-up parsing in which a stack holds grammar symbols and an
input buffer holds the rest of the string to be parsed.
Bottom of the stack and end of input are represented by $.
Convention: The top of the stack is shown on the right rather than on the left as
we did for top-down parsing.
Shift-Reduce Parsing works with two primary actions, shift and reduce
• Other obvious actions are accept and error.
During a left-to-right scan of the input string, the parser shifts zero or more input
symbols onto the stack, until it is ready to reduce a string P of grammar symbols
on top of the stack.
It then reduces to the head of the appropriate production. The parser repeats
this cycle until it has detected an error or until the stack contains the start symbol
and the input is empty.
CD / Module-III / Jasaswi 10 17 March 2024
Shift-Reduce Parsing Actions
Shift: shift the next input symbol onto the top of the stack.
Reduce: The right end of the string to be reduced must be at the top of the stack.
Locate the left end of the string within the stack and decide with what nonterminal
to replace the string.
Accept: Announce successful completion of parsing.
Error: Discover a syntax error and call an error recovery routine.
CD / Module-III / Jasaswi 11 17 March 2024
Shift-Reduce Parsing: Example
CD / Module-III / Jasaswi 12 17 March 2024
Handle on Top of the Stack
Is the following scenario possible?
CD / Module-III / Jasaswi 13 17 March 2024
Possible Choices in Rightmost Derivation
Case (1) Case (2)
NOTE:
• In both cases, after making a reduction the parser had to shift zero or more
symbols to get the next handle onto the stack.
• It never had to go into the stack to find the handle.
CD / Module-III / Jasaswi 14 17 March 2024
Shift-Reduce Actions Conflicts
Shift: shift the next input symbol from the right string onto the top of the stack.
Reduce: identify a string on top of the stack that is the body of a production, and
replace the body with the head.
How do you decide when to shift and when to reduce?
General shift-reduce technique:
• If there is no handle on the stack, then shift
• If there is a handle on the stack, then reduce
Bottom up parsing is essentially the process of detecting handles and reducing
them.
Different bottom-up parsers differ in the way they detect handles.
CD / Module-III / Jasaswi 15 17 March 2024
Challenges in Bottom-up Parsing
CD / Module-III / Jasaswi 16 17 March 2024
Shift-Reduce Conflict: Example 1
CD / Module-III / Jasaswi 17 17 March 2024
Shift-Reduce Conflict: Example 2
NOTE:
• shift-reduce parsing can be adapted to parse certain ambiguous grammars, such as
the if-then-else grammar above.
• If we resolve the shift/reduce conflict on else in favor of shifting, the parser will behave
as we expect, associating each else with the previous unmatched then.
CD / Module-III / Jasaswi 18 17 March 2024
Reduce-Reduce Conflict: Example 1
CD / Module-III / Jasaswi 19 17 March 2024
LR Parsing
LR(k) Parsing
Popular bottom-up parsing scheme based on a concept called LR(k) parsing
• L is for left-to-right scan of input
• R is for reverse of rightmost derivation
• k is the number of input symbols of lookahead that are used in making parsing
decisions.
The cases k = 0 or k = 1 are of practical interest
When (k) is omitted, k is assumed to be 1.
LR parsers are table-driven, like the non-recursive LL parser
LR grammar is one for which we can construct an LR parsing table
CD / Module-III / Jasaswi 21 17 March 2024
Why LR Parsers?
LR parsers can be constructed to recognize all programming-language
constructs for which CFGs can be written.
Most general non-backtracking shift-reduce parsing method.
An LR parser can detect a syntactic error as soon as it possible to do so on a left-
to-right scan
The class of grammars that can be passed using LR methods is a proper
superset of the class of grammars that can be parsed with predictive or LR
methods.
• LL(k) parsing predicts which production to use having seen only the first k
tokens of the right-hand side.
• LR(k) parsing can decide after it has seen input tokens corresponding to the
entire right-hand side of the production.
CD / Module-III / Jasaswi 22 17 March 2024
Block Diagram of LR Parser
CD / Module-III / Jasaswi 23 17 March 2024
LR Parsing
Remember the basic question: when to shift and when to reduce!
Information is encoded in a DFA constructed using canonical LR(0) collection
1. Augmented grammar with new start symbol and rule
2. Define helper functions CLOSURE() and GOTO()
CD / Module-III / Jasaswi 24 17 March 2024
Items and the LR(0) Automaton
The basic question in LR parsing: when to shift and when to reduce!
An LR parser makes shift-reduce decisions by maintaining states to keep track of
where we are in a parse.
States represent sets of "items."
An LR(0) item (item for short) of a grammar G is a production of G with a dot at
some position of the body.
An item indicates how much of a production we have seen at a given point in the
parsing process.
• Symbols on the left of “•” are already on the stack
• Symbols on the right of “•” are expected in the input
Thus, production A XYZ yields the four items shown in the table.
The production A generates only one item A .
CD / Module-III / Jasaswi 25 17 March 2024
LR(0) Item
One collection of sets of LR(0) items, called the canonical LR(0) collection,
provides the basis for constructing a deterministic finite automaton that is used to
make parsing decisions. Such an automaton is called an LR(0) automaton.
Each state of the LR(0) automaton represents a set of items in the canonical LR(0)
collection.
To construct the canonical LR(0) collection for a grammar, we define an
augmented grammar and two functions, CLOSURE and GOTO.
If G is a grammar with start symbol S, then G', the augmented grammar for G, is G
with a new start symbol S' and production S' S.
• The purpose of this new starting production is to indicate to the parser when it
should stop parsing and announce acceptance of the input.
• Acceptance occurs when and only when the parser is about to reduce by S'
S.
CD / Module-III / Jasaswi 26 17 March 2024
CLOSURE of Item Sets
Idea: Algorithm:
Let be a set of items for a grammar
CLOSURE( ) is the set of items constructed
from by the following two rules:
1. Add every item in to CLOSURE( )
2. If → is in CLOSURE( ) and →
is a production, then add → to
CLOSURE( ) if it is not already there.
Apply this rule until no more new items can
be added to CLOSURE( ).
CD / Module-III / Jasaswi 27 17 March 2024
Closure of Item Sets: Example
Consider the following expression Suppose
grammar:
The augmented grammar of the above
CFG is:
CD / Module-III / Jasaswi 28 17 March 2024
Kernel and Nonkernel Items
If one B-production is added to the CLOSURE(I) with the dot at the left end, then
all B-productions will be similarly added to the closure.
We divide all the sets of items of interest into two classes:
• Kernel items: the initial item, , and all items whose dots are not at the
left end.
• Non-kernel items: all items with their dots at the left end, except for
CD / Module-III / Jasaswi 29 17 March 2024
GOTO Function
Suppose is a set of items and is a grammar symbol.
GOTO( , ) is the CLOSURE of set all items such that
is in
• If is a set of items for some valid prefix , then GOTO( , ) is set of valid
items for prefix .
Intuitively, GOTO( , ) defines the transitions in the LR(0) automaton
• GOTO( , ) gives the transition from state under input
CD / Module-III / Jasaswi 30 17 March 2024
GOTO Function: Example
Suppose
CD / Module-III / Jasaswi 31 17 March 2024
Algorithm to construct Canonical Collection of Sets of LR(0) Items
CD / Module-III / Jasaswi 32 17 March 2024
Canonical Collection of Sets of LR(0) Items
The augmented
expression grammar:
𝑬 →𝑬
𝑬→𝑬+𝑻|𝑻
𝑻→𝑻∗𝑭|𝑭
𝑭 → 𝑬 | 𝒊𝒅
CD / Module-III / Jasaswi 33 17 March 2024
LR(0) automaton for the expression grammar
Given Expression grammar:
The augmented Expression grammar:
CD / Module-III / Jasaswi 34 17 March 2024
LR(0) Automaton
An LR parser makes shift-reduce decisions by maintaining states
Canonical LR(0) collection is used for constructing a DFA for parsing
States represent sets of LR(0) items in the canonical LR(0) collection
• Start state is CLOSURE( ), where ′ is the start symbol of the
augmented grammar.
• State refers to the state corresponding to the set of items .
CD / Module-III / Jasaswi 35 17 March 2024
Use of LR(0) Automaton
How can LR(0) automata help with shift-reduce decisions?
Suppose string of grammar symbols takes the automaton from start state 0 to
state 𝑗
• Shift on next input symbol if 𝑗 has a transition on
• Otherwise, reduce
Items in state 𝑗 help decide which production to use
The LR-parsing algorithm uses its stack to keep track of states as well as
grammar symbols.
The grammar symbol can be recovered from the state, so the stack holds states.
CD / Module-III / Jasaswi 36 17 March 2024
Structure of the LR Parsing Table
Assume 𝑖 is top of the stack and 𝑖 is the current input symbol.
The parsing table consists of two parts: a parsing-action function ACTION and a goto
function GOTO.
1. The ACTION function takes as arguments a state i and a terminal a (or $, the input
endmarker). The value of can have one of four forms:
a) Shift , where is a state. The action taken by the parser effectively shifts input
to the stack, but uses state to represent .
b) Reduce . The action of the parser effectively reduces on the top of the
stack to head .
c) Accept. The parser accepts the input and finishes parsing.
d) Error. The parser discovers an error in its input and takes some corrective action.
2. We extend the function, defined on sets of items, to states: if
then also maps a state and a nonterminal to state .
CD / Module-III / Jasaswi 37 17 March 2024
Constructing LR(0) Parsing Table
1. Construct LR(0) canonical collection ={ } for grammar ′
2. State is constructed from 𝑖
a) If [ → • ] is in 𝑖 and GOTO( 𝑖 , ) = 𝑗 , then set ACTION[ , ] = “Shift ”
b) If [ → •] is in 𝑖 , then set ACTION[ , ] = “Reduce → ” for all
c) If [ ′ → •] is in 𝑖 , then set ACTION[ , $] = “Accept”
3. If GOTO( 𝑖 , )= 𝑗 , then GOTO[ , ] =
4. All entries left undefined are “errors”.
CD / Module-III / Jasaswi 38 17 March 2024
LR(0) Parsing Table
CD / Module-III / Jasaswi 39 17 March 2024
Shift-Reduce Parser with LR(0) Automaton
The following table illustrates the actions of a shift-
reduce parser on input id * id, using the LR(0)
automaton shown here .
A stack is used to hold states.
The column “SYMBOLS” shows the grammar symbols
corresponding to the states on the stack.
At line (1), the stack holds the start state 0 of the
automaton; the corresponding symbol is the bottom-of-
stack marker $.
CD / Module-III / Jasaswi 40 17 March 2024
Shift-Reduce Parser with LR(0) Automaton
CD / Module-III / Jasaswi 41 17 March 2024
Viable Prefix
Viable prefixes are the prefixes of right sentential form that do not extend beyond
the end of its handle.
• Let is right sentential form, where is the handle then viable prefixes are
any prefix of except .
A viable prefix is a prefix of a right sentential form that can appear on the stack of
a shift-reduce parser.
• is a viable prefix if such that is a right sentential form.
id is a prefix of a right sentential form, but it can never appear on the stack
• Always reduce by → id before shifting
• Not all prefixes of a right sentential form can appear on the stack
There is no error as long as the parser has viable prefixes on the stack
CD / Module-III / Jasaswi 42 17 March 2024
Example of a Viable Prefix
CD / Module-III / Jasaswi 43 17 March 2024
Challenges with LR(0) Parsing
An LR(0) parser works only if each state with a reduce action has only one
possible reduce action and no shift action.
Takes shift/reduce decisions without any lookahead token
• Lacks the power to parse programming language grammars
CD / Module-III / Jasaswi 44 17 March 2024
Exercise - 1
1. Construct the LR(0) Parsing Table for the following grammar:
𝐴 → 𝑎𝐴 | 𝑏
Simulate the working of the parsing table with the string “aaab” from the given grammar.
CD / Module-III / Jasaswi 45 17 March 2024
Simple LR Parsing
Simple LR Parsing / SLR(1)
SLR is simple LR.
It is the smallest class of grammar having few number of states.
SLR is very easy to construct and is similar to LR parsing.
The only difference between SLR parser and LR(0) parser is that in LR(0)
parsing table, there’s a chance of ‘shift reduced’ conflict because we are
entering ‘reduce’ corresponding to all terminal states.
We can solve this problem by entering ‘reduce’ corresponding to FOLLOW of
LHS of production in the terminating state.
This is called SLR(1) collection of items.
CD / Module-III / Jasaswi 47 17 March 2024
Block Diagram of LR Parser
CD / Module-III / Jasaswi 48 17 March 2024
LR Parsing Algorithm
The parser driver is same for all LR parsers
• Only the parsing table changes across parsers
A shift-reduce parser shifts a symbol, and an LR parser shifts a state
By construction, all transitions to state is for the same symbol
• Each state, except the start state, has a unique grammar symbol associated
with it
CD / Module-III / Jasaswi 49 17 March 2024
SLR(1) Parsing
Extends LR(0) parser to eliminate a few conflicts
• Uses LR(0) items and LR(0) automaton
For each reduction , look at the next symbol
∗
Apply reduction only if or and
CD / Module-III / Jasaswi 50 17 March 2024
Steps for constructing SLR parsing
Write the augmented grammar
Draw the Canonical collection of LR(0) items / DFD
Number the productions
Find FOLLOW of LHS of production
Construct the Parse Table
• Defining 2 functions: GOTO[list of terminals] and ACTION[list of non-
terminals] in the parsing table.
Stack Implementation
Construct the Parse Tree
CD / Module-III / Jasaswi 51 17 March 2024
Constructing SLR-Parsing Tables
INPUT: An augmented grammar 𝐺’
OUTPUT: The SLR-parsing table functions ACTION and GOTO for 𝐺’.
METHOD:
1. Construct 𝐶 = {𝐼 , 𝐼 , . . . , 𝐼 }, the collection of sets of LR(0) items for 𝐺’.
2. State 𝑖 is constructed from 𝐼 . The parsing actions for state 𝑖 are determined as follows:
a) If 𝐴 → 𝛼 𝑎𝛽 is in 𝐼 and 𝐺𝑂𝑇𝑂 𝐼 , 𝑎 = 𝐼 , then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] to "shift j." Here 𝑎 must be a terminal.
b) If 𝐴 → 𝛼 is in 𝐼 ,, then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] to "reduce 𝐴 → 𝛼 " for all 𝑎 in 𝐹𝑂𝐿𝐿𝑂𝑊 𝐴 ; here 𝐴 may not be
𝑆’.
c) If 𝑆’ → 𝑆 is in 𝐼 , then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, $] to "accept ."
If any conflicting actions result from the above rules, we say the grammar is not SLR(1). The algorithm
fails to produce a parser in this case.
3. The goto transitions for state 𝒊 are constructed for all nonterminals 𝑨 using the rule: If 𝐺𝑂𝑇𝑂 𝐼 , 𝐴 = 𝐼 , then
𝐺𝑂𝑇𝑂 𝑖, 𝐴 = 𝑗.
4. All entries not defined by rules (2) and (3) are made "error.“
5. The initial state of the parser is the one constructed from the set of items containing [𝑆 → 𝑆].
CD / Module-III / Jasaswi 52 17 March 2024
SLR Parsing Table for Expression Grammar
SYMBOL FIRST FOLLOW
E (, id +, $, )
T (, id *, +, $, )
F (, id *, +, $, )
CD / Module-III / Jasaswi 53 17 March 2024
LR Parser Configurations
A LR parser configuration is a pair
Left half is stack content (top on the right), and right half
is the remaining input
Configuration represents the right sentential form where
is the grammar symbol represented by state .
CD / Module-III / Jasaswi 54 17 March 2024
Behavior of the LR Parser
The next move of the parser from the configuration is
determined by reading , the current input symbol, and ,, the state on top of the
stack, and then consulting the entry , in the parsing action table.
The configurations resulting after each of the four types of move are as follows:
1. If , = shift , the parser executes a shift move. It shifts the next state
onto the stack, and enters the new configuration
2. If , = reduce , the parser executes a reduce move and enters
the new configuration where and
3. If , = accept, parsing is successful and completed
4. If , = error, parsing has discovered an error and calls an error
recovery routine.
CD / Module-III / Jasaswi 55 17 March 2024
LR-parsing Algorithm
INPUT: An input string 𝑤 and Let 𝑎 be the first symbol of 𝑤$;
an LR-parsing table with while(1) { /* repeat forever */
functions 𝐴𝐶𝑇𝐼𝑂𝑁 and GOTO
let 𝑠 be the state on top of the stack;
for a grammar G.
if (𝐴𝐶𝑇𝐼𝑂𝑁[𝑠, 𝑎] = shift t ) {
OUTPUT: If w is in L(G), the
reduction steps of a bottom-up push 𝑡 onto the stack;
parse for w; otherwise, an let 𝑎 be the next input symbol;
error indication. } else if (𝐴𝐶𝑇𝐼𝑂𝑁[𝑠, 𝑎] = reduce 𝐴 → 𝛽) {
METHOD: Initially, the parser pop |𝛽| symbols off the stack;
has 𝑠0 on its stack, where 𝑠0 is let state 𝑡 now be on top of the stack;
the initial state, and 𝑤$ in the
push 𝐺𝑂𝑇𝑂[𝑡, 𝐴] onto the stack;
input buffer.
output the production 𝐴 → 𝛽;
} else if (𝐴𝐶𝑇𝐼𝑂𝑁[𝑆, 𝑎] = accept) break; /* parsing is done */
else call error-recovery routine;
}
CD / Module-III / Jasaswi 56 17 March 2024
Moves of an LR Parser on id id + id
CD / Module-III / Jasaswi 57 17 March 2024
Limitation of SLR Parsing
If an SLR parse table for a grammar does not have multiple entries in any cell
then the grammar is unambiguous.
Every SLR(1) grammar is unambiguous, but there are unambiguous grammars
that are not SLR(1).
Example:
CD / Module-III / Jasaswi 58 17 March 2024
Limitation of SLR Parsing: Example
With the following productions check that the grammar is not SLR.
S→L=R
S→R
L →* R
L → id
R→L
Step 1: First of all, convert it into augmented grammar G′ and number the productions
(0) S′ → S
(1) S → L = R
(2) S → R
(3) L →∗ R
(4) L → id
(5) R → L
CD / Module-III / Jasaswi 59 17 March 2024
Limitation of SLR Parsing: Example – contd…
Step2 − Find closure and goto function to construct LR (0) items.
I1 I9
accept
S
I0
L = R
I2 I6
id
R
L
I3 *
I8
L
*
(0) S′ → S
* I4 R
(1) S → L = R
S I7 (2) S → R
id id (3) L →∗ R
(4) L → id
I5
(5) R → L
CD / Module-III / Jasaswi 60 17 March 2024
Limitation of SLR Parsing: Example – contd…
Step 3: First Find FOLLOW of LHS of production
FOLLOW(S) = $
FOLLOW(L)={=,$}
FOLLOW(R)={=,$}
Step 4: Construction of Parsing Table
(0) S′ → S
• The following table display precisely that (1) S → L = R
there are multiple entries in Action [2, =] = (2) S → R
s6 or r5.
(3) L →∗ R
• Therefore, there is shift | Reduce conflict
on Entry Action [2, =]. This shows that (4) L → id
given grammar is not SLR (1). (5) R → L
CD / Module-III / Jasaswi 61 17 March 2024
SLR Parsing: Example 2
Construct LR parsing table for the given context-free grammar
S AA
A aA | b
STEP1 – Find augmented grammar and number them
The augmented grammar of the given grammar is:
1. S’ . S [0th production]
2. S . A A [1st production]
3. A . a A [2nd production]
4. A.b [3rd production]
STEP2 – Find LR(0) collection of items
STEP3 – Find FOLLOW of LHS of production
FOLLOW(S)={$}
FOLLOW(A)={a, b, $}
CD / Module-III / Jasaswi 62 17 March 2024
SLR Parsing: Example 2 contd…
STEP 4- Defining 2 functions: GOTO (list of non-terminals) and ACTION (list of terminals) in the
parsing table. Below is the SLR parsing table.
1. S’ . S [0th production]
2. S . A A [1st production]
3. A . a A [2nd production]
4. A.b [3rd production]
CD / Module-III / Jasaswi 63 17 March 2024
Difference between LR(0) and SLR(1)
Handling of Lookahead Symbols:
• LR(0): LR(0) parsers have no lookahead symbols. They make parsing decisions based solely on the
current input symbol. This lack of lookahead information can lead to more conflicts, including "reduce-
reduce" conflicts.
• SLR(1): SLR(1) parsers use one-symbol lookahead to make parsing decisions. The additional
lookahead helps in resolving some of the conflicts present in LR(0) parsing, especially "reduce-reduce"
conflicts.
Conflict Resolution:
• LR(0): LR(0) parsing tables may have more conflicts, and it is common to encounter "reduce-reduce"
conflicts. The absence of lookahead information makes it less powerful in resolving ambiguities.
• SLR(1): SLR(1) parsers attempt to address some conflicts present in LR(0) parsing. By considering one-
symbol lookahead, SLR(1) parsing tables are more expressive and can resolve certain conflicts, making
it more powerful than LR(0) but less powerful than some other LR parsing techniques.
SLR(1) parsing aims to address some conflicts, especially "reduce-reduce" conflicts, by using one-symbol
lookahead information. However, it's important to note that SLR(1) parsing may not eliminate all possible
"reduce-reduce" conflicts that can arise in the context of certain grammars.
CD / Module-III / Jasaswi 64 17 March 2024
Exercise - 2
1. Construct the SLR(1) Parsing Table for the following grammar:
𝑆 → 𝐴𝑎𝐴𝑏 | 𝐵𝑏𝐵𝑎
𝐴→𝜖
𝐵→𝜖
Simulate the working of the parsing table with the string “aaab” from the given grammar.
2. Show that the following grammar is SLR(1) but not LR(0).
𝐸 →𝑇+𝐸|𝑇
𝑇 → 𝑖𝑑
3. Construct the LR(0) and SLR(1) Parsing Table for the following grammar:
𝑆 → (𝐿) | 𝑎
𝐿 → 𝐿, 𝑆 | 𝑆
CD / Module-III / Jasaswi 65 17 March 2024
Canonical LR Parsing
The "canonical-LR(CLR)" makes full use of the lookahead symbol(s).
It uses a large set of items, called the LR(1) items.
LR(1) Item
An LR(1) item of a CFG is a string of the form , with as one
symbol lookahead.
• is a production in , and
The lookahead has no effect in an item of the form where .
If , reduce only if next input symbol is
Set of possible terminals will always be a subset of FOLLOW( ), but can be a
proper subset.
CD / Module-III / Jasaswi 67 17 March 2024
LR(1) Item
An LR(1) item is valid for a viable prefix
if there is a derivation
where
1. , and
2. is the first symbol in , or, and
CD / Module-III / Jasaswi 68 17 March 2024
Constructing LR(1) Sets of Items
CD / Module-III / Jasaswi 69 17 March 2024
Constructing LR(1) Sets of Items
CD / Module-III / Jasaswi 70 17 March 2024
Example Construction of LR(1) Items
CD / Module-III / Jasaswi 71 17 March 2024
LR(1) Automaton
CD / Module-III / Jasaswi 72 17 March 2024
Constructing LR(1) Parsing Table
1. Construct LR(1) canonical collection ={ }
2. State of the parser is constructed from 𝑖
a) If is in 𝑖 and 𝑖 , then set
“Shift ”
b) If is in 𝑖, then set “Reduce → ”
c) If is in 𝑖 , then set ACTION[ , $] = “Accept”
3. If , then
4. Initial state of the parser is constructed from the set of items containing
CD / Module-III / Jasaswi 73 17 March 2024
Canonical LR(1) Parsing Table
1. Construct LR(1) canonical collection 𝐶 = {𝐼 , 𝐼 , . . . , 𝐼 }
2. State 𝑖 of the parser is constructed from 𝐼𝑖
a) If [𝐴 → 𝛼 • 𝑎𝛽, 𝑏] is in 𝐼𝑖 and 𝐺𝑂𝑇𝑂(𝐼𝑖 , 𝑎) = 𝐼𝑗 , then set
𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] = “Shift 𝑗”
b) If [𝐴 → 𝛼 •, 𝑎] is in 𝐼𝑖, 𝐴 ≠ 𝑆 then set 𝐴𝐶𝑇𝐼𝑂𝑁[𝑖, 𝑎] = “Reduce 𝐴 → 𝛼 •”
c) If [𝑆 ′ → 𝑆 •, $] is in 𝐼𝑖 , then set ACTION[𝑖, $] = “Accept”
3. If 𝐺𝑂𝑇𝑂(𝐼𝑖 , 𝐴) = 𝐼𝑗 , then 𝐺𝑂𝑇𝑂[𝑖, 𝐴] = 𝑗
4. Initial state of the parser is constructed from the set of items containing [𝑆 ′ → • 𝑆, $]
CD / Module-III / Jasaswi 74 17 March 2024
Moves of a CLR Parser on
CD / Module-III / Jasaswi 75 17 March 2024
Canonical LR(1) Parsing
If the parsing table has no multiple-defined cells, then the corresponding
grammar is LR(1).
Every SLR(1) grammar is an LR(1) grammar.
• Canonical LR parser may have more states than SLR.
CD / Module-III / Jasaswi 76 17 March 2024
LALR Parsing
The "lookahead-LR (LALR)" method is based on the LR(0) sets of items, and has
many fewer states than typical parsers based on the LR(1) items.
By carefully introducing lookaheads into the LR(0) items, we can handle many
more grammars with the LALR method than with the SLR method, and build
parsing tables that are no bigger than the SLR tables.
LALR is the method of choice in most situations.
Lookahead LR (LALR) Parsing
CLR(1) parser has a large number of states.
Lookahead LR (LALR) parser
• Merge sets of LR(1) items that have the same core (set of LR(0) items, i.e.,
first component)
• LALR parsers have fewer states, same as SLR
LALR parser is used in many parser generators (e.g., Yacc and Bison).
LALR parsing can handle a larger class of grammars, making it more versatile in
practical compiler construction.
LALR parsers typically have fewer shift-reduce conflicts compared to SLR
parsers.
LALR parser may produce a reduce/reduce conflict.
CD / Module-III / Jasaswi 78 17 March 2024
Example Construction of LR(1) Items
CD / Module-III / Jasaswi 79 17 March 2024
Construction of LALR Parsing Table
INPUT: An augmented grammar 𝐺′.
OUTPUT: The LALR parsing-table functions ACTION and GOTO for 𝐺′.
METHOD:
1. Construct 𝐶 = {𝐼 , 𝐼 , … , 𝐼 }, the collection of sets of LR(1) items.
2. For each core present among the set of LR(1) items, find all sets having that core, and replace
these sets by their union.
3. Let 𝐶 = {𝐽 , 𝐽 , … , 𝐽 } be the resulting sets of LR(1) items. The parsing actions for state i are
constructed from 𝐽 in the same manner as the construction of canonical-LR parsing tables. If
there is a parsing action conflict, the algorithm fails to produce a parser, and the grammar is
said not to be LALR(1).
4. The GOTO table is constructed as follows. If 𝐽 is the union of one or more sets of LR(1) items,
that is, 𝐽 = 𝐼 ∩ 𝐼 ∩ . . . ∩ 𝐼 , then the cores of 𝐺𝑂𝑇𝑂 𝐼 , 𝑋 , 𝐺𝑂𝑇𝑂 𝐼 , 𝑋 , …, 𝐺𝑂𝑇𝑂 𝐼 , 𝑋 are
the same, since 𝐼 , 𝐼 , … , 𝐼 all have the same core. Let K be the union of all sets of items having
the same core as 𝐺𝑂𝑇𝑂 𝐼 , 𝑋 . Then 𝐺𝑂𝑇𝑂 𝐽, 𝑋 = 𝐾.
CD / Module-III / Jasaswi 80 17 March 2024
LALR Grammar
If there are no parsing action conflicts, then the grammar is LALR(1)
CD / Module-III / Jasaswi 81 17 March 2024
Final LR(1) Items for LALR Parser
CD / Module-III / Jasaswi 82 17 March 2024
LALR Parsing Table
CD / Module-III / Jasaswi 83 17 March 2024
Moves of a LALR Parser on
CLR Parser Table
LALR Parser Table
CD / Module-III / Jasaswi 84 17 March 2024
Notes on LALR Parsing Table
LALR parser behaves like the CLR parser excepting difference in stack states.
Merging LR(1) items can never produce shift/reduce conflicts.
• Suppose there is a shift-reduce conflict on lookahead due to items
and
• But merged state was formed from states with same cores, which implies
and must have already been in the same state,
for some value of
Merging items may produce reduce/reduce conflicts
CD / Module-III / Jasaswi 85 17 March 2024
Reduce-Reduce Conflicts due to Merging
{ 𝐴 → 𝑐 , 𝑑/𝑒 , B → 𝑐 , 𝑑/𝑒 }
CD / Module-III / Jasaswi 86 17 March 2024
Dealing with Errors with LALR Parsing
Consider an erroneous
input ccd
CLR parser will not even
reduce before reporting an
error
SLR and LALR parsers may
reduce several times before
reporting an error, but will
never shift an erroneous
input symbol onto the stack
CD / Module-III / Jasaswi 87 17 March 2024
Exercise - 3
1. Construct the CLR(1) and LALR(1) Parsing Table for the following grammar:
𝑆 → 𝑎𝐴𝑑 𝑏𝐵𝑑 𝑎𝐵𝑒 | 𝑏𝐴e
𝐴→𝑐
𝐵→𝑐
2. Construct the CLR(1) and LALR(1) Parsing Table for the following grammar:
𝑆 →𝐿 =𝑅|𝑅
𝐿 → ∗ 𝑅 | 𝑖𝑑
𝑅→𝐿
CD / Module-III / Jasaswi 88 17 March 2024
Comparison across LR Parsing Techniques
SLR(1) = LR(0) items + FOLLOW
SLR(1) parsers can parse a larger number of grammars than LR(0)
Any grammar that can be parsed by an LR(0) parser can be parsed by an SLR(1)
parser.
SLR(1) ≤ LALR(1) ≤ LR(1)
SLR(k) ≤ LALR(k) ≤ LR(k)
LL(k) ≤ LR(k)
Ambiguous grammars are not LR
Bottom-up parsing is a more powerful technique compared to topdown parsing
• LR grammars can handle left recursion
• Detects errors as soon as possible, and allows for better error recovery
Automated parser generators such as Yacc and Bison implement LALR parsing
CD / Module-III / Jasaswi 89 17 March 2024