0% found this document useful (0 votes)
17 views107 pages

CD Unit-2

This document covers syntax analysis in compiler design, focusing on the role of parsers, context-free grammars, and various parsing techniques. It discusses error handling strategies, including panic mode and phrase-level recovery, as well as the distinctions between top-down and bottom-up parsing methods. Key concepts such as LL(1) grammars, recursive descent parsing, and error recovery strategies are also addressed.

Uploaded by

yannaprasanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views107 pages

CD Unit-2

This document covers syntax analysis in compiler design, focusing on the role of parsers, context-free grammars, and various parsing techniques. It discusses error handling strategies, including panic mode and phrase-level recovery, as well as the distinctions between top-down and bottom-up parsing methods. Key concepts such as LL(1) grammars, recursive descent parsing, and error recovery strategies are also addressed.

Uploaded by

yannaprasanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 107

UNIT-2

• Syntax Analysis:
• Introduction: The Role of the parser – Representative
Grammars – Syntax Error Handling – Error Recovery
Strategies.
• Context Free Grammars: The formal definition of a CFG –
Notational Conventions – Derivations – Parse trees and
derivations – Ambiguity
• Writing Grammar: Lexical Versus Syntax Analysis –
Eliminating Ambiguity – Elimination of Left Recursion – Left
Factoring
• Top Down Parsing: Recursive Descent Parsing-FIRST and
FOLLOW - LL(1) Grammars – Non recursive Predictive
Parsing- Error Recovery in Predictive Parsing.
The Role of the parser
The Role of the parser
• The parser obtains a string of tokens from lexical
analyzer.
• It then verifies whether the input string can be
generated from the grammar of the source language.
• It has to report any syntactical errors and recover from
commonly occurring error to continue the processing
the remainder of the program
• The parser constructs a parse tree and passes it to the
rest of the compiler for further processing.
• It should also report syntactical errors in an easily
understandable way to the user.
Categorization of parsers
Representative Grammars

Useful for Bottom up


parsing. Belongs to LR grammar Class.

After removing left recursion used


for top down parsing

useful for illustrating


handling ambiguities
during parsing
Common programming errors can occur at many different levels
Lexical analysis Vs syntax analysis
Context Free Grammars
• The formal definition of a CFG
• Notational Conventions
• Derivations
• Parse trees and derivations
• Ambiguity
The formal definition of a CFG
Example 1:

Example 2:
Notational Conventions
Derivations
Derivations
LEFT MOST DERIVATION
RIGHT MOST DERIVATION
EXAMPLE
Assignment 1
Removing left recursion: Example 1
• S-> Sca/b/da
• A->Aac/d/bc

S->bS’/daS’
S’->caS’/€

A->dA’/bcA’
A’->acA’/€
Example 2
• E->EZM/M/2M
• M->MUF/F

• E->ME’/2ME’
• E’->ZME’/€

• M->FM’
• M’->UFM’/€
Example 3
• S->S;S/id:=E/print(L)
• S->id:=ES’/print(L)S’
• S’->;SS’/€
• E->E+E/id/num/(S,E)
E->idE’/numE’/(S,E)E’
E’->+EE’/€
• L->E/L,E
L->EL’
L’->,EL’/€
Remove left factoring/left recursion
• D->X/X,D
• X->id/id[c]
• S->T/T;S
• A->id=E/id(E)=E
• F->C/id/id[E]
• C->G/GC
• S->iEtS/iEtSeS
Types of parsing techniques
• Top down parsing and bottom up parsing

• Top down parsing:


– Recursive Descent parsing(or) brute force method
– Predictive parsing(or) non recursive descent
parsing
Recursive Descent parsing : string “cad”
Examples for recursive descent parsing
• S->abA derive string “ab”
• A->cd/c/€

• S->abA
• A->bc/b derive string”abb”
Predictive parsing
• This is also called as recursive predictive parsing
• No need of backtracking
• To construct predictive parsing, we need to compute
the First and Follow for all Non terminals
Properties of LL(1)
• No ambiguous or left recursive grammar can be LL(1).
• Grammar G is LL(1) iff whenever A| are two
distinct productions of G and:
– For no terminal a do both  and  derive strings beginning
with a.
FIRST()FIRST()=
– At most one of  and  can derive the empty string.
– If , the  does not derive any string beginning with a
terminal in FOLLOW(A).
FIRST(FOLLOW(A))FIRST(FOLLOW(A))=
*
Error Recovery in Predictive Parsing

• The parser design should be able to provide an error


message (an error message which depicts as much
possible information as it can ).
• It should be recovering from that error case, and it
should be able to continue parsing with the rest of
the input.
4 Types of error recovery strategies
Panic mode error recovery
• Panic-mode error recovery says that all the
input symbols are skipped until a
synchronizing token is found from the string.
• What is the synchronizing token?
• The terminal symbols which lie in follow set of
non-terminal they can be used as a
synchronizing token set for that non-terminal
How to select synchronizing set?
• Place all symbols in FOLLOW(A) into the
synchronizing set for non terminal A. If we skip
tokens until an element of FOLLOW(A) is seen and
pop A from the stack, it likely that parsing can
continue.
• We might add keywords that begins statements to
the synchronizing sets for the non terminals
generating expressions.
• If a non terminal can generate the empty string, then
the production deriving  can be used as a default. This
may postpone some error detection, but cannot cause
an error to be missed. This approach reduces the
number of non terminals that have to be considered
during error recovery.
• If a terminal on top of stack cannot be matched, a
simple idea is to pop the terminal, issue a message
saying that the terminal was inserted.
Example: error recovery
“synch” indicating synchronizing tokens
obtained from FOLLOW set of the
nonterminal in question.
If the parser looks up entry
M[A,a] and finds that it is blank, the input
symbol a is skipped.
If the entry is synch, the the
nonterminal on top of the stack is popped.
If a token on top of the stack
does not match the input symbol, then we
pop the token from the stack.
Example: error recovery (II)
Example 2
Phrase-level recovery
• Phrase level recovery is implemented by filling in the
blank entries in the predictive parsing table with pointers
to error routines.
• At each unfilled entry in the parsing table, it is filled by a
pointer to a special error routine that will take care of
that error case specifically.
• These error routines can be of different types like :
– change, insert, or delete input symbols.
– issue appropriate error messages.
– pop items from the stack.
• We should be careful while we design these error
routines because we may put the parser into an infinite loop
Error productions
• If a user has knowledge of common errors that can be
encountered then, these errors can be incorporated by
augmenting the grammar with error productions that
generate erroneous constructs.
• If this is used then, during parsing appropriate error messages
can be generated and parsing can be continued.
Global corrections
• The parser examines the whole program and tries to
find out the closest match for it which is error-free.
• The closest match program has less number of
insertions, deletions, and changes of tokens to
recover from erroneous input.
• Due to high time and space complexity, this method
is not implemented practically.
1. Universal parsers:-
– These parsers perform parsing. Cocke - Kalami -
Younger (CKY) algorithm. It uses the Chomsky
Normal Form (CNF) of the CFG.
– It is in efficient and not used in Commercial
Compilers.
2. Top down parsers: -
• These parsers build parse trees starting from the root
node and work up to the Leaves.
• These parsers Satisfy LL grammars.
• These Scanners scan from left to right and use left-
most derivation.
• These parsers are Categorized into
– back track parsing.
– Predictive parsing
– In back track parsing, if a sequence of erroneous
expansions are made and a mismatch is discovered,
the input pointer rolled back to the initial position
and all effects of parsing are undone.
– In predictive parsing, using current non terminal to
be expanded and the next input symbol, the parser
makes a decision of which production needs to be
expanded.
– Two ways of implementing predictive parsers:
– Recursive descent parsing have Procedure for
Parsing each non-terminal using the recursive nature
of The procedures.
– Table driven parsing. determines the next production
to be applied by using a parsing table. Indexed by
the Current non-terminal and the next input symbol.
3. Bottom up Parsers:

• There Parsers build parse trees starting from the


leaves and work up to the root node.
• They satisfy LR grammars.
• They scan from left to right and use right most
derivation.
• Here the input is reduced to the start symbol.
• These parsers are implemented using stacks.
• Here the next input symbol is shifted onto the stack
• Then the top few elements are popped out of the stack
if a match occurs to right side of the production. It is
then reduced to the left side of the production.
• Bottom up parsing is also called as shift reduce
parsing.
• These passers are categorized into
– Operator precedence parsers
• In operator precedence parsers, the choice of
whether to shift the next symbol. Or reduce the
processed input is decided using a precedence
relation table.
• In LR Parsers, passing table derived out of
grammar is used for deciding whether to shift the
next input Symbol or reduce the processed input.
These parsers are categorized as
– Simple LR (SLR)
– Canonical LR (CLR) and
– Look Ahead LR (LALR) based on the way the
parser tables are derived.

You might also like