0% found this document useful (0 votes)
141 views36 pages

Matrusri Engineering College: Subject Name

The document discusses compiler construction and outlines the course objectives, outcomes, and content. It covers topics like context free grammars, parsing, abstract syntax trees, lexical analysis, error handling, derivations, parse trees, ambiguity, and elimination of left recursion. The goals are to introduce language translation pipelines, scanning, parsing strategies, semantic analysis, symbol tables, intermediate code generation, and program analysis techniques for code optimization.

Uploaded by

Sai Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
141 views36 pages

Matrusri Engineering College: Subject Name

The document discusses compiler construction and outlines the course objectives, outcomes, and content. It covers topics like context free grammars, parsing, abstract syntax trees, lexical analysis, error handling, derivations, parse trees, ambiguity, and elimination of left recursion. The goals are to introduce language translation pipelines, scanning, parsing strategies, semantic analysis, symbol tables, intermediate code generation, and program analysis techniques for code optimization.

Uploaded by

Sai Krishna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

MATRUSRI

ENGINEERING COLLEGE

MATRUSRI ENGINEERING COLLEGE


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

SUBJECT NAME: Compiler Construction

FACULTY NAME: L. Raghavendar Raju, Assistant Professor

25-10-2020 Compiler Construction 1


MATRUSRI
ENGINEERING COLLEGE

Compiler Construction

COURSE OBJECTIVES:
➢ To introduce the steps in language translation pipeline and
runtime data structures used in translation
➢ To learn about Scanning (lexical analysis) process using
regular expressions and use of LEX to generate scanner
➢ To introduce different Parsing strategies including top-
down (e.g., recursive descent, Early parsing, or LL) and
bottom-up (e.g., backtracking or LR) techniques
➢ Describe semantic analyses using an attribute grammar
➢ To learn how to build symbol tables and generate
intermediate code.
➢ To introduce techniques of program analysis and code
25-10-2020
optimization Compiler Construction 2
MATRUSRI
ENGINEERING COLLEGE

Compiler Construction

COURSE OUTCOMES:
After completing this course, the student will be able to
1. Create lexical rules and grammars for a given language
2.Generate scanners and parsers from declarative
specifications.
3. Describe an abstract syntax tree for a small language.
4. Use program analysis techniques for code optimization
5. Develop the compiler for a subset of a given language

25-10-2020 Compiler Construction 3


MATRUSRI
ENGINEERING COLLEGE

UNIT-II
Context Free Grammars & Parsing: The parsing process
Context free grammars
Parse tree &Abstract syntax trees,
EBNF and syntax diagrams, and Properties of CFLs.
Top Down Parsing: Recursive descent parsing, LL (1) parsing,
First and follow sets, Recursive descent parser, and
Error recovery in top down parsers.

OUTCOMES:
1. Generate scanners and parsers from declarative specifications.
2. Describe an abstract syntax tree for a small language.

25-10-2020 Compiler Construction 4


MATRUSRI
ENGINEERING COLLEGE

MODULE-I

CONTENTS:
Context Free Grammars & Parsing:
The parsing process, Context free grammars,
Parse tree &Abstract syntax trees,
EBNF and syntax diagrams, and Properties of CFLs.

OUTCOMES:
Generate scanners and parsers from declarative specifications

25-10-2020 Compiler Construction 5


MATRUSRI
ENGINEERING COLLEGE

The role of parser

token
Source Lexical Parse tree Rest of Front Intermediate
Parser
program Analyzer End representation
getNext
Token

Symbol
table

25-10-2020 Compiler Construction 6


MATRUSRI
ENGINEERING COLLEGE

Uses of grammars
E -> E + T | T
T -> T * F | F
F -> (E) | id

E -> TE’
E’ -> +TE’ | Ɛ
T -> FT’
T’ -> *FT’ | Ɛ
F -> (E) | id

25-10-2020 Compiler Construction 7


MATRUSRI
ENGINEERING COLLEGE

Error handling
• Common programming errors
• Lexical errors
• Syntactic errors
• Semantic errors
• Lexical errors
• Error handler goals
• Report the presence of errors clearly and accurately
• Recover from each error quickly enough to detect subsequent errors
• Add minimal overhead to the processing of correct progrms

25-10-2020 Compiler Construction 8


MATRUSRI
ENGINEERING COLLEGE

Error-recover strategies
• Panic mode recovery
• Discard input symbol one at a time until one of designated set of
synchronization tokens is found
• Phrase level recovery
• Replacing a prefix of remaining input by some string that allows the parser to
continue
• Error productions
• Augment the grammar with productions that generate the erroneous constructs
• Global correction
• Choosing minimal sequence of changes to obtain a globally least-cost
correction

25-10-2020 Compiler Construction 9


MATRUSRI
ENGINEERING COLLEGE

Context free grammars


• Terminals
• Nonterminals expression -> expression + term
• Start symbol expression -> expression – term
• productions expression -> term
term -> term * factor
term -> term / factor
term -> factor
factor -> (expression)
factor -> id

25-10-2020 Compiler Construction 10


MATRUSRI
ENGINEERING COLLEGE

Derivations
Rightmost and
leftmost
derivations
Productions are treated • E -> E + E | E * E | -E |
as rewriting rules to (E) | id
generate a string • Derivations for –(id+id)
• E => -E => -(E) => -
(E+E) => -(id+E)=>-
(id+id)

25-10-2020 Compiler Construction 11


MATRUSRI
ENGINEERING COLLEGE

Parse trees

• -(id+id)
• E => -E => -(E) => -(E+E) => -(id+E)=>-
(id+id)

25-10-2020 Compiler Construction 12


MATRUSRI
ENGINEERING COLLEGE

Ambiguity
• For some strings there exist more than one parse tree
• Or more than one leftmost derivation
• Or more than one rightmost derivation
• Example: id+id*id

25-10-2020 Compiler Construction 13


MATRUSRI
ENGINEERING COLLEGE

Elimination of ambiguity

25-10-2020 Compiler Construction 14


MATRUSRI
ENGINEERING COLLEGE

Elimination of ambiguity (cont.)

• Idea:
• A statement appearing between a then and an else must be matched

25-10-2020 Compiler Construction 15


MATRUSRI
ENGINEERING COLLEGE

Elimination of left recursion


• A grammar is left recursive if it has a non-terminal A such that there
is a derivation A=> Aα +

• Top down parsing methods cant handle left-recursive grammars


• A simple rule for direct left recursion elimination:
• For a rule like:
• A -> A α|β
• We may replace it with
• A -> β A’
• A’ -> α A’ | ɛ

25-10-2020 Compiler Construction 16


MATRUSRI
ENGINEERING COLLEGE

Left recursion elimination (cont.)


• There are cases like following
• S -> Aa | b
• A -> Ac | Sd | ɛ
• Left recursion elimination algorithm:
• Arrange the nonterminals in some order A1,A2,…,An.
• For (each i from 1 to n) {
• For (each j from 1 to i-1) {
• Replace each production of the form Ai-> Aj γ by the production Ai -> δ1 γ | δ2 γ | …
|δk γ where Aj-> δ1 | δ2 | … |δk are all current Aj productions
• }
• Eliminate left recursion among the Ai-productions
• }

25-10-2020 Compiler Construction 17


MATRUSRI
ENGINEERING COLLEGE

Left factoring
• Left factoring is a grammar transformation that is useful for producing
a grammar suitable for predictive or top-down parsing.
• Consider following grammar:
• Stmt -> if expr then stmt else stmt
• | if expr then stmt
• On seeing input if it is not clear for the parser which production to use
• We can easily perform left factoring:
• If we have A->αβ1 | αβ2 then we replace it with
• A -> αA’
• A’ -> β1 | β2

25-10-2020 Compiler Construction 18


MATRUSRI
ENGINEERING COLLEGE

Left factoring (cont.)


• Algorithm
• For each non-terminal A, find the longest prefix α common to two or more of
its alternatives. If α<> ɛ, then replace all of A-productions A->αβ1 |αβ2 | … |
αβn | γ by
• A -> αA’ | γ
• A’ -> β1 |β2 | … | βn
• Example:
• S -> I E t S | i E t S e S | a
• E -> b

25-10-2020 Compiler Construction 19


MATRUSRI
ENGINEERING COLLEGE

MODULE-II

CONTENTS:
Top Down Parsing: Recursive descent parsing,
LL (1) parsing,
First and follow sets,
Recursive descent parser, and
Error recovery in top down parsers.

OUTCOMES:
Describe the parsing techniques for given languages

25-10-2020 Compiler Construction 20


MATRUSRI
ENGINEERING COLLEGE

Introduction
• A Top-down parser tries to create a parse tree from the root
towards the leafs scanning input from left to right
• It can be also viewed as finding a leftmost derivation for an
input string
• Example: id+id*id

E -> TE’ E
lm
E
lm
E
lm
E
lm
E
lm
E

E’ -> +TE’ | Ɛ T E’ T E’ T E’ T E’ T E’
T -> FT’
T’ -> *FT’ | Ɛ F T’ F T’ F T’ F T’ + T E’

F -> (E) | id id id Ɛ id Ɛ

25-10-2020 Compiler Construction 21


MATRUSRI
ENGINEERING COLLEGE

Recursive descent parsing


• Consists of a set of procedures, one for each nonterminal
• Execution begins with the procedure for start symbol
• A typical procedure for a non-terminal

void A() {
choose an A-production, A->X1X2..Xk
for (i=1 to k) {
if (Xi is a nonterminal
call procedure Xi();
else if (Xi equals the current input symbol a)
advance the input to the next symbol;
else /* an error has occurred */
}
25-10-2020 } Compiler Construction 22
MATRUSRI
ENGINEERING COLLEGE

Recursive descent parsing (cont)


• General recursive descent may require backtracking
• The previous code needs to be modified to allow
backtracking
• In general form it cant choose an A-production easily.
• So we need to try all alternatives
• If one failed the input pointer needs to be reset and another
alternative should be tried
• Recursive descent parsers cant be used for left-recursive
grammars

25-10-2020 Compiler Construction 23


MATRUSRI
ENGINEERING COLLEGE

Example

S->cAd
A->ab | a Input: cad

S S S

c A d c A d c A d

a b a

25-10-2020 Compiler Construction 24


MATRUSRI
ENGINEERING COLLEGE

First and Follow

• First() is set of terminals that begins strings derived from


*
• If α=>ɛ then is also in First(ɛ)
• In predictive parsing when we have A-> α|β, if First(α) and First(β) are
disjoint sets then we can select appropriate A-production by looking at the
next input
• Follow(A), for any nonterminal A, is set of terminals a that can appear
immediately after A in* some sentential form
• If we have S => αAaβ for some αand βthen a is in Follow(A)
• If A can be the rightmost symbol in some sentential form, then $ is in
Follow(A)
25-10-2020 Compiler Construction 25
MATRUSRI
ENGINEERING COLLEGE

Computing First
• To compute First(X) for all grammar symbols X, apply following rules
until no more* terminals or ɛ can be added to any First set:
1. If X is a terminal then First(X) = {X}.
2. If X is a nonterminal and X->Y1Y2…Yk is a production for some k>=1,
then place a in First(X) if for some i a is in First(Yi) and ɛ is in all of
First(Y1),…,First(Yi-1) that is Y1…Yi-1 => ɛ. if ɛ is in First(Yj) for j=1,…,k
then add ɛ to First(X).
3. If X-> ɛ is a production then add ɛ to First(X)
*
• Example!

25-10-2020 Compiler Construction 26


MATRUSRI
ENGINEERING COLLEGE

Computing follow
• To compute First(A) for all nonterminals A, apply following rules until
nothing can be added to any follow set:
1. Place $ in Follow(S) where S is the start symbol
2. If there is a production A-> αBβ then everything in First(β) except ɛ is in
Follow(B).
3. If there is a production A->B or a production A->αBβ where First(β)
contains ɛ, then everything in Follow(A) is in Follow(B)
• Example!

25-10-2020 Compiler Construction 27


MATRUSRI
ENGINEERING COLLEGE

LL(1) Grammars
• Predictive parsers are those recursive descent parsers needing no
backtracking
• Grammars for which we can create predictive parsers are called LL(1)
• The first L means scanning input from left to right
• The second L means leftmost derivation
• And 1 stands for using one input symbol for lookahead
• A grammar G is LL(1) if and only if whenever A-> α|β are two distinct
productions of G, the following conditions hold:
• For no terminal a do α and β both derive strings beginning with a
• At most one of* α or βcan derive empty string
• If α=> ɛ then β does not derive any string beginning with a terminal in Follow(A).
25-10-2020 Compiler Construction 28
MATRUSRI
ENGINEERING COLLEGE

Construction of predictive parsing table

• For each production A->α in grammar do the following:


1. For each terminal a in First(α) add A-> in M[A,a]
2. If ɛ is in First(α), then for each terminal b in Follow(A) add A-> ɛ to M[A,b].
If ɛ is in First(α) and $ is in Follow(A), add A-> ɛ to M[A,$] as well
• If after performing the above, there is no production in M[A,a] then
set M[A,a] to error

25-10-2020 Compiler Construction 29


MATRUSRI
ENGINEERING COLLEGE

First
Example Follow
F {(,id} {+, *, ), $}
E -> TE’ {(,id} {+, ), $}
E’ -> +TE’ | Ɛ T
E {(,id} {), $}
T -> FT’
T’ -> *FT’ | Ɛ E’ {+,ɛ} {), $}
T’ {*,ɛ} {+, ), $}
F -> (E) | id
Input Symbol
Non -
terminal id + * ( ) $
E E -> TE’ E -> TE’

E’ E’ -> +TE’ E’ -> Ɛ E’ -> Ɛ

T T -> FT’ T -> FT’

T’ T’ -> Ɛ T’ -> *FT’ T’ -> Ɛ T’ -> Ɛ

25-10-2020 Compiler Construction 30


F F -> id F -> (E)
MATRUSRI
ENGINEERING COLLEGE

Another example
S -> iEtSS’ | a
S’ -> eS | Ɛ
E -> b

Input Symbol
Non -
terminal a b e i t $
S S -> a S -> iEtSS’

S’ S’ -> Ɛ S’ -> Ɛ
S’ -> eS
E E -> b

25-10-2020 Compiler Construction 31


MATRUSRI
ENGINEERING COLLEGE

Non-recursive predicting parsing

a + b $

Predictive
parsing output
stack X
Y program
Z
$
Parsing
Table
25-10-2020
M
Compiler Construction 32
MATRUSRI
ENGINEERING COLLEGE

Predictive parsing algorithm


Set ip point to the first symbol of w;
Set X to the top stack symbol;
While (X<>$) { /* stack is not empty */
if (X is a) pop the stack and advance ip;
else if (X is a terminal) error();
else if (M[X,a] is an error entry) error();
else if (M[X,a] = X->Y1Y2..Yk) {
output the production X->Y1Y2..Yk;
pop the stack;
push Yk,…,Y2,Y1 on to the stack with Y1 on top;
}
set X to the top stack symbol;
}

25-10-2020 Compiler Construction 33


MATRUSRI
ENGINEERING COLLEGE

Example
• id+id*id$

Matched Stack Input Action


E$ id+id*id$

25-10-2020 Compiler Construction 34


MATRUSRI
ENGINEERING COLLEGE

Error recovery in predictive parsing


• Panic mode
• Place all symbols in Follow(A) into synchronization set for nonterminal A: skip tokens until an
element of Follow(A) is seen and pop A from stack.
• Add to the synchronization set of lower level construct the symbols that begin higher level
constructs
• Add symbols in First(A) to the synchronization set of nonterminal A
• If a nonterminal can generate the empty string then the production deriving can be used as a
default
• If a terminal on top of the stack cannot be matched, pop the terminal, issue a message saying
that the terminal was insterted

25-10-2020 Compiler Construction 35


MATRUSRI
ENGINEERING COLLEGE

Non - Input Symbol


Example terminal
E
id
E -> TE’
+ * ( )
E -> TE’ synch
$
synch

E’ E’ -> +TE’ E’ -> Ɛ E’ -> Ɛ

T -> FT’ synch T -> FT’ synch synch


T
T’ -> Ɛ T’ -> *FT’ T’ -> Ɛ T’ -> Ɛ
T’

F F -> id synch synch F -> (E) synch synch

Stack Input Action


E$ )id*+id$ Error, Skip )
E$ id*+id$ id is in First(E)
TE’$ id*+id$
FT’E’$ id*+id$
idT’E’$ id*+id$
T’E’$ *+id$
*FT’E’$ *+id$
FT’E’$ +id$ Error, M[F,+]=synch
T’E’$ +id$ F has been poped

25-10-2020 Compiler Construction 36

You might also like