0% found this document useful (0 votes)

172 views31 pages

Unit 1,2 PDF

The document discusses compiler design and its various phases. It describes lexical analysis, syntax analysis, semantic analysis, code optimization, and code generation. Lexical analysis converts source code to tokens. Syntax analysis constructs a parse tree. Semantic analysis verifies the parse tree. Code optimization improves efficiency. Code generation outputs machine-readable code. Compiler construction tools can assist with each phase.

Uploaded by

MD NADEEM ASGAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

172 views31 pages

Unit 1,2 PDF

Uploaded by

MD NADEEM ASGAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Compiler Deisgn Unit 1,2

● Cross Compiler that runs on a machine ‘A’ and produces a code for another
machine ‘B’. It is capable of creating code for a platform other than the one on
which the compiler is running.
● Source-to-source Compiler or transcompiler or transpiler is a compiler that
translates source code written in one programming language into source code of
another programming language.
● Assembler – For every platform (Hardware + OS) we will have a assembler. They
are not universal since for each platform we have one. The output of assembler
is called object file. Its translates assembly language to machine code.
● Relocatable Machine Code – It can be loaded at any point and can be run. The
address within the program will be in such a way that it will cooperate for the
program movement.
● Loader/Linker – It converts the relocatable code into absolute code and tries to
run the program resulting in a running program or an error message (or
sometimes both can happen). Linker loads a variety of object files into a single
file to make it executable. Then loader loads it in memory and executes it.

Analysis Phase – An intermediate representation is created from the give source code :

1. Lexical Analyzer
2. Syntax Analyzer
3. Semantic Analyzer
● Lexical analyzer divides the program into “tokens”, Syntax analyzer recognizes
“sentences” in the program using syntax of language and Semantic analyzer
checks static semantics of each construct.
Synthesis Phase – Equivalent target program is created from the intermediate

representation. It has three parts :

1. Intermediate Code Generator

2. Code Optimizer
3. Code Generator

Intermediate Code Generator generates “abstract” code, Code Optimizer optimizes the

abstract code, and final Code Generator translates abstract intermediate code into

specific machine instructions.

Compiler construction tools

The compiler writer can use some specialized tools that help in implementing various

phases of a compiler. These tools assist in the creation of an entire compiler or its

parts. Some commonly used compiler construction tools include:

1. Parser Generator –
It produces syntax analyzers (parsers) from the input that is based on a
grammatical description of programming language or on a context-free
grammar. It is useful as the syntax analysis phase is highly complex and
consumes more manual and compilation time.
Example:PIC, EQM

2. Scanner Generator –
It generates lexical analyzers from the input that consists of regular
expression description based on tokens of a language. It generates a finite
automation to recognize the regular expression.
Example: Lex

3.Syntax directed translation engines –

It generates intermediate code with three address format from the input that
consists of a parse tree. These engines have routines to traverse the parse tree
and then produces the intermediate code. In this, each node of the parse tree is
associated with one or more translations.

4.Automatic code generators –

It generates the machine language for a target machine. Each operation of the
intermediate language is translated using a collection of rules and then is taken
as an input by the code generator. Template matching process is used. An
intermediate language statement is replaced by its equivalent machine language
statement using templates.

5.Data-flow analysis engines –

It is used in code optimization.Data flow analysis is a key part of the code
optimization that gathers the information, that is the values that flow from one
part of a program to another. Refer – d
ata flow analysis in Compiler

6.Compiler construction toolkits –

It provides an integrated set of routines that aids in building compiler
components or in the construction of various phases of compiler.
We basically have two phases of compilers, namely Analysis phase and
Synthesis phase. Analysis phase creates an intermediate representation from the
given source code. Synthesis phase creates an equivalent target program from
the intermediate representation.

Symbol Table – It is a data structure being used and maintained by the compiler,

consists all the identifier’s name along with their types. It helps the compiler to function

smoothly by finding the identifiers quickly.

The compiler has two modules namely front end and back end. Front-end constitutes of

the Lexical analyzer, semantic analyzer, syntax analyzer and intermediate code

generator. And the rest are assembled to form the back end.

1. Lexical Analyzer – It reads the program and converts it into tokens. It
converts a stream of lexemes into a stream of tokens. Tokens are defined
by regular expressions which are understood by the lexical analyzer. It also
removes white-spaces and comments.
2. Syntax Analyzer – It is sometimes called as parser. It constructs the parse
tree. It takes all the tokens one by one and uses Context Free Grammar to
construct the parse tree.
Why Grammar ?
The rules of programming can be entirely represented in some few
productions. Using these productions we can represent what the program
actually is. The input has to be checked whether it is in the desired format or
not.
Syntax error can be detected at this level if the input is not in accordance
with the grammar.

3. Semantic Analyzer – It verifies the parse tree, whether it’s meaningful or
not. It furthermore produces a verified parse tree.It also does type checking,
Label checking and Flow control checking.
4. Intermediate Code Generator – It generates intermediate code, that is a
form which can be readily executed by machine We have many popular
intermediate codes. Example – Three address code etc. Intermediate code
is converted to machine language using the last two phases which are
platform dependent.
Till intermediate code, it is same for every compiler out there, but after that,
it depends on the platform. To build a new compiler we don’t need to build it
from scratch. We can take the intermediate code from the already existing
compiler and build the last two parts.
5. Code Optimizer –
It transforms the code so that it consumes fewer
resources and produces more speed. The meaning of the code being
transformed is not altered. Optimisation can be categorized into two types:
machine dependent and machine independent.
6. Target Code Generator – The main purpose of Target Code generator is to
write a code that the machine can understand and also register allocation,
instruction selection etc. The output is dependent on the type of assembler.
This is the final stage of compilation.

Symbol table is build in syntax and semantic analysis phase.

It is used by compiler to achieve compile time efficiency.

Lexical Analysis is the first phase of compiler also known as scanner. It converts the

High level input program into a sequence of T

okens.

● Lexical Analysis can be implemented with the D

eterministic finite Automata.
● The output is a sequence of tokens that is sent to the parser for syntax
analysis

What is a token?
A lexical token is a sequence of characters that can be treated as a unit in the grammar

of the programming languages.

Example of tokens:

● Type token (id, number, real, . . . )

● Punctuation tokens (IF, void, return, . . . )
● Alphabetic tokens (keywords)

Keywords; Examples-for, while, if etc.

Identifier; Examples-Variable name, function name etc.

Operators; Examples '+', '++', '-' etc.

Separators; Examples ',' ';' etc

Example of Non-Tokens:

● Comments, preprocessor directive, macros, blanks, tabs, newline etc

Lexeme: The sequence of characters matched by a pattern to form

the corresponding token or a sequence of input characters that comprises a single

token is called a lexeme. eg- “float”, “abs_zero_Kelvin”, “=”, “-”, “273”, “;” .

How Lexical Analyzer functions

1. Tokenization .i.e Dividing the program into valid tokens.

2. Remove white space characters.

3. Remove comments.

4. It also provides help in generating error message by providing row number and

column number.

The lexical analyzer identifies the error with the help of automation machine and
the grammar of the given language on which it is based like C , C++ and gives row
number and column number of the error.
Suppose we pass a statement through lexical analyzer –

a = b + c ; It will generate token sequence like this:

id=id+id; Where each id reference to it’s variable in the symbol table

referencing all details

For example, consider the program

int main()

// 2 variables

int a, b;

a = 10;

return 0;

All the valid tokens are:

'int' 'main' '(' ')' '{' '}' 'int' 'a' ',' 'b' ';'

'a' '=' '10' ';' 'return' '0' ';' '}'

Above are the valid tokens.

You can observe that we have omitted comments.

As another example, consider below printf statement.

Flex (Fast Lexical Analyzer Generator )

FLEX (fast lexical analyzer generator) is a tool/computer program for generating lexical

analyzers (scanners or lexers) written by Vern Paxson in C around 1987. It is used

together with B
erkeley Yacc parser generator or GNU Bison parser generator. Flex and

Bison both are more flexible than Lex and Yacc and produces faster code.

Bison produces parser from the input file provided by the user. The function yylex() is

automatically generated by the flex when it is provided with a .l file and this yylex()

function is expected by parser to call to retrieve tokens from current/this token stream.

Introduction to Syntax Analysis

When an input string (source code or a program in some language) is given to a

compiler, the compiler processes it in several phases, starting from lexical analysis

(scans the input and divides it into tokens) to target code generation.

Syntax Analysis or Parsing is the second phase, i.e. after lexical analysis. It checks the

syntactical structure of the given input, i.e. whether the given input is in the correct

syntax (of the language in which the input has been written) or not. It does so by
building a data structure, called a Parse tree or Syntax tree. The parse tree is

constructed by using the pre-defined Grammar of the language and the input string. If

the given input string can be produced with the help of the syntax tree (in the derivation

process), the input string is found to be in the correct syntax. if not, error is reported by

syntax analyzer.

The Grammar for a Language consists of Production rules.

Example:

Suppose Production rules for the Grammar of a language are:

S -> cAd

A -> bc|a

And the input string is “cad”.

Now the parser attempts to construct syntax tree from this grammar for the given input

string. It uses the given production rules and applies those as needed to generate the

string. To generate string “cad” it uses the rules as shown in the given diagram:

In the step iii above, the production rule A->bc was not a suitable one to apply (because

the string produced is “cbcd” not “cad”), here the parser needs to backtrack, and apply

the next production rule available with A which is shown in the step iv, and the string

“cad” is produced.

Thus, the given input can be produced by the given grammar, therefore the input is

correct in syntax.

But backtrack was needed to get the correct syntax tree, which is really a complex

process to implement.

There can be an easier way to solve this, which we shall see in the next article

“Concepts of FIRST and FOLLOW sets in Compiler Design”.

Context Free Grammars (CFG) can be classified on the basis of following two

properties:

1) Based on number of strings it generates.

● If CFG is generating finite number of strings, then CFG is N

on-Recursive
● If CFG can generate infinite number of strings then the grammar is said to
be R
ecursive grammar

During Compilation, the parser uses the grammar of the language to make a
parse tree(or derivation tree) out of the source code. The grammar used must be
unambiguous. An ambiguous grammar must not be used for parsing.

2) Based on number of derivation trees.

● If there is only 1 derivation tree then the CFG is unambiguous.

● If there are more than 1 derivation tree, then the CFG is a
mbiguous.

Types of Recursive Grammars

Based on the nature of the recursion in a recursive grammar, a recursive CFG can be

again divided into the following:

● Left Recursive Grammar (having left Recursion)

● Right Recursive Grammar (having right Recursion)
● General Recursive Grammar(having general Recursion)

Note:A linear grammar is a context-free grammar that has at most one non-terminal in

the right hand side of each of its productions.

Inherently ambiguous Language:

Let L be a Context Free Language (CFL). If every Context Free Grammar G with

Language L = L(G) is ambiguous, then L is said to be inherently ambiguous Language.

Ambiguity is a property of grammar not languages. Ambiguous grammar is unlikely to

be useful for a programming language, because two parse trees structures(or more) for

the same string(program) implies two different meanings (executable programs) for the

program.

An inherently ambiguous language would be absolutely unsuitable as a programming

language, because we would not have any way of fixing a unique structure for all its

programs.

Disambiguate the grammar i.e., rewriting the grammar such that there is only one
derivation or parse tree possible for a string of the language which the grammar
represents.

Role of the parser :

In the syntax analysis phase, a compiler verifies whether or not the tokens generated by

the lexical analyzer are grouped according to the syntactic rules of the language. This is

done by a parser. The parser obtains a string of tokens from the lexical analyzer and

verifies that the string can be the grammar for the source language. It detects and
reports any syntax errors and produces a parse tree from which intermediate code can

be generated.

The process of deriving the string from the given grammar is known as derivation

(parsing).

Depending upon how derivation is done we have two kinds of parsers :-

1. Top Down Parser

2. Bottom Up Parser

Top Down Parser

Top down parsing attempts to build the parse tree from root to leave. Top down parser
will start from start symbol and proceeds to string. It follows leftmost derivation. In
leftmost derivation, the leftmost non-terminal in each sentential is always chosen.

Parsing is classified into two categories, i.e. Top Down Parsing and Bottom-Up Parsing.
Top-Down Parsing is based on Left Most Derivation whereas Bottom Up Parsing is
dependent on Reverse Right Most Derivation.

The process of constructing the parse tree which starts from the root and goes down to

the leaf is Top-Down Parsing.

● Top-Down Parsers constructs from the Grammar which is free from

ambiguity and left recursion.
● Top Down Parsers uses leftmost derivation to construct a parse tree.
● It allows a grammar which is free from Left Factoring.

Classification of Top-Down Parsing –

1. With Backtracking: Brute Force Technique or Recursive Descent Parsing

2. Without Backtracking: Predictive Parsing or Non-Recursive Parsing or LL(1)
Parsing or Table Driver Parsing

Brute Force Technique or Recursive Descent Parsing –

1. Whenever a Non-terminal spend first time then go with the first alternative
and compare with the given I/P String
2. If matching doesn’t occur then go with the second alternative and compare
with the given I/P String.
3. If matching again not found then go with the alternative and so on.
4. Moreover, If matching occurs for at least one alternative, then the I/P string
is parsed successfully.

LL(1) or Table Driver or Predictive Parser –

1. In LL1, first L stands for Left to Right and second L stands for Left-most
Derivation. 1 stands for number of Look Aheads token used by parser while
parsing a sentence.
2. LL(1) parsing is constructed from the grammar which is free from left
recursion, common prefix, and ambiguity.
3. LL(1) parser depends on 1 look ahead symbol to predict the production to
expand the parse tree.
4. This parser is Non-Recursive.

In this article, we are discussing the Bottom Up parser.

Bottom Up Parsers / Shift Reduce Parsers

Build the parse tree from leaves to root. Bottom-up parsing can be defined as an

attempt to reduce the input string w to the start symbol of grammar by tracing out the

rightmost derivations of w in reverse.

Eg.

Classification of bottom up parsers

A general shift reduce parsing is LR parsing. The L stands for scanning the input from

left to right and R stands for constructing a rightmost derivation in reverse.

Benefits of LR parsing:

1. Many programming languages using some variations of an LR parser. It

should be noted that C++ and Perl are exceptions to it.
2. LR Parser can be implemented very efficiently.
3. Of all the Parsers that scan their symbols from left to right, LR Parsers
detect syntactic errors, as soon as possible.

Important Notes

1. Even though CLR parser does not have RR conflict but LALR may contain RR conflict.

2. If number of states LR(0) = n1,

number of states SLR = n2,

number of states LALR = n3,

number of states CLR = n4 then,

n1 = n2 = n3 <= n4

Operator grammar and precedence parser

A grammar that is generated to define the mathematical operators is called operator

grammar with some restrictions on grammar. An o

perator precedence grammar is a

context-free grammar that has the property that no production has either an empty

right-hand side (null productions) or two adjacent non-terminals in its right-hand side.

Examples –

This is the example of operator grammar:

E->E+E/E*E/id

However, grammar that is given below is not an operator grammar because two

non-terminals are adjacent to each other:

S->SAS/a

A->bSb/b

Although, we can convert it into an operator grammar:

S->SbSbS/SbS/a
A->bSb/b

Operator precedence parser –

An operator precedence parser is a one of the bottom-up parser that interprets an
operator-precedence grammar. This parser is only used for operator grammars.
Ambiguous grammars are not allowed in case of any parser except operator precedence
parser.

Syntax Directed Translation

Background : Parser uses a CFG(Context-free-Grammer) to validate the input string and

produce output for next phase of the compiler. Output could be either a parse tree or

abstract syntax tree. Now to interleave semantic analysis with syntax analysis phase of

the compiler, we use Syntax Directed Translation.

Syntax Directed Translation are augmented rules to the grammar that facilitate
semantic analysis. SDT involves passing information bottom-up and/or top-down the
parse tree in form of attributes attached to the nodes.

Syntax directed translation rules use 1) lexical values of nodes, 2) constants & 3)
attributes associated to the non-terminals in their definitions.

The general approach to Syntax-Directed Translation is to construct a parse tree or

syntax tree and compute the values of attributes at the nodes of the tree by visiting
them in some order. In many cases, translation can be done during parsing without
building an explicit tree.

This is a grammar to syntactically validate an expression having additions and

multiplications in it. Now, to carry out semantic analysis we will augment SDT rules to
this grammar, in order to pass some information up the parse tree and check for
semantic errors, if any. In this example we will focus on evaluation of the given
expression, as we don’t have any semantic assertions to check in this very basic
example.

For understanding translation rules further, we take the first SDT augmented to [ E ->
E+T ] production rule. The translation rule in consideration has val as attribute for both
the non-terminals – E & T. Right hand side of the translation rule corresponds to
attribute values of right side nodes of the production rule and vice-versa. Generalizing,
SDT are augmented rules to a CFG that associate 1) set of attributes to every node of
the grammar and 2) set of translation rules to every production rule using attributes,
constants and lexical values.

To evaluate translation rules, we can employ one depth first search traversal on the

parse tree. This is possible only because SDT rules don’t impose any specific order on

evaluation until children attributes are computed before parents for a grammar having

all synthesized attributes. Otherwise, we would have to figure out the best suited plan to

traverse through the parse tree and evaluate all the attributes in one or more traversals.

For better understanding, we will move bottom up in left to right fashion for computing

translation rules of our example.

Above diagram shows how semantic analysis could happen. The flow of information

happens bottom-up and all the children attributes are computed before parents, as

discussed above. Right hand side nodes are sometimes annotated with subscript 1 to

distinguish between children and parent.

Additional Information

Synthesized Attributes are such attributes that depend only on the attribute values of

children nodes.

Thus [ E -> E+T { E.val = E.val + T.val } ] has a synthesized attribute val corresponding to

node E. If all the semantic attributes in an augmented grammar are synthesized, one

depth first search traversal in any order is sufficient for semantic analysis phase.
Inherited Attributes are such attributes that depend on parent and/or siblings

attributes.

Thus [ Ep -> E+T { Ep.val = E.val + T.val, T.val = Ep.val } ], where E & Ep are same

production symbols annotated to differentiate between parent and child, has an

inherited attribute val corresponding to node T.

Types of attributes –

Attributes may be of two types – Synthesized or Inherited.

1. Synthesized attributes –
A Synthesized attribute is an attribute of the non-terminal on the left-hand
side of a production. Synthesized attributes represent information that is
being passed up the parse tree. The attribute can take value only from its
children (Variables in the RHS of the production).

For eg. let’s say A -> BC is a production of a grammar, and A’s attribute is
dependent on B’s attributes or C’s attributes than it will be synthesized
attribute.

2. Inherited attributes –

An attribute of a nonterminal on the right-hand side of a production is called an

inherited attribute. The attribute can take value either from its parent or from its
siblings (variables in the LHS or RHS of the production).
For example, let’s say A -> BC is a production of a grammar and B’s attribute is
dependent on A’s attributes or C’s attributes than it will be inherited attribute.

Now, let’s discuss about S-attributed and L-attributed SDT.

1. S-attributed SDT :
○ If an SDT uses only synthesized attributes, it is called as
S-attributed SDT.
○ S-attributed SDTs are evaluated in bottom-up parsing, as the
values of the parent nodes depend upon the values of the child
nodes.
○ Semantic actions are placed in rightmost place of RHS.
2. L-attributed SDT:
○ If an SDT uses both synthesized attributes and inherited
attributes with a restriction that inherited attribute can inherit
values from left siblings only, it is called as L-attributed SDT.
○ Attributes in L-attributed SDTs are evaluated by depth-first and
left-to-right parsing manner.
○ Semantic actions are placed anywhere in RHS.
Note – If a definition is S-attributed, then it is also L-attributed but N
OT vice-versa.

Semantics
Semantics of a language provide meaning to its constructs, like tokens and
syntax structure. Semantics help interpret symbols, their types, and their
relations with each other. Semantic analysis judges whether the syntax
structure constructed in the source program derives any meaning or not.

CFG + semantic rules = Syntax Directed Definitions

For example:

int a = “value”;

should not issue an error in lexical and syntax analysis phase, as it is

lexically and structurally correct, but it should generate a semantic error as
the type of the assignment differs. These rules are set by the grammar of
the language and evaluated in semantic analysis. The following tasks should
be performed in semantic analysis:

● Scope resolution
● Type checking

● Array-bound checking

Attribute grammar is a special form of context-free grammar where some

additional information (attributes) are appended to one or more of its
non-terminals in order to provide context-sensitive information. Each
attribute has well-defined domain of values, such as integer, float,
character, string, and expressions.

Attribute grammar is a medium to provide semantics to the context-free

grammar and it can help specify the syntax and semantics of a
programming language. Attribute grammar (when viewed as a parse-tree)
can pass values or information among the nodes of a tree.

Reduction : When a terminal is reduced to its corresponding non-terminal

according to grammar rules. Syntax trees are parsed top-down and left to
right. Whenever reduction occurs, we apply its corresponding semantic
rules (actions).

Semantic analysis uses Syntax Directed Translations to perform the above

tasks.

Semantic analyzer receives AST (Abstract Syntax Tree) from its previous
stage (syntax analysis).

Semantic analyzer attaches attribute information with AST, which are called
Attributed AST.
Attributes are two tuple value, <attribute name, attribute value>

For example:

int value = 5;

<type, “integer”>

<presentvalue, “5”>

For every production, we attach a semantic rule.

S-attributed SDT
If an SDT uses only synthesized attributes, it is called as S-attributed SDT.
These attributes are evaluated using S-attributed SDTs that have their
semantic actions written after the production (right hand side).

As depicted above, attributes in S-attributed SDTs are evaluated in

bottom-up parsing, as the values of the parent nodes depend upon the
values of the child nodes.
L-attributed SDT
This form of SDT uses both synthesized and inherited attributes with
restriction of not taking values from right siblings.

In L-attributed SDTs, a non-terminal can get values from its parent, child,
and sibling nodes. As in the following production

S → ABC

S can take values from A, B, and C (synthesized). A can take values from S
only. B can take values from S and A. C can get values from S, A, and B. No
non-terminal can get values from the sibling to its right.

Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right

parsing manner.

We may conclude that if a definition is S-attributed, then it is also

L-attributed as L-attributed definition encloses S-attributed definitions.

Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
Acd 2.1
No ratings yet
Acd 2.1
20 pages
TUTCOMNAV1
No ratings yet
TUTCOMNAV1
5 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Compiler Basics for ISE Students
No ratings yet
Compiler Basics for ISE Students
55 pages
CDPPT Unit1
No ratings yet
CDPPT Unit1
60 pages
Ak CD Cse 305 Assignment 1
No ratings yet
Ak CD Cse 305 Assignment 1
15 pages
Assignment SCDlab
No ratings yet
Assignment SCDlab
5 pages
CH 1
No ratings yet
CH 1
23 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
Introduction
No ratings yet
Introduction
23 pages
Intro To Compilers
No ratings yet
Intro To Compilers
77 pages
Compiler Design Module
100% (1)
Compiler Design Module
120 pages
1.lecture Notes 19 Apil
No ratings yet
1.lecture Notes 19 Apil
26 pages
Language Processing System in Compiler Design: Difficulty Level: Last Updated: 22 Feb, 2021
No ratings yet
Language Processing System in Compiler Design: Difficulty Level: Last Updated: 22 Feb, 2021
54 pages
Unit 1 Part 3 - Compiler
No ratings yet
Unit 1 Part 3 - Compiler
45 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
Automata and Compiler Design: D.Rahul
No ratings yet
Automata and Compiler Design: D.Rahul
638 pages
Chapter 2ditt
No ratings yet
Chapter 2ditt
19 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
29 pages
CD Assignment Question Bank
No ratings yet
CD Assignment Question Bank
20 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
CD All Units
No ratings yet
CD All Units
117 pages
Compiler CH1
No ratings yet
Compiler CH1
24 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Compiler Design - Module 1-Notes
No ratings yet
Compiler Design - Module 1-Notes
74 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
6 pages
Role and Importance of Compilers
No ratings yet
Role and Importance of Compilers
126 pages
Compiler Design Chapter-1
No ratings yet
Compiler Design Chapter-1
41 pages
Compiler Design Note
No ratings yet
Compiler Design Note
313 pages
1-Phases of Compiler
No ratings yet
1-Phases of Compiler
68 pages
Chapter-1 Compiler Design
100% (1)
Chapter-1 Compiler Design
13 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
Compiler RNP SP Unit 4
No ratings yet
Compiler RNP SP Unit 4
69 pages
Unit 1 - CD Cs3501
No ratings yet
Unit 1 - CD Cs3501
24 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
Lec1 Introduction To Compiler
No ratings yet
Lec1 Introduction To Compiler
16 pages
Introduction To Compilers Complier: Ompiler Source Program Target Program Error Message
No ratings yet
Introduction To Compilers Complier: Ompiler Source Program Target Program Error Message
23 pages
Basic of Compiler
100% (1)
Basic of Compiler
17 pages
Unit 01
No ratings yet
Unit 01
78 pages
Compiler Designing
No ratings yet
Compiler Designing
12 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
CD 1
No ratings yet
CD 1
15 pages
Compiler
No ratings yet
Compiler
17 pages
5CAI4-02 Compiler Design
No ratings yet
5CAI4-02 Compiler Design
21 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
Compiler Design
No ratings yet
Compiler Design
11 pages
Compiler Phases Overview
No ratings yet
Compiler Phases Overview
17 pages
CD Mid 1 Solution
No ratings yet
CD Mid 1 Solution
46 pages
Slides 01 - Compiler Construction - UET CS - Introduction
No ratings yet
Slides 01 - Compiler Construction - UET CS - Introduction
37 pages
Compiler Design Principles Guide
No ratings yet
Compiler Design Principles Guide
40 pages
Compiler Design CSE - 353: UNIT-1
No ratings yet
Compiler Design CSE - 353: UNIT-1
42 pages
7MCE1C4-Principles of Compiler Design
No ratings yet
7MCE1C4-Principles of Compiler Design
117 pages
UNIT-1 For CD
No ratings yet
UNIT-1 For CD
28 pages
1 Introduction
No ratings yet
1 Introduction
38 pages
Compiler Design
No ratings yet
Compiler Design
53 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
JSP PDF
No ratings yet
JSP PDF
11 pages
Noteshub - Co.In - Download Android App: Scanned by Camscanner
No ratings yet
Noteshub - Co.In - Download Android App: Scanned by Camscanner
35 pages
MMPC 2
No ratings yet
MMPC 2
34 pages
Unit 2 PDF
No ratings yet
Unit 2 PDF
10 pages
Noteshub - Co.In - Download Android App: Scanned by Camscanner
No ratings yet
Noteshub - Co.In - Download Android App: Scanned by Camscanner
35 pages
Collate OS Notes Unit 2
No ratings yet
Collate OS Notes Unit 2
34 pages
Akash 2018
No ratings yet
Akash 2018
17 pages
All Editorials PDF
No ratings yet
All Editorials PDF
18 pages
Noteshub - Co.In - Download Android App: Scanned by Camscanner
No ratings yet
Noteshub - Co.In - Download Android App: Scanned by Camscanner
32 pages
Blood Bank Management System: R.Logarajah University of Jaffna
No ratings yet
Blood Bank Management System: R.Logarajah University of Jaffna
33 pages
Smart Blood Bank Project
No ratings yet
Smart Blood Bank Project
24 pages
(WWW - Entrance-Exam - Net) - JELET Solved Question Papers PDF
No ratings yet
(WWW - Entrance-Exam - Net) - JELET Solved Question Papers PDF
41 pages
LC-3 ISA Guide for Computer Engineers
100% (4)
LC-3 ISA Guide for Computer Engineers
26 pages
MP Codes
No ratings yet
MP Codes
8 pages
CUDA Programming Basic: High Performance Computing Center Hanoi University of Science & Technology
No ratings yet
CUDA Programming Basic: High Performance Computing Center Hanoi University of Science & Technology
38 pages
Coa Question Papers
No ratings yet
Coa Question Papers
2 pages
GameBoy Advance Programming Manual
100% (2)
GameBoy Advance Programming Manual
172 pages
Workshop Manual
No ratings yet
Workshop Manual
21 pages
Web Development Bootcamp
No ratings yet
Web Development Bootcamp
20 pages
Python Unit-I - Basics of Python Notes
No ratings yet
Python Unit-I - Basics of Python Notes
40 pages
Kotlin Development Complete Guide - Henry Codwell
No ratings yet
Kotlin Development Complete Guide - Henry Codwell
521 pages
Fundamentals of C++ Programming: Hochiminh University of Technology Computer Science and Engineering - (CO1011)
No ratings yet
Fundamentals of C++ Programming: Hochiminh University of Technology Computer Science and Engineering - (CO1011)
34 pages
AN12107
No ratings yet
AN12107
48 pages
Micro Drainage International Manual
No ratings yet
Micro Drainage International Manual
349 pages
Core 2 Edited Done
No ratings yet
Core 2 Edited Done
55 pages
S7-200 PowerUnit PLC
No ratings yet
S7-200 PowerUnit PLC
28 pages
Porttalk Installation Steps
No ratings yet
Porttalk Installation Steps
2 pages
ERS: Energy-Efficient Real-Time DAG Scheduling On Uniform Multiprocessor Embedded Systems
No ratings yet
ERS: Energy-Efficient Real-Time DAG Scheduling On Uniform Multiprocessor Embedded Systems
6 pages
Gate Mega Sheet
No ratings yet
Gate Mega Sheet
14 pages
Lenovo 9220 HB1 LCD Monitor Service Manual
No ratings yet
Lenovo 9220 HB1 LCD Monitor Service Manual
65 pages
Conclusion
No ratings yet
Conclusion
6 pages
Final - SEIT - UNIT 1 - OOP
No ratings yet
Final - SEIT - UNIT 1 - OOP
55 pages
Computer Science Basics
No ratings yet
Computer Science Basics
6 pages
CPU Compatibility List
No ratings yet
CPU Compatibility List
3 pages
OSP-P200L Simple Manual V25
No ratings yet
OSP-P200L Simple Manual V25
23 pages
Lesson 1 - Introduction To Fault-Tolerant Computing
No ratings yet
Lesson 1 - Introduction To Fault-Tolerant Computing
6 pages
CloudByte ElastiStor Guide
No ratings yet
CloudByte ElastiStor Guide
313 pages
AWS and DEVOPS Course Syllabus
No ratings yet
AWS and DEVOPS Course Syllabus
5 pages
CSS Grade 10 Quarter 2 LAS 1
No ratings yet
CSS Grade 10 Quarter 2 LAS 1
15 pages
ETC Nomad EOS Hotkeys
100% (1)
ETC Nomad EOS Hotkeys
2 pages
Compiler Design - Code Generation
No ratings yet
Compiler Design - Code Generation
62 pages

Unit 1,2 PDF

Uploaded by

Unit 1,2 PDF

Uploaded by

Compiler Deisgn Unit 1,2

representation. It has three parts :

1. Intermediate Code Generator

specific machine instructions.

Compiler construction tools

parts. Some commonly used compiler construction tools include:

3.Syntax directed translation engines –

4.Automatic code generators –

5.Data-flow analysis engines –

6.Compiler construction toolkits –

smoothly by finding the identifiers quickly.

Symbol table is build in syntax and semantic analysis phase.

High level input program into a sequence of T

● Lexical Analysis can be implemented with the D

of the programming languages.

● Type token (id, number, real, . . . )

Keywords; Examples-for, while, if etc.

Identifier; Examples-Variable name, function name etc.

Operators; Examples '+', '++', '-' etc.

Separators; Examples ',' ';' etc

● Comments, preprocessor directive, macros, blanks, tabs, newline etc

Lexeme​: The sequence of characters matched by a pattern to form

How Lexical Analyzer functions

1. Tokenization .i.e Dividing the program into valid tokens.

2. Remove white space characters.

a = b + c​ ; It will generate token sequence like this:

referencing all details

For example, consider the program

All the valid tokens are:

'a' '=' '10' ';' 'return' '0' ';' '}'

Above are the valid tokens.

As another example, consider below printf statement.

Flex (Fast Lexical Analyzer Generator )

analyzers (scanners or lexers) written by Vern Paxson in C around 1987. It is used

Introduction to Syntax Analysis

the given input string can be produced with the help of the syntax tree (in the derivation

The Grammar for a Language consists of Production rules.

Suppose Production rules for the Grammar of a language are:

And the input string is “cad”.

In the step iii above, the production rule A->bc was not a suitable one to apply (because

“Concepts of FIRST and FOLLOW sets in Compiler Design”.

1) Based on number of strings it generates.

● If CFG is generating finite number of strings, then CFG is N

2) Based on number of derivation trees.

● If there is only 1 derivation tree then the CFG is unambiguous.

Types of Recursive Grammars

again divided into the following:

● Left Recursive Grammar (having left Recursion)

the right hand side of each of its productions.

Inherently ambiguous Language:

Language L = L(G) is ambiguous, then L is said to be inherently ambiguous Language.

Ambiguity is a property of grammar not languages. Ambiguous grammar is unlikely to

An inherently ambiguous language would be absolutely unsuitable as a programming

Role of the parser :

1. Top Down Parser

Top Down Parser

the leaf is Top-Down Parsing.

● Top-Down Parsers constructs from the Grammar which is free from

Classification of Top-Down Parsing –

1. With Backtracking:​ Brute Force Technique or Recursive Descent Parsing

Brute Force Technique or Recursive Descent Parsing –

LL(1) or Table Driver or Predictive Parser –

In this article, we are discussing the Bottom Up parser.

Bottom Up Parsers / Shift Reduce Parsers

rightmost derivations of w in reverse.

Classification of bottom up parsers

left to right and R stands for constructing a rightmost derivation in reverse.

1. Many programming languages using some variations of an LR parser. It

2. If number of states LR(0) = n1,

number of states LALR = n3,

number of states CLR = n4 then,

Operator grammar and precedence parser

grammar​ with some restrictions on grammar. An o

This is the example of operator grammar:

non-terminals are adjacent to each other:

Although, we can convert it into an operator grammar:

Lexeme: The sequence of characters matched by a pattern to form

a = b + c ; It will generate token sequence like this:

1. With Backtracking: Brute Force Technique or Recursive Descent Parsing

grammar with some restrictions on grammar. An o

CFG + semantic rules = Syntax Directed Definitions

int a = “value”;

int value = 5;