0% found this document useful (0 votes)
47 views37 pages

CD Unit-1

The document provides an overview of compilers and their structure. It discusses the different phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, code optimization, and code generation. It also compares compilers to interpreters and assemblers, noting key differences in how each translates source code. The overall purpose is to explain the basic components and process of compilation.

Uploaded by

dartionjp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views37 pages

CD Unit-1

The document provides an overview of compilers and their structure. It discusses the different phases of a compiler, including lexical analysis, syntax analysis, semantic analysis, code optimization, and code generation. It also compares compilers to interpreters and assemblers, noting key differences in how each translates source code. The overall purpose is to explain the basic components and process of compilation.

Uploaded by

dartionjp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

UNIT-1

Overview of the Compiler


and its Structure

1
Language processor
 A language processor is a software program designed or used to
perform tasks such as processing program code to machine code.
 The programs are written mostly in high level languages like Java,
C++, Python etc. and are called source code.
 These source code cannot be executed directly by the computer
and must be converted into machine language to be executed.
 Hence, a special translator system software is used to translate
the program written in high-level language into machine code is
called Language Processor and the program after translated into
machine code (object program / object code).

2
Translator
 A translator is a program that takes one form of program as input
and converts it into another form.
 Types of translators are:
1. Compiler
2. Interpreter
3. Assembler

Source Translator Target


Program Program

Error
Messages (If any)
3
Compiler
 A compiler is a program that reads a program written in source
language and translates it into an equivalent program in target
language.

void main() 0000 1100 0010


{ 0100
Source
int a=1,b=2,c; Compiler 0111 1000 0001
Target
c=a+b; Program 1111 0101 1110
Program
printf(“%d”,c); 1100 0000 1000
} 1011

Source Error Target


Program Messages (If any) Program

4
Interpreter
 Interpreter is also program that reads a program written in source
language and translates it into an equivalent program in target
language line by line.

Void main() 0000 1100 0010


{ 0000
int a=1,b=2,c; Interpreter 1111 1100 0010
c=a+b;
1010 1100 0010
printf(“%d”,c);
0011 1100 0010
} 1111
Error
Source Target
Messages (If any)
Program Program

5
Assembler
• Assembler is a translator which takes the assembly code as an
input and generates the machine code as an output.

MOV id3, R1 0000 1100 0010


MUL #2.0, R1 0100
MOV id2, R2 0111 1000 0001
MUL R2, R1 Assembler 1111 0101 1110
MOV id1, R2 1100 0000 1000
ADD R2, R1 1011
MOV R1, id1 1100 0000 1000
Error
Assembly Code Messages (If any) Machine Code

6
Difference between compiler & interpreter
Compiler Interpreter
Scans the entire program and translates It translates program’s one statement at
it as a whole into machine code. a time.

It generates intermediate code. It does not generate intermediate code.

An error is displayed after entire An error is displayed for every


program is checked. instruction interpreted if any.

Requirement of memory is more. Requirement of memory is less.


Example: C compiler Example: Basic, Python, Ruby

7
Difference between assembler & interpreter
Assembler Interpreter
It converts low-level language to the It converts high-level language to the
machine language. machine language.
The program for an Assembler is written The program for an Interpreter is
for particular hardware. written for particular language.
It is one to one i.e. one instruction It is one to many i.e. one instruction
translates to only one instruction. translates to many instruction.
It translates entire program before It translates program instructions line by
running. line.
Errors are displayed before program is Errors are displayed for the every
running. interpreted instruction (if any).
It is used only one time to create an It is used everytime when the program
executable file. is running.
Requirement of memory is less. Requirement of memory is more.

8
Analysis synthesis model of compilation
 There are two part of compilation.
1. Analysis Phase
2. Synthesis Phase

void main() Analysis Synthesis 0000 1100


{ Phase Phase 0111 1000
int a=1,b=2,c; 0001
c=a+b; 1111 0101
printf(“%d”,c); Intermediate 1000
} Representation 1011

Source Code Target Code

9
Analysis phase & Synthesis phase
Analysis Phase Synthesis Phase
 Analysis part breaks up the  The synthesis part constructs
source program into the desired target program
constituent pieces and from the intermediate
creates an intermediate representation.
representation of the source  Synthesis phase consist of the
program. following sub phases:
 Analysis phase consist of 1. Code optimization
three sub phases: 2. Code generation
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
10
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation
Code
Semantic analysis generation

11
Lexical analysis
 Lexical Analysis is also called linear analysis Position = initial + rate*60
or scanning.
Lexical analysis
 Lexical Analyzer divides the given source
statement into the tokens.
id1 = id2 + id3 * 60
 Ex: Position = initial + rate * 60 would be
grouped into the following tokens:
Position (identifier)
= (Assignment symbol)
initial (identifier)
+ (Plus symbol)
rate (identifier)
* (Multiplication symbol)
60 (Number)

12
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation
Code
Semantic analysis generation

13
Syntax analysis
 Syntax Analysis is also called Parsing or Position = initial + rate*60
Hierarchical Analysis. Lexical analysis
 The syntax analyzer checks each line of
the code and spots every tiny mistake. id1 = id2 + id3 * 60
 If code is error free then syntax Syntax analysis
analyzer generates the tree.
=

id1 +

id2 *
id3 60

14
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation
Code
Semantic analysis generation

15
Semantic analysis
=
 Semantic analyzer determines the
meaning of a source string. id1 +
 It performs following operations: id2 * int to
1. matching of parenthesis in the real
id3 60
expression.
2. Matching of if..else statement. Semantic analysis
3. Performing arithmetic operation =
that are type compatible. id1 +
4. Checking the scope of operation.
id2 *

*Note: Consider id1, id2 and id3 are real id3 inttoreal

60
16
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation

Code
Semantic analysis generation

17
Intermediate code generator
=
 Two important properties of
intermediate code : id1 +
1. It should be easy to produce.
id2 *
2. Easy to translate into target
t3 id3 inttoreal
program. t2 t1
 Intermediate form can be represented 60
Intermediate code
using “three address code”.
 Three address code consist of a
t1= int to real(60)
sequence of instruction, each of which t2= id3 * t1
has at most three operands. t3= t2 + id2
id1= t3

18
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation
Code
Semantic analysis generation

19
Code optimization
 It improves the intermediate code.
 This is necessary to have a faster Intermediate code
execution of code or less consumption
t1= int to real(60)
of memory. t2= id3 * t1
t3= t2 + id2
id1= t3

Code optimization

t1= id3 * 60.0


id1 = id2 + t1

20
Phases of compiler
Compiler

Analysis phase Synthesis phase

Lexical analysis
Intermediate Code
code optimization
Syntax analysis generation
Code
Semantic analysis generation

21
Code generation
 The intermediate code instructions are
Code optimization
translated into sequence of machine
instruction. t1= id3 * 60.0
id1 = id2 + t1

Code generation

MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2,R1
MOVF R1, id1

22
Phases of compiler
Source program

Analysis Phase
Lexical analysis

Syntax analysis

Semantic analysis
Symbol table Error detection
and recovery
Intermediate code

Sr. Variable Type Address Code optimization


Name
1 Position Float 0001
Code generation Synthesis Phase
2 Initial Float 0005
3 Rate Float 0009
Target Program
23
Exercise
 Write output of all the phases of compiler for following
statements:
1. x = b-c*2
2. I=p*n*r/100

24
Context of compiler (Cousins of compiler)
Skeletal
Source Preprocessor
Program

Preprocessor
It performs the following functions:
1. Macro processing
2. File inclusion
3. Rational preprocessor
4. Language extensions

25
Context of compiler (Cousins of compiler)
Skeletal
Source Preprocessor
Program

Preprocessor
1. Macro processing: Allows user to define macros. Macro is shorthand
for longer constructs.
Ex: #define PI 3.14159265358979323846
2. File inclusion: A preprocessor may include the header file into the
program.
Ex: #include<stdio.h>
3. Rational preprocessor: It provides built in macro for construct like
while statement or if statement.

26
Context of compiler (Cousins of compiler)
Skeletal
Source Preprocessor
Program

Preprocessor
4. Language extensions: Add capabilities to the language by using
built-in macros.
• Ex: the language equal is a database query language
embedded in C.
• Statement beginning with ## are taken by preprocessor to be
database access statement unrelated to C and translated into
procedure call on routines that perform the database access.

27
Context of compiler (Cousins of compiler)
Skeletal
Source Preprocessor Compiler
Source
Program Program

Compiler
• A compiler is a program that reads a program written in source
language and translates it into an equivalent program in target
language.

28
Context of compiler (Cousins of compiler)
Skeletal
Preprocessor Compiler
Source Source Target Assembler
Program Program Assembly
Program

Assembler
• Assembler is a translator which takes the assembly program
(mnemonic) as an input and generates the machine code as an
output.

29
Context of compiler (Cousins of compiler)
Skeletal
Preprocessor Compiler
Source Source Target Assembler Relocatable
Program Program Assembly Object Code
Program
Linker / Loader
Libraries &
Linker Object Files
• Linker makes a single program from a several files of relocatable
machine code.
• These files may have been the result of several different compilation,
and one or more library files.

30
Context of compiler (Cousins of compiler)
Skeletal
Preprocessor Compiler
Source Source Target Assembler Relocatable
Program Program Assembly Object Code
Program
Linker / Loader
Libraries &
Loader Object Files
• The process of loading consists of: Absolute
1. Taking relocatable machine code. Machine
Code
2. Altering the relocatable address.
3. Placing the altered instructions and
data in memory at the proper
location.

31
Front end & back end (Grouping of phases)
Front end: Depends primarily on source language and largely independent of
the target machine.
It includes following phases:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generation
5. Creation of symbol table
Back end: Depends on target machine and do not depend on source program.
It includes following phases:
6. Code optimization
7. Code generation phase
8. Error handling and symbol table operation

32
Pass structure
 One complete scan of a source program is called pass.
 Pass includes reading an input file and writing to the output file.
 Compiler pass refers to the traversal of a compiler through the
entire program.
 Compiler pass are two types: Single Pass Compiler, and Two Pass
Compiler or Multi Pass Compiler.

33
Single Pass Compiler
 If we combine or group all the phases of compiler in
a single module known as single pass compiler.
 A one pass/single pass compiler is that type of compiler that
passes through the part of each compilation unit exactly once.
 Single pass compiler is faster and smaller than the multi pass
compiler.
 As a disadvantage of single pass compiler is that it is less efficient
in comparison with multipass compiler.
 Single pass compiler is one that processes the input exactly once,
so going directly from lexical analysis to code generator, and then
going back for the next read.

34
Two Pass compiler or Multi Pass compiler
 A Two pass/multi-pass Compiler is a type of compiler that processes
the source code or abstract syntax tree of a program multiple times.
In multipass Compiler we divide phases in two pass as:
1. First Pass is refers as
• Front end
• Analytic part
• Platform independent
2. Second Pass: is refers as
• Back end
• Synthesis Part
• Platform Dependent

35
Pass structure
Forward reference: A forward reference of a program entity is a
reference to the entity which precedes its definition in the program.
 This problem can be solved by postponing the generation of target
code until more information concerning the entity becomes
available.
 It leads to multi pass model of compilation.
 In Pass I: Perform analysis of the source program and note
relevant information.
 In Pass II: Generate target code using information noted in pass I.

36
Types of compiler
1. One pass compiler
• It is a type of compiler that compiles whole process in one-pass.
2. Two pass compiler
• It is a type of compiler that compiles whole process in two-pass.
• It generates intermediate code.
3. Incremental compiler
• The compiler which compiles only the changed line from the source
code and update the object code.
4. Native code compiler
• The compiler used to compile a source code for a same type of
platform only.
5. Cross compiler
• The compiler used to compile a source code for a different kinds platform.
37

You might also like