Compiler, various phases of
compilation
Introduction to Compiler
• A compiler is a translator that converts the high-level language into the machine language.
• High-level language is written by a developer and machine language can be understood by the
processor.
• Compiler is used to show errors to the programmer.
• The main purpose of compiler is to change the code written in one language without changing the
meaning of the program.
• When you execute a program which is written in HLL programming language then it executes into
two parts.
• In the first part, the source program compiled and translated into the object program (low level
language).
• In the second part, object program translated into the target program through the assembler.
There are various types of compilers.
Some of the common types include:
• Source-to-source compiler: This is a compiler in which the source code of one language is
transformed into a source code of a different language. Examples include CoffeeScript and Haxe.
• Cross compiler: In this compiler, the source code can be produced in one machine and executed in a
different machine. GNU Compiler Collection (GCC) is a good example of a cross compiler.
• JIT (just in time) compiler: In this compiler, the compilation is deferred until the runtime. It is
applied to modern programming languages such as Python, Java, and JavaScript.
• Hardware compiler: This is a compiler that uses the hardware configuration to produce output, rather
than a string of instructions. Xilinx Synthesis Tool (XST) is a good example of a hardware compiler.
Introduction to language processing system
Learning the language processing system for C
programming language can improve our
understanding of how a compiler is utilized. This
system consists of various components that
process the input language to produce the desired
output.
The following provides a brief explanation of these components:
• Preprocessor: This tool produces an output that is used as the input for the compiler. It
performs various operations such as macro-processing, language extension, and file inclusion.
• Compiler: This compiles the high-level language and translates it into a language that can be
understood by the assembler (assembly code or low-level language).
• Assembler: This tool uses the output of the compiler as its input. In this tool, the assembly
code is transformed into machine code. The output produced by the assembler is called the
object file.
• Linker: This tool transforms the output of the assembler into executable machine code. Here,
all the program parts are linked to enhance execution.
• Loader: This tool collects the executable machine codes and loads them into the memory for
execution.
The compiler design architecture
The compiler design architecture can be divided
into two main parts: analysis and synthesis.
Analysis
This part represents the front-end in compiler design.
It consists of various operations such as analyzing the source code, dividing the core into sections, and
checking for errors. It also constructs a symbol table to map source code symbols to relating
information such as type, scope, and location.
An intermediate representation (IR) of the program is generated and analyzed before it is sent to the
synthesis phase. The analysis part of the architecture consists of phases such as preprocessing, lexical
analysis, syntax analysis, and semantic analysis.
Synthesis
This part uses the intermediate code representation as the input. It represents the back-end in compiler
design. The synthesis part of the architecture utilizes the symbol table and the intermediate code
representation to produce the target program. It consists of phases such as optimization and code
generation.
Difference Between Compiler and Interpreter
Basis Compiler Interpreter
The entire program is analyzed in a Line by line of the program is analyzed in
Analysis
compiler. an interpreter.
Machine Code Stores machine code in the disk storage. Machine code is not stored anywhere.
The execution of the program happens The execution of the program takes place
Execution only after the entire program is after every line is evaluated and hence the
compiled. error is raised line by line if any.
Run Time Compiled program runs faster Interpreted program runs slower.
The compilation gives an output The interpretation does not give any output
Generation program that runs independently from program and is thus evaluated on every
the source file. execution.
Compiler Phases
The compilation process
contains the sequence of
various phases. Each phase
takes source program in one
representation and produces
output in another
representation. Each phase
takes input from its previous
stage.
Symbol table
It is king of data structure that
stores various identifiers and
their attributes.
1. Lexical analysis
This is the first phase of the compiler that receives the source code, scans, and transforms it
into lexemes. These lexemes are represented by the lexical analyzer in a token form.
Tokens consist of various categories such as separators, identifiers, operators, comments, and
keywords.
x = a + b * 20
here, x, a, b, 20 are identifiers (Token)
=, +. * are operators (Token
2. Syntax analysis =
This phase is also referred to as parsing. It uses the
tokens generated in the previous phase to produce a
syntax tree (parse tree). It checks whether the token X *
expressions are syntactically correct.
a +
b 20
3. Semantic Analyzer
It verifies the parse tree, whether it’s meaningful or not. It furthermore produces a verified parse
tree. It also does type checking, Label checking, and Flow control checking.
4. Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the intermediate
code. Intermediate code is generated between the high-level language and the machine language.
The intermediate code should be generated in such a way that you can easily translate it into the
target machine code.
5. Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate code so that the
output of the program could run faster and take less space. It removes the unnecessary lines of
the code and arranges the sequence of statements in order to speed up the program execution.
6. Code Generation
Code generation is the final stage of the compilation process. It takes the optimized intermediate
code as input and maps it to the target machine language. Code generator translates the
intermediate code into the machine code of the specified computer.
temp1 = 50
temp2 = id3 * temp1 MOV R1,#50
temp2 = id2 + temp2 MOV R2, ID3
id1 = temp3 MUL R1, R2
ADD id1, r1
Parse tree semantically
verified (It has some
meaning)
temp1 = id3 * 50
id1 = id2 + temp1
Compiler Passes
Pass is a complete traversal of the source program. Compiler has two passes to traverse the source program.
Multi-pass Compiler
• Multi pass compiler is used to process the source code of a program several times.
• In the first pass, compiler can read the source program, scan it, extract the tokens and store the result in
an output file.
• In the second pass, compiler can read the output file produced by first pass, build the syntactic tree and
perform the syntactical analysis. The output of this phase is a file that contains the syntactical tree.
• In the third pass, compiler can read the output file produced by second pass and check that the tree
follows the rules of language or not. The output of semantic analysis phase is the annotated tree syntax.
• This pass is going on, until the target output is produced.
One-pass Compiler
• One-pass compiler is used to traverse the program only once. The one-pass compiler passes only
once through the parts of each compilation unit. It translates each part into its final machine code.
• In the one pass compiler, when the line source is processed, it is scanned and the token is extracted.
• Then the syntax of each line is analyzed and the tree structure is build. After the semantic part, the
code is generated.
• The same process is repeated for each line of code until the entire program is compiled.
1. What is a compiler?
a) system program that converts instructions to machine language
b) system program that converts machine language to high-level language
c) system program that writes instructions to perform
d) None of the mentioned
View Answer
Answer: a
2. Which of the following is a stage of compiler design?
a) Semantic analysis
b) Intermediate code generator
c) Code generator
d) All of the mentioned
View Answer
Answer: d
Explanation: The phases of a compiler are:
1. Lexical analysis
2. Syntax analysis
3. Semantic analysis
4. Intermediate code generator
5. Code optimizer
6. Code generator
3. What is the output of the lexical analyzer?
a) Data Types
b) Tokens
c) Code
d) All of the mentioned
View Answer
Answer: b
4. Which of the following error can a compiler check?
a) Syntax Error
b) Logical Error
c) Both Logical and Syntax Error
d) Compiler cannot check errors
View Answer
Answer: a
5. A programmer, writes a program to multiply two numbers instead of dividing them by
mistake, how can this error be detected?
a) Compiler or interpreter
b) Compiler only
c) Interpreter only
d) None of the mentioned
View Answer
Answer: d
Explanation: This is a logical error that can’t be detected by any compiler or interpreter.
6. Who is responsible for the creation of the symbol table?
a) Assembler
b) Compiler
c) Interpreter
d) All of the mentioned
View Answer
Answer: b
Explanation: The compiler generates a symbol table, which contains a list of lexemes or tokens.
7. Which of the following is known as a compiler for a high-level language that runs on one
machine and produces code for a different machine?
a) Cross compiler
b) Multipass compiler
c) Optimizing compiler
d) One pass compiler
View Answer
Answer: a
8. A system program that integrates a program’s individually compiled modules into a
form that can be executed?
a) Interpreter
b) Assembler
c) Compiler
d) Linking Loader
View Answer
Answer: d
Explanation: A loader that combines the functionality of a relocation loader with the
ability to combine a number of independently compiled program segments.
9. A compiler is a program that
a) Acceptance of a program written in a high-level language and produces an object program
b) Program is put into memory and executes it
c) Translation of assembly language into machine language
d) None of the mentioned
View Answer
Answer: a
Explanation: A compiler is a software (or combination of programs) that converts source
code written in one programming language (the source language) into code written in another
programming language (the target language) (the target language, often having a binary form
known as object code).
10. Which phase of the compiler is Syntax Analysis?
a) Second
b) Third
c) First
d) All of the mentioned
View Answer
Answer: a
THANK YOU!