Compiler Construction
Phases of Compiler
The 6 phases of a compiler are:
1. Lexical Analysis
2. Syntactic Analysis or Parsing
3. Semantic Analysis
4. Intermediate Code Generation
5. Code Optimization
6. Code Generation
1.Lexical Analysis: Lexical analysis or Lexical analyzer is the initial stage or phase
of the compiler. This phase scans the source code and transforms the input
program into a series of a token.
Compiler Construction
A token is basically the arrangement of characters that defines a unit of
information in the source code.
In computer science, a program that executes the process of lexical analysis is
called a scanner, tokenizer, or lexer.
Roles and Responsibilities of Lexical Analyzer
• It is accountable for terminating the comments and white spaces from the
source program.
• It helps in identifying the tokens.
• Categorization of lexical units.
2. Syntax Analysis:
In the compilation procedure, the Syntax analysis is the second stage. Here the
provided input string is scanned for the validation of the structure of the standard
grammar. Basically, in the second phase, it analyses the syntactical structure and
inspects if the given input is correct or not in terms of programming syntax.
Compiler Construction
It accepts tokens as input and provides a parse tree as output. It is also known as
parsing in a compiler.
Roles and Responsibilities of Syntax Analyzer
• Note syntax errors.
• Helps in building a parse tree.
• Acquire tokens from the lexical analyzer.
• Scan the syntax errors, if any.
3. Semantic Analysis:
In the process of compilation, semantic analysis is the third phase. It scans
whether the parse tree follows the guidelines of language. It also helps in keeping
track of identifiers and expressions. In simple words, we can say that a semantic
analyzer defines the validity of the parse tree, and the annotated syntax tree
comes as an output.
Roles and Responsibilities of Semantic Analyzer:
• Saving collected data to symbol tables or syntax trees.
• It notifies semantic errors.
• Scanning for semantic errors.
4. Intermediate Code Generation:
The parse tree is semantically confirmed; now, an intermediate code generator
develops three address codes. A middle-level language code generated by a
compiler at the time of the translation of a source program into the object code is
known as intermediate code or text.
Compiler Construction
Few Important Pointers:
• A code that is neither high-level nor machine code, but a middle-level code
is an intermediate code.
• We can translate this code to machine code later.
• This stage serves as a bridge or way from analysis to synthesis.
Roles and Responsibilities:
• Helps in maintaining the priority ordering of the source language.
• Translate the intermediate code into the machine code.
• Having operands of instructions.
Compiler Construction
5. Code optimizer:
Now coming to a phase that is totally optional, and it is code optimization. It is
used to enhance the intermediate code. This way, the output of the program is
able to run fast and consume less space. To improve the speed of the program, it
eliminates the unnecessary strings of the code and organizes the sequence of
statements.
Roles and Responsibilities:
• Remove the unused variables and unreachable code.
• Enhance runtime and execution of the program.
• Produce streamlined code from the intermediate expression.
6. Code Generator: The final stage of the compilation process is the code
generation process. In this final phase, it tries to acquire the intermediate code as
input which is fully optimised and map it to the machine code or language. Later,
the code generator helps in translating the intermediate code into the machine
code.
Roles and Responsibilities:
• Translate the intermediate code to target machine code.
• Select and allocate memory spots and registers.
What is a Symbol Table?
The symbol table is mainly known as the data structure of the compiler. It helps in
storing the identifiers with their name and types. It makes it very easy to operate
the searching and fetching process.
Compiler Construction
The symbol table connects or interacts with all phases of the compiler and error
handler for updates. It is also accountable for scope management.
It stores:
• It stores the literal constants and strings.
• It helps in storing the function names.
• It also prefers to store variable names and constants.
• It stores labels in source languages.
The advantages of using a compiler to translate high-level programming
languages into machine code are:
1. Portability: Compilers allow programs to be written in a high-level
programming language, which can be executed on different hardware
platforms without the need for modification. This means that programs can
be written once and run on multiple platforms, making them more
portable.
2. Optimization: Compilers can apply various optimization techniques to the
code, such as loop unrolling, dead code elimination, and constant
propagation, which can significantly improve the performance of the
generated machine code.
3. Error Checking: Compilers perform a thorough check of the source code,
which can detect syntax and semantic errors at compile-time, thereby
reducing the likelihood of runtime errors.
4. Maintainability: Programs written in high-level languages are easier to
understand and maintain than programs written in low-level assembly
language. Compilers help in translating high-level code into machine code,
making programs easier to maintain and modify.
5. Productivity: High-level programming languages and compilers help in
increasing the productivity of developers. Developers can write code faster
in high-level languages, which can be compiled into efficient machine code.