Bottom-up Parsing: The bottom-up parsing works just the reverse of the top-down
parsing. It first traces the rightmost derivation of the input until it reaches the start symbol.
Shift-Reduce Parsing: Shift-reduce parsing works on two steps: Shift step and Reduce step.
           Shift step: The shift step indicates the increment of the input pointer to the next input
symbol that is shifted.Reduce Step: When the parser has a complete grammar rule on the right-
hand side and replaces it with RHS.
LR Parsing: LR parser is one of the most efficient syntax analysis techniques as it works with
context-free grammar. In LR parsing L stands for the left to right tracing, and R stands for the
right to left tracing.
Compiler Phases
The compilation process contains the sequence of various phases. Each phase takes
source program in one representation and produces output in another representation.
Each phase takes input from its previous stage.
There are the various phases of compiler:
Fig: phases of compiler
Lexical Analysis:
Lexical analyzer phase is the first phase of compilation process. It takes source code as
input. It reads the source program one character at a time and converts it into
meaningful lexemes. Lexical analyzer represents these lexemes in the form of tokens.
Syntax Analysis
Syntax analysis is the second phase of compilation process. It takes tokens as input and
generates a parse tree as output. In syntax analysis phase, the parser checks that the
expression made by the tokens is syntactically correct or not.
Semantic Analysis
Semantic analysis is the third phase of compilation process. It checks whether the parse
tree follows the rules of language. Semantic analyzer keeps track of identifiers, their
types and expressions. The output of semantic analysis phase is the annotated tree
syntax.
Intermediate Code Generation
In the intermediate code generation, compiler generates the source code into the
intermediate code. Intermediate code is generated between the high-level language and
the machine language. The intermediate code should be generated in such a way that
you can easily translate it into the target machine code.
Code Optimization
Code optimization is an optional phase. It is used to improve the intermediate code so
that the output of the program could run faster and take less space. It removes the
unnecessary lines of the code and arranges the sequence of statements in order to
speed up the program execution.
Code Generation
Code generation is the final stage of the compilation process. It takes the
optimized intermediate code as input and maps it to the target machine language.
Code generator translates the intermediate code into the machine code of the
specified computer.
Compilation Definition
Introduction
Compiling is the process computers use to translate high-level programming languages
into computer-understandable machine language. A compiler is the name of the
software that performs this conversion.
Source code is the format in which programmers create their programs. Before a
program can be executed, the source code should go through several steps. The source
code must go through a compiler to convert the high-level language instruction into
object code.
When the compiler has generated object code, passing that code through a linker is the
last step in creating an executable program. The linker creates machine code,
integrating modules and assigning actual values to all symbolized addresses.
The two components of compilation are analysis and synthesis. The analysis stage
separates the source code into its constituent elements and produces an intermediate
representation of the source program. The target program is created from the
intermediate term by the synthesis component.
Compilation Process
Compiling a high-level program into binary low-level machine code is called compilation.
The computer can execute only binary machine commands because of its hardware
design. Thus, every program designed in a language other than machine code must first
be translated into machine instructions.
The compilation is a multi-stage process that transforms high-level computer programs
that are understandable by humans into low-level, binary code that is readable by
machines. Four steps convert a program's source code into an executable file. These
four steps in the compilation process include preprocessing, the compiler, assembly, and
linking.
Preprocessor
The preprocessing stage is the initial step in the compilation process, and this stage is
also known as the lexical analysis stage. The program source code file is entered into
the preprocessing step, which outputs a file with the dot i(.i) extension that has been
preprocessed.
The compiler searches the source code file for any # include and # define class
preprocessor directives. The entire header file set is processed at the preprocessing
stage, and all macros are handled by replacing them with absolute values. But
comments are not processed at this stage.
Compiler
The second step in the compilation process is compilation itself. The compiler accepts
the preprocessed file as input, producing an output file containing the assembly code
containing the dot s(.s) extension. The compiler translates all high-level software
instructions into their corresponding assembly code instructions. These instructions were
built for a particular architecture and are platform-dependent.
Assembler
The third step in the compilation process is the assembler. The assembler translates
basic computer commands into binary code for the computer's processor to perform its
fundamental operations.
These instructions are written in assembly language or assembler. An assembler is used
to translate assembly code into object code. The name of the source file and the object
file created by the assembler is the same.
Linker
The fourth and last step in the compilation process is linking. The primary purpose of
linking is to combine all object code files into a single executable file (sourcefile.exe). Big
computer programs are organized into a variety of manageable files.
Separate files are used to store the user-defined functions. In the header section, these
files are linked to the main program file like the C language's # included. Similarly, the
programming language offers built-in standard library functions that programs can use
immediately to simplify coding tasks.
As the standard library code is in object code (a pre-compiled format), it may be
immediately incorporated at the linking step by the linker while producing an executable
file during the compilation process.
Why are Computer Programs Compiled?
The need to compile high-level programs is something that computer science students
should understand fully. High-level programming languages like C, C++, Python, Dot
Net, and Java are used to create high-level programs.
Humans can understand high-level programs. Every high-level programming language
has a unique syntax and reserved keywords instructing the machine to perform
particular operations.
The ease of programming is a special consideration in developing high-level
programming languages. The majority of keywords in high-level programming languages
are common English words.
Compiler
Compilers are pieces of software that change source code into object code. In other
words, it transforms high-level language into machine/binary language. Also, carrying
out this step is important to make the program executable, and this is because the
computer understands only binary language.
Some compilers translate the high-level language into an assembly language as an
initial step. At the same time, others translate it into machine code. Compilation refers to
the process of transforming source code into machine code.
Types of compiler
Compilers come in a variety of forms, including the following:
   o   Single-Pass Compiler: Tokens are extracted from line sources after they have been
       scanned during processing in a single-pass compiler. Hence, once the line syntax is
       examined, the tree structure and various tables containing details about each token
       are                                                                      generated.
       After ensuring that the semantical component is correct, the code is finally written.
       Each line of code goes through the same procedure until the entire program is
       compiled. The parser that will call methods to carry out various functions typically
       serves as the central component of the compiler.
   o   Two-Pass Compiler: A two-pass compiler is a processor that executes the program
       to be translated twice. In the two-pass compiler, there are two sections, i.e.
           o   Front end: It converts the legal code into an Intermediate Representation
               (IR).
           o   Back end: The target machine is mapped with IR.
       Retargeting is made easier by the two-pass compiler approach. Moreover, it supports
       various front ends.
   o   Multi-Pass                                                                 Compiler
       The compiler makes the first changed structure after scanning the input source once,
       then makes a second modified structure after scanning the first form it created, and
       so on until the object form is completed. The term "multi-pass compiler" refers to this
       kind of compiler.
Phases/Structure Of Compiler
There are various phases in the compilation process. Also, the results of each stage
serve as the input for the following step. The compilation process contains the following
phases or structures:
1. Lexical Analyzer
   o   It accepts the source code of high-level languages as input.
   o   It examines the source code's characters from left to right. As a result, scanner is
       another term.
   o   The words are grouped into lexemes. Lexemes are a collection of characters with a
       specific meaning.
   o   Each lexeme fits together to create a token.
   o   White space and comments are eliminated.
   o   Lexical mistakes are checked and fixed.
2. Syntax Analyzer
   o   The syntax analyzer is also referred to as a "parser".
   o   The lexical Analyzer's output serves as its input.
   o   It checks the source code for syntax errors.
   o   It accomplishes this by creating a parse tree from each token.
   o   The parse tree needs to follow source code grammar rules for the syntax to be
       correct.
   o   A context-free grammar is an appropriate grammar for such codes.
3. Semantic Analyzer
   o   It checks the syntax analyzer's parse tree.
   o   It verifies the programming language validity of the code, such as data type
       compatibility, variable declaration, initialization, etc.
   o   It also generates a verified parse tree. The annotated parse tree is another name we
       give to this tree.
   o   Moreover, it conducts type checking, flow checking, etc.
4. Intermediate Code Generator (ICG)
   o   It produces intermediate code.
   o   Neither machine language nor high-level language is used in this program, and it's in
       an intermediate form.
   o   Although translated into machine language, the final two steps depend on the
       platform.
       o   All compilers use the same intermediate code. Also, we create the machine code
           following the platform.
       o   The three-address code is an instance of an intermediate code.
    5. Code Optimizer
       o   The intermediate code is optimized.
       o   Its purpose is to modify the code to run faster and consume fewer resources (CPU,
           memory).
       o   It rearranges the code and deletes any unnecessary lines.
       o   The source code still has the same meaning.
    6. Target Code Generator
       o   The optimized intermediate code is then transformed into machine code.
       o   This is the compilation's final step.
       o   This process generates relocatable machine code.
    Compiler Operations
    The compiler's crucial operations include the following:
       o   After breaking it down into smaller pieces, it provides grammatical structure to each
           source program segment.
       o   It also allows you to use the intermediate representation to build the symbol table and
           the desired target program.
       o   The compiler assists with error detection and source code compilation.
       o   It organizes and saves all variables and codes.
       o   Compiler support is provided for separate compilation.
       o   It reads the entire program, analyzes it, and then translates it into a language
           equivalent in semantics.
       o   The compiler is the process of transforming source code to object code, depending
           on the type of machine.
    Code Optimization in Compiler Design
    Last Updated : 04 Sep, 2024
    Code optimization is a crucial phase in compiler design aimed at
    enhancing the performance and efficiency of the executable code.
    By improving the quality of the generated machine code
    optimizations can reduce execution time, minimize resource
    usage, and improve overall system performance. This process
    involves the various techniques and strategies applied during
compilation to produce more efficient code without altering the
program’s functionality.
The code optimization in the synthesis phase is a program
transformation technique, which tries to improve the intermediate
code by making it consume fewer resources (i.e. CPU, Memory) so
that faster-running machine code will result. The compiler
optimizing process should meet the following objectives:
 The optimization must be correct, it must not, in any way,
   change the meaning of the program.
 Optimization should increase the speed and performance of the
   program.
 The compilation time must be kept reasonable.
 The optimization process should not delay the overall compiling
   process.
When to Optimize?
Optimization of the code is often performed at the end of the
development stage since it reduces readability and adds code
that is used to increase performance.
Why Optimize?
Optimizing an algorithm is beyond the scope of the code
optimization phase. So the program is optimized. And it may
involve reducing the size of the code. So, optimization helps to:
 Reduce the space consumed and increases the speed of
   compilation.
 Manually analyzing datasets involves a lot of time. Hence, we
   make use of software like Tableau for data analysis. Similarly,
   manually performing the optimization is also tedious and is
   better done using a code optimizer.
 An optimized code often promotes re-usability.
Types of Code Optimization
The optimization process can be broadly classified into two types:
 Machine Independent Optimization: This code optimization
   phase attempts to improve the intermediate code to get a
   better target code as the output. The part of the intermediate
   code which is transformed here does not involve any CPU
   registers or absolute memory locations.
 Machine Dependent Optimization: Machine-dependent
   optimization is done after the target code has been generated
   and when the code is transformed according to the target
   machine architecture. It involves CPU registers and may have
   absolute memory references rather than relative references.
   Machine-dependent optimizers put efforts to take
   maximum advantage of the memory hierarchy.
    Difference Between Linker and Loader
    Last Updated : 20 Sep, 2024
    When we run a program, two major players work behind the
    scenes to make it happen: two components; namely, the Linker
    and the Loader. This may best be described as a double act of
    computing. The most important work is done by the linker; this is
    the output of all the different locations with code and produces an
    executable file. There is then the loader function which is the one
    that loads this file into memory, ready to run.
    What is Linker?
    A linker is a special program that combines the object files,
    generated by the compiler/assembler and other pieces of code to
    originate an executable file that has a .exe extension. In the
    object file, the linker searches and appends all libraries
    What is a Linker?
    A linker is a program in a system, also known as a link editor and binder, which
    combines object modules into a single object file. Generally, it is a program that
    performs the process of linking; it takes one or multiple object files, which are
    generated by compiler. And, then combines these files into an executable files.
    Modules are called for the different pieces of code, which are written in
    programming languages. Linking is a process that helps to gather and maintain
    a different piece of code into an executable file or single file. With the help of a
    linker, a specific module is also linked into the system library.
    The primary function of the linker is to take objects from the assembler as input
    and create an executable file as output for the loader, as it helps to break down a
    large problem into a small module that simplifies the programming task.
Usually, computer programs are made up of various modules in which all being
a compiled computer programs and span separate object files. The whole
program refers to these different compiled modules with the help of using
symbols. These separate files are combined by linker into a single executable
file. The source code is converted into machine code, and the linking is
performed at the last step while compiling the program.
Source code -> compiler -> Assembler -> Object code -> Linker -> Executable
file -> Loader
The objects can be collected by linker from a library or runtime library. Most of
the linker only consists of files in the output that are referenced by other
libraries or object files, and they do not include the whole library. The process
of library linking requires additional modules to be linked with some referenced
modules; thus, it may be an iterative process. Generally, one or more than one
system libraries are linked by default, and libraries are available for different
purposes.
In the program's address space, the arranging of the objects is also handled by
the linker. The compiler often assumes a fixed base location (like zero), as it
seldom knows about the object location where it will reside. The loads, store,
and re-targeting of absolute jumps may be involved in the relocating machine
code. When the executable output produced by the linker is finally loaded into
memory, it may require other relocation pass. Usually, this pass is omitted on
the hardware, hardware that offers virtual memory. There is no conflict even at
the time all programs load at the same base address because each program is put
into its own address space. If the executable file is a position-independent
executable, this pass is also omitted on this file.
needed for the execution of the file. It regulates the memory
space that will hold the code from each module. It also merges
two or more separate object programs and establishes links
among them.
loaders
In compiler design, a loader is a program that is responsible for loading
executable programs into memory for execution. The loader reads the
object code of a program, which is usually in binary form, and copies it
into memory. It also performs other tasks such as allocating memory for
the program’s data and resolving any external references to other
programs or libraries. The loader is typically part of the operating system
and is invoked by the system’s bootstrap program or by a command from
a user. Loaders can be of two types:
   Absolute Loader: It loads a program at a specific memory location,
    specified in the program’s object code. This location is usually absolute
    and does not change when the program is loaded into memory.
   Relocating Loader: It loads a program at any memory location, and
    then adjusts all memory references in the program to reflect the new
    location. This allows the same program to be loaded into different
    memory locations without having to modify the program’s object code.
Architecture of Loader
Below is the Architecture of the Loader:
The architecture of a loader in a compiler design typically consists of
several components:
1. Source program: This is a program written in a high-level
   programming language that needs to be executed.
2. Translator: This component, such as a compiler or interpreter,
   converts the source program into an object program.
3. Object program: This is the program in a machine-readable form,
   usually in binary, that contains both the instructions and data of the
   program.
4. Executable object code: This is the object program that has been
   processed by the loader and is ready to be executed.
Overall, the Loader is responsible for loading the program into memory,
preparing it for execution, and transferring control to the program’s entry
point. It acts as a bridge between the Operating System and the program
being loaded.
Role of Loader in Compilation
In the compilation process, the Loader is responsible for bringing the
machine code into memory for execution. It performs the following key
functions:
   Loading the executable program into memory from the secondary
    storage device.
 Allocating memory space to the program and its data.
 Resolving external references between different parts of the program.
 Setting up the initial values of the program counter and stack pointer.
 Preparing the program for execution by the CPU.
The Loader plays a critical role in the compilation process as it ensures
that the program is properly loaded into memory, and the necessary
memory space is allocated for the program and its data. It also resolves
external references between different parts of the program and prepares it
for execution by the CPU.
Features of Loaders
   Relocation: Loaders can relocate the program to different memory
    locations to avoid memory conflicts with other programs.
   Linking: Loaders can link different parts of the program to resolve
    external references and create a single executable program.
   Error Detection: Loaders can detect and report errors that occur
    during the loading process.
   Memory Allocation: Loaders can allocate memory space to the
    program and its data, ensuring that the program has enough memory
    to execute efficiently.
   Execution Preparation: Loaders can prepare the program for
    execution by setting up the initial values of the program counter and
    stack pointer.
   Dynamic Loading: Loaders can load program segments dynamically,
    allowing the program to only load the necessary segments into memory
    as they are needed.
Advantages of Loader
There are several advantages of using a loader in compiler design:
1. Memory management: The loader is responsible for allocating
   memory for the program’s instructions and data. This allows the
   program to execute in a separate, protected area of memory, which
   can help prevent errors in the program from affecting the rest of the
   system.
2. Dynamic linking: The loader can resolve external references to other
   programs or libraries at runtime, which allows for more flexibility in the
   design of the program. This means that if a library is updated, the
   program will automatically use the new version without requiring any
   changes to the program’s object code.
3. Relocation: The loader can relocate a program to any memory
   location, which allows for efficient use of memory and prevents
   conflicts with other programs.
4. Error handling: The loader can check the compatibility of the program
   with the system and handle any errors that occur during loading, such
   as missing libraries or incompatible instruction sets.
5. Modularity: The loader makes it possible to develop and use separate
   modules or components, which can be linked together to form a
   complete program. This allows for a more modular design, which can
   make programs easier to maintain and update.
6. Reusability: As the program is separated into different modules, it can
   be reused in other programs. The loader also allows the use of shared
   libraries, which can be used by multiple programs.
7. Ease of use: The loader provides a simple and consistent interface for
   loading and executing programs, which makes it easier for users to run
   and manage programs on their system.
Disadvantages of Loader
There are several disadvantages of using a loader in compiler design:
1. Complexity: Loaders can be complex to implement and maintain, as
   they need to perform a variety of tasks such as memory management,
   symbol resolution, and relocation.
2. Overhead: Loaders introduce additional overhead in terms of memory
   usage and execution time, as they need to read the object code from
   storage and perform various operations before the program can be
   executed.
3. Size limitations: Loaders have limitations on the size of the program
   that can be loaded and might not be able to handle large programs or
   programs with a lot of external references.
4. Limited Flexibility: Loaders are typically specific to a particular
   operating system or architecture, and may not be easily portable to
   other systems.
5. Security: A poorly designed or implemented loader can introduce
   security vulnerabilities, such as buffer overflows or other types of
   memory corruption.
6. Error handling: Loaders need to handle various types of errors, such
   as missing libraries, incompatible object code, and insufficient memory.
7. Overlapping Memory: Due to the use of dynamic loading and
   relocation, the same memory location may be used by multiple
   programs leading to overlapping memory.
8. Dependency issues: Programs might have dependencies on external
   libraries or other programs, and the loader needs to resolve these
   dependency