0% found this document useful (0 votes)
65 views46 pages

Compiler Design: Dr. Sherif Eletriby

Uploaded by

Hadeer Anwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views46 pages

Compiler Design: Dr. Sherif Eletriby

Uploaded by

Hadeer Anwar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Compiler Design

Dr. Sherif Eletriby


Textbook
❑Compilers: Principles, Techniques, and Tools, A. V. Aho, R.
Sethi, J. D. Ullman.

❑Compiler Design: Theory, Tools, and Examples, Seth D.


Bergmann.
Welcome!

•Programming languages are notations for describing


computations to people and to machines.

•Before a program can be run, it first must be translated


into a form in which it can be executed by a computer.
•The software systems that do this translation are
called compilers.
Why to Learn Compiler design?

Computers are a balanced mix of software and hardware. Hardware is


just a piece of mechanical device and its functions are being controlled
by a compatible software. Hardware understands instructions in the form
of electronic charge, which is the counterpart of binary language in
software programming. Binary language has only two alphabets, 0 and
1. To instruct, the hardware codes must be written in binary format,
which is simply a series of 1s and 0s. It would be a difficult and
cumbersome task for computer programmers to write such codes, which
is why we have compilers to write such codes.
Planned Topics

⚫ Chapter 1: Introduction
⚫ Chapter 2: Lexical Analysis
⚫ Chapter 3: Syntax Analysis
⚫ Chapter 4: Top-Down Parsing
⚫ Chapter 5: Bottom-Up Parsing
⚫ Chapter 6: Code Generation
⚫ Chapter 7: Optimization
Chapter 1: Introduction
What is a Compiler?
• Recall from your study of assembly language or computer
organization the kinds of instructions that the computer’s CPU is
capable of executing.
(1) add two numbers stored in memory,
(2) move numbers from one location in memory to another,
(3) move information between the CPU and memory.

• But, How computer can see like this instruction?


➢if (array6[loc]<MAX) sum=0; else array6[loc]=0;

It is needing Compiler
Definition: Compiler

A compiler is a program which translates any program written in a


high-level language (source code/ source program) into an equivalent
program in the machine language (Target program).

Input: A program (source file/ source program)


Output: A program (object file), in machine language (Target program)

Source
program Equivalent
written by program in
Compiler low level
source
language language

source program Target program


Compiler
• A compiler accepts as input a program written in a particular high-level language
and produces as output an equivalent program in machine language.
• The input program is known as the source program, and its language is the source
language.
• The output program is known as the object program, and its language is the object
language.
• A compiler translates source language programs into equivalent object language
programs.

9
If a portion of the input to a compiler looked like this:

A = B + C * D;
The output corresponding to this input might look
something like this:

The compiler must be smart enough to know that the multiplication


should be done before the addition even though the addition is read
first when scanning the input.
Compiler
• Show the output of a Java native code compiler, in any typical
assembly language, for the following Java input string:
while (x<a+b) x = 2*x;

11
Language Processing System

We have learnt that any


computer system is made of
hardware and software.
The hardware understands a
language, which humans cannot
understand.
So we write programs in high-
level language, which is easier
for us to understand and
remember. These programs are
then fed into a series of tools and
OS components to get the
desired code that can be used by
the machine. This is known as
Language Processing System.
Let us first understand how a program, using compiler, is
executed on a host machine.

• User writes a program in language


(high-level language).
• The compiler, compiles the program and
translates it to assembly program (low-
level language).
• An assembler then translates the
assembly program into machine code
(object).
• A linker tool is used to link all the parts
of the program together for execution
(executable machine code).
• A loader loads all of them into memory
and then the program is executed.
Example of compiler

C Compiler
C++ compiler for the Mac (Intel).
Java compiler for the Sun (MIPS).

The major task of a linker is to search and locate


referenced module/routines in a program and to
determine the memory location where these codes
will be loaded, making the program instruction to
have absolute references.
Example of compiler

Save
code

Object files
Source files Compiler .Obj
.cpp, .h Run -
.java Build

Linker
⚫ Linker : computer program that takes one or more
objects files generated by a compiler and Object
combines them into a single executable program Program
.exe - com
Example of compiler

Examples:
C++ compiler for the Mac (Intel).

Assembler:
An assembler translates assembly language
programs into machine code. The output of an
assembler is called an object file, which contains a
combination of machine instructions as well as the
data required to place these instructions in memory.
‫‪Example of compiler‬‬

‫‪A Java compiler for the Sun (MIPS).‬‬

‫‪▪ The java compiler compiles to 'byte code', an‬‬


‫‪intermediate form for the java 'virtual machine'.‬‬

‫في بعض األحيان تتم عملية الترجمة على شكل خليط من عمل المترجم والمفسر معا كما هو الحال في معالج لغة ‪ JAVA‬الذي يجمع‬
‫بين الترجمة والتفسير حيث تتم أوال ترجمة البرنامج المكتوب بلغة ‪ JAVA‬إلى برنامج وسيط يسمى ‪ bytecodes‬والذي يتم‬
‫الحقا تفسيره بواسطة اآللة االفتراضية ‪.Virtual Machine‬‬
‫الفائدة من هذا الترتيب هو أن ‪ bytecodes‬التي تمت ترجمتها على آلة معينة يمكن تفسيرها على آلة أخرى كما يمكن ذلك من‬
‫خالل شبكة‪ .‬باإلضافة إلى المترجم فإن هناك مجموعة من البرامج ال بد من توفرها حتى تتم عملية إنتاج برنامج الهدف القابل‬
‫للتنفيذ‪.‬‬
Example of compiler A Java compiler for the Sun (MIPS).
Compiler and Interpreter

Interpreters: is software translate source program statement by statement


(or line by line) and then this statement is given to the CPU to be executed
before the next statement in being translated.

• The input to an interpreter is a program written in a high-level language, but rather


than generating a machine language program, the interpreter actually carries out the
computations specified in the source program.

‫ الذي يتلخص عمله في أنه بدال من إنتاج برنامج الهدف‬Interpreter ‫هناك نوع آخر شائع من معالجات اللغة يسمى المفسر‬
‫ يقوم المفسر بإدخال البيانات في نفس الوقت الذي تجري فيه ترجمة البرنامج أي‬،‫أوال ثم إدخال البيانات الالزمة للتنفيذ بعد ذلك‬
.‫بشكل متزامن‬
Compiler and Interpreter

The output of a compiler is a program (Target program), whereas


the output of an interpreter is the source program’s output.
Compiler vs Interpreter

Sample Problem:
Show the compiler output and the interpreter output for the following source code:
for (i=1; i<=4; i++) cout << i*3;

Compiler output Interpreter output


Difference between compiler and Interpreter

The machine-language target program produced by a


compiler is usually much faster than an interpreter at
mapping inputs to outputs .

An interpreter, however, can usually give better error


diagnostics than a compiler, because it executes the
source program statement by statement.
Compile-Time vs. Run-Time Errors
• Compile time: the time at which a source program is compiled.
• Run time: the time at which the resulting object program is loaded
and executed.

24
Compiler Language
• It is important to remember that a compiler is a program, and it must
be written in some language (machine, assembly, high-level).
• In describing this program, we are dealing with three languages:
(1) the source language, i.e. the input to the compiler,
(2) the object language, i.e. the output of the compiler,
(3) the language in which the compiler is written, or the language in which it
exists,

A compiler which translates Java A compiler which translates PC


A Java compiler for the Mac.
programs to Mac machine language, machine language programs to
and which runs on a Sun machine. Java, written in Ada

25
Compiler Language
Using the big C notation to show each of the following compilers:
1. An Ada compiler which runs on the PC and compiles to the PC machine
language.
2. An Ada compiler which compiles to the PC machine language, but which is
written in Ada.
3. An Ada compiler which compiles to th PC machine language, but which runs on
a Sun.

26
The Structure of a Compiler
The Phases of a Compiler
The Structure of a Compiler
The Structure of a Compiler
The Phases of a Compiler
The compilation process is a
sequence of various phases.

Each phase takes input from its


previous stage, has its own
representation of source program,
and feeds its output to the next
phase of the compiler.

Let us now understand the phases


of a compiler.
The Phases of a Compiler
Lexical Analysis (Scanner)
• The first phase of a compiler is called lexical analysis which generate a
stream of tokens as output from source program.
• Also, it build a symbol table to store all identifiers used in the source
program.
• key words - while, void, if, for, ...
• identifiers - declared by the programmer
• operators - +, -, *, /, =, ==, ...
• numeric constants - numbers such as 124, 12.35, 0.09E-23, etc.
• character constants - single characters or strings of characters
• special characters - characters used as delimiters such as . ( ) , ; :
• comments - ignored by subsequent phases. These must be identified by the
scanner, but are not included in the output.
Lexical Analysis (Scanner) Example
• Show the token classes, or “words”, put out by the lexical analysis phase
corresponding to this Java source input:

sum = sum + unit * /* accumulate sum */ 12 ;

• Identifier (sum)
• operator (=)
• Identifier (sum)
• operator (+)
• identifier (unit)
• operator (*)
• numeric constant (12)
• special characters (;)
Syntax Analysis Phase
• The syntax analysis phase is often called the parser.
• The parser will check for the program syntax error and determine the
underlying structure of the source program.
• The output of this phase may be a stream of atoms or a collection of
syntax trees.
• For example, (ADD, A, B, C): The meaning of the following atom
would be to add A and B, and store the result into C.
Syntax Analysis Phase
Example:
• Show the atoms corresponding to the following Java statement:
a=b+c*d;

(MULT, c, d, temp1)
(ADD, b, temp1, temp2)
(MOVE, temp2, a)
Syntax Analysis Phase
Example:
• Show the atoms corresponding to the following Java statement:
while (a <= b) a = a + 1;

(LBL, L1)
(Test, a, <=, b, L2)
(JMP, L3)
(LBL, L2)
(ADD, a, 1, a)
(JMP, L1)
(LBL, L3)
Syntax Analysis Phase
• Show a syntax tree for the Java statement:
if (a+3 < 400) a =0; else b = a*a;
• Assume that an if statement consists of three subtrees, one for the
condition, one for the consequent statement, and one for the else
statement, if necessary.
Global Optimization
• The global optimization phase is optional. Its purpose is simply to make the object
program more efficient in space and/or time.
• Examining the sequence of atoms put out by the parser to find redundant or
unnecessary instructions or inefficient code.
• Since it is invoked before the code generator, this phase is often called machine-
independent optimization.
• Examples:
stmt1 y=16; x = Math.sqrt (y); // loop invariant
go to label1 for (i=1; i<=100000; i++) for (i=1; i<=100000; i++)
stmt2 { x = Math.sqrt (y); // square root method System.out.println (x+i);
stmt3 System.out.println (x+i);
label1: stmt4 }

This would eliminate 99,999 unnecessary


stmt2 and stmt3 can the assignment to x need not be inside calls to the sqrt method at run time.
never be executed. the loop since y does not change as the
loop repeats
Code Generation
• Java compilers produce an intermediate form, known as byte code, which can be interpreted by the
Java run-time environment.
• But, in this course, we will be assuming that our compiler is to produce native code for a particular
machine.
• So, It is assumed that you have had some experience with assembly language and machine
language.
• For example, an ADD atom might be translated to three machine language instructions: (1) load the
first operand into a register, (2) add the second operand to that register, and (3) store the result, as
shown for the atom (ADD, a, b,temp):

• LOD r1,a // Load a into reg. 1


• ADD r1,b // Add b to reg. 1
• STO r1,temp // Store reg. 1 in temp
Code Generation Example
• Show assembly language instructions corresponding to the following atom
string:
• (ADD, a, b, temp1)
• (TEST, a, ==, b, L1)
• (MOVE, temp1, a)
• (LBL, L1)
• (MOVE, temp1, b)

LOD r1,a
ADD r1,b
STO r1,temp1 // ADD, a, b, temp1
CMP a,b
BE L1 // TEST, A, ==, B, L1
MOV a,temp1 // MOVE, temp1, a
L1: MOV b,temp1 // MOVE, temp1, b
Local Optimization
• It involves examining sequences of instructions put out by the code generator to
find unnecessary or redundant instructions.
• For this reason, local optimization is often called machine-dependent optimization.
• the expression a + b + c in the source program might result in the following
instructions as code generator output:

LOD r1,a // Load a into register 1


ADD r1,b // Add b to register 1
STO r1,temp1 // Store the result in temp1*
LOD r1,temp1 // Load result into reg 1*
ADD r1,c // Add c to register 1
STO r1,temp2 // Store the result in temp2
Local Optimization
• It involves examining sequences of instructions put out by the code generator to
find unnecessary or redundant instructions.
• For this reason, local optimization is often called machine-dependent optimization.
• the expression a + b + c in the source program might result in the following
instructions as code generator output:

LOD r1,a // Load a into register 1


ADD r1,b // Add b to register 1
STO r1,temp1 // Store the result in temp1*
LOD r1,temp1 // Load result into reg 1*
ADD r1,c // Add c to register 1
STO r1,temp2 // Store the result in temp2
Implementation Techniques
• A new compiler, with all optimizations, could take over a person-year
to implement.
• For this reason, we are always looking for techniques or shortcuts
which will speed up the development process.
• So, we can generate compilers from other compilers that have been
developed previously.

If the source program is a compiler which translates language A into language B, then the object
program will also be a compiler which translates language A into language B.
Implementation Techniques
The End

Questions?

You might also like