0% found this document useful (0 votes)
4 views27 pages

1 1 Introduction

The document outlines the goals and structure of a compiler course, emphasizing the importance of understanding compilers' functionality and correctness. It discusses the history of high-level languages, particularly the development of FORTRAN I, and the major components of a compiler, including lexical analysis, parsing, semantic analysis, optimization, and code generation. Additionally, it highlights the evolution of compilers and their significance in modern programming languages.

Uploaded by

12drstrange
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views27 pages

1 1 Introduction

The document outlines the goals and structure of a compiler course, emphasizing the importance of understanding compilers' functionality and correctness. It discusses the history of high-level languages, particularly the development of FORTRAN I, and the major components of a compiler, including lexical analysis, parsing, semantic analysis, optimization, and code generation. Additionally, it highlights the evolution of compilers and their significance in modern programming languages.

Uploaded by

12drstrange
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Compiler

(CS3104)
Introduction 1

Taken from: https://web.stanford.edu/class/cs143/


Course Goal

• Open the lid of compilers and see inside


– Understand what they do
– Understand how they work
– Understand how to build them

• Correctness over performance


– Correctness is essential in compilers
– They must produce correct code
– Enormous consequences if they do not

10
How are Languages Implemented?

• Two major strategies:


– Interpreters run your program
– Compilers translate your program

Program
Program Compiler Binary Code
Interpreter
Machine
Machine

11
Language Implementations

• Compilers dominate low-level languages


– C, C++, Go, Rust

• Interpreters dominate high-level languages


– Python, JavaScript

• Many language implementations provide both


– Java, Javascript, WebAssembly
– Interpreter + Just in Time (JIT) compiler

12
History of High-Level Languages

• 1954: IBM develops the 704

• Problem
– Software costs exceeded
hardware costs!

• All programming done in


assembly

13
The Solution

• Enter “Speedcoding”

• An interpreter

• Ran 10-20 times slower than hand-written


assembly

14
FORTRAN I

• Enter John Backus

• Idea
– Translate high-level code to
assembly

– Many thought this


impossible

– Had already failed in other


projects

15
FORTRAN I (Cont.)

• 1954-7
– FORTRAN I project

• 1958
– >50% of all software is in
FORTRAN

• Development time halved

• Performance close to
hand-written assembly!

16
FORTRAN I

• The first compiler


– Huge impact on computer science

• Led to an enormous body of theoretical and practical


work

• Modern compilers preserve the outlines of FORTRAN I

• Can you name a modern compiler?

17
The Structure of a Compiler

1. Lexical Analysis — identify words


2. Parsing — identify sentences
3. Semantic Analysis — analyse sentences
4. Optimization — editing
5. Code Generation — translation

Can be understood by analogy to how


humans comprehend English.

18
Lexical Analysis

• First step: recognize words.


– Smallest unit above letters

This is a sentence.

19
More Lexical Analysis

• Lexical analysis is not trivial.

• Suppose we scramble the whitespaces:


ist his ase nte nce

• Suppose we replace whitespace with z:


iszthiszazsentence

20
And More Lexical Analysis

• Lexical analyzer divides program text into


“words” or “tokens”
if x == y then z = 1; else z = 2;

21
Parsing

• Once words are understood, the next step is to


understand sentence structure

• Parsing = Diagramming Sentences


– The diagram is a tree

22
Diagramming a Sentence

This line is a longer sentence

article noun verb article adjective noun

subject object

sentence

23
Parsing Programs

• Parsing program expressions is the same


• Consider:
if x == y then z = 1 else z = 2
• Diagrammed:
x == y z 1 z 2
relation assign assign

predicate then-stmt else-stmt


if-then-else
24
Semantic Analysis

• Once sentence structure is understood, we can


try to understand “meaning”
– But meaning is too hard for compilers

• Compilers perform limited semantic analysis to


catch inconsistencies

25
Semantic Analysis in English

• Example:
Jack said Jerry left his assignment at home.
What does “his” refer to? Jack or Jerry?

• Even worse:
Jill said Jill left her assignment at home?
How many Jills are there?
Which one left the assignment?

26
Semantic Analysis in Programming

• Programming {
languages define strict int i = 3;
rules to avoid such {
ambiguities
int i = 4;
cout << i;
• This C++ code prints
“4”; the inner definition }
is used }

27
More Semantic Analysis

• Compilers perform many semantic checks


besides variable bindings

• Example:
Jack left her homework at home.

• Possible type mismatch between her and Jack


– If Jack is male

28
Optimization

• Akin to editing
– Minimize reading time
– Minimize items the reader must keep in short-term
memory

• Automatically modify programs so that they


– Run faster
– Use less memory
– In general, to conserve some resource

29
Optimization Example

x = y * 0 is the same as x = 0

(the * operator is annihilated by zero)

Is this optimization legal?

30
Code Generation

• Typically produces assembly code

• Generally a translation into another language


– Analogous to human translation

31
Intermediate Representations (IR)

• Compilers typically perform translations between


successive intermediate languages
– All but first and last are intermediate representations
(IR) internal to the compiler
Source

• IRs are generally ordered in


descending level of abstraction IR

– Highest is source …
– Lowest is assembly IR

Assembly
32
Intermediate Representations (IR) (Cont.)

• IRs are useful because lower levels expose


features hidden by higher levels
– registers
– memory layout
– raw pointers
– etc.

• But lower levels obscure high-level meaning


– Classes
– Higher-order functions
– Even loops…

33
Issues

• Compiling is almost this simple, but there are


many pitfalls

• Example: How to handle erroneous programs?

• Language design has a big impact on the compiler


– Determines what is easy and hard to compile
– Course theme: many trade-offs in language design

34
Compilers Today

• The overall structure of almost every compiler


adheres to our outline

• The proportions have changed since FORTRAN


– Early: lexing and parsing most complex/expensive

– Today: optimization dominates all other phases, lexing


and parsing are well understood and cheap

• Compilers are now also found inside libraries:


• XLA, TVM, Halide, DBMS, …
35

You might also like