Principles of Compiler Design
Intermediate Representation
Compiler
Lexical Syntax Semantic
Analysis Analysis Analysis
Abstract
Source Token Target
Syntax
Program stream tree Intermediate Program
Code
Front End Back End
(Language specific)
1
Intermediate Code Generation
• Code generation is a mapping from source level
abstractions to target machine abstractions
• Abstraction at the source level
identifiers, operators, expressions, statements,
conditionals, iteration, functions (user defined, system
defined or libraries)
• Abstraction at the target level
memory locations, registers, stack, opcodes, addressing
modes, system libraries, interface to the operating
systems
2
Intermediate Code Generation ...
• Front end translates a source program into an intermediate
representation
• Back end generates target code from intermediate representation
• Benefits
– Retargeting is possible
– Machine independent code optimization is possible
Intermediate Machine
Front Code Code
end generator generator
3
Three address code
• Assignment • Function
– x = y op z – param x
– x = op y – call p,n
– x=y – return y
• Jump • Pointer
– goto L – x = &y
– if x relop y goto L – x = *y
• Indexed assignment – *x = y
– x = y[i]
– x[i] = y
4
Syntax directed translation of
expression into 3-address code
• Two attributes
• E.place, a name that will hold the
value of E, and
• E.code, the sequence of three-address
statements evaluating E.
• A function gen(…) to produce
sequence of three address statements
– The statements themselves are kept in some
data structure, e.g. list
– SDD operations described using pseudo code
5
Syntax directed translation of
expression into 3-address code
S → id := E
S.code := E.code ||
gen(id.place:= E.place)
E → E1 + E 2
E.place:= newtmp
E.code:= E1.code || E2.code ||
gen(E.place := E1.place + E2.place)
E → E1 * E 2
E.place:= newtmp
E.code := E1.code || E2.code ||
gen(E.place := E1.place * E2.place)
6
Syntax directed translation of
expression …
E → -E1
E.place := newtmp
E.code := E1.code ||
gen(E.place := - E1.place)
E → (E1)
E.place := E1.place
E.code := E1.code
E → id
E.place := id.place
E.code := ‘ ‘
7
Example
For a = b * -c + b * -c
following code is generated
t1 = -c
t2 = b * t1
t3 = -c
t4 = b * t3
t5 = t2 + t4
a = t5
8
Flow of Control
S → while E do S1 S.begin := newlabel
Desired Translation is S.after := newlabel
S. begin :
S.code := gen(S.begin:) ||
E.code
E.code ||
if E.place = 0 goto S.after
gen(if E.place = 0 goto S.after) ||
S1.code S1.code ||
goto S.begin gen(goto S.begin) ||
S.after : gen(S.after:)
9
Flow of Control …
S → if E then S1 else S2 S.else := newlabel
S.after := newlabel
E.code
if E.place = 0 goto S.else S.code = E.code ||
S1.code gen(if E.place = 0 goto S.else) ||
S1.code ||
goto S.after
gen(goto S.after) ||
S.else: gen(S.else :) ||
S2.code S2.code ||
S.after: gen(S.after :)
10
Declarations
P→ D
D→D;D
D → id : T
T → integer
T → real
11
Declarations
For each name create symbol table entry with information
like type and relative address
P → {offset=0} D
D→D;D
D → id : T
enter(id.name, T.type, offset);
offset = offset + T.width
T → integer
T.type = integer; T.width = 4
T → real
T.type = real; T.width = 8
12
Declarations
For each name create symbol table entry with information
like type and relative address
P → {offset=0} D
D→D;D
D → id : T
enter(id.name, T.type, offset);
offset = offset + T.width
T → integer
T.type = integer; T.width = 4
T → real
T.type = real; T.width = 8
13
Declarations …
T → array [ num ] of T1
T.type = array(num.val, T1.type)
T.width = num.val x T1.width
T → ↑T1
T.type = pointer(T1.type)
T.width = 4
14
Keeping track of local information
• when a nested procedure is seen, processing of
declaration in enclosing procedure is temporarily
suspended
• assume following language
P→D
D → D ;D | id : T | proc id ;D ; S
• a new symbol table is created when procedure
declaration
D → proc id ; D1 ; S is seen
• entries for D1 are created in the new symbol table
• the name represented by id is local to the enclosing
procedure 15
Example
program sort;
var a : array[1..n] of integer;
x : integer;
procedure readarray;
var i : integer;
……
procedure exchange(i,j:integers);
……
procedure quicksort(m,n : integer);
var k,v : integer;
function partition(x,y:integer):integer;
var i,j: integer;
……
……
begin{main}
……
end.
16
sort
nil header
a
x
readarray to readarray
exchange to exchange
quicksort
readarray exchange quicksort
header header header
i k
v
partition
partition
header
i
j 17
Creating symbol table: Interface
• mktable (previous)
create a new symbol table and return a pointer to the
new table. The argument previous points to the
enclosing procedure
• enter (table, name, type, offset)
creates a new entry
• addwidth (table, width)
records cumulative width of all the entries in a table
• enterproc (table, name, newtable)
creates a new entry for procedure name. newtable
points to the symbol table of the new procedure
• Maintain two stacks: (1) symbol tables and (2) offsets
• Standard stack operations: push, pop, top
18
Creating symbol table …
D→ proc id;
{t = mktable(top(tblptr));
push(t, tblptr); push(0, offset)}
D1; S
{t = top(tblptr);
addwidth(t, top(offset));
pop(tblptr); pop(offset);
enterproc(top(tblptr), id.name, t)}
D→ id: T
{enter(top(tblptr), id.name, T.type, top(offset));
top(offset) = top (offset) + T.width}
19
Creating symbol table …
P→
{t=mktable(nil);
push(t,tblptr);
push(0,offset)}
D
{addwidth(top(tblptr),top(offset));
pop(tblptr); // save it somewhere!
pop(offset)}
D→ D;D
20
Field names in records
T → record
{t = mktable(nil);
push(t, tblptr); push(0, offset)}
D end
{T.type = record(top(tblptr));
T.width = top(offset);
pop(tblptr); pop(offset)}
21
Names in the Symbol table
S → id := E
{p = lookup(id.place);
if p <> nil then emit(p := E.place)
else error}
emit is like gen, but
instead of returning
E → id code, it generates
{p = lookup(id.name); code as a side effect
if p <> nil then E.place = p in a list of three
else error} address instructions.
22
Type conversion within assignments
E → E1+ E2
E.place= newtmp;
if E1.type = integer and E2.type = integer
then emit(E.place ':=' E1.place 'int+' E2.place);
E.type = integer;
…
similar code if both E1.type and E2.type are real
…
else if E1.type = int and E2.type = real
then
u = newtmp;
emit(u ':=' inttoreal E1.place);
emit(E.place ':=' u 'real+' E2.place);
E.type = real;
…
similar code if E1.type is real and E2.type is integer
26
Example
real x, y;
int i, j;
x=y+i*j
generates code
t1 = i int* j
t2 = inttoreal t1
t3 = y real+ t2
x = t3
27
Boolean Expressions
• compute logical values
• change the flow of control
• boolean operators are: and or not
E → E or E
| E and E
| not E
| (E)
| id relop id
| true
| false
28
Methods of translation
• Evaluate similar to arithmetic expressions
– Normally use 1 for true and 0 for false
• implement by flow of control
– given expression E1 or E2
if E1 evaluates to true
then E1 or E2 evaluates to true
without evaluating E2
29
Numerical representation
• a or b and not c
t1 = not c
t2 = b and t1
t3 = a or t2
• relational expression a < b is equivalent to
if a < b then 1 else 0
1. if a < b goto 4.
2. t = 0
3. goto 5
4. t = 1
5.
30
Syntax directed translation of
boolean expressions
E → E1 or E2
E.place := newtmp
emit(E.place ':=' E1.place 'or' E2.place)
E → E1 and E2
E.place:= newtmp
emit(E.place ':=' E1.place 'and' E2.place)
E → not E1
E.place := newtmp
emit(E.place ':=' 'not' E1.place)
E → (E1) E.place = E1.place
31
Syntax directed translation of
boolean expressions
E → id1 relop id2
E.place := newtmp
emit(if id1.place relop id2.place goto nextstat+3)
emit(E.place = 0)
emit(goto nextstat+2)
emit(E.place = 1)
“nextstat” is a global
E → true variable; a pointer to
E.place := newtmp the statement to be
emit(E.place = '1')
emitted. emit also
E → false updates the nextstat
E.place := newtmp as a side-effect.
emit(E.place = '0')
32
Example:
Code for a < b or c < d and e < f
100: if a < b goto 103 if e < f goto 111
101: tl = 0 109: t3 = 0
102: goto 104 110: goto 112
103: tl = 1 111: t3 = 1
104:
112:
if c < d goto 107
t4 = t2 and t3
105: t2 = 0
106: goto 108 113: t5 = tl or t4
107: t2 = 1
108:
33
Short Circuit Evaluation of boolean
expressions
• Translate boolean expressions without:
– generating code for boolean operators
– evaluating the entire expression
Each Boolean
• Flow of control statements expression E has two
attributes, true and
S → if E then S1 false. These
| if E then S1 else S2 attributes hold the
label of the target
| while E do S1 stmt to jump to.
34
Control flow translation of
boolean expression
if E is of the form: a<b
then code is of the form: if a < b goto E.true
goto E.false
E → id1 relop id2
E.code = gen( if id1 relop id2 goto E.true) ||
gen(goto E.false)
E → true E.code = gen(goto E.true)
E → false E.code = gen(goto E.false)
35
E.true
E.code
E.false
E.true
S1.code
E.false
S → if E then S1
E.true = newlabel
E.false = S.next
S1.next = S.next
S.code = E.code ||
gen(E.true ':') ||
S1.code
36
S → if E then S1 else S2
E.true
E.code E.true = newlabel
E.false E.false = newlabel
E.true
S1.next = S.next
S1.code
S2.next = S.next
goto S.next S.code = E.code ||
E.false
S2.code
gen(E.true ':') ||
S1.code ||
S.next
gen(goto S.next) ||
gen(E.false ':') ||
S2.code
37
S → while E do S1
S.begin E.true S.begin = newlabel
E.code
E.false E.true = newlabel
E.true E.false = S.next
S1.code S1.next = S.begin
S.code = gen(S.begin ':') ||
goto S.begin E.code ||
E.false gen(E.true ':') ||
S1.code ||
gen(goto S.begin)
38
Control flow translation of
boolean expression
E → E1 or E2
E1.true := E.true
E1.false := newlabel
E2.true := E.true
E2.false := E.false
E.code := E1.code || gen(E1.false) || E2.code
E → E1 and E2
E1.true := newlabel
E1 false := E.false
E2.true := E.true
E2 false := E.false
E.code := E1.code || gen(E1.true) || E2.code
39
Control flow translation of
boolean expression …
E → not E1 E1.true := E.false
E1.false := E.true
E.code := E1.code
E → (E1) E1.true := E.true
E1.false := E.false
E.code := E1.code
40
Example
Code for a < b or c < d and e < f
if a < b goto Ltrue
goto L1
L1: if c < d goto L2
goto Lfalse
L2: if e < f goto Ltrue
goto Lfalse
Ltrue:
Lfalse:
41
Example …
Code for while a < b do
if c<d then x=y+z
else x=y-z
L1: if a < b goto L2
goto Lnext
L2: if c < d goto L3
goto L4
L3: t1 = Y + Z
X= t1
goto L1
L4: t1 = Y - Z
X= t1
goto L1
Lnext:
42
Case Statement
• switch expression
begin
case value: statement
case value: statement
….
case value: statement
default: statement
end
• evaluate the expression
• find which value in the list of cases is the same as
the value of the expression.
– Default value matches the expression if none of the
values explicitly mentioned in the cases matches the
expression
• execute the statement associated with the value
found
43
Translation
code to evaluate E into t code to evaluate E into t
if t <> V1 goto L1 goto test
code for S1 L1: code for S1
goto next goto next
L1 if t <> V2 goto L2
L2: code for S2
code for S2
goto next
goto next
……
L2: ……
Ln-2 if t <> Vn-l goto Ln-l Ln: code for Sn
code for Sn-l goto next
goto next test: if t = V1 goto L1
Ln-1: code for Sn if t = V2 goto L2
next: ….
if t = Vn-1 goto Ln-1
goto Ln
next:
Efficient for n-way branch 44
BackPatching
• way to implement boolean expressions and
flow of control statements in one pass
• code is generated as quadruples into an
array
• labels are indices into this array
• makelist(i): create a newlist containing only
i, return a pointer to the list.
• merge(p1,p2): merge lists pointed to by p1
and p2 and return a pointer to the
concatenated list
• backpatch(p,i): insert i as the target label for
the statements in the list pointed to by p
45
Boolean Expressions
E → E1 or M E2
| E1 and M E2
| not E1
| (E1)
| id1 relop id2
| true
| false
M→Є
• Insert a marker non terminal M into the grammar
to pick up index of next quadruple.
• attributes truelist and falselist are used to
generate jump code for boolean expressions
• incomplete jumps are placed on lists pointed to
by E.truelist and E.falselist 46
Boolean expressions …
• Consider E → E1 and M E2
– if E1 is false then E is also false so
statements in E1.falselist become
part of E.falselist
– if E1 is true then E2 must be tested
so target of E1.truelist is beginning
of E2
– target is obtained by marker M
– attribute M.quad records the
number of the first statement of
E2.code 47
E → E1 or M E2
backpatch(E1.falselist, M.quad)
E.truelist = merge(E1.truelist, E2.truelist)
E.falselist = E2.falselist
E → E1 and M E2
backpatch(E1.truelist, M.quad)
E.truelist = E2.truelist
E.falselist = merge(E1.falselist, E2.falselist)
E → not E1
E.truelist = E1 falselist
E.falselist = E1.truelist
E → ( E1 )
E.truelist = E1.truelist
E.falselist = E1.falselist
48
E → id1 relop id2
E.truelist = makelist(nextquad)
E.falselist = makelist(nextquad+ 1)
emit(if id1 relop id2 goto --- )
emit(goto ---)
E → true
E.truelist = makelist(nextquad)
emit(goto ---)
E → false
E.falselist = makelist(nextquad)
emit(goto ---)
M→ Є
M.quad = nextquad
49
Generate code for
a < b or c < d and e < f
100: if a < b goto -
Initialize nextquad to 100 101: goto - 102
102: if c < d goto - 104
103: goto -
E.t={100,104} 104: if e < f goto -
E.f={103,105} 105 goto –
backpatch(102,104)
E.t={100} E.t={104} backpatch(101,102)
or M.q=102
E.f={101} E.f={103,105}
Є
a < b
E.t={102} and M.q=104 E.t ={104}
E.f={103} E.f={105}
Є
c < d e < f 50
Flow of Control Statements
S if E then S1
| if E then S1 else S2
| while E do S1
| begin L end
|A
LL;S
| S
S : Statement
A : Assignment
L : Statement list
51
Scheme to implement translation
• E has attributes truelist and falselist
• L and S have a list of unfilled quadruples to
be filled by backpatching
• S while E do S1
requires labels S.begin and E.true
– markers M1 and M2 record these labels
S while M1 E do M2 S1
– when while. .. is reduced to S
backpatch S1.nextlist to make target of all the
statements to M1.quad
– E.truelist is backpatched to go to the beginning
of S1 (M2.quad)
52
Scheme to implement translation …
S if E then M S1
backpatch(E.truelist, M.quad)
S.nextlist = merge(E.falselist,
S1.nextlist)
S if E them M1 S1 N else M2 S2
backpatch(E.truelist, M1.quad)
backpatch(E.falselist, M2.quad )
S.next = merge(S1.nextlist,
N.nextlist,
S2.nextlist)
53
Scheme to implement translation …
S while M1 E do M2 S1
backpatch(S1.nextlist, M1.quad)
backpatch(E.truelist, M2.quad)
S.nextlist = E.falselist
emit(goto M1.quad)
54
Scheme to implement translation …
S begin L end S.nextlist = L.nextlist
SA S.nextlist = makelist()
L L1 ; M S backpatch(L1.nextlist,
M.quad)
L.nextlist = S.nextlist
LS L.nextlist = S.nextlist
N N.nextlist = makelist(nextquad)
emit(goto ---)
M M.quad = nextquad
55