0 ratings 0% found this document useful (0 votes) 60 views 35 pages Compiler Design
The document discusses the process of translating programming languages through compilers and interpreters, detailing their functions and differences. It covers various types of compilers, the role of intermediate code, and the importance of semantic analysis in code generation. Additionally, it outlines the qualities of a good compiler and the principles of compilation, including optimization and error handling.
AI-enhanced title and description
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here .
Available Formats
Download as PDF or read online on Scribd
Go to previous items Go to next items
Save compiler-design For Later “@ Wansiator » Ary en
Y Progiam writien In a programming language must be translated before it
can be execu i ion ls 4yti
ect. This Wanstation is typically accomplished by a software system called
“Wanslator . ~
joa High-level language and “the target language ¥
@ compiler : Ff ‘the Source
axjact code Than ‘tie translator is colled compiler
pinta prea; Certain offer ranstater ‘ansfrm™, © programming tao frguoge 7
egled interpreter -
an intermediate code , which can be chroctty executed -
compiler Interpreter
= Faster — shee
=S one lin at a Time
> whible SIC prog
large and tomper
a small’ ond Simple
probem 7 ~
=>
probeate : -) 22,
OTanvatet :: Any proatan ver
con be weve Nogtar wivition In a programming language must be Wanslated befove it
Heck. This ae een
is Wanslation is typically accomplished by 2 sollware system called
“Wanslator .
Target program
Ocompijer : Jj the sowce is a High-level language and “The target language. 1
ovject code “Thon Tie translator is coed & complet
gramming neo Janguoge ‘into
interpreter ; certain other ran stator hiansfrm 0 0
coed wnterpreler -
an ‘intermediate code , which can be chucetty executed -
Sic program
Intermediate Code
“opject Code
couple Interpreter
— slower
= Faster .
> whok Sic p19 —s one line at a ‘Time
3 srl and Simple
-s large and complex
Probiom
prokem# imcde
a define
Machine conte
| code translater -
| a , Se
source coxe: Source code i the input fo % compiler or otntt
uctions thay the processor Con
@ Object Code: Contains a sequence of instr
4o read of modify «
“understand bur that difficult foro, human
|. exe — object code (05/ application software)
Pb can be executed by 4
code and perform The
platform ‘nde pendent «
1@ Byte code : Byte code is the “Kind of code: thal
|
virtual mocking . The virtual machine reads the byt
“| cpetations ; it specifies . Byte, Code is typically slower but
tis directly executable by the
|@ Machine code: Machine code is code tha
Load into memory and s
computer’s Physical prowssor without further cranstation +
‘ Wate: 08: 6-22
“fh Execution of a program
i> level tang
Machine dependent
Hochine cede (Relocated)
Machine code.ie fom : aa
A { Symot tabe Wtermectiate Code gen
manager
Code optimization
end
» bependent
/ smineats, Part
‘Assembly code
| Qualities of a Good compiler
> What qualities would you want in a compiles ?
+ Gonorates correct code (first and feremst) a
Semantic analy |
pase “hie (Gemaeh
Fioy handir
ut larg indereetf Use oF compiler Technology
program 10 object code
symthesi
|
|-s Most common use: “Hranslate a high-level
+ progiam Translation : binary transtation , nardware
IP Optimizations fr computer architoctuses ©
|e tmprove progam performane. , fare ine paralla ism - etc
account “hardware
> Autematic parallisation ov voctorisation
|
5 performance nsumentation «eg " ~P9 oplios oF ce oY HE
5 inlerpretors + e.g + Python, Ruby 1 Perl ) matlab Sh.
Ly software productivity tb
{| = Detwugging aids + e.g. purdy
L, security : Jove VM Uses compiler ainalysls to Prove walety’ of yava woke
Lo ctext fivmatters just -in— tine compilation for jova “power
global list butect computing
Monagemnr ,
]
fey 5 Ability extract properties of a Source program (analysis) and
ranyiom “A te construct a Target program Gynthes) «
| Surnmasy +i lexical Analysis C scanning)
Reacts Character in the Sour. program afd groups them inte word> Coaric unit 6
syrlax)
Produces words and recognises what sort they are
>The output is called token and s/o palrof tne form © oF
orth = 103s» attibute >
4 rath stele msn
‘Tih Syntax (or syntactic) Analysis CParsing) aunty
> imposes © Fierrarchicol structure on the toten steam
> This hiertarchical structuie ts usually ‘expressed by recursive: tules -
> context-free grammar > formalise “these recursff Ast for b*b-44#ax*c
“
/\
aN
ceo sida
i #1
= An abstract Syntax Tree (45T) is a more useful data structure tor intemal
representation. df is a compressect version d the parse area .C54R
+ASTs ate one form of IR.
(fh Semantic Analysis (context handling)
> collects context (semantic )infosmation , cheeks for semantic error s- and
annvtes nodes oF the tree with the results.
Examples :
—4ype checking : report efor if an operator is applied to an ‘ncompete
operand fi Ly fout t 5
=Check flow - Of - contols
= uniqueness ‘or name-related checks.
Th Intermediok code generation :Th coce Optimisation:
| sate gpay 6 tb improve ane intermediate, code, and THe tre. effectiveness oF
cece gereration ond Perfor ane danger (ose *
e from trivial (e:9-
mance of
= optirigations can yang constant folirg ) te highly
sophisticated Ce-9- ‘m- fintyg) .
wor exampb : replace the fist two stetemants ‘in the example Gg he
|| previous bide slide with » Imp? = 4%a
fn compilers perform such a range of optimisations “rar one
“Moder
could argue for :
on eee aajacee | een |_ “eget __
lltersamaal Ph rise Coat
| —
(ode
- 56 Cout)
\| Aranstier i
Exampe ef cflont + HPHPC oor
(input) c++ [ation] ¢ Coutput PHP — [ararto C++
|B crows comper : A crass compiler is compiler sha) capable of creating pre
“exe code for a Platform otha than ‘the one on’ which “the compiler is vunning «
| @ decompiler :
Machine code to Src Code
6 Beolstiopring : Bootsharring is 4 Yechniaue fr
Cis a compiter ov assemater +
that it intends” to compiler)
producing @ self compiling complies
en mn “The any source. programming janguge languoge
= Target=> Pavol translator wriflen in ¢ programm™ng
| ereate another postay transabr written in” C+
+ tang
q pe —3/P
of
| soe peo P
. woiitien by > €
iC r
Og de
ee —
we ves earns
7 < at
way <— Skeam of toKen
“Prim (Emer a numer”): 5
seanf (" %ad", &0)> nA
toten Ciype, attrinue)
int! — ‘keyword | U/— wr Tndided “in toxon
2 — Identifier 5 integer
10 —_ Number . integer Leet
RE of Wentifier >, elulaty*
x ds ANN s4stetatels
Imeger sca!) *
Real nunte : (qd) , Coy|p Lexical Analyzer
“gp Big picture
| Former tang instead of natural lang
testa! onilyeer —s recogeiea. parts "of speech (dic Nona?)
| —> Keyword , Identifier»
| ait
+ Language °
# Tolan. Lexims, Pattern
# Toketizaton : lexim to pian
# Wny all this 2
# wical Anolyas (code) ot
“y Roguiar Fxpressiom
4 ob is a ee denoting the empty set qf
Sf RFs ty automate scanner Constructo 1
Example,
(RE FA
Nextiot analyzer
@ is a regular expesion “thal represent the regular language
ve G
ap
eo \ O (aex)
% =o
ee: Oo
a + +0 4O
©
Alphabet firite)
woywrd —Lfinits)
Identifies — Infinite
Sentence. — infinite
operator — ih Fite.
Mate + 22.6.22,
PRL = Af embly lang
*Rl= QE} ervty oti
“
rRL= 4a=
Concalination +
> Unwn + I
(> Cbwsure
@
aM ta
anoten RE
CO
Ley = 80%
los Abc?
ha roy + Rory
= {ap ae, bb Bcf
le =
ey
—a
Syate elimination methos
- Mulliple finite final stares + 4
2. Starting state Should no
3. Final state should not have outgoing
.
Lk ae
TOES +-O)
a tbo)?
vt have incoming
Aransi tion
ees
—
a__O
Fr = albaO--O
-
a+ Gori®Ssitya Left linear
Right linear
7 2
osep
| alatyr ~ A ¢
' r * B . j
] . . t
A> oB : >
B— aBlesle + /\
oN
wm) e
b* ‘ i
Right to left Conversim uy
poo lore P— Qal Pole
ea ty el OR] Ob
@ aR Ibe R— Pal Re
R— aPloR _
Fecogrine which,
‘Grammer
. Right linea!
b?+ (abt ab* a) pad lop
@— oR] bale
Rap lpRie <>ff Ganol structure of a comaler
ares weal |
sap
snk
Aneatysis
Cale “ton _|
ctenerafion
ima Targer code
Analysis optimisation
Target Code
aaesten -— 3
Back-end
Fion) -end
“peconceptual Structure ; two major ehares
\ Inlermtiate sorger Code
sauce tcode —> [Fant-end Ferran | Back-ene | mast
Front-end : performs “the analysis of the source Language «
~ Rcegriie? legal and illegal programs And reports errors
= rundeskinds the mput program and collects tk semantics in an IR
= -prdues IR ani Shapes the Code for the loatk-ona
= Much can be automated
Back-end + dor» the taiget language synthesis -
= chooses instructions +o implement each IR operation,
— Transtak. wR ine target code
~ Needy to cortiim with system interfacy
- Aupmation oy been Ley successfulinxn compilers with Mtn Componenis
Torget 1
Fortran
|
Postal
Target 4
SAM language specific Knowledge. must be encoded ,in, Tne ion! end
+ All aiget specific fnowlece must be 6 encoded, m We back-end
dt Qualities of a good compiler
what qualile, wouky ‘you. want in a compiler 2
= generate correct code Cfirst and .Joramost ) .
= gemrate ‘fast “code !
~ conforms to the specifications of “the input larquage
~ copes with esserttialy arbitary inp size» variables ete
compilation tims Cincarly) proportional to stze of: source
= good diayonss diagnostics
ceosistent opfimisations
= works well wilh the debuggerot Prinapls ot compilation
me compiler must:
> Proserve the tmparing of the program béIng compiled
> "improve the source ‘wde in some war.
Other issues (depending on the setting):
+ Speed (of compiled tode)
+ Space (Size ‘of ‘Compiled code)
1 Peedeacr tinfor matin provided ‘to “the i)
© Debugging (ranfrmatiom’ dbsewa’ “ha “tetadionship soures Code. vs, t494)
+ compilation time efficiency (fast ov slue compiler) :
# tistorical Note *
“The fi « y
a Ao Higher- Level Programming
= Machine lar .
ine. Language (1st generation) ee larguages (2% 9e)- eorly
*thigh- level tangueige (288° genoratwor)- 19505 1950
/ 4th genorafion igher level languages GAL - postscript)
a 5th gonsration tang [logic boved » eg, Prolog)
+ otter classifications : nn Sibaatie
= \mperative (neu) > declorative (whos)
= oWject- oriented Languages
= Scripfina languagesjam charuky —> divided Jormol tang inlo sub cliviiors-
RG FA .
L
CFG Pehebun
L
esa (OTM
v
Recurrsively enumerate $4 (RES) -TH
Context ~ “free Grammer (CFG)
Lomtext ~ free Grammer (CFG)
=> Recunsive
= Sy detor mine syntactic /riot
production Jue
G= lv. ts. 6)
E—eEte| EE | Id
v= 2
Ta Ate id pt
See °
pe JES ETE EO EYE ESid fp
Vaid! sting
FI FE
eee Sontentiol forrr
Sidtere (Wp Ceomtnation of terminals hon terinalt)
DSidtidte MW)
DS idtid tid (Hi) —> sentence /statement
(70 variats)
__ tate:2 7.6.22,Ro
a Ln fo
{| S EAEME (D
> id4 EXE Ly
Sententiak vm
Sida idee (ni)
sid+idt id Cw
ow
ww
@ FR ELE EDENE
x :
8 yew id
€ SEtECE >
S ELErid '
- E+ id tid
> E+ idtid > idtid tid
id + id * id
|
|
| BS Ete sid
|
|
|
mare | Amiiguaus
r more than one left mat/ tight mat /devivatign —> Anbigims qraminer
] G cannot Wirt,
| : > parser
\| AN Must be depyminit
Fe E&
| 1) AZ
i gal) [ECVE 1p down
| \ «| Nett righ
lia / ta piring gente
Ww AS) bo
ana? 2
i PA BxG
2 pad bint aed tue seo 2438 HG
14| vitor
5-4 as|sala 5 Anigucs grammmeat
| MS IN
“4 l fe /
( | |
ee
“te 5 a SbS| Sade bSas le
I We abab
a Ss
(ss :
)
ALS
o
”
HRA LR YR TRE | al ble. — (guar exprin)
FI i“ SS R
AS AN\\,
| | } is Ss i? ¢
eel ©
AY atbte
Otbrest Context Free Grammer (crn)
| Arbigtiious
© Asociatwity (samme precedent)
G) Priority /procedenca
Fo Ete Jia
24344
E
E + &
| eS
E + 5
| \
2 4
Right assodative
Em Etidlia
Ztats
#
same precadance + TeSUIT some
a ~
E AN ° i "
| | 4
Left associative.
# ES idtelia
AN
iv
JN
W+oe
swf t vay |
ay
Right rewtsive grammer7
‘Say
“pe precactonce
| [2x545 T= tam
Forse le*el2lag
E>ERT(T
qo TeF le
Fo 21316
E Rt 3459 17
= TERE IESE] Ai3i5 :
| E> Eatit
TAOTHEIe
Fo 21315 PRA
| Sout
1
| jie priority > naar to toot
higher prinity — lower lever
rary ~ — Tighe Pr
. 4 = Fight ossocias~
[# ES EFe/ FTE [ete | Ce) [-8 lid
BO Fat (rT ;
T— THE Ie
b— GtFIE
6 —Oea Hid
a> (G)\-Glid|
# R>eeR | eR IR*alblc
: \
Binary Clas pre
# ESE ad [Eo e | Ne
E> eE ow T(T
to T and FIFE
F— Wot F | Tru] Fate
ee unambiguion frm of regular mp
< ‘
Em ETIT
To TFIF .
Fo F* lalolc
eu
\ Aa AX
Fe A> AMIE > fax
4,\ = Aceves.
aN J
Aye
/
ra’
co
a
A
‘= pet Aat CN 3
of A
gy A> KAlB Ny i
aod t eB
Coe B
AOR l= axepHf Eleminating left recursion
[a= Aa 18)
A — BAy
AW wale LE
aa)
jw ES Fattt =>
Ta Ttelr
Fo id
L= pew
replace with
C_. pair ef Production A
x recurs
to Teme, Ries 6 rr
LN
ane
A &
|
é€
ied
> Context flex grammer
=. uname gram
oy Wert Fecursion free 9°
Ero +TEevle
TORT’
- “le
pro eel
eft mat top doun approachworivation
| Ambigous Crammer + 7
rnore. than one left most geivation STE
) ' "4 4nere ove. (non one
parse Wee _ Ambigou®
| Foe +e fewe lid
idtid+id id tid *id
|> Ambigous to Unampidous 5 manually ;
Cog no altervettive wor
|: _ Same same
| er, eT te Y
| AN yi; + a
| E NN I\s
EOE id
I | id id
id id
|
Two woys of Ambignus Analysis AN.
Eom &
i 4. Order of precedence same
2. priority Ust
tid eidtt Associativity + 2 = Gaeyyy ee
left ~ Fecursi (ab > Right Recursive
Ae = ‘sive
ght ~ Yecursive
2. Right = Yecursive & 7’) o> left Recursive
E —e+T |T
| E
* E es ~ . 3 .
en AY AS" For Ambigow Analysis 2S a
+ T Fe Ls
oii,
JN IN a
Fo + ot +4 &
|
|e tmambigous : level of Precectence
Lowest precedence. —s rear fo rovt
FOE4TIT
a To TeFle Higher Precedence, — far away from tout
=
Blr— id
AN BO EST IT
no fesul F
\ EF t+ T ™ T+) F
| “ZN told
T Tw F :
| \ & +5 Deivation
F F
Context Free Commer
|
Loft Recurave Cran Rignt Recursive Gamera= (Wy.
a.
ae P)
=
ion
> Comext fee grammer — Left Recursio
if Eta E&
E derivation of Ea —> Et => E©
G Right recursive if sta OF
bt ;
left
S— aAt
A— Able
S—s aAc
A—s Ab|b
S—> a@Ac
A— Able
| Fane [n> oy
| farts | nya}
| favre | 707
Hust haw. 4 Production
for each non - terriinal
aC-AbC /Obbt,.--
3
JN
a Aes
/\
A rc
|
€
S
IN,
infinite
4 7 X\ for
Es lefy rowsya
poe! Grane
S—2 aAc.,
Aw vbAl &
~< Right Recursion
< ,Poblem ih (ae Of Top-down pare
Th Elimination of left Recursion :
ds
omer) C="
Aa BX D> ABE
A— Ba IC Gi .
B—> AB [p Indirect left Recursion
—a aA’ IEgate: 8 22
(9 S—vaced cAd we ead" so eAq
» Cod a)
(i) A Abta
Ww Axa
r cad
mal N 4
zt ¢
:
AY sani
CEG -
| Right mot
| or
my Gshift reducing o7r1
oi
| Scare (otto
UG
icra)
Deteminisie. Non- dete umiach,
LRRG Grammer Nerarnmes
|
zn
p
5
sfasing—> pare ‘tree generation
sHust be deterministic
Monat <- A> XB, | XB. Jobs —> w= “X83” A A 4
“\) /*S
« ) a » a
A> aA’
{27 an rows of rans terry |
Hon-det ‘0 determirotic
= left - factoring 2
aw < No barwhacking "5
h- WM, 5 ders [ik +ses]q if expression
then,
| F>* ‘Statemeey
i S—> iF+55/ [a
wise] $9’ ebes +
ss
oe E—b
Rate: 17-8: ‘n
oo
: vet fac terie
cra G read
—L — 7 y
r bet Non-det,
ie ue el 1a pdeton 40104 ene
LF 1G un Cutt mont divans)
|
|
Luk)
ULC) —S prediction with 4 a
| k
ji tory with this don: tL(D Game errarnener a
UU) Grammer | LLC) longuage | UCD Parsereee \
Preclictive tasers) ‘nar implemen
IL
ae)
SB Class of parsers (also known
| with a parsing table -Tabl. driven parser «
LLU) Guarera —a—table-passer
UG) Grammar +
* Non - ambiguous
% Non ~lett rectusive
* Deterministic
clatter eH recunimn
ke
ag Rectusive escent Parser i 4 Parner that wer a ser Of recursive.
Procedums to recogrizt its input (may ov may nok back tracking) «
© SEyt|T ETE’
a torte Ei +Te’ 1e€
yr | Ne
r — (+ lid T— FT’
Wo er’ NE
| ro ce) 18
using this —> Recursive Percent
Parser
Prowdure EC) 5
| begin
TOs
| F PRM (5
end
| procedure EPRN () +
i| if input Symo) = ‘4 then
begin
Abv ()3
|| TOs
i| EPRN (>
endenn
4 poe
eh Predictive PaIser : a
A predictive parser is n Efficient way gf implementing recursive descent pausing
by tnandling the stack, implementation records eaplicitty +
. Pane tabe~32D
a
+———»> output & ineur
i x Cratse tree) aaa
4 Top symbol of
. Stack,
°
Fig: Prédictivé parsor Model
‘3 The parser is controlled by a program that determines X+ a.
“Where ave ‘three possible possibilities :
ALaf R245 4,-he Parser halts and announces Sutcessdul completion of
passing:
2. Mo K=a + 4- the passer pop off x from the staci, and advances the
input (pointer to “the next Inpur Symbol.
3. 4$.X% is non- terminal ‘symbol
. : : cet
i - W MIkrals {x= Uvpl} the parser replace x a4
i 4rom ‘the top of stacls and push ms VU ps Tenis = S = Tal
IY EF
{| . ip MLX-a= error . thon the Parser call error fecuvery routineTy FI
Vos were fe
Fo (Ce) \id
We" id+id id”
Stacy
WG idtidnia $
Fer id tid wid s
SER idtidwidg
= *.
HET id idtidtigs
+
SEE +id vid $
t
SETA @ id wid $
@ gis
eee idy ids
§ ET id idv ids
-z x
+
G EVER » idg
7 ae
Non- terminals
>
ria
YE>IE”
TORT
\poid
{
| Tor?
[Es +1’
|
|
LU] 5 reprasenb 3 pase tree
E= TE’
= FTE’
=, Stee
> ide’
= id+TE’
=: id+ 17 E*
> id+id VE’
= id4 fd FI/E®
> idvidw idt/®
> idtid wid &’
> td+idsid
(etter Yo left west gerivoti”)