0% found this document useful (0 votes)
25 views19 pages

PCD - Unit Iii

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views19 pages

PCD - Unit Iii

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

UNIT III

Intermediate Code Generation


Syntax Directed Definitions – Evaluation order for SDD - Intermediate Code Generation – Three
address code Types and Declarations - Type Systems -Specification of a simple type checker,
Equivalence of Type Expressions, Type Conversions.

Syntax Directed Definitions


• A formalist called as syntax directed definition is used fort specifying translations for
programming language constructs.
• A syntax directed definition is a generalization of a context free grammar in which each grammar
symbol has associated set of attributes and each and each production is associated with a set of
semantic rules

Definition of (Syntax Directed Definition) SDD:

SDD is a generalization of CFG in which each grammar productions X->α is associated with it a
setof semantic rules of the form

a: = f(b1,b2…..bk)

Where a is an attribute obtained from the function f.

• A syntax-directed definition is a generalization of a context-free grammar in which:

– Each grammar symbol is associated with a set of attributes.

– This set of attributes for a grammar symbol is partitioned into two subsets called synthesized and
inherited attributes of that grammar symbol.

– Each production rule is associated with a set of semantic rules.

• Semantic rules set up dependencies between attributes which can be represented by a


dependencygraph.

• This dependency graph determines the evaluation order of these semantic rules.

• Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also
havesome side effects such as printing a value.

The two attributes for non terminal are :

1) Synthesized attribute (S-attribute) : (↑)

An attribute is said to be synthesized attribute if its value at a parse tree node is determined
fromattribute values at the children of the node

2) Inherited attribute: (↑,→)

An inherited attribute is one whose value at parse tree node is determined in terms of attributes at
theparent and | or siblings of that node.

• The attribute can be string, a number, a type, a, memory location or anything else.
• The parse tree showing the value of attributes at each node is called an annotated parse tree.

The process of computing the attribute values at the node is called annotating or decorating the
parsetree. Terminals can have synthesized attributes, but not inherited attributes.

Annotated Parse Tree

• A parse tree showing the values of attributes at each node is called an Annotated parse tree.

• The process of computing the attributes values at the nodes is called annotating (or
decorating) ofthe parse tree.

• Of course, the order of these computations depends on the dependency graph induced by
thesemantic rules.
Ex1:1) Synthesized Attributes : Ex: Consider the CFG :
S→ EN E→ E+T E→E-T E→ T T→ T*F T→T/F T→F F→ (E) F→digit N→;
Solution: The syntax directed definition can be written for the above grammar by using
semanticactions for each production.

Production rule Semantic actions

S →EN S.val=E.val
E →E1+T E.val =E1.val + T.val
E →E1-T E.val = E1.val – T.val
E →T E.val =T.val
T →T*F T.val = T.val * F.val
T →T|F T.val =T.val | F.val
F → (E) F.val =E.val
T →F T.val =F.val
F →digit F.val =digit.lexval
N →; can be ignored by lexical Analyzer as; I
is terminating symbol

For the Non-terminals E,T and F the values can be obtained using the attribute

“Val”.The taken digit has synthesized attribute “lexval”.

In S→EN, symbol S is the start symbol. This rule is to print the final answer of expressed.

Following steps are followed to Compute S attributed definition

1. Write the SDD using the appropriate semantic actions for corresponding production rule of
thegiven Grammar.

2. The annotated parse tree is generated and attribute values are computed. The Computation is
donein bottom up manner.

3. The value obtained at the node is supposed to be final output.

PROBLEM 1:
Consider the string 5*6+7; Construct Syntax tree, parse tree and annotated tree.

Solution:

The corresponding annotated parse tree is shown below for the string 5*6+7;

Syntax tree:

Annotated parse tree :

Advantages: SDDs are more readable and hence useful for specifications

Disadvantages: not very efficient.


Ex2:

PROBLEM : Consider the grammar that is used for Simple desk calculator. Obtain
theSemantic action and also the annotated parse tree for the string
3*5+4n. L→En E→E1+T
E→T
T→T
1*F
T→F
F→
(E)
F→digit
Solution :

Production rule Semantic actions


L→En L.val=E.val
E→E1+T E.val=E1.val + T.val
E→T E.val=T.val
T→T1*F T.val=T1.val*F.val
T→F T.val=F.val
F→(E) F.val=E.val
F→digit F.val=digit.lexval

The corresponding annotated parse tree U shown below, for the string 3*5+4n.

Evaluation order for SDD

Evaluation order for SDD includes how the SDD(Syntax Directed Definition) is evaluated
with the help of attributes, dependency graphs, semantic rules, and S and L attributed
definitions. SDD helps in the semantic analysis in the compiler so it’s important to know
about how SDDs are evaluated and their evaluation order. This article provides detailed
information about the SDD evaluation. It requires some basic knowledge of grammar,
production, parses tree, annotated parse tree, synthesized and inherited attributes.
Terminologies:
• Parse Tree: A parse tree is a tree that represents the syntax of the production hierarchically.
• Annotated Parse Tree: Annotated Parse tree contains the values and attributes at each node.
• Synthesized Attributes: When the evaluation of any node’s attribute is based on children.
• Inherited Attributes: When the evaluation of any node’s attribute is based on children or
parents.
Dependency Graphs:
A dependency graph provides information about the order of evaluation of attributes with the
help of edges. It is used to determine the order of evaluation of attributes according to the
semantic rules of the production. An edge from the first node attribute to the second node
attribute gives the information that first node attribute evaluation is required for the evaluation
of the second node attribute. Edges represent the semantic rules of the corresponding
production.
Dependency Graph Rules: A node in the dependency graph corresponds to the node of the
parse tree for each attribute. Edges (first node from the second node)of the dependency graph
represent that the attribute of the first node is evaluated before the attribute of the second
node.

Ordering the Evaluation of Attributes:


The dependency graph provides the evaluation order of attributes of the nodes of the parse
tree. An edge( i.e. first node to the second node) in the dependency graph represents that the
attribute of the second node is dependent on the attribute of the first node for further
evaluation. This order of evaluation gives a linear order called topological order.

There is no way to evaluate SDD on a parse tree when there is a cycle present in the graph and
due to the cycle, no topological order exists.
Explanation of dependency graph:
Node number in the graph represents the order of the evaluation of the associated attribute.
Edges in the graph represent that the second value is dependent on the first value.

Table-1 represents the attributes corresponding to each node.


Table-2 represents the semantic rules corresponding to each edge.
Table-1
Node Attribute
1 digit.lexval
2 digit.lexval
3 digit.lexval
4 B.syn
5 B.syn
6 B.syn
7 A1.syn
8 A.syn
9 A1.inh
10 S.val

S-Attributed Definitions:
S-attributed SDD can have only synthesized attributes. In this type of definitions semantic rules
are placed at the end of the production only. Its evaluation is based on bottom up parsing.

Example: S ⇢ AB { S.x = f(A.x | B.x) }


L-Attributed Definitions:
L-attributed SDD can have both synthesized and inherited (restricted inherited as attributes can
only be taken from the parent or left siblings). In this type of definition, semantics rules can be
placed anywhere in the RHS of the production. Its evaluation is based on inorder (topological
sorting).
Example: S ⇢ AB {A.x = S.x + 2} or S ⇢ AB { B.x = f(A.x | B.x) } or S ⇢ AB { S.x = f(A.x
| B.x) }
Note:
Every S-attributed grammar is also L-attributed.
For L-attributed evaluation in order of the annotated parse tree is used.
For S-attributed reverse of the rightmost derivation is used.
Three address code
Three address code is a type of intermediate code which is easy to generate and can be easily
converted to machine code. It makes use of at most three addresses and one operator to
represent an expression and the value computed at each instruction is stored in temporary
variable generated by compiler. The compiler decides the order of operation given by three
address code.

General representation –

a = b op c
Where a, b or c represents operands like names, constants or compiler generated temporaries
and op represents the operator

Example-1: Convert the expression a * – (b + c) into three address code.

Example-2: Write three address code for following code


for(i = 1; i<=10; i++)
{
a[i] = x * 5;
}
Implementation of Three Address Code –
There are 3 representations of three address code namely
1. Quadruple
2. Triples
3. Indirect Triples
1. Quadruple – It is a structure which consists of 4 fields namely op, arg1, arg2 and result. op
denotes the operator and arg1 and arg2 denotes the two operands and result is used to store the
result of the expression.
Advantage –
• Easy to rearrange code for global optimization.
• One can quickly access value of temporary variables using symbol table.
Disadvantage –
• Contain lot of temporaries.
• Temporary variable creation increases time and space complexity.
Example – Consider expression a = b * – c + b * – c. The three address code is:
t1 = uminus c
t2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5
2. Triples – This representation doesn’t make use of extra temporary variable to represent a
single operation instead when a reference to another triple’s value is needed, a pointer to that
triple is used. So, it consist of only three fields namely op, arg1 and arg2.
Disadvantage –
• Temporaries are implicit and difficult to rearrange code.
• It is difficult to optimize because optimization involves moving intermediate code.
When a triple is moved, any other triple referring to it must be updated also. With help
of pointer one can directly access symbol table entry.
Example – Consider expression a = b * – c + b * – c

3. Indirect Triples – This representation makes use of pointer to the listing of all references to
computations which is made separately and stored. Its similar in utility as compared to
quadruple representation but requires less space than it. Temporaries are implicit and easier to
rearrange code.
Example – Consider expression a = b * – c + b * – c
Question – Write quadruple, triples and indirect triples for following expression : (x + y) * (y
+ z) + (x + y + z)
Explanation – The three address code is:
t1 = x + y
t2 = y + z
t3 = t1 * t2
t4 = t1 + z
t5 = t3 + t4
Type Expressions
A primary type is a type of expression. Typical basic types for a language contain boolean,
char, integer, float, and void; the void implies "the absence of a value."
We can form a type name or type expression by applying the array type constructor to a number
and a type expression.
Types have structure, which we can represent using type expressions: a type expression is
either a primary type or is formed by applying a type constructor operator to a type expression.
The basic types and constructors depend on the language to be checked.
For Example,

The array type int [2][3][2] can be read as an "array of 2 arrays each of 3 arrays of 2 integers
each" and written as a type expression array(2, array(3, array(2, integer))).
Basic types
char, int, double, float are type expressions.
Type names
A name used to denote a type expressions.
Arrays
If E is a type expression and n is an int, then
array(n, E)
is a type expression indicating the type of an array with elements in T and indices in the range
0 ... n - 1.
Products
If E1 and E2 are type expressions then
E1 × E2
is a type expression indicating the type of an element of the Cartesian product of E1 and E2.
The products of more than two types are defined similarly, and this product operation is left-
associative. Hence
(E1 × E2) × E3 and E1 × E2 × E3
are equivalent.
Records
The only difference between a record and a product is that the fields of a record have names.
If abc and xyz are two type names, if E1 and E2 are type expressions then
record(abc: E1, xyz: E2)
is a type expression indicating the type of an element of the Cartesian product of T1 and T2.
Pointers
Let E is a type expression, then
pointer(E)
is a type expression denoting the type pointer to an object of type T.
Function types
If E1 and E2 are type expressions then
E1 -> E2
is a type expression denoting the type of functions associating an element from E1 with an
element from E2.
Type Equivalence
When graphs represent type expressions, two types are structurally equivalent if and only if
one of the following conditions is true:
• They are of the same basic type.
• They are formed by applying the identical constructor to structurally equivalent types.
• One is a type name that refers to the other.

The first two conditions in the above definition lead to the name equivalence of type
expressions if type names are interpreted as though they stand for themselves.
"If two type expressions are equal, return a specified type else error," says a common type-
checking rule. Ambiguities might develop when type expressions are given names and then
utilized in later type expressions. The critical question is whether a type expression's name
refers to itself or is an abbreviation for another type of expression.
Type names can create implicit cycles by denoting type expressions. The resulting graph can
have cycles owing to recursive types if edges to type names are redirected to the type
expressions given by the names.
Example of a non Cyclic graph:

Declarations
When we encounter declarations, we need to layout storage for the declared variables.
For every local name in a procedure, we create a Symbol Table(ST) entry including:
1. The type of the name
2. How much storage the name requires

Grammar:
D -> real, id
D -> integer, id
D -> D1, id

We use ENTER to enter the symbol table and use ATTR to trace the data type.
A succession of declarations is generated by nonterminal D: Basic, array, and record types are
developed by nonterminal T. One of the basic types int and float is caused by nonterminal B.
Nonterminal C, which stands for "component," creates strings of zero or more integers, each
encircled by brackets. A primary kind given by B is followed by array components specified
by nonterminal C in an array type. A record type (T's second product) is a set of declarations
for the record fields, all of which are enclosed in curly braces.

Specification of a Simple Type Checker


• Specification of a simple type checker for a simple language in which the type of
each identifier must be declared before the identifier is used.
• The type checker is a translation scheme that synthesizes the type of each expression
from the types of its subexpressions.
• The type checker can handle arrays, pointers, statements, and functions.
• Specification of a simple type checker includes the following:
o A Simple Language
o Type Checking of Expressions
o Type Checking of Statements
o Type Checking of Functions

Simple Language
• The following grammar generates programs, represented by the nonterminal P,
consisting of a sequence of declarations D followed by a single expression E.

P -> D ; E

D -> D ; D | id : T

T -> char | integer | array[num] of T | ‡ T

A translation scheme for above rules:

P -> D ; E

D -> D ; D

D -> id : T { addtype(id.entry, T.type }


T -> char { T.type := char }

T -> integer { T.type := integer }

T -> ‡ T1 { T.type := pointer(T1.type) }

T -> array[num] of T1 { T.type := array(1..num.val, T1.type) }

Type Checking of Expressions


• The synthesized attribute type for E gives the type expression assigned by the type
system to the expression generated by E.
• The following semantic rules say that constants represented by the
tokens literal and num have type char and integer, respectively:

Rule Semantic Rule

E -> literal { E.type := char }

E -> num { E.type := integer}

• A function lookup(e) is used to fetch the type saved in the symbol-table entry pointed
to by e.
• When an identifier appears in an expression, its declared type is fetched and assigned
to the attribute type;

E -> id { E.type := lookup(id.entry}

• The expression formed by applying the mod operator to two subexpressions of type
integer has type integer; otherwise, its type is type_error. The rule is

E -> E1 mod E2 { E.type := if E1.type = integer and E2.type = integer

then integer

else typr_error}
• In an array reference E1 [E2], the index expression E2 must have type integer, in which
case the result is the dement type t obtained from the type array(s. t ) of E1; we make
no use of the index set s of the array.

E -> E1 [ E2 ] { E.type := if E2.type = integer and E1.type = array(s,t)


then t

else typr_error}
• Within expressions, the postfix operator ↑ yields the object pointed to by its operand.
The type of E ‡ is the type ‡ of the object pointed to by the pointer E:

E -> E1 ‡ { E.type := if E1.type = ponter(t) then t

else typr_error}

Type Checking of Statements


• Since language constructs like statements typically do not have values, the special basic
type void can be assigned to them.
• If an error is detected within a statement, then the type type_error assigned.
• The assignment statement, conditional statement, and while statements are considered
for the type checking.
• The Sequences of statements are separated by semicolons.

Statement Rule Semantic Rule

Assignment S -> id := E {S.type := if id.type = E,type then void

else type_error}

Conditional E -> if E then S1 { S.type := if E.type = Boolean then S1.type

else type_error}

While E -> while E do S1 { S.type := if E.type = Boolean then S1.type

else type_error}

Sequence E -> S1 ; S2 { S.type := if S1.type = void and S2.type =


void then void

else type_error}

Type Checking of Functions


• The application of a function to an argument can be captured by the production

E -> E ( E )
• An expression is the application of one expression to another. The rules for associating
type expressions with nonterminal T can be augmented by the following production
and action to permit function types in declarations.

T -> T1' -> T2' {T.type := T1.type -> T2.type}

• Quotes around the arrow used as a function constructor distinguish it from the arrow
used as the meta symbol in a production.
• The rule for checking the type of a function application is

E -> E1 ( E2 ) { E.type := if E2.type = s and E1.type = s -> t

then t

else typr_error}

EQUIVALENCE OF EXPRESSIONS IN TYPE CHECKING


Type checker find whether two type expressions are equivalent or not.
This type equivalence is of two categories:
1. Structural equivalence
2. Name equivalence.
In type checking if two type expressions are equal then return a certain type else return type-
error.
Structural equivalence:
Replace the named types by their definitions and recursively check the substituted trees. If type
expressions are built from basic types and constructors then those expressions are called
structurally equivalent.
Example:
S1 S2 Equivalence Reason
Char Char S1 equivalent to S2 Similar basic types
Pointer (char) Pointer (char) S1 equivalent to S2 Similar constructor ptr to the char type
Name equivalence :
Two type expressions are name equivalent if and only if they are identical, that is if they can
be represented by the same syntax tree, with the same labels.

Example:
typedef struct Node
{
int x;
} Node;
Node *first,*second;
Struct Node *last1,*last2;

The variables first and second are name equivalent similarly last1 and last2 are name
equivalent.
Type Conversion
The type conversion is an operation that takes a data object of one type and creates the
equivalent data objects of multiple types. The signature of a type conversion operation is given
as
conversion_op :type1→type2
There are two types of type conversions which are as follows −
• Implicit type conversion (Coercions) − The programming languages that enable
mixed-mode expressions should describe conventions for implicit operand type
conversions.
Coercion is defined as an automatic conversion between types. For example in Pascal, if the
operands for the addition operation are of integer type and other real types, one of then the
integer data object is implicitly changed to type real earlier the addition is implemented.
• Explicit type conversion − Some languages support few efficiencies for doing explicit
conversions, both widening and narrowing. In some cases, warning messages are
created when an explicit narrowing conversion results in a meaningful change to the
value of the object being modified.
For example, Pascal supports a built-in function round that changes a real-number data object
to an integer data object with a value similar to the rounded value of the real. In C-based
languages, explicit type conversions are known as casts. The desired type is located in
parentheses only before the expression to be modified, as shown in (int) X for float X converts
the value of X to type integer. One of the reasons for the parenthesis in C conversions is that C
has several two-word type names, such as long int.

You might also like