UNIT III:
Syntax Directed Translation: Syntax-Directed Definitions, Evaluation Orders for
SDD’s, Applications of Syntax Directed Translation, Syntax-Directed Translation
Schemes, Implementing L-Attributed SDD’s.
Intermediate Code Generation: Variants of Syntax Trees, Three Address Code,
Types and Declarations, Translation of Expressions, Type Checking, Control
Flow, Backpatching, Intermediate Code for Procedures.
Introduction:
Input string is syntactically checked by the syntax analyzer. In semantic analysis compiler
analyzes the meaning of the program. Beyond syntax analysis, programming constructs are
analyzed hence extra syntactic rules are imposed in this phase. The semantic analysis is
carried out by scanning the input text and it is static in nature.
Need of Semantic analysis:
The semantic analysis is done in order to obtain the precise meaning of programming
construct . This phase is also known as static semantic analysis.
The need for semantic analysis is :
1.Firstly it is used to build a symbol to established in declarations.
2.Secondly, the data type of identifier is obtained and type checking of the whole
expression and statement is done in order to follow the type of the language
3.Thirdly, it is use to identify the scope of identifiers.
Syntax Direction Translation (SDT)
While doing the static analysis of the language we use syntax directed definition
that means augmented CFG is generated. In other words the set of attributes are
associated with each terminal and non- terminal symbols. The attributes can be a
string , a number, a type, a memory location or anything else.
The conceptual view of syntax directed translation can be shown as,
First we parse the input token string and a syntax tree is generated. Then the tree
is being is traversed for evaluating the semantic rules at the parse tree nodes.
The implementation need not have to follow all the steps. In single parse
implementation semantic rules can be evaluated during parsing without explicit
construction of parse tree or dependency graph. In such semantic evolutional at the
nodes of syntax tree, values of the attributes are defined for the given input string. Such
a parse tree containing the values of attributes at each node is called as the annotated
decorated parse tree.
There are two types of attributes.
1)Synthesized attribute.
2)Inherited attribute.
1.If the attributes of the parent node depends on the attributes of children then the
attributes is called Synthesized attribute.
Synthesized attribute.
If the attributes of the parent node depends on the attributes of children
then the attributes is called Synthesized attribute.
The Synthesized attribute only uses the Synthesized attribute then that one is
called as “S” attribute . To compute “S” Attribute we have to follow the
following rules.
•Write the Syntax directed definition(SDD) the appropriate semantic actions for
corresponding rule of the given grammar.
•Construct an annotated parse tress. This computation is done in bottom-up
manner.
•The value obtained at the root node is the final output.
•For the non-terminals E,T and F the values can be obtained using the
attributes “val” .Here “val” is a attribute and semantic rule is computing the
value of val.
•The token digit has synthesized attribute level whose value can be obtained
from lexical analyzer. In the rule S→EN, the symbol S is the start symbol. This
rule is to print the final answer of the expression.
•In syntax directed definition, terminals have synthesized attribute only.
Inherited Attributes
The value of inherited attribute at a node in a parse tree is defined using the attribute
values at the parent/siblings.
Following Steps are to be followed for inherited attribute:
• Construct the Syntax Directed Definition (SDD) using semantic actions.
•Annotated the parse tree with inherited attribute by processing in topdown parsing.
•The annotated parse tree is,
•The value of L node is first obtained from T. type (sibling). The T.type is basically lexical
value obtained as int /float /char /double.
•Then L nodes gives the type of the identifiers a,b and c, The computation of type is done in
top-down manner/ in procedure traversal.
•Using the function Enter_type the type of identifiers a,b and c is inserted in symbol table at
corresponded id_entry
Dependency Graph:
Dependency graph is used to represent the flow of information among the attributes
in a parse tree. In a parse tree, a dependency graph basically helps to determine the
evaluation order for the attributes. The main aim of the dependency graphs is to help
the compiler to check for various types of dependencies between statements in
order to prevent them from being executed in the incorrect sequence, i.e.
in a way that affects the program’s meaning. This is the main aspect
that helps in identifying the program’s numerous parallelizable
components.
Example of Dependency Graph:
Design dependency graph for the following grammar:
E -> E1 + E2
E -> E1 * E2
PRODUCTIONS SEMANTIC RULES
E -> E1 + E2 E.val -> E1.val + E2.val
E -> E1 * E2 E.val -> E1.val * E2.val
Evaluation Order For SDD
Evaluation order for SDD includes how the SDD(Syntax Directed Definition) is evaluated
with the help of attributes, dependency graphs, semantic rules, and S and L attributed
definitions. SDD helps in the semantic analysis in the compiler so it’s important to know
about how SDDs are evaluated and their evaluation order. This article provides detailed
information about the SDD evaluation. It requires some basic knowledge of grammar,
production, parses tree, annotated parse tree, synthesized and inherited attributes.
1. Dependency Graphs:
A dependency graph provides information about the order of evaluation of attributes with
the help of edges. It is used to determine the order of evaluation of attributes according
to the semantic rules of the production. An edge from the first node attribute to the
second node attribute gives the information that first node attribute evaluation is required
for the evaluation of the second node attribute. Edges represent the semantic rules of the
corresponding production.
Dependency Graph Rules: A node in the dependency graph corresponds to the node of the parse tree for each
attribute. Edges (first node from the second node)of the dependency graph represent that the attribute of the first
node is evaluated before the attribute of the second node.
Ordering the Evaluation of Attributes:
The dependency graph provides the evaluation order of attributes of the nodes of the parse tree. An edge( i.e. first
node to the second node) in the dependency graph represents that the attribute of the second node is dependent
on the attribute of the first node for further evaluation. This order of evaluation gives a linear order
called topological order.
There is no way to evaluate SDD on a parse tree when there is a cycle present in the graph and due to the cycle, no
topological order exists.
Production Table
S.No. Productions Semantic Rules
1. S⇢A&B S.val = A.syn + B.syn
2. A ⇢ A1 # B A.syn = A1.syn * B.syn
A1.inh = A.syn
3. A1 ⇢ B A1.syn = B.syn
4. B ⇢ digit B.syn = digit.lexval
S-Attributed Definitions:
S-attributed SDD can have only synthesized attributes. In this type of definitions semantic
rules are placed at the end of the production only. Its evaluation is based on bottom up
parsing.
Example: S ⇢ AB { S.x = f(A.x | B.x) }
L-Attributed Definitions:
L-attributed SDD can have both synthesized and inherited (restricted inherited as attributes
can only be taken from the parent or left siblings). In this type of definition, semantics rules
can be placed anywhere in the RHS of the production. Its evaluation is based on inorder
(topological sorting).
Example: S ⇢ AB {A.x = S.x + 2} or S ⇢ AB { B.x = f(A.x | B.x) } or S ⇢ AB { S.x = f(A.x | B.x) }
Semantic Rules with controlled side-effects:
Side effects are the program fragment contained within semantic rules. These side effects in
SDD can be controlled in two ways: Permit incidental side effects and constraint admissible
evaluation orders to have the same translation as any admissible order.
Application of Syntax Directed Translation :
•SDT is used for Executing Arithmetic Expression.
•In the conversion from infix to postfix expression.
•In the conversion from infix to prefix expression.
•It is also used for Binary to decimal conversion.
•In counting number of Reduction.
•In creating a Syntax tree.
•SDT is used to generate intermediate code.
•In storing information into symbol table.
•SDT is commonly used for type checking also.
Example :
Here, we are going to cover an example of application of SDT for
better understanding the SDT application uses. let’s consider an
example of arithmetic expression and then you will see how SDT will
be constructed.
Let’s consider Arithmetic Expression is given.
Input : 2+3*4 output: 14
SDT for the above example.
Semantic Action is given as following.
E -> E+T { E.val = E.val + T.val then print (E.val)}
|T { E.val = T.val}
T -> T*F { T.val = T.val * F.val}
|F { T.val = F.val}
F -> Id {F.val = id}
AST(Abstract Syntax Tree) is a
condensed form of an SDT tree. It has
operators at internal nodes, not at
leaves. Syntactic details like
parenthesis, commas, semi-commas,
etc., are omitted in AST.
AST is better to structure for later
compiler stages since it omits details
with the source language. It only
contains information about the
essential structure of the program.
For Example: 2*(4+5)
Syntax-Directed Translation Schemes
The syntax-directed translation helps in the semantic analysis phase in the
compiler. SDT has semantic actions along with the production in the
grammar.
•The Syntax directed translation scheme is a context -free grammar.
•The syntax directed translation scheme is used to evaluate the order of
semantic rules.
•In translation scheme, the semantic rules are embedded within the right
side of the productions.
•The position at which an action is to be executed is shown by enclosed
between braces. It is written within the right side of the production.
1 Postfix Translation Schemes
2 Parser-Stack Implementation of Postfix SDT's
3 SDT's With Actions Inside Productions
4 Eliminating Left Recursion From SDT's
5 SDT's for L-Attributed Definitions
1.Postfix Translation Schemes
The simplest SDD implementation occurs when we can parse the
grammar bottom-up and the SDD is S-attributed. In that case, we can
construct an SDT in which each action is placed at the end of the production
and is executed along with the reduction of the body to the head of that
production. SDT's with all actions at the right ends of the production bodies
are called postfix SDT's.
2. Parser-Stack Implementation of Postfix SDT's
Postfix SDT's can be implemented during LR parsing by executing the actions when
reductions occur. The attribute(s) of each grammar symbol can be put on the stack
in a place where they can be found during the reduction. The best plan is to place
the attributes along with the grammar symbols (or the LR states that represent
these symbols) in records on the stack itself.
3. SDT's With Actions Inside Productions
An action may be placed at any position within the body of a production.
It is performed immediately after all symbols to its left are processed. Thus,
if we have a production B -» X {a} Y, the action a is done after we have recognized
X (if X is a terminal) or all the terminals derived from X (if X is a non terminal).
4. Eliminating Left Recursion From SDT's
grammar with left recursion can be parsed deterministically top-down, we
examined left-recursion elimination, When the grammar is part of an SDT,
When transforming the grammar, treat the actions as if they were termi-nal
symbols.
This principle is based on the idea that the grammar transformation preserves
the order of the terminals in the generated string. The actions are therefore
executed in the same order in any left-to-right parse, top-down or bottom-up.
5. SDT's for L-Attributed Definitions
Converted S-attributed SDD's into postfix SDT's, with actions at the right ends of
productions. As long as the underlying grammar is LR, postfix SDT's can be parsed
and translated bottom-up.
Now, we consider the more general case of an L-attributed SDD.
The underlying grammar can be parsed top-down, for if not it is frequently
impossible to perform the translation in connection with either an LL or an LR
parser.
S – attributed and L – attributed SDTs in Syntax directed translation
S-attributed and L-attributed SDTs, here is a brief intro to Synthesized or
Inherited attributes Types of attributes – Attributes may be of two types –
Synthesized or Inherited
1.Synthesized attributes – A Synthesized attribute is an attribute of the non-
terminal on the left-hand side of a production. Synthesized attributes represent
information that is being passed up the parse tree.
For e.g. let’s say A -> BC is a production of a grammar, and A’s attribute is
dependent on B’s attributes or C’s attributes then it will be synthesized
attribute.
Inherited attributes – An attribute of a non-terminal on the right-hand side of a production
is called an inherited attribute.
The attribute can take value either from its parent or from its siblings. The non-terminal
concerned must be in the body (RHS) of production.
For example, let’s say A -> BC is a production of a grammar and B’s attribute is dependent
on A’s attributes or C’s attributes then it will be inherited attribute because A is a parent
here, and C is a sibling.
S-attributed SDT :
•If an SDT uses only synthesized attributes, it is called as S-attributed SDT.
•S-attributed SDTs are evaluated in bottom-up parsing, as the values of the parent nodes
depend upon the values of the child nodes.
•Semantic actions are placed in rightmost place of RHS.
L-attributed SDT:
•If an SDT uses both synthesized attributes and inherited attributes with a restriction
that inherited attribute can inherit values from left siblings only, it is called as L-
attributed SDT.
•Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right parsing
manner.
•Semantic actions are placed anywhere in RHS.
•Example : S->ABC, Here attribute B can only obtain its value either from the parent – S
or its left sibling A but It can’t inherit from its right sibling C. Same goes for A & C – A can
only get its value from its parent & C can get its value from S, A, & B as well because C is
the rightmost attribute in the given production.
Implementing L-Attributed SDD's
1. Translation During Recursive-Descent Parsing
2. On-The-Fly Code Generation
3. L-Attributed SDD's and LL Parsing
4. Bottom-Up Parsing of L-Attributed SDD‘s
1.Build the parse tree and annotate. This method works for any
noncircular SDD whatsoever.
2. Build the parse tree, add actions, and execute the actions in preorder.
This approach works for any L-attributed definition. We discussed how to turn
an L-attributed SDD into an SDT in particular, we discussed how to embed
actions into productions based on the semantic rules of such an SDD.
3. Use a recursive-descent parser with one function for each non-terminal.
The function for non-terminal A receives the inherited attributes of A as
arguments and returns the synthesized attributes of A.
4. Generate code on the fly, using a recursive-descent parser.
5. Implement an SDT in conjunction with an LL-parser. The attributes are kept
on the parsing stack, and the rules fetch the needed attributes from known
locations on the stack.
6. Implement an SDT in conjunction with an LR-parser. This method may be
surprising, since the SDT for an L-attributed SDD typically has actions in the
middle of productions, and we cannot be sure during an LR parse that we are
even in that production until its entire body has been constructed. We shall
see, however, that if the underlying grammar is LL, we can always handle
both the parsing and translation bottom-up.
Intermediate Code Generation :
Types and Declarations :
The types analysis and type checking is an important activity done in the semantic
analysis phase. The need for type checking is:
* To detect the errors arising in the expression due to incompatible operand.
•To generate intermediate code for expressions and statements typically language
support 2 types of data types.
1) Basic data types
2) Constructed data types
•The basic data types are integer , character, real , Boolean , enumerated data
types.
• Arrays, records ( Structures), set and pointer are the constructed data types. The
constructed data types are built using basic data types.
Type Expression :
The type of language construct will be denoted by a type expression. A Type
expression is either a basic type or is formed by applying an operator called a type
constructor to other type expression.
The set of basic types and constructs depends on the language to checked.
The following are definition of type expressions:
1.A basic types is a type expression among the basic types are Boolean , Char,
Integer and real. A special basic data type , Type error, will single an error during
type checking . Finally , a basic type void denoting “ the absence of a value” allows
statements to be checked.
2. The type name is also a type expression
Example: Type definition int *INT_PTR
The above statement defines the type name INT_PTR, as type expression which is actually
a integer pointer.
3. A type constructor applied to type expressions is a type expression constructors include.
ARRAYS :
If T is a type expression, then array (I,T) is a type expression denoting the type of an array
with elements of type T and index set I, I is often a range of integers.
Example : int array [20];
The type expression for an array having the elements from 0 to 19 and the data type is int
i.E , array (0,1,2,……,19, int)
Product :
The Cartesian product of T1 and T2 is a type expression i.e, T1 x T2 is a type
expression where T1, T2, are Data Types.
Pointers:
If T is a type expression then pointer (T) is a type expression denoting the type
“pointer to an object of type T”
Example : In Pascal, the declarations
Var P : ↑ row
Declares variable p to have type pointer (row).
Records :
The difference between a record and a product is that the fields of a record have
names. The record type contractor will be applied to a tuple formed from field
names and fields types.
Example :
IN Pascal program,
type row= record
address: integer;
Lexeme : array [1-----15] of char
end;
Var table : array [1----101 ] of row
declares the type name ‘row’ representing the
The type expression
“ record (( address x integer) x (lexeme x array (1….15, char)))”
and the variable table to be an array of records of this type.
Functions:
A function maps elements of one set, the domain, to another set, the range.
In programming language , as mapping a domain type D to a range type R.
Denoted by the expression D → R.
Mod of a Pascal has domain type ‘int x int’
Int x int → int
Type Checking :
•Type checking is the process of verifying and enforcing constraints of types in
values.
•A compiler must check that the source program should follow the syntactic and
semantic conventions of the source language and it should also check the type
rules of the language.
•It allows the programmer to limit what types may be used in certain
circumstances and assigns types to values.
•The type-checker determines whether these values are used appropriately or
not.
•It checks the type of objects and reports a type error in the case of a
violation, and incorrect types are corrected.
Types of Type Checking:
There are two kinds of type checking:
•Static Type Checking.
•Dynamic Type Checking.
Static type checking: is defined as type checking performed at compile time.
•It checks the type variables at compile-time, which means the type of the
variable is known at the compile time.
•It generally examines the program text during the translation of the program.
•Using the type rules of a system, a compiler from the source text that a
function (fun) will be applied to an operand. The right type each time the
expression function is evaluated.
Examples of Static checks include:
Type-checks: A compiler should report an error if an operator is applied to an
incompatible operand. For example, if an array variable and function variable are added
together.
The flow of control checks: Statements that cause the flow of control to leave a construct
must have someplace to which to transfer the flow of control.
Uniqueness checks: There are situations in which an object must be defined only once. For
example, in Pascal an identifier must be declared uniquely, labels in a case statement must
be distinct, and else a statement in a scalar type may not be represented.
Name-related checks: Sometimes the same name may appear two or more times. For
example : a loop may have a name that appears at the beginning and end of the construct.
The compiler must check that the same name is used at both places.
The Benefits of Static Type Checking:
•Runtime Error Protection.
•It catches syntactic errors like spurious words or extra punctuation.
•It catches wrong names like Math and Predefined Naming.
•Detects incorrect argument types.
•It catches the wrong number of arguments.
Dynamic Type Checking:
•Dynamic Type Checking is defined as the type checking being done at run
time.
•In Dynamic Type Checking, types are associated with values, not variables.
•Implementations of dynamically type-checked languages runtime objects are
generally associated with each other through a type tag, which is a reference
to a type containing its type information.
•Dynamic typing is more flexible. A static type system always restricts what
can be conveniently expressed.
•Dynamic typing results in more compact programs since it is more flexible
and does not require types to be spelled out.
•Programming with a static type system often requires more design and
implementation effort.
The design of the type-checker depends on:
Syntactic Structure of language constructs.
The Expressions of languages.
The rules for assigning types to constructs (semantic rules).
The Position of the Type checker in the Compiler:
The Benefits of Dynamic Type Checking:
•Dynamic typing is more flexible.
•A static type system always restricts what can be conveniently expressed.
•Dynamic typing results in more compact programs since it is more flexible and
does not require types to be spelled out.
•Programming with a static type system often requires more design and
implementation effort.
Control Flow
The control flow is the order in which the computer executes statements in
a script.
Code is run in order from the first line in the file to the last line, unless the
computer runs across the (extremely frequent) structures that change the
control flow, such as conditionals and loops.
1. Boolean Expressions
2. Short-Circuit Code
3. Flow-of-Control Statements
4. Control-Flow Translation of Boolean Expressions
5. Avoiding Redundant Gotos
6. Boolean Values and Jumping Code
1. Boolean Expressions:
Boolean expressions are composed of the Boolean operators (which we
denote &&, I I, and !, using the C convention for the operators AND, OR,
and NOT, respectively) applied to elements that are Boolean variables or
relational expressions.
Relational expressions are of the form E± rel E2 , where Ex and E2 are
arithmetic expressions.
In this section, we consider Boolean expressions generated by the following
grammar:
2. Short-Circuit Code :
In short-circuit (or jumping) code, the Boolean operators &&, I I, and !
translate into jumps. The operators themselves do not appear in the code;
instead, the value of a Boolean expression is represented by a position in the
code sequence.
Example 6.21 : The statement
i f ( x < 100 || x > 200 && x != y ) x = 0;
3. Flow-of-Control Statements
The translation of Boolean expressions into three-address code in the context of
statements such as those generated by
the following grammar:
In these productions, non-terminal B represents a Boolean expression and non-terminal S
represents a statement.
4. Control-Flow Translation of Boolean Expressions
The semantic rules for Boolean expressions complement the semantic rules for statements
in the code layout of a Boolean expression B is translated into three-address instructions
that evaluate B using creates labels only when they are needed. Alternatively, unnecessary
labels can be eliminated during a subsequent optimization phase.
conditional and unconditional jumps to one of two labels: B. true if B is true, and B. false if
B is false.
5. Avoiding Redundant Gotos
For example: The comparison x > 200 translates into the code fragment:
if x > 200 goto L4
goto L1
L4: ...
Consider the following instruction;
If False x > 200 goto L1
L4: ...
6. Boolean Values and Jumping Code
The focus in this section has been on the use of Boolean expressions to alter the flow of control in
statements. A Boolean expression may also be evaluated for its value, as in assignment statements
such as x = t r u e ; or x = a<b; .
A clean way of handling both roles of Boolean expressions is to first build a syntax tree for
expressions, using either of the following approaches:
1. Use two passes. Construct a complete syntax tree for the input, and then walk the tree in depth-
first order, computing the translations specified by the semantic rules.
2. Use one pass for statements, but two passes for expressions. With this approach, we would
translate E in while (E) S1 before S1 is examined. The translation of E, however, would be done by
building its syntax tree and then walking the tree.
Backpatching:
•Backpatching is basically a process of fulfilling unspecified information. This
information is of labels.
•It basically uses the appropriate semantic actions during the process of code
generation.
Backpatching is mainly used for two purposes:
1. Boolean expression:
Boolean expressions are statements whose results can be either true or false. A
boolean expression which is named for mathematician George Boole is an
expression that evaluates to either true or false. Let’s look at some common
language examples:
My favorite color is blue. → true
I am afraid of mathematics. → false
2 is greater than 5. → false
2.Flow of control statements:
The flow of control statements needs to be controlled during the execution of
statements in a program.
For example:
Applications of Backpatching:
•Backpatching is used to translate flow-of-control statements in one pass itself.
•Backpatching is used for producing quadruples for boolean expressions during
bottom-up parsing.
•It is the activity of filling up unspecified information of labels during the code
generation process.
•It helps to resolve forward branches that have been planted in the code.
Intermediate Code for Procedures.
Procedures and their implementation will be discussed at length along with the run-time
management of storage for names.
We use the term function in this section for a procedure that returns a value.
We briefly discuss function declarations and three-address code for function calls.
In three-address code, a function call is unraveled into the evaluation of parameters in
preparation for a call, followed by the call itself. For simplicity, we assume that
parameters are passed by value; parameter-passing methods
Example 6.25 : Suppose that a is an array of integers, and that f is a function from
integers to integers. Then, the assignment