Principles of Programming Languages
Unit-II
Describing Syntax and Semantics
Axiomatic Semantics & Denotational Semantics
K. Ravi Chythanya 1-1
Introduction
• Syntax: the form or structure of the expressions,
statements, and program units
• Semantics: the meaning of the expressions,
statements, and program units
• Pragmatics:
• Syntax and semantics provide a language’s
definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)
K. Ravi Chythanya 1-2
Language Theory
• Chomsky identified four classes of language
Typ Characteristics
e
0 Unrestricted
1 Context-sensitive
2 Context-free
3 Regular
– Types 2 and 3 useful for programming language
specification
• Backus (on ALGOL58 committee) developed
notation for specifying programming languages
K. Ravi Chythanya 1-3
Terminology
• A metalanguage is a language used to describe
another language
• A sentence is a string of characters over some
alphabet
• A language is a set of sentences
– a language is specified by a set of rules
• A lexeme is the lowest level syntactic unit of a
language (e.g., *, sum, begin)
• A token is a category of lexemes (e.g., identifier)
K. Ravi Chythanya 1-4
Two approaches to Language Definition
• Recognizers
– Read a string and decide whether it follows the rules for the language
– Example: syntax analysis part of a compiler (Chapter 4)
• Generators
– A device that generates sentences of a language (BNF)
– More useful for specifying the language than for checking a string
K. Ravi Chythanya 1-5
Formal Methods of Describing Syntax
• Backus-Naur Form and Context-Free Grammars
– Most widely known method for describing
programming language syntax
– Developed as part of the process for specifying
ALGOL
– Define a class of languages called context-free
languages
• Extended BNF
– Improves readability and writability of BNF
K. Ravi Chythanya 1-6
BNF Fundamentals
• Non-terminals: BNF abstractions used to represent
classes of syntactic structures
• Terminals: lexemes and tokens
• Grammar: a collection of rules
– Examples of BNF rules:
<ident_list> → identifier
| identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>
K. Ravi Chythanya 1-7
BNF Rules
• A rule has a left-hand side (LHS) and a right-hand
side (RHS), and consists of terminal and
nonterminal symbols
• In a context-free grammar, there can only be one
symbol on the LHS
• A grammar is a finite nonempty set of rules
• An abstraction (or nonterminal symbol) can have
more than one RHS
<stmt> <single_stmt>
| begin <stmt_list> end
K. Ravi Chythanya 1-8
Describing Lists
• Syntactic lists are described using recursion
<ident_list>-> <ident>
| <ident>, <ident_list>
<arg_list> -> <expr>
| <expr>, <arg_list>
<param_list> -> <param>
| <param>, <param_list>
<stmt_list> -> <statement>
| <statement> <stmt_list>
K. Ravi Chythanya 1-9
K. Ravi Chythanya 1-10
Derivations
• BNF is a generative device
– Use a grammar to generate sentences that belong to the language
the grammar describes
• A derivation is a repeated application of rules, starting
with the start symbol and ending with a sentence (all
terminal symbols)
K. Ravi Chythanya 1-11
Derivation
• Every string of symbols in the derivation is a sentential
form
• A sentence is a sentential form that has only terminal
symbols
• A leftmost derivation is one in which the leftmost
nonterminal in each sentential form is the one that is
expanded
• A derivation may be neither leftmost nor rightmost
K. Ravi Chythanya 1-12
An Example Grammar
<program> -> <stmts>
<stmts> -> <stmt> | <stmt> ; <stmts>
<stmt> -> <var> = <expr>
<var> -> a | b | c | d
<expr> -> <term> + <term> | <term> -
<term>
<term> -> <var> | const
K. Ravi Chythanya 1-13
An Example Derivation
<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const
K. Ravi Chythanya 1-14
Parse Tree
• A hierarchical representation of a derivation
<program>
<stmts>
<stmt>
<var> = <expr>
a <term> + <term>
<var> const
b
K. Ravi Chythanya 1-15
K. Ravi Chythanya 1-16
K. Ravi Chythanya 1-17
Ambiguity in Grammars
• A grammar is ambiguous if and only if it generates a
sentential form that has two or more distinct parse trees
K. Ravi Chythanya 1-18
An Ambiguous Expression Grammar
<expr> <expr> <op> <expr> | const
<op> / | -
<expr> <expr>
<expr> <op> <expr> <expr> <op> <expr>
<expr> <op> <expr> <expr> <op> <expr>
const - const / const const - const / const
K. Ravi Chythanya 1-19
An Unambiguous Expression Grammar
• If we use the parse tree to indicate precedence levels of the
operators, we cannot have ambiguity
<expr> <expr> - <term> | <term>
<term> <term> / const | const
<expr>
<expr> - <term>
<term> <term> / const
const const
K. Ravi Chythanya 1-20
Associativity of Operators
• Operator associativity can also be indicated by a
grammar
<expr> -> <expr> + <expr> | const
(ambiguous)
<expr> -> <expr> + const | const
(unambiguous) <expr> <expr>
<expr> + const
<expr> + const
const
K. Ravi Chythanya 1-21
Extended BNF
• Optional parts are placed in brackets [ ]
<proc_call> -> ident
([<expr_list>])
• Alternative parts of RHSs are placed inside parentheses
and separated via vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside braces { }
<ident> → letter {letter|digit}
K. Ravi Chythanya 1-22
BNF and EBNF
• BNF
<expr> <expr> + <term>
| <expr> - <term>
| <term>
<term> <term> * <factor>
| <term> / <factor>
| <factor>
• EBNF
<expr> <term> {(+ | -) <term>}
<term> <factor> {(* | /) <factor>}
K. Ravi Chythanya 1-23
Static Semantics
• Context-free grammars (CFGs) cannot describe all of the
syntax of programming languages
• For example
– the requirement that a variable be declared before it can be used is
impossible to express in a grammar
– information about variable and expression types could be included
in a grammar but only at the cost of great complexity
K. Ravi Chythanya 1-24
K. Ravi Chythanya 1-25
Axiomatic Semantics
• Based on formal logic (predicate calculus)
• Original purpose: proof correctness of programs.
• The logical expressions are called assertions.
• An assertion before a statement (a precondition)
describes the constraints on the program variables at
that point in the program.
• An assertion following a statement (a
postcondition) describes the new constraints on
those variables after execution of the statement.
K. Ravi Chythanya 1-26
Example
• We examine assertions from the point of view that
preconditions are computed from given postconditions.
• Assume all variables are integer.
• Postconditions and preconditions are presented in
braces.
• A simple example:
• sum = 2 * x + 1 {sum > 1}
• The postcondition is {sum > 1}
• One possible precondition is {x > 10}
K. Ravi Chythanya 1-27
Weakest precondition
• A weakest precondition is the least restrictive
precondition that will guarantee the postcondition.
• For example, in the above statement and
postcondition,
{ x > 10 }
{ x > 50 }
{ x > 100 }
• Are all valid precondition.
• The weakest precondition of all preconditions in this
case is { x > 10 }
K. Ravi Chythanya 1-28
Correctness proofs
• If the Weakest precondition can be computed from the given
postconditions for each statement of a language, then
correctness proofs can be constructed for programs in that
language as follows:
• The proof is begun by using the desired result of the
program’s execution as the postcondition of the last statement
of the program.
• This postcondition, along with the last statement, is used to
compute the weakest precondition for the last statement.
• This precondition is then used as the postcondition for the
second last statement.
• This process continues until the beginning of the program is
reached.
K. Ravi Chythanya 1-29
Correctness proofs
• At that point, the precondition of the first statement states the
condition under which the program will compute the desired
results.
• If this condition is implied by the input specification of the
program, the program has been verified to be correct.
• To use axiomatic semantics for correctness proofs or for
formal semantic specifications, either an axiom or an
inference rule must be available for each kind of statement in
the language.
• An axiom is a true logical statement.
• An inference rule is a method of inferring the truth of an
assertion based on other assertions.
K. Ravi Chythanya 1-30
Axiomatic Semantics: Assignment statement
• Let x = E be a general assignment
statement and Q be the postcondition.
• Then its weakest precondition P, is defined
by the axiom
P = Qx→E
• P is computed as Q with all instances of x
replaced by E.
K. Ravi Chythanya 1-31
Example
• For example, consider the following statement
and postcondition.
a = b / 2 - 1 { a < 10}
• The weakest precondition is computed by
subsituting b/2-1 in the postcondition
b / 2 - 1 < 10
b < 22
K. Ravi Chythanya 1-32
Notations for axiomatic semantics
• The usual notations are:
{P} S {Q}
• Where P is the precondition, Q is the
postcondition and S is the statement.
• For the assignment statement, the notation is
{Qx→E} x = E {Q}
K. Ravi Chythanya 1-33
Example
• Compute the precondition for the assignment
statement
x=2*y-3 { x > 25 }
• The weakest precondition is computed as
2 * y -3 > 25
y > 14
K. Ravi Chythanya 1-34
Example
• What about if the left side of the assignment
appears in the right side of the assignment?
x = x + y - 3 {x > 10}
• The weakest precondition is
x + y - 3 > 10
y > 13 – x
• Has no effect on the process of
computing the precondition.
K. Ravi Chythanya 1-35
Axiomatic Semantics: Sequences
• The precondition for a sequence of statements cannot be
described by an axiom, because the precondition depends on
the particular kind of statements in the sequence.
• The precondition can only be described with an inference rule.
• Let S1 and S2 be adjacent statements.
• Assume that S1 and S2 have the following pre/postconditions:
{P1} S1 {P2}
{P2} S2 {P3}
• The inference rule for such two-statement sequence is
{P1} S1{P2}, {P2} S2 {P3}
{P1} S1; S2 {P3}
• The axiomatic semantics of the sequence S1; S2 is
{P1} S1; S2 {P3}
K. Ravi Chythanya 1-36
Axiomatic Semantics: Sequences
• The above inference rule states that to get the
sequence precondition, the precondition of the
second statement is computed.
• This new assertion is used as the
postcondition of the first statement , which
can then be used to compute the precondition
of the first statement.
• This precondition can be used as the
precondition for the whole sequence.
K. Ravi Chythanya 1-37
Example
• Assume we have the following sequence of
statements:
x1 = E1
x2 = E2
• Then we have
{P3x2→E2} x2 = E2 {P3}
{P3x2→E2}x1→E1 x1 = E1 {P3x2→E2 }
• Therefore, the precondition for the sequence x1=E1;
x2=E2 with postcondition P3 is {P3x2→E2}x1→E1
K. Ravi Chythanya 1-38
Example
• Consider the following sequence and postcondition:
y = 3 * x + 1;
x = y + 3;
{x < 10}
• The precondition for the last assignment statement is
y<7
• Which is used as the postcondition for the first statement.
• The precondition for the first statement and the sequence can
be now computed.
3*x+1 < 7
x<2
K. Ravi Chythanya 1-39
Axiomatic Semantics: Selection
• The general form of the selection statement is
If B then S1 elese S2
• The inference rule is
{B and P} S1{Q}, {(not B) and P} S2 {Q}
{P} if B then S1 else S2 {Q}
• This rule indicates that selection statements must be proven for
both when the condition expression is true and when it is false.
• The first logical statement above the line represents the then
clause; the second represents the else clause.
• We need a precondition P that can be used in the precondition
of both the then and else clauses.
K. Ravi Chythanya 1-40
Example
• Consider the following selection statement:
if ( x > 0 )
y = y - 1
else y = y + 1
• Suppose the postcondition, Q for the selection statement is {y > 0}
• We can then use the axiom for assignment on the then clause.
y = y - 1 { y > 0} This produces {y -1 > 0} or {y > 1}.
• It can be used as the P part of the precondition of the then clause
• Now, Apply the same axiom for the else clause
y = y + 1 { y > 0} which produces y = y + 1 { y > 0} or { y > -1}
• Because {y > 1} → {y > -1}
• The rule uses {y > 1} for the precondition of the whole selection statement.
K. Ravi Chythanya 1-41
Axiomatic Semantics: Logical Pretest Loop
• Computing the weakest precondition for a logical
pretest loop (while loop) is more difficult that for a
sequence, because the number if iterations cannot
always be predetermined.
• In a case where the number of iterations is known,
the loop can be unrolled and treated as a sequence.
K. Ravi Chythanya 1-42
Axiomatic Semantics: Logical Pretest Loops
{P} while B do S end {Q}
(I and B) S {I}
{I} while B do S {I and (not B)}
where I is the loop invariant (the inductive hypothesis)
K. Ravi Chythanya 1-43
Axiomatic Semantics: Axioms
• Characteristics of the loop invariant: I must meet the following
conditions:
– P => I -- the loop invariant must be true initially
– {I} B {I} -- evaluation of the Boolean must not change the validity of I
– {I and B} S {I} -- I is not changed by executing the body of the loop
– (I and (not B)) => Q -- if I is true and B is false, is implied
– The loop terminates
K. Ravi Chythanya 1-44
Loop Invariant
• The loop invariant I is a weakened version of the loop
postcondition, and it is also a precondition.
• I must be weak enough to be satisfied prior to the beginning
of the loop, but when combined with the loop exit condition,
it must be strong enough to force the truth of the
postcondition
K. Ravi Chythanya 1-45
Evaluation of Axiomatic Semantics
• Developing axioms or inference rules for all of the statements
in a language is difficult
• It is a good tool for correctness proofs, and an excellent
framework for reasoning about programs, but it is not as
useful for language users and compiler writers
• Its usefulness in describing the meaning of a programming
language is limited for language users or compiler writers
K. Ravi Chythanya 1-46
Denotational Semantics
• The most widely known method for describing the
meaning of programs.
• Based on recursive function theory.
• Originally developed by Scott and Strachey (1970)
K. Ravi Chythanya 1-47
Denotational Semantics (continued)
• The process of building a denotational specification
for a language define for each language entity both
– a mathematical object and
– a function that maps instances of that entity onto
instances of the mathematical object.
• The difficulty with this method lies in creating the
objects and the mapping functions.
• The method is named denotational because the
mathematical object denote the meaning of their
corresponding language entity .
K. Ravi Chythanya 1-48
Example – Binary Number
• The syntax of a binary number is:
• <bin_num> →0
| 1
| <bin_num> 0
| <bin_num> 1
• To describe the meaning of a binary number using
denotational semantics we associate the actual meaning with
each rule that has a single terminal symbol in its RHS.
• The syntactic entities in this case are ‘0’ and ‘1’.
• The objects are the decimal equivalent.
K. Ravi Chythanya 1-49
Example – Binary Number
• Let the domain of semantic values of the objects be N, the set
of nonnegative decimal integer values.
• The function Mbin maps the syntactic entities of the previous
grammar to the objects in N.
• The function Mbin ,for the above grammar, is defined as
follows:
Mbin(‘0’) = 0
Mbin(‘1’) = 1
Mbin(<bin_num> ‘0’) = 2 * Mbin(<bin_num>)
Mbin(<bin_num> ‘1’) = 2 * Mbin(<bin_num>) + 1
K. Ravi Chythanya 1-50
Example - Decimal Numbers
<dec_num> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
| <dec_num> ( 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9)
• The denotational semantics for these syntax rules are:
Mdec('0')=0, Mdec('1') =1, …, Mdec('9') = 9
Mdec(<dec_num> '0') = 10 * Mdec(<dec_num>)
Mdec(<dec_num> '1’) = 10 * Mdec(<dec_num>) + 1
…
Mdec(<dec_num> '9') = 10 * Mdec(<dec_num>) + 9
K. Ravi Chythanya 1-51
Denotational Semantics: Program Constructs
• OS are defined in terms of state changes.
• DS are defined in nearly the same way.
• For simplicity, DS are defined in terms of only the
values of all of the program’s variables.
• The difference between OS and DS is that:
– state changes in OS are defined by coded
algorithms, written in some programming
language
– state changes in DS are defined by mathematical
functions.
K. Ravi Chythanya 1-52
Denotational Semantics: Program Constructs
• Let the state of a program be represented as a set of ordered
pairs as follows:
s = {<i1, v1>, <i2, v2>, …, <in, vn>}
• Each i is a variable and the associated v is its current value.
• Any of the v’s can have the special value undef.
• Let VARMAP be a function that, when given a variable name
and a state, returns the current value of the variable
VARMAP(ij, s) = vj
• The state changes are used to define the meanings of
programs and program constructs.
• Some constructs, such as expressions, are mapped to values,
not states.
K. Ravi Chythanya 1-53
Denotational Semantics: Expressions
• We assume here that we deal with only simple
expressions:
– Only + and * operators.
– An expression can have at most one operator.
– The only operands are scalar variables and integer
literals.
– No parenthesis.
– The value of an expression is integer.
K. Ravi Chythanya 1-54
Denotational Semantics: Expressions
• The BNF description of these expressions:
<expr> <dec_num> | <var> | <binary_exp>
<inary_exp> <left_exp> <operator> <right_exp>
<left_exp> <dec_num> | <var>
<right_exp> <dec_num> | <var>
<operator> + | *
• The only error we consider in expressions is that a variable
has an undefined value.
• Let Z be the set of integers, and let error be the error value.
• Then Z U {error} is the set of values to which an expression
can evaluate.
K. Ravi Chythanya 1-55
Denotational Semantics: Expressions
• The DS of expressions are (dot notation refer to child
nodes of a node)
Me(<expr>, s) = case <expr> of
<dec_num> = Mdec(<dec_num>, s)
<var> = if VARMAP(<var>, s) == undef
then error
else VARMAP(<var>, s)
<binary_expr> = if (Me(<binary_expr>.<left_expr>, s) == undef OR
Me(<binary_expr>.<right_expr>, s) = undef)
then error
else if (<binary_expr>.<operator> == ‘+’ then
Me(<binary_expr>.<left_expr>, s) +
Me(<binary_expr>.<right_expr>, s)
else Me(<binary_expr>.<left_expr>, s) *
Me(<binary_expr>.<right_expr>, s)
K. Ravi Chythanya 1-56
Assignment Statements
• An assignment statement is an expression evaluation plus
the setting of the left-side variable to the expression’s value.
Maps state sets to state sets
Ma(x := E, s) = if Me(E, s) == error
then error
else s’ = {<i1’,v1’>, <i2’,v2’>,..., <in’,vn’>},
where for j = 1, 2, ..., n,
vj’ = VARMAP(ij, s) if ij <> x
= Me(E, s) if ij == x
K. Ravi Chythanya 1-57
Logical Pretest Loops
• Assume we have two mapping functions, M sl and Mb
– Msl Maps statement list to states.
– Mb Maps boolean expression to boolean value.
• The DS of a simple loop are:
Ml(while B do L, s) = if Mb(B, s) == undef
then error
else if Mb(B, s) == false
then s
else if Msl(L, s) == error
then error
else Ml(while B do L, Msl(L, s))
K. Ravi Chythanya 1-58
Loop Meaning
• The meaning of the loop is the value of the program variables
after the statements in the loop have been executed the
prescribed number of times, assuming there have been no
errors
• In essence, the loop has been converted from iteration to
recursion, where the recursive control is mathematically
defined by other recursive state mapping functions
K. Ravi Chythanya 1-59