0% found this document useful (0 votes)
471 views168 pages

Gordon 1979

Uploaded by

GicuMicu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
471 views168 pages

Gordon 1979

Uploaded by

GicuMicu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 168

Michael J. C.

Gordon

THE
DENOTATIONAL
DESCRIPTION
OF
PROGRAMMING
LANGUAGES
An Introduction

Springer-Verlag
New York Heidelberg Berlin
Michael J. C. Gordon
Department of Computer Science
James Clerk Maxwell Building
University of Edinburgh
Mayfield Road
Edinburgh EH9 3JZ, Scotland
Great Britain

AMS Subject Classifications (1970): 68-A30

Library of Congress Cataloging in Publication Data


Gordon, Michael J C 1948-
The denotational description of programming
languages.

Bibliography: p.
Includes indexes.
1. Programming languages (Electronic computers)
1. Title.
QA76.7.G67 001.6'424 79-15723

All rights reserved.

No part of this book may be translated or reproduced


in any form without written permission from Springer-Verlag.

© 1979 by Springer-Verlag New York Inc.

987 654 3 2 1

ISBN-13: 978-0-387-90433-7 e-ISBN-13: 978-1-4612-6228-2


DOl: 10.1 007/978-1-4612-6228-2
Preface
This book explains how to formally describe programming languages
using the techniques of denotational semantics. The presentation is
designed primarily for computer science students rather than for (say)
mathematicians. No knowledge of the theory of computation is required,
but it would help to have some acquaintance with high level programming
languages. The selection of material is based on an undergraduate
semantics course taught at Edinburgh University for the last few years.
Enough descriptive techniques are covered to handle all of ALGOL 50,
PASCAL and other similar languages.
Denotational semantics combines a powerful and lucid descriptive
notation (due mainly to Strachey) with an elegant and rigorous theory
(due to Scott). This book provides an introduction to the descriptive
techniques without going into the background mathematics at all. In
some ways this is very unsatisfactory; reliable reasoning about semantics
(e.g. correctness proofs) cannot be done without knowing the underlying
model and so learning semantic notation without its model theory could
be argued to be pointless. My own feeling is that there is plenty to be
gained from acquiring a purely intuitive understanding of semantic
concepts together with manipulative competence in the notation. For
these equip one with a powerful conceptua1 framework-a framework
enabling one to visualize languages and constructs in an elegant and
machine-independent way. Perhaps a good analogy is with calculus: for
many practical purposes (e.g. engineering calculations) an intuitive
understanding of how to differentiate and integrate is all that is needed.
Once the utility of the ideas and techniques are appreciated it becomes
much easier to motivate the underlying mathematical notions (like limits
and continuity). Similarly an intuitive understanding of the descriptive
techniques of denotational semantics is valuable, both as a tool for
understanding programming, and as a motivation for the advanced
theory.
Because the underlying mathematics is not described I have occasion-
ally used notation which, whilst intuitively straightforward, is technically
sloppy. For example I have used the symbol = with several conceptually
similar, but mathematically distinct, meanings. I felt it best not to
introduce minor technical distinctions and restrictions if they could not be
satisfactorily explained. Any reader familiar with Scott's theory should
easily be able to detect and correct the few abuses of notation.
I have changed certain standard notations to be less standard but more
mnemonic. Also since one or two standard symbols were not available on
the typesetting machine used I have had to use substitutes. I hope the
overall attractiveness of the typesetting compensates for the slightly un-
orthodox appearance of some of the notation.
Acknowledgements
None of the material in this book is original, that which has not been
published is part of the folklore of the subject. In deciding how to present
things I was strongly influenced by Milne and Strachey's advanced
treatise [Milne & Stracheyl. Jim Donahue, Robert Milne and Bob Tennent
all made detailed criticisms of an early draft-I have tried to implement
their suggestions as much as possible. Errors in the final draft were
pointed out to me by members of CS4 and the postgraduate theory
course (78179); special thanks to: Gordon Brebner, Mark Jerrum, Alan
Mycroft, Don Sanella and Michael Sanderson. Finally I'd like to thank the
following people for numerous fruitful discussions, suggestions and
explanations: Dana Angluin, Rod Burstall, Avra Cohn, Dave Macqueen,
Robin Milner, Peter Mosses, Mogens Nielsen, Gordon Plotkin, Jerry
Schwarz and Chris Wadsworth.
The writing of this book was supported by the Science Research
Council.
Contents
Preface
Acknowledgements
1. Introduction 5
1.1. Syntax, semantics and pragmatics 6
1.2. The purposes of formal semantics 6
1.2.1. Providing precise and machine-independent concepts 7
1.2.2. Providing unambiguous specification techniques 7
1.2.3. Providing a rigorous theory to support reliable reasoning 8
1.3. Denotational semantics 9
1.4. Abstract entities and their description 10

2. A first example: the language TINY 12


2.1. Informal syntax of TINY 12
2.2. Informal semantics of TINY 12
2.2.1. Informal semantics of expressions 13
2.2.2. Informal semantics of commands 14
2.3. An example 14
2.4. Formal semantics of TINY 14
2.4.1. Syntax 15
2.4.2. States, memories, inputs, outputs and values 15
2.4.3. Semantic functions 16
2.4.3.1. Denotations of expressions 16
2.4.3.2. Denotations of commands 18
2.4.4. Semantic clauses 19
2.4.4.1. Clauses for expressions 19
2.4.4.2. Clauses for commands 20
2.4.5. Summary of the formal semantics of TINY 21

3. General concepts and notation 23


3.1. Abstract syntax 23
3.2. Sets and domains 24
3.2.1. The problem of recursively defined functions 25
3.2.2. The problem of recursively defined sets 26
3.2.3. The role of Dana Scott's theory 28
3.2.4. The role of mathematics in this book 29
3.3. Defining domains 30
3.3.1. Standard domains 30
3.3.2. Finite domains 30
3.3.3. Domain constructors 30
3.3.3.1. Functionspace[0,-+02l 30
3.3.3.2. Product [0, x O 2 X ... X 0nl 31
3.3.3.3. Sequences 0* 31
3.3.4. Sum [0, + O 2 + ... + 0nl 32
3.3.4. Domain equations 34
3.4. Functions 34
3.4.1. A-notation 36
3.4.1.1. Basic idea 36
3.4.1.2. Elaborations 36
3.4.1.2.1. Explicitly indicating source and/ or target 36
3.4.1.2.2. More than one argument 37
3.4.1.3. Applying A-expressions to arguments 37
Contents

3.4.1.4. Changing bound variables 38


3.4.2. Higher order functions 38
3.4.3. Important notational conventions on precedence and association 39
3.4.4. Currying 40
3.4.5. Conditionals 41
3.4.6. Cases notation 42
3.4.7. Updating functions 43
3.4.8. Generic functions 43
3.4.9. Ways of defining functions (including recursion) 43
3.4.10. Cancelling out variables 45
3.4.11. where notation 46
3.4.12. Composition and sequencing 47
3.4.12.1. Composition 47
3.4.12.2. Sequencing 47

4. Denotational description of TINY 49


4.1. Abstract syntax 49
4.1.1. Syntactic domains 49
4.1.2. Syntactic clauses 49
4.2. Semantics 49
4.2.1. Semantic domains 50
4.2.2. Auxiliary functions 50
4.2.2.1. result 50
4.2.2.2. donothing 50
4.2.2.3. checkNum 50
4.2.2.4. checkBool 51
4.2.3. Semantic functions 51
4.2.4. Semantic clauses 51
4.2.4.1. Clauses for expressions 51
4.2.4.2. Clauses for commands 51

5. Standard semantics 52
5.1. Continuations 52
5.1.1. Modelling the 'rest of the program' 53
5.1.2. Direct and continuation semantics 55
5.1.3. Continuation semantics of TINY 57
5.1.3.1. Semantic domains and functions 57
5.1.3.2. Semantic clauses 58
5.1.4. Final answers and output 60
5.1.4.1. Final answers are not states 60
5.1.4.2. Output is not part of the state 60
5.1.4.3. Output can be infinite 60
5.2. Locations, stores and environments 62
5.2.1. Sharing 62
5.2.2. Variables and locations 62
5.2.3. Stores 63
5.2.4. Environments 64
5.3. Standard domains of values 65
5.4. Blocks, declarations and scope 65
5.5. Standard domains of continuations 68
5.5.1. Command continuations 68
5.5.2. Expression continuations 68
5.5.3. Declaration continuations 68
5.6. Standard semantic functions 69
Contents

5.7. Continuation transforming functions 70


5.7.1. cont 71
5.7.2. update 71
5.7.3. ref 72
5.7.4. deref 72
5.7.5. err 72
5.7.6. Domain checks: 07 72
5.8. Assignments and Land R values 73
5.8.1. Land R values 74
5.9. Procedures and functions 75
5.9.1. Procedures 75
5.9.2. Functions 77
5.9.3. Summary 78
5.10 Non standard semantics and concurrency 78

6. A second example: the language SMALL 80


6.1. Syntax of SMALL 80
6.1.1. Syntactic domains 80
6.1.2. Syntactic clauses 80
6.2. Semantics of SMALL 81
6.2.1. Semantic domains 81
6.2.2. Semantic functions 82
6.2.3. Semantic clauses 82
6.2.3.1. Programs 82
6.2.3.2. Expressions 82
6.2.3.3. Commands 83
6.2.3.4. Declarations 84
6.3. A worked example 85

7. Escapes and jumps 88


7.1. Escapes 88
7.1 .1. Escapes from commands 88
7.1 .2. Escapes from expressions 89
7.1.3. valof and resultis 90
7.2. Jumps 92
7.2.1 The semantics of jumps 93
7.2.2. Assigning label values to variables 96

8. Various kinds of procedures and functions 98


8.1. Procedures (or functions) with zero or more parameters 98
8.1.1. Zero parameters 98
8.1 .2. More than one parameter 99
8.2. Recursive procedures and functions 100
8.2..1. Recursive functions in ALGOL 60 and PASCAL 102
8.3. Static and dynamic binding 102
8.3.1. Semantics of binding 103
8.3.2. Advantages and disadvantages of dynamic binding 104
8.4. Parameter passing mechanisms 105
8.4.1. Call by value 105
8.4.2. Call by reference 108
8.4.2.1. Simple call by reference 108
8.4.2.2. PASCAL call by reference 108
8.4.2.3. FORTRAN call by reference 108
8.4.3. Call by value and result 109
Contents

8.5. Procedure calling mechanisms 110


8.5.1. Call by closure (ALGOL 60 call by name) 111
8.5.2. Call by text (LISP FEXPRs) 112
8.5.3. Call by denotation 113
8.5.4. Quotation constructs 113
8.6. Summary of calling and passing mechanisms 114
8.7. Procedure and function denoting expressions (abstractions) 115
8.8. Declaration binding mechanisms 116

9. Data structures 118


9.1. References 118
9.2. Arrays 119
9.2.1. news 120
9.2.2. newarray 120
9.2.3. subscript 120
9.3. Records 121
9.4. Data structure valued expressions 123
9.5. Files 124

10. Iteration constructs 129


10.1. repeat C until E 129
10.2. Event loops 129
10.3. For-statements 130

11. Own-variables 134


11.1. The within construct 135
11.2. Different interpretations of ALGOL 60 own-variables 135
11.3. Semantics of own-variables 136
11.3.1. Static interpretation 140
11.3.2. Intermediate interpretation 140
11.3.3. Dynamic interpretation 140

12. Types 142


12.1. Various kinds of types 142
12.2. Well-typed programs and type-checking 143
12.2.1. The denotational description of type-checking 144
12.3. The semantics of types 145

Appendix: Remarks for instructors and sample exercises 147


Sample exercises 149

References 152
Subject and Author Index 154
Symbols 160
Introduction
1. Introduction
1.1. Syntax, semantics and pragmatics
The study of a natural or artificial language is traditionally split into three
areas:
(j) Syntax: This deals with the form, shape, structure etc. of the
various expressions in the language.
(ij) Semantics: This deals with the meaning of the language.
(iii) Pragmatics: This deals with the uses of the language.

For programming languages the words "syntax" and "semantics" both


have fairly clear meanings; "pragmatics" on the other hand is rather more
vague-some people use the term to cover anything to do with
implementations (the idea being that languages are 'used' via
implementations) others use the term for the study of the 'practical
value' of programming concepts. We shall discuss our uses of "syntax"
and "semantics" later in the book; the word "pragmatics" will be
avoided.

1.2. The purposes of formal semantics


This book is about the formal semantics of programming languages. The
word "formal" means that our study will be based on precise
mathematical principles (although we shall avoid the actual mathematics).
Many books on programming languages discuss meanings intuitively or
informally. For example in the PASCAL report [Jensen and Wirth] the
meaning of an assignment statement is described thus: "The assignment
statement serves to replace the current value of a variable by a new value
specified as an expression". For many purposes this is a perfectly
adequate semantic description and one might wonder if there is any point
in being more formal. To see that there is, we now briefly discuss three
uses of formal techniques:
(j) Providing precise machine independent concepts.
(ij) Providing unambiguous specification techniques.
(iii) Providing a rigorous theory to support reliable reasoning.

We shall explain each of these in turn.

6
Introduction 7
1.2.1. Providing precise and machine-independent concepts
When informally describing or comparing languages one frequently
makes use of various concepts. For example recall the description of the
PASCAL assignment statement quoted above. If one was trying to
explain this to someone who knew nothing about programming one
would have to explain:
(i) The concept of a "variable".
(ii) What the "current value" of a variable is.
(iii) Howa value is "specified as an expression".
One way of explaining these would be to describe some actual
implementation of PASCAL [say the one for the DEC 10 computer] and
then to explain how variables, current values, expression evaluation etc.
are represented. The trouble with this is that if someone else had been
given explanations with respect to a different implementation (say the one
for the CDC 6000 computer) then the two explanations might assign
subtly different meanings to the same concepts. In any case the
"essence" of these notions clearly is not dependent on any particular
machine. What a formal semantics gives one is a universal set of
concepts, defined in terms of abstract mathematical entities, which
enable things like (j), (ii), (iii) above to be explained without reference to
the arbitrary mechanisms of particular implementations.
Unfortunately it is rather hard to justify the analytic power of formal
concepts before the concepts themselves have been described. As we
proceed the reader must decide for himself whether we are really, as
claimed, providing powerful tools for thought.

1.2.2. Providing unambiguous specification techniques


Both users and implementers of a programming language need a
description that is comprehensible, unambiguous and complete. During
the last twenty years or so notation such as Backus-Naur form (BNF for
short), supported by the concepts of formal language theory, have
provided adequate ways of specifying, and thinking about, the syntax of
languages. Unfortunately the development of notation and concepts for
dealing with semantics has been much slower. Almost all languages have
been defined in English. Although these descriptions are frequently
masterpieces of apparent clarity they have nevertheless usually suffered
8 The Denotational Description of Programming Languages

from both inconsistency and incompleteness. For example many different


interpretations were compatible with the defining reports of both ALGOL
60 [Knuth '67] and PASCAL [Welsh, Sneeringer and Hoare]. Experience
has shown that it just is not possible to achieve the necessary precision
using informal descriptions.
Another reason why formal semantic definitions are useful is that they
can be made machine readable. This enables useful software tools to be
produced - just as formal syntactic notations lead to tools such as parser
generators. For example Peter Mosses has produced a system which
generates test bed implementations of programming languages from
formal specifications similar to those described in this book[Mosses].

1.2.3. Providing a rigorous theory to support reliable reasoning


Formal semantic techniques enable us to state and rigorously prove
various properties of programs and programming languages. For example
in order to prove a program correct one must show that its actual meaning
coincides with its intended meaning, and to do this both meanings must
be formalised.
When reasoning about programs one is often tempted to use various
apparently plausible rules of inference. For example suppose at some
point in a computation some sentence, S[El say, involving the expression
E is true; then after doing the assignment x: = E one would expect S[x] to
hold (since x now has the value of El. Thus if fly is prime" is true before
doing the assignment x: = y then fIX is prime" is true after it (here E is V
and S[El is liE is prime"). Although at first sight this rule seems to be
intuitively correct it does not work for most real languages. For example
in PASCAL if x is of type real and V of type integer then if the sentence "v
is an integer" is true before doing x: = y the sentence " X is an integer" is
not true after it (since the value of V will have been coerced to a value of
type real before being assigned to xl. It can be shown [Ligler1 that for
ALGOL 60 the assignment rule discussed above can fail to hold in six
ways.
A formal semantics provides tools for discovering and proving exactly
when rules like the above one work: without a formal semantics one has
to rely on intuition-experience shows this is not enough. For example
the designers of EUCLID hoped that by making the language clean and
simple the validity of various rules of inference would be intuitively
Introduction 9

obvious. Unfortunately when they came to actually write down the rules
they found their correctness was not at all obvious and in the paper des-
cribing them [London et al.J they have to point out that they are still not
sure the rules are all correct. If EUCLID had been given a formal semantics
(of the type to be described in this book) then whether or not a given rule
was correct would have been a matter of routine mathematical calcula-
tion.
In this book we shall concentrate on the topics discussed in 1.2.1. and
1.2.2. - readers interested in the underlying mathematics (and its use for
rigorous reasoning) are advised to look at Joe Stoy's excellent text [Stoy].
1.3. Denotational semantics
The kind of formal semantics described in this book is called denotational
semantics. The word "denotational" is used because in this approach
language constructs are described by giving them denotations-these are
abstract mathematical entities which model their meaning. We shall be
going into great detail about what these denotations are and how they are
related to the constructs that denote them-for the time being perhaps a
hint of the idea can be conveyed by saying that the expressions (4 + 2),
(12-6) and (2 x 3) all denote the same number, namely the number de-
noted by6.
In the early days of the subject denotational semantics was called
"mathematical semantics" -this term has now been abandoned since it
incorrectly suggests that other kinds of semantics are non-mathematical.
These other kinds of semantics-for example operational semantics, axio-
matic semantics and algebraic semantics-are not alternative ways of
doing what denotational semantics does; they each have their own set of
goals. A certain amount of confusion has arisen from considering these
kinds of semantics to be rival theories. This confusion leads one to under-
take the useless activity of trying to decide which kind is best. To try and
decide, for example, if denotational semantics is better than axiomatic
semantics is like trying to decide if physical chemistry is better than
organic chemistry.
The purposes of the various kinds of semantics are outlined below:

Denotational semantics
This is concerned with designing denotations for constructs. In this book
10 The Denotational Description of Programming Languages

we shall only describe denotations which model abstract machine-inde-


pendent meanings; however denotations can be (and have been)
designed to model particular implementations (for examples see Vol. II of
[Milne and StracheyJ).

Operational semantics
This is concerned with the operations of machines which run programs-
namely implementations. These operations can be described with
machine oriented denotations or using non-denotationed techniques like
automata theory.

Axiomatic semantics
This is concerned with axioms and rules of inference for reasoning about
programs.

Algebraic semantics
This is a branch of denotational semantics making use of algebraic
concepts.

1.4. Abstract entities and their description


Denotational semantics is concerned with denotations-abstract entities
which model meanings. Now unfortunately such abstract entities are
rather elusive- indeed it is impossible to discuss them without using
some particular concrete notation. For example the only way to talk about
numbers is to use some arbitrary set of names like 1, 2, 3 or I, II, III or
one, two, three . .. etc. Since we can not directly manipulate abstract
entities but only talk about them with languages, perhaps we should
conclude that the languages are the only things that 'really exist'.
In this book we shall write as though abstract entities like numbers or
sets really do exist and that they are different from the notations and
languages used to talk about them. This approach is the standard one and
(we believe) leads to clean and well structured thinking-for example it
focuses one on the 'real problems' and prevents ones thought from
getting bogged down in inessential details (like whether to use 1, 2, 3 or
I, II, 111- there is one set of numbers but many ways of naming them).
However it is possible to use the notation of denotational semantics whilst
Introduction 11

denying that this notation refers to abstract entities at all. If one adopts
this attitude then a denotational description is thought of as specifying a
translation from one language (the language being described) to another
language (a 'meta language' consisting of the notations described in this
book). The problem with this is that the metalanguage-the 'target
language' of the translation-is ad hoc and not really properly defined.
Now it is possible to formalize the metalanguage (see [Mosses]) but doing
this, although very valuable for some purposes, is a separate task from our
main goal-namely describing programming languages-just as formal-
izing the language of physics is a separate task from actually doing
physics (the former activity is part of the philosophy of science).
Just as mathematicians use ad hoc pieces of notation to increase clarity
so we use various informal abbreviations to make the descriptions of
denotations clearer. If one thinks of semantics as being about abstract
concepts then this is a natural and reasonable thing to do-the concepts
are what really matter, the notation is just one of many possible ways of
talking about them. If, on the other hand, one thinks of semantics as
explaining meanings by translating them into a metalanguage then one
will (rightfully) worry about the lack of precise details (for example exact
syntax) of this metalanguage. For readers inclined toward the second
view this book is likely to be frustrating and unsatisfactory.
2. A first example: the language TI NY

In this chapter we shall describe the syntax and semantics of a little


programming language called TINY. Our purpose is to provide a vehicle
for illustrating various formal concepts in use. In subsequent chapters we
shall describe these concepts systematically but here we just sketch out
the main ideas and associated techniques.

2.1. Informal syntax of TINY


Tiny has two main kinds of constructs, expressions and commands, both
of which can contain identifiers which are strings of letters or digits begin-
ning with a letter (for example x, y1, thisisaverylongidentifier). If we
let I, I" 12 , . . . stand for arbitrary identifiers, E, E" E2 , . • • stand for
arbitrary expressions and C, C" C 2 , . • • stand for arbitrary commands
then the constructs of TINY can be listed as follows:
E: : =0 11 1true I false I read I I I not E IE, = E2 IE, + E2
C:: = I: = E I output E I if E then C, else C 2 I while E do C I C, ;C 2
This notation is a varient of BNF, the symbol ":: =" should be read as
"can be" and the symbol" I" as "or". Thus a command C can be I: = E or
output E or if E then C, else C 2 or while E do C or C, ;C 2 ' Notice that
this syntactic description is ambiguous-for example it does not say
whether while E do C, ;C 2 is (while E do C, );C 2 or while E do (C, ;C 2 ).
We shall use brackets (as above) and indentation to avoid such ambi-
guities. In the next chapter we shall clarify further our approach to syntax.

2.2 Informal semantics of TINY


Each command of TINY, when executed, changes the state. This state
has three components:
(j) The memory: this is a correspondence between identifiers and
values. In the memory each identifier is either bound to some value
or unbound.
(jj) The input: this is supplied by the user before programs are run; it
consists of a (possibly empty) sequence of values which can be
read using the expression read (explained later).

12
A First Example: The Language TINY 13

(iii) The output: this is an initially empty sequence of values which


records the results of the command output E (explained later).
Each expression of TINY specifies a value; since expressions may contain
identifiers (for example x + y) this value depends on the state. All values-
the values of expressions, the values bound to identifiers or the values in
the input or output-are either truth values (true, false) or numbers
(0,1,2, ... ).
We shall now explain informally the meaning of each construct.

2.2.1. Informal semantics of expressions


The value of each expression is as follows:
(E1)00r1
The value of 0 is the number 0 and the value of 1 is the number 1.
(E2) true or false
The value of true is the truth value true and the value of false is
the truth value false.
(E3) read
The value of read is the next item on the input (an error occurs if
the input is empty). read has the 'side effect' of removing the first item
so after it has been evaluated the input is one item shorter.
(E4) I
The value of an expression I is the value bound to I in the memory (if I
is unbound an error occurs).
(E5) not E
If the value of E is true then the value of not E is false; if the value of E
is false then the value of not E is true. In all other cases an
error occurs.
(E6) E, = E2
The value of E, = E2 is true if the value of E, equals the value of E2,
otherwise it is false.
(E7) E, + E2
The value of E, + E2 is the numerical sum of the values of E, and E2
(if either of these values is not a number then an error occurs).
14 The Denotational Description of Programming Languages

2.2.2. Informal semantics of commands


(C1)1:=E
I is bound to the value of E in the memory (overwriting whatever was
previously bound to I).
(C2) output E
The value of E is put onto the output.
(C3)if E then C 1 else C 2
If the value of E is true then C 1 is done, if its value is false then C 2
is done (in any other case an error occurs).
(C4) while E do C
If the value of E is true then C is done and then while E do C is
repeated starting with the state resulting from C's execution. If
the value of E is false then nothing is done. If the value of E is neither
true nor false an error occurs.
(C5) C 1 ;C 2
C 1 and then C 2 are done in that order.

2.3. An example
The following TINY command outputs the sum of the numbers on the
input. The end of the input is marked with true.
sum: = 0; x: = read;
while not (x = true) do sum: = sum + x; x: = road;
output sum

2.4. Formal semantics of TINY


We shall now informally formalize the above description of TINY. Our
hope is to convey the general 'shape' of a denotational description. The
reader should not attempt to grasp all the details (some of which are over-
simplified) but just get the main ideas.
An essential part of our formalization will be the defining of various sets
(for example sets of denotations). It turns out that some of the ways of
defining sets we need only work properly if we use certain special sets
called domains. We shall thus use "domain" instead of "set" - however
intuitively domains can be thought of just like sets (indeed the class of
A First Example: The Language TINY 15

domains is just a subclass of the class of sets) and we shall employ normal
set theoretic notation on them. For example {x I P[xl} is the set of all x's
satisfying P[x]; xeS means x belongs to S; f:S,-S2 means f is a func-
tion from S, to S2' In the next chapter (in fact in 3.2.) we shall explain
why domains rather than sets must be used.

2.4.1. Syntax
To deal with the syntax we define the following syntactic domains:
Ide = {III is an identifier}
Exp = {E I E is an expression}
Com = {C I C is a command}
Domain names like these will always start with a capital letter. Ide is a
standard domain, Exp and Com vary from language to language.

2.4.2. States, memories, inputs, outputs and values


We start by formalizing the concept of a state. To do this we define
domains State of states, Memory of memories, Input of inputs, Output
of outputs and Value of values. The definition of these domains consists
of the following domain equations which we first state and then explain:
(j) State = Memory x Input x Output
(ii) Memory = Ide-[Value + {unbound}]
(iii) Input = Value*
(iv) Output = Value*
(v) Value = Num + Bool

(j) Means that State is the domain of all triples (m,i,o) where me
Memory, ielnput and oeOutput. In general 0, x O2 x ... X On is the
domain {(d, ,d 2, ... ,d n) I d, eO, ,d2e02, ... ,dneOn} of n-tuples.
(ii) Means that Memory is the domain of all functions from the domain
Ide to the domain [Value + {unbound}]. The domain [Value+{unbound}1
is the union of the domain Value and the one element domain {unbound}.
In general [0,-0 2] is the domain {f I f:0,-02} of all functions from 0,
to O 2 and [0, + O 2] is the 'disjoint union' of 0, and O2-we clarify this
later; for now think of + as ordinary union restricted to disjoint domains.
If meMemory and Ielde then m I, the result of applying the function m
to argument I, is either in Value or is unbound; in the former case the
16 The Denotational Description of Programming Languages

value m I is the value bound to I in m, in the latter case I is unbound in m.


(iii) Means Input is the domain of all strings (including the empty string)
of members of Value. In general D* is the domain
{(d, , ... ,d n ) I d, cD,d 2 cD, ... ,d n cD}

of strings or sequences over D. We shall denote the empty string by ( )


and sometimes write d, • d 2 ••••• d n instead of (d, , ... ,dn ).
(iv) Means Output is the domain of all strings of values.
(v) Means that a value is either a number (i.e. a member of Num) or a truth
value (i.e. a member of Booll. Num = {0,1 ,2, ... } and Bool = {true, false}.
Thus a state is a triple (m,i,o) where m is a function from identifiers to
values (or unbound) and i and 0 are sequences of values.

2.4.3. Semantic functions


In this section we discuss the semantic functions for TINY. Semantic func-
tions are functions which define the denotation of constructs. For TINY
we need:
E: Exp-{ denotations of expressions}
C:Com-{denotations of commands}

If EcExp and CcCom then E[E] and C[C] are the results of applying the
functions E and C to E and C respectively and are the denotation of the
corresponding constructs defined by the semantics. We discuss these
denotations later, but first note that:
(j) In general if X is a variable which ranges over some syntactic
domain of constructs then we will use X for the corresponding semantic
functions. Thus ClI: = E] is the denotation-or meaning-of I; = E etc.
Oi) The "emphatic brackets" [ and] are used to surround syntactic
objects when applying semantic functions to them. They are supposed to
increase readability but have no other significance-X[X] is just the result
of applying the function X to X.

2.4.3.1. Denotations of expressions


Since expressions produce values one might at first take their denotations
to be members of Value. To model this idea one would give the semantic
function E the type Exp-Value and then E[E] would be E's value (e.g.
E[O] =0). This works for constant expressions but in general it fails to
handle:
(i) The possibility of expressions causing errors (for example 1 + true).
A First Example: The Language TINY 17

(ii) The dependence of some expression's values on the state (for


example the value of x + 1 depends on what x is bound to in the
memory; the value of read depends on the input).
(iii) The possibility that the evaluation of an expression might change
the state (for example read removes the first item from the input).

I
To handle (j) we must define E: Exp-[Value + {error} 1so that:
if v is E's value
E[E]= v
error if E causes an error

For example E[1 + 1] = 2 but E[1 + true] = error.


To handle (ii) we must make the result of an evaluation a function of the
state-i.e. define E:Exp-[State-[Value + {error}]] so that:
if v is E's value in s
E[E]s= 1v
error if the evaluation causes an error

For example:
1 + (mx) if s = (m, i, 0) and mx is a number
E[1 +x1s= 1
error otherwise
Thus the denotation E[E] of E is a function of type
Exp-[State-[Value + {error}]].
Finally to handle (iii) we must further complicate E's type so that:
E:Exp-[State-[[Value x State] + {error}]]
and then
(v, s') where v is E's value in sand s' is the state after
the evaluation.
E[E]s= {

error if an error occurs.


18 The Denotational Description of Programming Languages

For example:
Ihd i. 1m, tl i, s)) if s = 1m, i, 0), i is non empty and has
{ first member hd i and the rest is tl i.
E[read]s=

error if the input is empty.

We formally define E by cases on the different kinds of expressions. For


example the denotations of expressions of the form I are defined by the
following semantic clause:
Em (m, i, 0) = 1m 1= unbound)-error, (ml, (m, i, 0))

Here "b-v" v 2 " means "if b is true then v, else v 2 " -we give a
precise account of this notation in 3.4.5. Thus the above semantic clause
says that if m I is unbound (i.e. I is unbound in m) then an error occurs
otherwise the value of I is whatever it is bound to in the memory. The
state resulting from the evaluation is 1m, i, 0) - i.e. the original state. An
example of a semantic clause in which the evaluation changes the state is:
E[read] 1m, i, 0) = null i-error, Ihd i, 1m, tl i, 0))
Here null i is true if i is empty, hd i is the first element of i and tl i the rest
(see 3.3.3.3.). Thus if the input is empty an error results otherwise the
value of read is the first item in the input and the resulting state has this
item removed.
The rest of the semantic clauses for TINY are described in 2.4.4. below.

2.4.3.2. Denotations of commands


The effect of executing a command is to produce a new state or generate
an error, thus:
C:Com-[State-[State + {error}]]
a typical semantic clause is:
C[output E] s = (E[E]s= (v, (m, i, om-(m, i, v. 0), error
Here v. 0 is the string resulting from sticking v onto the front of o. Thus
this semantic clause says that C[output E) is a function which when
applied to a state s first evaluates E and if E produces a value v and new
A First Example: The Language TINY 19

state (m, i, 0) then the result is (m, i, v. 0) otherwise the result is error.

2.4.4. Semantic clauses


In this section we describe and explain the semantic clauses for TINY.

2.4.4.1. Clauses for expressions


(E1) E[o]s = (0, s)
E[1]s=(1,s)
The value of a numeral is the corresponding number; the evaluation does
not change the state.
(E2) E[true]s = (true, s)
E[false]s = (false, s)
The value of a boolean constant is the corresponding boolean value (truth
value); the evaluation does not change the state.
(E3) E[read] (m, i, 0) = null i-error, (hd i, (m, tl i, 0))

This was explained above, null i is true if i is empty and false otherwise;
hd i is the first element of i and tl i is the rest of i.
(E4) Em (m, i, 0) = (m 1= unbound)-error, (m I, (m, i, 0))
This was explained in 2.4.3.1. above.
(E5) E[not E]s = (E[E1s = (v,s'n-OsBool v-(not v,s'), error), error

isBool v is true if ve Bool and is false otherwise (here


veValue = Num + Bool), not: Bool-Bool is the function defined by
not true = false and not false = true. Thus the value of not E is not of
the value of E (or error if E's evaluation leads to a number or error) and
the state is changed to the state s' resulting from E's evaluation in s.
(E6) E[E, =E 2]s = (E[E,]s=(v"s,))-((E[E 2]s, =(V 2,S2))-
(v, =v 2' S2)' error), error
Here v, = V2 is true if v, equals v 2 and false otherwise. Thus the result of
E, = E2 in s is obtained by first evaluating E, in s to get (v" s,) (or error
-in which case error is the value of E, = E2), then evaluating E2 in s, to
20 The Denotational Description of Programming Languages

get (v 2, S2) (or error-in which case error is the value of E, = E2), finally
(v, =V 2,S2) is returned asthe result of E, =E 2.
(E7) E[E, +E 2]s=(E[E,]s=(v"s,))-
((E[E 2]s, = (V 2,S2))-
lisNum v, and isNum v 2-
(v, + V2,S2) ,error) ,error) ,error
This semantic clause is similar to the preceding one. isNum v is true if v
is a number (i.e. member of Num) and false otherwise. Thus (E7) says
that to evaluate E, + E2 one evaluates E, then evaluates E2 (in the state
resulting from E,) then tests their values to make sure they are numbers
and if so returns their sum and the state resulting from E2. If either of
these values is not a number or if E, or E2 generates an error then E, + E2
generates an error.

2.4.4.2. Clauses for commands


(C1) ClI: = E) s = (E[E]s = (v,(m,c,o)))-(m[v/I],i,o),error

if I = I'
(m[v/I]) I' = j v
ml' otherwise
Thus ClI: = E) s is a state identical to the state resulting from the evalua-
tion of E except that E's value v is bound to I in the memory (if E produces
error then so does I: = El.
(C2) C[output Els = (E[E]s= (v, (m,i,om-(m,i,v. 0), error
The semantic clause was explained in 2.4.3.2. above.
(C3) CUf E then C, else C 2 ls =
((E[E] s = (v,s'))-
lisBool v-(v-C[C,] s' ,C[C 2] s'),error),error)

isBool v is true if v is a truth value (i.e. if ve Bool = {true, false}) and is


false otherwise. Thus if E produces result (v,s') when evaluated in s then
C, or C 2 are executed (in the state s' resulting from E) depending on
whether E's value v is true or false.
A First Example: The Language TINY 21

(C4) C[while E do C] s =
((E[E] s = (v,s'))-
Os8001 v-
(v-((C[C] s' = s")-C[while E do C) s" ,error) ,s'),
error) ,error)

If E produces an error then so does while E do C; if E produces value v


and a new state s' then if v is true C is done to get s" and then
while E do C is done starting with s". If v is false then the result of
while E do C is s' the result of E. Finally if v is not a truth value or
C[C]s' = error then while E do C causes an error. Notice that (C4) is
recursive C[while E do s] occurs on the right hand side.
(C5) C[C, ;C 2 1s = (C[C,]s = error)-error,C[C 2 ](C[C,]s)

Thus if C, generates an error then so does C, ;C 2 otherwise C 2 is done in


the state resulting from C, .
This completes the semantic clauses, and also the formal semantics, of
TINY. In some of the clauses, especially (E6). (E7), (C3) and (C4), the
tests for the various error conditions makes it hard to follow what is going
on-the semantics of non error executions fails to stand out. In the next
chapter we describe various notations for clarifying and simplifying
semantic clauses; the reader may like to peep ahead to chapter 4 to see
how neat the clauses can be made.
An important thing to note about the semantic clauses above is that
each construct has its denotation defined in terms of the denotation of its
components; for example C[C,; C 2 ] is defined in terms of C[C,] and
C[C 2 ]. The only exception to this is (C4), the clause for while E do C,
where the denotation is defined 'in terms of itself' -we shall have more to
say on such recursive definitions in 3.2.1.

2.4.5. Summary of the formal semantics of TINY


The formal semantic description of TINY just given had three main parts:
(i) Specification of the syntactic domains Exp and Com.
(ii) Specification of the semantic domains State, Value etc.
(iii) Specification of the semantic functions E and C which map syn-
tactic entities to semantic ones.
22 The Denotational Description of Programming Languages

In the next chapter we shall describe in detail the concepts and notations
of these specifications. In subsequent chapters we shall refine and extend
the concepts and notation, and apply the techniques to a wide variety of
programming constructs (including most of ALGOL 60 and PASCAL).
3. General concepts and notation
This chapter is rather long and tedious-it is a 'user manual' for the nota-
tion and concepts we shall need. On first reading one should only look in
detail at 3.1. and 3.2., the other sections should be quickly skimmed and
then referred to later when their contents are used. Here is a quick over-
view of the main sections:
3.1. We explain the concept of 'abstract syntax' -the kind of syntactic
description convenient for semantics.
3.2. We explain why we use "domains" rather than sets and discuss
informally the role of the underlying mathematical theories.
3.3. We describe different kinds of domains and ways of building them.
3.4. We discuss the concept of a function and then describe
numerous notations for manipulating them.

3.1. Abstract syntax


The programs of most languages are built out of various kinds of con-
structs such as identifiers, expressions, commands and declarations. For
example let C be the TINY command x: = read; sum: = sum + x; then:

C is the command C, ;C 2
where C, is the command I,: = E,
where I, is the identifier x
and E, is the expression read
and C 2 is the command 12 : = E2
where 12 is the identifier sum
and E2 is the expression E2, + E2 2
where E2 , is the identifier sum
and E2 2 is the identifier x
We call the various constructs which make up a given construct its
constituents, for example C" C 2, I, , 12, E" E2 , E2 " E22 are the consti-
tuents of C. The immediate constituents of a construct are its 'biggest'
constituents, for example the immediate constituents of Care C, and
C 2 , the immediate constituents of C, are I, and E, and the immediate
constituents of C 2 are 12 and E2 .

23
24 The Denotational Description of Programming Languages

As we noted after listing the semantic clauses of TINY the denotation


assigned to a construct by a denotational semantics only depends on the
kind of construct and the denotations of its immediate constituent. Thus
for the purposes of semantics a syntactic description need only specify
the various constructs and what their immediate constituents are. Other
details of syntax-for example precedence-are irrelevent for semantics
and can thus be ignored. A syntactic description which just lists the kinds
of constructs and their immediate constituents is called an abstract
syntax.
The notation we shall use to specify abstract syntax has two parts:
(j) A list of the various syntactic categories of the language (for TINY
the syntactic categories are expressions and commands). For each
category we provide a name for the domain of all constructs of
that category (for TINY the names are "Exp" and "Com") and a
meta variable to range over the domain (in TINY E ranges over Exp
and C ranges over Com!.
OJ) A list of syntactic clauses, each one specifying the various kinds of
constructs in a category (for TINY the syntactic clauses were given
in2.1.,namelyE::=OI ... andC::= I:=EI ... ).

The metavariables stand for constructs of the corresponding category


(i.e. members of the corresponding syntactic domain). To distinguish
different instances metavariables may be primed or subscripted (for
example E, E', E" E2 stand for expressions!. The notation for each kind
of construct given in the syntactic clauses must display the immediate
constituents and distinguish it from other kinds of construct, but other-
wise it is arbitrary. Usually one chooses a notation based on some actual
concrete syntax for the language being defined.
We shall always use a fixed standard domain Ide of identifiers (with
metavariable I). Although languages do in fact differ in what they allow as
identifiers these differences are not normally semantically significant.

3.2. Sets and domains


In describing the semantics of TINY we mysteriously used the word
"domain" instead of "set". As we hinted this was for technical mathe-
matical reasons. Unfortunately to fully explain the difference between sets
General Concepts and Notation 25

and domains, and to convincingly justify our use of the latter, we would
have to delve in detail into the underlying mathematics. This we wish to
avoid and so I shall just very crudely sketch some of the issues involved.
Readers who are not satisfied with this discussion should consult Joe
Stoy's excellent book [Stoy] where all details are lucidly explained.
There are really two related, but separate, problems which lead to the
use of domains rather than sets:
(j) Recursive definitions of functions.
Iii) Recursive definitions of sets.

We shall look at these in turn.

3.2.1. The problem of recursively defined functions


We often find it necessary to define functions recursively. For example the
semantic clause for while E do C, (C4), defines C[while E do C) as some
expression involving C[whHe E do C)' At first sight such 'definitions' are
highly suspicious since they seem to assume the thing they are trying to
define is already defined. The way out is to regard recursive definitions as
equations- to regard (C4) as analogous to the quadratic equation
x = tx2 + t (which 'defines' x to be 1 or 2) and to seek methods of solving
functional equations, like (C4), analogous to the methods we have for
solving quadratic equations. It turns out that the kind of functional equa-
tions we want to solve in semantics only have solutions when the
functions involved map between specially structured sets called domains.
To see what goes wrong with ordinary sets consider the following two
equations 'defining' f: Num--+Num
(i) f x = (f x) +1
Iii) f x= fx

If Num = {O, 1,2, ... } there is no f satisfying (i) and every f satisfies Iii),
thus neither of these two equations in any sense defines f. On the other
hand the equation:
(iii) f x = (x = 0)--+1 ,x x f(x-1)

uniquely defines f to be the factorial function (i.e. for all x:fx=x!). If we


used arbitrary sets then it would be just as easy to write bad definitions
26 The Denotational Description of Programming Languages

like (j) and (ii) as good ones like (iii). What the theory of domains does is
ensure every definition is good. It does this by:
(a) Requiring all domains to have a certain 'implicit structure' which
can be shown to guarantee that all equations (including (i), (ii) and
(iii)) have at least one solution.
(b) Providing a way, via the 'implicit structure', of choosing an 'in-
tended solution' from among the various solutions guaranteed
by (a).

For example the domain Num, in virtue of its 'implicit structure', contains
besides 0,1,2, ... an 'undefined' element undefined and then (j) and (ii)
both end up defining f to be the 'undefined function': f x = undefined for
all xENum. (iii) defines f to satisfy fx=x! if x=O, x=1, ... and
f undefined = undefined. We shall not go into the details obscurely
hinted at here; all the reader needs to know is that recursive definitions
can be solved in a way that justifies the intuitions behind them. Some
further discussion on recursive definitions occurs in 3.4.9.

3.2.2. The problem of recursively defined sets


It is often the case that intuitions about the 'data' manipulated by pro-
grams in a language leads us to want to define recursively the domains
modelling the corresponding 'data-types'. As an example let us examine
one way of adding parameterless procedures to TINY. We add to the
syntax of TINY a new kind of expression proc C and a new kind of
command I. The informal semantics of these is:
(E8) proc C
The value of proc C is a procedure which when invoked (by an
identifier denoting it) executes C.
(e6) I
When the command I is executed the procedure denoted by I is
invoked.
General Concepts and Notation 27

For example in a program:


P:= proc~x:= x+l);

P;

P;

each time a P was executed x would be incremented by 1.


To handle the formal semantics we must extend the domain Value to
include the denotations of procedures. Thus:
Value = Num + Bool + Proc
For denotations of procedures we simply take the state transformation
done each time the procedure is invoked. Thus
Proc = State--+[State + {error}]

Now the semantic clauses are simply:


(Ea) E[proc C] s = (C[C],s)
(C6) C[I](m,i,o) = (m 1= unbound)--+error,
isProc(m I)--+m I (m,i,o) , error

Here m I (m,i,o) is the procedure value m I denoted by I in m applied to


the state (m,i,o). To see the point of this example let us write out the
domain equations for this extended TINY:
State = Memory x Input x Output
Memory = Ide--+[Value + {unbound)]
Input = Value*
Output = Value*
Value = Num + Bool + Proc
Proc = State--+[State + {error}]

These domain equations are recursive: State is defined in terms of Input


which is defined in terms of Value which is defined in terms of Proc
which is defined in terms of State. Thus just as the 'looping' of
while E do C led us to define its denotation as a recursive function, so the
embedding of procedures in states leads us to define State as a recursive
28 The Denotational Description of Programming Languages

Domain. Now just as there are problems with recursive definitions of


functions so there are problems- much harder problems in fact-with
recursive definitions of domains. For example it can be shown mathe-
matically that there are no sets satisfying the domain equations above!
Fortunately if we work with domains, and interpret the domain building
operators x, -, + in a clever way, then solutions do exist. In other
words set equations cannot in general be solved but domain equations
can. The crucial two ideas which ensure all domain equations have solu-
tions are firstly to define [D, -OJ to be the domain, not of all functions,
but just of those functions which preserve the 'implicit structure' on the
domains 0, and O 2 (this 'implicit structure' is present as a consequence
of the exact definition of a domain), and secondly to interpret = not as
equality between domains but as 'isomorphism'. Alas we cannot explain
these tantalising remarks further here.

3.2.3. The role of Dana Scott's theory


A major breakthrough in semantic theory came when Dana Scott showed
[Scott] how to consistently interpret both kinds of recursive definitions-
definitions of members of domains (see 3.2.1.) and definitions of domains
themselves (see 3.2.2.) - in a single unified framework. In summary what
he did was:
(j) Devise a class of 'structured' sets called domains and define the
operators x, +, -, * etc. on them.
(ij) Show how elements of domains could be defined recursively.
(iii) Show how domains themselves could be defined recursively.
Many of the intuitive ideas of denotational semantics have been around
for ages-for example in 1966 Strachey published a paper "Towards a
Formal Semantics" [Stracheyl which contains several of the key concepts
described in this book, and even before that John McCarthy had outlined
what was essentially a denotational semantics of a fragment of ALGOL 60
[McCarthy] .
The main achievement of Dana Scott was firstly to point out that the
naive formalization of some of the early semantic models could not be
consistently extended to non-trivial languages, and then to show how a
consistent (and very beautiful) theory was in fact possible if domains
(instead of arbitrary sets) were used to model denotations.
General Concepts and Notation 29

The problem with inconsistent formalizations (like Strachey's


potentially was) is that they provide no basis for reliable reasoning. For
example consider the proof of1 = 2 the proceeds by first assuming 0 = 1
and then adding 1 to both sides-proofs of properties of programs from
inconsistent formalization are just like this!
Of course the early inconsistent theories were not without value, for
one thing they prompted the discovery of the modern consistent versions.
It is usually easier to formalize ideas than to think of them and the early
pioneers of semantics must take the credit for originating the general ap-
proach if not the fine details. There are plenty of other examples of in-
consistent ideas being useful-for example the use of infinitesimals in
calculus and the use of Dirac delta-functions in physics; both these were
valuable for years (indeed centuries) before consistent models of them
were devised.
Thus although the idea of denotational semantics is quite old it was
nevertheless a major breakthrough when in 1969 Scott discovered his
theory; for only then could we start to use semantic models as the basis
for trustworthy proofs. Furthermore his theory has enormously sharpened
our intuitions about the concepts involved and it is inconceivable that the
descriptive techniques described in this book could have advanced as fast
as they have without it. Also it seems that future developments in seman-
tics (for example to handle concurrency [Milne and Milner]) will be even
more dependent on the underlying theory.

3.2.4. The role of mathematics in the book


We shall not discuss the mathematics involved in Scott's theory at all; our
approach to recursive equations is similar to an engineers approach to
differential equations, namely we assume they have solutions but don't
bother with the mathematical justification. In practice this is perfectly
satisfactory for the kinds of things we discuss in this book; one only needs
to dabble in the mathematics if:
(j) One wishes to perform rigorous proofs.
(ii) One wishes to devise descriptive techniques for radically different
kinds of constructs (for example involving concurrency).
30 The Denotational Description of Programming Languages

3,3, Defining domains


In this section we explain all the ways of defining domains we shall need.
0, 0', 0 1 , O 2, ... etc. will stand for arbitrary domains and the names of
all particular domains will start with a capital letter (for example: Num,
Value, Statel.

3,3,1, Standard domains


The following domains are standard and will be used without further
explanation:
numbers: Num = {O, 1,2, .. ,}
truth values: Bool = {true, false}
identifiers: Ide ={III is a string of letters or digits beginning with a
letter}
3,3,2, Finite domains
Finite domains will be defined explicitly by listing their elements, for ex-
ample:
Spectrum = {red, orange, yellow, green, blue, indigo, violet}

3,3,3, Domain constructors


We shall build domains out of standard, or finite, domains using the
various domain constructors described below. Since we are avoiding the
underlying mathematics some of the descriptions will be slightly over-
simplified; these oversimplifications only concern technical details and do
not affect the main ideas.

3,3,3,1, Function space [D,-D 21


[0,-0 2] is the domain of functions from 0, to O2,
[0,-0 2] = {f I f: 0,-02}
Comments (j) fe[O,-02] if and only if f: 0,-0 2 and we say "f has
type 0, -0 2", In general for any domain 0 (not neces-
sarilya function space) If deO we say lid has type 0"
and write lid: 0".
General Concepts and Notation 31

(ij) If fC[0,-02] then 0, is called the source of f and O 2


the target of f (sometimes "domain" and "range" are
used instead of "source" and "target" - but this con-
flicts with our other uses of the word "domain").
(iii) We may write 0,-0 2 instead of [0,-0 2]; by conven-
tion - associates to the right so, for example,
0,-0 2-0 3 -0 4 means [0,-[0 2-[0 3 -0 4 ]]],
(iv) In Scott's theory [0,-0 2] is defined to be the set of all
functions which 'preserve the structure' of the domains;
thus strictly speaking not every function is in [0,-0 2],
In practice all the functions we shall use-indeed all the
functions our notations allow us to define-do 'preserve
structure' and no problems arise. Our assumption that
all functions are in [0,-0 2] is analogous to the
engineers assumption that all functions are differenti-
able.

3.3.3.2. Product [D, x D2 x ... x DJ


[0, X O2 X ... X On] is the domain of all n-tuples (d" d 2, ... , dn) of ele-
ments d, ED" d 2ED 2, "" dnEOn. If dc[O, x O2 x ... X On] then el i d is the
ith coordinate of d. Thus
[0, x O2 X ... X On] =(ld" d 2, ... , dn) I d, ED" d 2E0 2, ... , dnEOn}
el i (d" d z , ... , d n )= d;
d = (e11 d, el2 d, ... , el n d)

3.3.3.3. Sequences D·
0* is the domain of all finite sequences of elements of O. If dE 0* then
either d is the empty sequence () or d = (d" d z, ... , d n) where n>O and
each d; is a member of O. As with the product el i d is the ith coordinate
of d; hd d is the first member of d (thus hd d = el 1 d) and tl d is the
sequence consisting of all but the first element of d; null d is true if d is
the empty string and false otherwise. Thus:
32 The Denotational Description of Programming Languages

0* = {{d" d 2 , ••• , d n} I o<n, diED} + ({ )}


el i (d" d 2 , ••• , d n) = d i
hd (d, , d 2' ••• , d n) = d ,

I
tl (d" d 2 , ••• , d n) = (d 2 , ••• , d n)
if d = ( )
null d = true
false otherwise

Comments (j) el, hd, tl and null can be thought of as functions with
types:1

el:Num-O*-O
hd:O*-O
tl:O*-O*
null:O*-Bool
el, hd and tl only make sense when applied to non-
empty sequences.
Oi) An alternative notation for the sequence (d" d 2 , ••• , d n)
is d, . d 2 •• , •• d n. Also if dEO and
d* = (d" d 2 , ••• , dn)EO* then d. d* is the sequence
(d, d" d 2 , ••• , d n). When using this notation we may
refer to sequences as strings.

3.3.3.4. Sum [0, + O 2 + ... + OJ


Each member of [0, + O 2 + ... + On] corresponds to exactly one member
of some 0i' The difference between the sum [0, + O 2 + ... + On] and the
union of 0" ... , On is that if d is in the union and dEDi and dEO j (for some
i.i= j) then it does not make sense to ask if d comes from Oi or OJ. Each
member of [0, + O 2 + ... + On] on the other hand is 'flagged' to indicate
which domain it comes from. To make this clear we define

so that if dEDi and dEO j then (d, i) represents d considered as a member


of OJ and (d, j) represents d considered as a member of OJ.
If 0 = [0, + O 2 + ... + On] and dEO then isOi d is true if d corresponds
to a member of 0i and false otherwise. If isOi d =true then outO i d is the
General Concepts and Notation 33

member of 0; corresponding to d. If d;EO; then inO;d; is the member of 0


corresponding to d;. In summary:

I
[0, + O 2 + ... + On] = {(d;, j) I d;EO;, O<i<n + 1}
if d is of the form (d;, j).
isO; d = true
otherwise.

I
false
d; if d is of the form (d;, j)
outO; d=
'undefined' otherwise
inO; d; = (d;, j)
Comments (i) isO;, outO; and inO; can be thought of as functions
with types:
isO;:[O, + O 2 + ... + On]-Bool
outO;:[O, + O2 + ... + On]-O;
inO;:O;-[O, + O 2 + ... + On]
(ii) If two summands of a sum have the same name then
this notation breaks down. For example if
0= Num + Num + Bool then isNum d is not well de-
fined. We shall always have summands with distinct
names - if one wants to avoid this restriction then a
different notation must be used (for example is1, is2,
is3 etc as tests for the first, second and third sum-
mands).
(iii) To avoid having to clutter up expressions with inO; and
outO; we shall usually allow context to determine
whether an element is in 0; or in 0 where
0= [0, + O 2 + ... + On]' We thus adopt the following
conventions which are analogous to the 'coercions' of
programming languages.
(a) If d;EO; occurs in a context requiring a member of
0= [0, + O 2 + ... + On] then we interpret the
occurrence of d; as inO; d;. For example if mNum,
Value= Num + Bool and f: Value-O (some D)
then Iff n" really means Iff (inNum n)".
34 The Denotational Description of Programming Languages

(b) If ddD, + D2 + ... + Dn] occurs in a context re-


quiring a member of D; then we interpret the occur-
rence of d as outD; d. For example if
f: D-[Num + Bool] and g: Bool-D' then "glf d)"
really means "g(outBool If dU"
(iv) The domain D* of finite sequences can be thought of as
an infinite sum
D* = {()} + D + [D x D] + [D x D x D] + ...

3.3.4. Domain equations


We discussed in 3.2.2. why we need to be able to define domains by re-
cursive equations. In general such equations have the form
D, = T, [D" "" Dnl
D2 ~ T2 [D" "" Dnl

Dn = Tn[D" "" Dnl


Where each T; [D" "" Dnl is some expression built out of D" "" Dn
and finite or standard domains using the domain constructors -, x, *, +
described above. The example we met in 3.3.2. was
State = Memory x Input x Output
Memory = Ide-[Value + {unbound}]
Input = Value*
Output = Value*
Value = Num + Bool + Proc
Proc = State-[State + {error}]
An example of a single domain equation is the domain DW of infinite
sequences of members of D defined by:
DW=[Dx DW]

If scDw then s=(d" s,)dDx DW] where s, =(d 2, s2)e[Dx DW] where
s 2 = 1d 3' S 3 ) e[ D x DW] '" i. e. s = (d , , 1d 2' (d 3' ... ))).

3.4. Functions
The word "function" has a number of different meanings which, if not
General Concepts and Notation 35

carefully distinguished, can lead to total confusion. It is especially im-


portant to distinguish mathematical functions from the so called "func-
tions" which occur as constructs in several programming languages (for
example PASCAL, LISP and POP-2l.
(i) A mathematical function f with source A and target B (which we
write f: A-B) is a set of pairs (a, b)eA x B such that if (a, b,)d
and (a, b 2 )d then b, =b 2 • If (a, b)d we write fa=b and thus
f={(a, fa) I aeA}. In the notation f: A-B the expression "A-B"
is called the type of f and denotes the set of all functions from
AtoB.
(iil A 'function' in a programming language like PASCAL is a
construct which describes a rule for transforming an argument-
the actual parameter-to a result.
Now if F is a function construct (as in (iil) then one can associate with it
a mathematical function f: { actual parameters}-{results} defined by
fx = result of calling F on parameter x.

Indeed a simple denotational semantics might give f as the denotation of F


(i.e. F[F] = f). However note that:
(a) Many different functions constructs can denote the same mathe-
matical function (for example different algorithms for sorting a list).
(b) A function construct is a finite object whereas a mathematical
function is typically an infinite set-for example the factorial
function is the infinite set
HO, 1)' (1, 1), (2,2)' (3,6)' (4,24), ... }
(c) As we shall see later, except in simple cases, it will not be satis-
factory to take the denotation of a function construct to be the
mathematical function defined by its input/output behaviour. To
handle side-effects, jumps out of the functions body etc. we will
need more complicated denotations.

Unless the contrary is clear from context the word "function" in this book
means "mathematical function".
We shall use two main techniques for defining functions:
(i) A-notation (see 3.4.1.l.
(ij) Definition by recursion (see 3.4.9.).
36 The Denotational Description of Programming Languages

3.4.1. A-notation

3.4.1.1. Basic idea


Suppose E[x] is some expression involving x such that whenever dEO is
substituted for x-and we shall denote the result of such a substitution by
E[ d] - the resulting expression (namely E[ d]) denotes a member of 0'.
For example if both 0 and 0' are Num then E[x] could be x + 1 (so
E[5] =5 + 1 =6) or perhaps x x x (then E[5] =5 x 5 =25). For such expres-
sions the notation:
Ax. E[x]
denotes the function f: 0-0' such that:
for all dEO: fd = E[d]
For example:
(i) Ax. x + 1 denotes the successor function of type Num-Num.
(ij) AX. x x x denotes the squaring function of type Num-Num.
(iii) Ax. (x = O)-true, false denotes the test-for-zero functions of type
Num-Bool.
An expression of the form AX. E[x] is called a A-expression, x is its bound
variable and E[x] its body.
This is the central idea of A-notation; the various elaborations described
below are just to make the descriptions of complicated functions a bit
more intelligible.
N. B. The body of a A-expression always extends as far to the right as
possible, thus AX. x + 1 is AX. (x + 1) not (Ax. x) + 1.

3.4.1.2. Elaborations

3.4.1.2.1. Explicitly indicating source and! or target


Sometimes it is not clear what the source (0 say) or target (0' say) of a
function defined by a A-expression are-for example consider AX. x-in
such cases one can, if necessary, indicate the desired information by
writing :0-0' after the A-expression. For example Ax. x: Num-Num,
AX. x: Bool-Bool, AX. x + 1: Num-Num etc.
General Concepts and Notation 37

Often the target but not the source is clear; in such cases one can write
:0 after the bound variable. For example Ax:Num. x, Ax:Bool. x,
Ax:Num. x+ 1 etc.

3.4.1.2.2. More than one argument


If E[x" ... , x n] is an expression having a value in 0' when x, GO" ... ,
XnG On then the notation
A(X" ... , Xn). E[x" ... , x n]
is used to describe the function f: [0, x '" x On]-O' such that:
for all d, GO" ... , dnGOn: f(d" ... , d n) = E[d" ... , dn]
Examples are:
(j) A(X, ,x 2) • x, + X2 the addition function if type

[Num x Num]-Num

OJ) A(X, ,x 2) • x, <X2 -error ,x, - X2: the subtraction function oftype
[Num x Num]-[Num + {error}l
This notation for more than one argument can be combined with explicit
indication of source on target, for example
A(X" x 2) • x, + X2: [[Num x Num]-Num]
or A(X" x 2): [Num x Num]. x, + X2
Functions like this can either be thought of as having many arguments or,
perhaps more elegantly, as having just one argument which is a tuple.

3.4.1.3. Applying A-expressions to arguments


Just as we can form expressions like "f1" to denote the application of f to
1 so we can from expressions in which A-expressions are applied to
arguments, for example:
(Ax.x+1)2 = 2+1 = 3
(A(X, y). x + y)(2,3) = 2+ 3 = 5
When 'evaluating' (Ax. E[x])a to E[a] one must only substitute a for those
38 The Denotational Description of Programming Languages

occurrences of x in E[x] which are not bound by inner A's. For example
(Ax. (Ax. x))a evaluates to Ax • x not AX . a.

3.4.1.4. Changing bound variables


The meaning of A-expressions does not depend on the names of its bound
variables-Ax.x+1, An.n+1, Am.m+1 all denote the same function.
In general bound variables can be renamed as long as the new name does
not occur elsewhere in the A-expression, if the new name does occur
elsewhere then 'variable capture' may result and the meaning of the A-
expression change. An example of variable capture occurs if we rename x
in Ax. (AY • x) to Y to get Ay. (AY • y) - here the x is initially bound by the
outer A but after being renamed to Y gets 'captured' by the inner A and this
capturing changes the meaning, thus:
((Ax. (Ay. x))1) 2= (AY .1) 2= 1
((Ay. (Ay. y))1) 2= (Ay. y) 2=2

It is quite tricky (but not really difficult) to spell out exactly the general
conditions under which variables get captured (see [Stoy]) but for-
tunately it is usually obvious in particular cases.

3.4.2. Higher order functions


The example in the previous section - Ax • (AY • x) - is an example of a A-
expression which denotes a higher order function. Higher order functions
are functions whose source or target contains functions. Thus AX. (Ay. x)
has a type of the form 0,-[0 2 -0,] and so the result of applying
AX. (Ay. x) to an argument, d, say, is a member of [0 2 -0,] (Ay. d, to be
exact) i.e. a function.

Another example of a higher order function is


twice: [[Num-Num]-[Num-Num]]
defined by twice f=Ax.f(fx) (i.e. twice=Af.(Ax.f(fx))). twice f is a
function which does f twice, for example twice (An. n + 1) is a function
which applies An. n + 1 twice- i.e. adds 1 twice- i.e. adds 2. Thus
twice (An. n + 1) = Ax. x + 2, this can be verified formally as follows:
General Concepts and Notation 39

twice (An. n + 1) = Ax. (An. n + 1) ((An. n + 1) x)


=Ax. (An. n + 1) (x + 1)
= AX • (x + 1) + 1
=Ax.x+2

Higher order functions are very useful and we shall frequently use them;
for example the semantic functions for TINY were higher order-
C: Com-[State-[State + {error}]] so C[E] (the result of applying C to
E) is a function. Higher order functions are perfectly ordinary and every-
thing we say about functions in general applies to them also. The reason
why we singled them out for discussion is because in many programming
languages (for example ALGOL 60 and PASCAL) 'functions' (as opposed
to mathematical functions) are subject to various constraints. For example
in neither ALGOL 60 nor PASCAL can 'functions' return 'functions' as
results (so twice could not be programmed). These constraints are to en-
able efficient implementations of the languages to be possible and do not
reflect anything inherent in the concept of a function. Indeed other (less
efficient) languages like LISP or POP-2 allow the unrestricted program-
ming of higher order 'functions'. As discussed earlier (at the beginning of
3.4,) one must be careful not to confuse the mathematical concept of a
function with the various 'functions' occurring in programming
languages.

3.4.3. Important notational conventions on precedence and


association
(i) fx,x 2... x n means ( ... ((fx,)x 2) ... ). For example fgx means (fg)x
notf(g x). Thus application associates to the left.
(ii) f;x, ;x 2; ... ;xn means f(x, (x 2(. .. (xn_, x nl. .. ))l. For example f;g;x
means f(g x) not (f g)x. Thus ";" denotes application but unlike
juxtaposition associates to the right.
(iii) f, x" ... x'n;f 2 X2, ... X2n; ... ;fnxn, ... xnn means
(f,X'1 ... Xln);(f2X21 ... X2n); ... ;(fnXnl ... Xnn). For example f g;x means
(f g); x i.e. (f g) x and f;g x means f;(g x) i.e. f(g x). Thus ";" binds
weaker than ordinary application (as denoted by juxtaposition).
40 The Denotational Description of Programming Languages

(jv) AX, X2.. 'Xn . E[x, , X2, """' xnl means


AX, • (Ax 2 .... (Axn • E[x" X2, """' x n]) ... ). This notation can be ex-
tended to functions which take tuples as arguments, thus for
example AX, (x 2,X3 )x 4 • E[x, ,X 2,X 3,x 4 l means
AX, • (A(x 2,x 3) . (Ax 4 • E[x, ,X 2,X 3,x 4 ])).
(v) Conventions (i) and (jv) cooperate nicely together, for example
(Ax, X2. "Xn . E[x, ,X 2,""" ,xn])d, d 2... dn= E[d, ,d 2," .. ,dnl
(vi) 0,-02- ... -0n means [0,-[02-[. .. [On_,-On]"']]] (see
3.3.3.3.).
(vii) The body of a A-expressions always extends as far to the right as
possible, for example AX. f X Yz means AX. (f X y z) not (Ax. f) X y z.

3.4.4. Currying
Consider the two functions plus and plusc defined below:
(j) plus: [Num x Num]-Num plus = A(n, m). n + m
(ji) plusc: Num-Num-Num plusc = Anm • n + m
These functions are related in that plus (n, m) = plusc n m. It will often
be convenient to use higher order functions like plusc which take their argu-
ments 'one at a time' instead of plus which takes a pair (or more generally
an n-tuple) as an argument. The advantage of plusc is that it can be
applied to just one argument. For example plusc 1 is a well formed ex-
pression denoting the function Am.1 + m, plusc 2 denotes Am. 2+ m
etc. - plus 1, plus 2 do not make sense as 1, 2 are not in the source
[Num x Num] of plus. Another example is the semantic functions of
TINY; because
c: Com-State-[State + {error}]
We can use C[C] to denote the denotation of C; if C had type
[Com x State]-[State + {error}] this expression would not make sense
and we would have to use AS. C[C, s1.
Functions of more than one argument which take them 'one at a time'
like plusc and C are called curried (after Mr. Curry). In general if
f: [0, x O 2 X ... X On]-O then there is an equivalent curried function,
curryf, say of type 0,-02- ... -0n-0 defined by
General Concepts and Notation 41

curry itself is a higher order function defined by:


curry: [[0, x O 2 x ... X On]-0]-[O,-02- ... -0n-0]
curry = At X, X2"'Xn' f(x" x 2' ... , xn)
For example: curry plus = (At X, X2 . f(x" x 2)) plus
= (At . (Ax, X2 . fix, , x 2))) plus
= AX, X2 . plus (x 1, x 2)
=Ax 1x 2 · X1 +X2
= plusc
There is an inverse to curry called uncurry defined by:
uncurry: [O,-02- ... -0n-0]-[[01 X O 2 x ... X On]-O]
uncurry = At(x 1, X2' ... , x n) . f X1x 2.. 'Xn
The reader might like to show that uncurry plusc = plus and that for all f,
l' of the appropriate types curry (uncurry f) =f and uncurry (curry f) = 1'.

3.4.5. Conditionals
A very important standard function is the conditional cond defined by:
cond: [0 x O]-Bool-O (0 arbitrary)
if b=true

if b=false

We use the following notations:


(i) b-d 1, d 2 means cond (d" d 2) b
OJ) b 1-d 1, b 2-d 2, ... , bn-d n, d n+ 1 means
b 1-d 1, (b 2-d 2, (. .. (bn-d n, d n+ 1)... )). If b 1, .'" b n are mutually
exclusive we may omit d n+, and write b,-d 1, ... , bn-d n.
Strictly speaking there is a conditional function, condO say, for each
domain O. Thus CondNum: [Num x Num]-Bool-Num is different
from condlde: [Ide x Ide]-Bool-Ide etc. In practice when we write
b-d 1, d 2 it should be clear from context which domain is intended.
An important property of conditionals is: (b-f, g) X = b-fx, gx here
42 The Denotational Description of Programming Languages

f, 9 must have types of the form 0,-0 2 so the left hand conditional has
type [[0,-0 2] x [0,-0 211-8001-[0,-0 2] and the right hand condi-
tional has type [0 2 x O 2]-8001-0 2 using cond instead of - we have:
cond [0,-0 2] (f,g) bx = cond O 2 (fx, gx) b.

3.4.6. Cases notation


We shall illustrate cases notation with a couple of examples and hope the
reader gets the general idea from these.
Suppose 0 = [Value x State] + {error} and dEO then the expression:
(d = (v, s))-E, [v, s],
(d = error)-E 2
means if d corresponds to a member (v,s) of [Value x State] then E, [v,s]
else if d = error then E2
As another example suppose 0 = 0, + O 2 + 0 3
0, =0" XO'2
02={d 2}
0 3 =0 3, X0 32 X0 33
then the expression:
(d = (d, , , d, 2))- E, [d, , , d, 2],
(d =d 2)-E 2,
(d = (d 3" d 32' d 33))-E 3[d 3" d 32 , d 33], E4
means if d corresponds to (d", d, 2)EO, then E, [d", d'2], if d cor-
responds to d 2 E0 2 then E2, if d corresponcfs to (d 3" d 32' d 33)E0 3 then
E3 [d 3, , d 32, d 331 else E4· This second cases statement is equivalent to
isO, d-(A(d", d, 2)' E, [d", d, 2]) (outO, d),
is0 2d-E 2,
is0 3d-(A(d 3" d 32 , d 33 ). E3[d 3" d 32 , d 33 ]) (out0 3 d), E4
This shows that cases notation both tests values and binds their com-
ponents to variables. Examples of the use of cases notation are the
semantic clauses for TINY above.
General Concepts and Notation 43

3.4.7. Updating functions


If f: D-D', d" ... , dnED and d,', ... , d n' ED' then f[d,' 1 " " dn/d, 1 " " d n]
denotes the function identical to f except at d" ... , d n where it has values
d,', ... , d n' respectively. Thus:

f[ d, ., ... , d n' I d, I ••• , d n] = Ad • d = d, -d, , I

d = d2~d2"

3.4.8. Generic functions


Functions like cond which, strictly speaking, are collections of functions,
one for each domain of an appropriate type, are called generic. Other ex-
amples are curry and uncurry. The actual functions which make up the
collections of functions are called instances, thus condNum, condlde
are instances of cond etc. The expressions which describe the types of
the instances of generic functions are called generic types. For example
[D x D]-Bool-D is the generic type of cond; an instance of this type
with D = Num is [Num x Num]-Bool-Num. The generic type of curry
is [[D, x D2 x ... x Dn]-D]-[D,-D2- ... -Dn-D], in curry plus (see
3.4.4) it is used at instance
[[Num x Num]-Num]-[Num-Num-Num].

3.4.9. Ways of defining functions (including recursion)


A typical definition of a function f has the form
f(x" ... ,xn) = E[x" ... ,xn)
or f x, .. 'Xn = E[x, , ... , xn) if f is curried

For example:
plus (x, y) = x+y
plusc xy = x + y
44 The Denotational Description of Programming Languages

An equivalent (and sometimes more convenient) way of writing such


definitions is as
f = A(x" ... , x n). E[x" ... , xn1
ort = AX, ... Xn . E[x, , ... , xn1

For example
plus = A(x, y) . X + Y
pi usc = Axy.x+y
We shall often need to define functions recursively, for example we might
define fact: Num-Num thus:
factn = (n=0)-1,nxfact(n-1)
or fact = An.(n= 0)-1,nxfact(n-1)
We also sometimes need to define several functions by mutual recursion.
If we use A-expressions (as in the second definition of fact above) then
the general form of mutually recursive definitions is:
f, = E,[f" ... ,fn1
f 2 ~ E2 [ f , , ... , f n]

f n = En [f , , ... , f n1

Where the E, [f" ... , fn1 are A-expressions. For example if E[f1 is
An. (n = 0)-1, n x f(n-1) then the second definition of fact above is:
=
fact E[factJ. We thus see fact is a 'fixed point' of the function At. E[f]
- i.e. (At. E[f]) fact = fact. At. E[f1 has type
[Num-Num]-[Num-Num] and its 'fixed point' fact has type
[Num-Num]. If we introduce a generic function fix of type [D-D)-D
defined intuitively by:
fix f = the fixed point of f
Then the recursive definition of fact can be written non-recursively as
fact = fix (At n . (n = 0)-1, n x f (n-1))

here fix is used at the instance D = [Num-Num] of its generic type


[D-D)-D.
The precise definition, and mathematical analysis, of the function fix
General Concepts and Notation 45

constitute the "theory of fixed points", this was discussed in 3.2.1. We


shall not use fix but will define things recursively instead.

3.4.10. Cancelling out variables


Consider the equation fx = gx if this holds for all x then it follows that
f = g. In general an equation like
fx, ... xi ... xn=(E[x" ... ,xJ)xi+, ... x n

can be simplified to

f X, ... Xi= E[x" ... , x;1


as long as x i +' , ... , Xn do not occur in E[x" ... , xJ. A common example of
such simplifications is with definitions like
f xy= px-+gxy, hxy

using the fact that px-+gxy, hxy= (px-+gx, hx) y we can reduce this
equation to
f xy = (px-+gx, hx) y

and then cancel y to get:

f x = p x-+g x, h x

Now the definition:

f x, "'Xn = (E[x" ... , xJ)xi +, "'Xn


is equivalent to
f = AX, ... Xn' (E[x" ... , Xi]) Xi+, .. Xn
and just as the first definition can be simplified to
f X, ... Xi = E[x" ... , Xi]
so the second one can be simplified to
f = Ax, ... Xi . E[x, , ... , xJ

In general Ax, ... xi ... x n . (E[x" ... , Xi]) Xi +' ... Xn can be simplified to
AX, ... x i • E[x" ... , Xi] as long as x i +', ... , Xn do not occur in E[x, .... , x;l.
For example Axy. px-+gxy, hxy can be simplified to Ax. px-+gx, hx.
46 The Denotational Description of Programming Languages

N.B. When using ";" (see 3.4.3.) one must be careful with cancelling.
For example if f x = g;h;x then it does not follow that f = g;h. Simi-
larly Ax. g;h;x does not simplify to g;h.

3.4.11. where notation


Sometimes it is convenient to 'structure' expressions by writing
E[x, , ... , xnl where x, = E,
and X 2 = E2

and: x n = En
instead of E[E, , ... , Enl. For example
x, x X 2 where x, ~ 2+3
and X 2 = 6-7

instead of (2 + 3) x (6-7). Use of where can sometimes shorten expres-


sions and highlights essential aspects of their structure, for example
C[C,; C 2]s = (s' = error)-error, C[C 2]s'
where s' = C[C,]s

One can also use where for 'subsidiary' functions, for example
(f1) + (f2) where fx = x x x

this means ((Ax. x x x)1) + ((Ax. x x x)2) = 5


If the subsidiary function is recursive we draw attention to that fact by
using "whererec" instead of "where". For example
fact 6 whererec fact n = (n =0)-1, n x fact (n-1)
If several subsidiary functions are to be mutually recursive then we write
E[f" ... ,fn1whererecf, ~ E,[f" ... ,fn1

(here we assume the subsidiary functions are being defined as A-expres-


sions but this is not necessary).
General Concepts and Notation 47

3.4.12. Composition and sequencing

3.4.12.1 Composition
If f: 0 1 -0 2 and g: O 2-0 3 then their composition gOf: 0 1 -0 3 is de-
fined by gOf = Ax. g(f x). Notice that in gOf, f is 'done first'.

3.4.12.2. Sequencing
We are going to define f* 9 where f and 9 have a variety of types; before
exact details here is the essential idea: f* 9 is the function corresponding
to first doing f, if f produces an error then error is the result of f* 9 other-
wise the results of f are 'fed to' g. f * 9 is like gOf except that
(i) Errors are propagated.
(ii) 9 may be curried.
Here now are the cases we shall need:
(a) Suppose f: 0 1 -[0 2 + {error}1 and g: O2-[0 3 + {error}1 then
f*g: 0 1 -[0 3 + {error}1 is defined by:
f* 9 = AX. fx = error-error, g(fx)
(b) Suppose f: 0 1 -[[0 2 x 0 3 ] + {error}1 and
g:02-03-[04 + {error}1 then f*g: 0 1 -[0 4 + {error}1 is defined
by:
f* 9 = AX. (f X = error)-error, (f x = (d 2 , d 3 ))-g d 2 d 3

To see the use of * observe that the semantic clause:


C[C 1 ; C 2 ] s = (C[C 1] s = error)-error, C[C 2 ] (C[C 1 ] s)
can be simplified to
= C[C 1 ]*C[C 2 ]
C[C 1 ; C 2]
Here we have an example of case (a) with 0 1 = O2 = 0 3 = State. As an-
other example the semantic clause:
C[output E]s = (E[E] s = error)-error,
(E[E] = (v,(m,i,o)))-(m,i,v. 0)
48 The Denotational Description of Programming Languages

can be simplified to
C[output E] = E[E]* Av(m,i,o). (m,i,v. 0)
Here we have case (b) with D, = D3 = State, O2 = Value.
As a final example (again of case (b) with 0, = 0 3 = State, O2 =Value)
the semantic clause (E6) in 2.4.4.2. can be written:
E[E, = E2 ] = E[E,]*AV, .E[E 2 ]*AV 2 S.(V, =V 2 ,S)
The reader is urged to ensure he follows these examples by expanding the
*'s. We shall also use f*g when f cannot produce error. Case (a) then
corresponds tof* 9 = gOf, case (b) to f* 9 = go (uncurry f).
Finally note that:
(j) * is associative so expressions like f* 9 * h are unambiguous
(since f* (g* h) = (f* g)* hI.
(ii) Expressions like f*g*Ax.h*k mean f*g*(AX.(h*k))-see
3.4.3. (vii).
4. Denotational description of TINY
The purpose of the battery of abbreviatory conventions described in the
preceding chapter is to provide a set of notations which are both powerful
and concise. We would like the expressions and definitions we write-
semantic clauses for example-to be both brief and readable. Un-
fortunately there is some conflict between these two requirements-
compact notations are often hard to understand, but, conversely, long
descriptions often hide the really important facts in a confusirig morass of
minor detail. Exactly how to trade off compactness versus explicit detail is
still a matter of controversy and the reader must decide for himself
whether powerful abbreviations (like the sequencing operator * of
3.4.12.2. for example) are a help or hinderance to understanding.
In the rest of this chapter we describe TINY using the notations and con-
cepts presented in chapter 3. We give the new version of the semantics
without comment but indicating previous sections which explain various
notations when they are first used. The new semantics is completely
equivalent to the one given in chapter 2 and so the explanations there
apply here also.

4.1. Abstract syntax (see 3.1.)

4.1.1. Syntactic domains


E ranges over the domain Exp of expressions
C ranges over the domain Com of commands

4.1.2. Syntactic clauses


E :: = 0111 true I false I read III not E I E, = E2 I E, + E2
C:: = I: = E I output E I if E then C, else C 2 I while E do C I C,; C 2

4.2. Semantics

49
50 The Denotational Description of Programming Languages

4.2.1. Semantic domains (see3.3.)


State = Memory x Input x Output
Memory = Ide-[Value + {unbound}]
Input = Value*
Output = Value*
Value = Num + Bool

4.2.2. Auxiliary functions (see 3.4.)


These functions are useful for 'factoring out' common parts of the
semantic clauses in 4.2.4.

4.2.2.1. result
Informal description: result v is the denotation of an expression which
returns vas value and does not change the state.
Formal description: result:Value-State-[[Value x State] + {error}]
result = Avs.(v,s)
(see (iii) (a) in the comments at the end of 3.3.3.4.).

4.2.2.2. donothing
Informal description: donothing is the denotation of a command which
has no effect.
Formal description: donothing:State-[State + {error}]
donothing = AS. S

4.2.2.3. checkNum
Informal description: checkNum v is the denotation of an expression
which returns v and an unchanged state if vENum and error otherwise.
Formal description:
checkNum:Value-State-[[Value x State] + {error}]
checkNum = AV S . isNum v-(v ,s), error
(see 3.3.3.4.).
Denotational Description of TINY 51

4.2.2.4. checkBool
Informal description: checkBool v is the denotation of an expression
which returns v and an unchanged state if veBool and error otherwise.
Formal description:
checkBool:Value-State-[[Value x State] + {error}]
checkBool = AV s. isBool v-lv,s), error

4.2.3. Semantic functions


E:Exp-State-[[Value x State] + {error}]
C:Com-State-[State + {error}]

4.2.4. Semantic clauses

4.2.4.1. Clauses for expressions


(E1) E[O] = result 0, E[1] = result 1
(E2) E[true] = result true, E[talse] = result false
(E3) E[read] =
A(m,i,o). null i-error, (hdi, (m, tli, 0)) (see3.3.3.3.)
(E4) Em =
Alm,i,o) . m I = unbound-error, (m I, (m,i,o))
(E5) E[not E] =
* *
E[E] checkBool AV. result (not v) (see 3.4.12.2.)
(E6) E[E, = E2 ] =
* *
E[E,1 AV, • E[E 2 1 AV 2 • result Iv, = v 2 )
(E7) E[E, + E2 ] =
* *
E[E,] checkNum AV, . E[EJ checkNum AV 2* * •
result Iv, + v 2 )
4.2.4.2. Clauses for commands
(C1) ClI: = E] = E[E] *
Av(m, i, 0). (m[v/l], i, 0)
(C2) C[output E] = E[E] *
AV (m, i, 0). (m, i, v. 0)
(C3) CUt E then C, else C 2 ] =
*
E[E] checkBool *cond (C[C,]' C[C 2 ]) (see 3.4.5.)
(C4) C[while E do C] =
* * *
E[E] checkBool cond (C[C] C[while E do C]' donothing)
(C5) C[C, ;CJ = C[C,] C[C 2 ]*
5. Standard semantics
The description of TINY given in the last chapter was designed to illustrate
the main ideas and notations of denotational semantics. For real langu-
ages (such as PASCAL or ALGOL 60) it is necessary to use rather more
sophisticated denotations, in particular:
(j) Instead of denotations transforming states 'directly' it is necessary
for them to transform states indirectly via continuations (see 5.1.
below). This enables jumps to be handled.
(jj) The binding of identifiers to values needs to be split into two
parts: a mapping from identifiers to 'variables' and a mapping from
'variables' to values. This enables sharing or aliasing to be handled.
A semantics based on (j) and (jj) is called a standard semantics. If we
describe languages using fixed standard techniques then comparisons
between languages are made easier. The disadvantage is that for any
particular language the 'fit' of the techniques may not be perfect. For
example when we come to discuss the 'dynamic binding' of LISP the two-
stage binding of identifiers to values of (jj) is not completely appropriate-
however the distortion it involves is compensated for by being able to
discuss very different languages (for example PASCAL and LISP) within a
single framework.
In this chapter we describe the central ideas and techniques of standard
semantics. In the next chapter we illustrate these by giving a standard
semantics of a slight extension of TINY called SMALl. Then in the rest of
the book we apply standard techniques to describe many different kinds
of constructs (including most of those occurring in PASCAL and
ALGOL 60),

5.1. Continuations
The development of continuations [Strachey and Wadsworth] was an im-
portant advance in the descriptive techniques of semantics. It led to:
(j) Simplification - both conceptual and notational- in the description
of most constructs.
OJ) Smooth descriptions of various constructs which, previously, were

52
Standard Semantics 53

impossible to handle (for example jumps back into already exited


procedures, co-routines and 'state saving').

5.1.1. Modelling the 'rest of the program'


Most constructs can be thought of as transforming some input into some
result- the exact type of these depending on the construct. In TINY the
inputs to both commands and expressions are states and the results are
states (or error) is the case of commands and value/state pairs (or error)
is the case of expressions.
In the kind of semantics described so far each construct directly de-
notes its input/ result transformation, and the transformation of a com-
plete program is got (roughly) by combining (with *) the transformations
of its components. This type of semantics is called a direct semantics and
it suffers from the problem that there is no way a construct can avoid
passing its result to the rest of the program following it. If a construct
produces an abnormal result, say error, then the rest of the program has
to cope with this. Thus semantic clauses get cluttered up with tests for
abnormal values. Although these tests can be hidden in operators like *
the checking involved is unnatural- intuitively when an error occurs the
rest of the program is simply ignored and the computation just stops.
In a continuation semantics we make the denotations of constructs
depend on the 'rest of the program' - or continuation-following them.
The intuitive idea is that each construct decides for itself where to pass its
result. Usually it will pass it to the continuation corresponding to the
'code' textually following it in the program-the normal continuation-
but in some cases this will be ignored and the result passed to some other
'abnormal' continuation. For example:
(i) When an error occurs the normal continuation is ignored and
control passes to a continuation corresponding to an error stop.
(ij) When a jump occurs the normal continuation is ignored and con-
trol passes to a continuation corresponding to the rest of the
program following the label jumped to.
This is all very vague and probably will not be fully comprehensible until
we have explained the formal details. Continuations can, at first, be a bit
tricky and the reader should not worry if to start with everything seems
54 The Denotational Description of Programming Languages

back to front and unnaturally higher order-one needs to work through a


few concrete examples before thinking with continuations comes natur-
ally.
We now explain exactly what continuations are and which aspects of
the 'rest of the program' they model. Formally a continuation is a function
from whatever the 'rest of the program' expects to be passed as an inter-
mediate result-and this depends on what the 'rest of the program'
follows- to the final answer of the program. For example in the TINY
program I: = E;C the intermediate result passed to the 'rest of the pro-
gram' following I: = E-i.e. passed to C-is a state and the final answer is
a state or error; however the intermediate result passed to the 'rest of the
program' following E-i.e. passed to I:=;C-is a value (E's value) to-
gether with a state (the state after doing El. The continuations
corresponding to the first case-i.e. corresponding to the 'rest of the pro-
gram' following a command - are called command continuations and
form a domain Cont defined by
Cont = State-[State + {error}]
com mind
continuations
1' t¥ I

results of final
commands answers
Continuations corresponding to the second case-i.e. corresponding to
the 'rest of the program' following expressions-are called expression
continuations and form a domain Econt defined by:
Econt = Value-State-[State + {error}]

t t
- ' , I , J

T ¥ •

expression
continuations
results of final
expressions answers
(curried)

We define Econt as above rather than uncurried as


[Value x State]-[State + {error}] so that Econt = Value-Cont-this
turns out to be very convenient when we come to write semantic clauses.
Standard Semantics 55

We shall use c, c', C" c 2 etc for typical members of Cont; k, kf, k"
k2 etc. will be typical members of Econt. The continuations described
above are appropriate for TINY but slightly too simple for general use. In
5.5. we describe domains Cc and Ec of standard command and expres-
sion continuations; however for the time being we shall continue to work
with Cont and Econt since they embody the essential ideas without un-
necessary extra detail.

5.1.2. Direct and continuation semantics


Recall that in our direct semantics of TINY (see 4) we used semantic
functions with the following types:
E:Exp-State-[[Value x State] + {error}]
C:Com-State-[State + {error}]
In a continuation semantics the denotations of constructs are functions of
continuations as well as of states. If we (temporarily) use E' and C' for the
continuation semantic functions, then their types are:
E': Exp-Econt-State-[State + {error}]
(i.e. E':Exp-Econt-Cont)
C': Com-Cont-State-[State + {error}]
(i.e. C':Com-Cont-Cont)
and are defined so that:
k v s' where E has value v and transforms s to Sf.
Ef[E] ks =
error otherwise,
c s' if C transforms s to Sf
C'[C] c s =
error otherwise
Thus in general the continuation denotation of a construct is a function of
a continuation and state which yields the final answer of the program-
this final answer being obtained by passing some intermediate results
(e.g. v, Sf etc.) to some continuation (i.e. to the 'rest of the program').
Normally this continuation will correspond to the program text following
the construct, but sometimes other continuations are appropriate. For
56 The Denotational Description of Programming Languages

example in the case of jumps-commands of the form goto 1 say-the


semantic clause will roughly be:
C[goto I] c s = c' s where c' is some continuation specified by 1
(e.g. bound to 1in s)
We shall discuss this kind of thing in detail later, however for the time
being we concentrate on explaining what a continuation semantics just
for TINY looks like. We start with identifiers.
The continuation semantic clause for identifiers is:
E'[I] k (m,i,o) = (m I = unbound)-error, k(m I)(m,i,o)
Thus the final answer yielded by I when the 'rest of the program' is k and
the state is (m,i,o) is error if I is unbound in m, and otherwise is the final
answer produced by k when it is passed m I and (m,i,o) as intermediate
results. Notice how continuation semantics allows us to explicitly specify
final answers-e.g. error-as well as intermediate results (e.g. m I). In
the case that m 1 is unbound we ignore k and immediately stop with
error. In a direct semantics one can only specify the intermediate results
and so error has to be an intermediate result with the consequence that
messy operators like *are then needed to handle it. To see this more
clearly here is the semantic clause for C, ;C 2 :
C[C, ;C 2 ] c s = C[C,](C[C 2 ] c) s
This says we evaluate C, with a continuation C'[C 2 ] c (i.e. a 'rest of the
program' corresponding to C 2 followed by c). Now if C, generates an
error then it can simply ignore its continuation C'[C 2 ] c and 'stop' with
final answer error. In a direct semantics denotations cannot choose to not
send an intermediate result to the 'rest of the program'. One way of think-
ing of direct semantics is as a special kind of continuation semantics in
which one is always forced to send some intermediate result to the normal
(i.e. textual) continuation. This follows from the fact that one can prove
(for TINY) that:
for all EEExp, kEEcont: E'[E] k = E[E] k *
for all CECom, cECont: C'[C] c = C[C] c *
For TINY this actually defines E' and C, but normally one has to write out
semantic clauses for continuation semantic functions directly since there
Standard Semantics 57

may be no direct semantics available. For example it is rather hard (though


not impossible) to define C[goto 11 and to extend * so that:
C'[goto 11 c = C[goto l1*c. The meaning of * (at least as defined in
3.4.12.2.) implies that if C[goto 11 s is not error then
C[goto IJ * c = c (C[goto I]) so there is no way to avoid c getting some
intermediate result passed to it. In other words
C'[goto I] c = C[goto l1*c is incompatible with C'[goto 11 c ignoring c.
By complicating * It is possible to get round this and to obtain a direct
semantics of goto I (i.e. to concoct a definition of C[goto I] that works)
but the details are messy and only of technical interest (see [Bjorner &
Jones]). For languages in which the flow of control may fail to follow the
textual structure of programs the most natural description is a continua-
tion semantics. From now on all our semantics will use continuations and
we shall drop the primes on continuation semantic functions.

5.1.3. Continuation semantics of TI NY


In this section we illustrate the ideas just described by giving a continua-
tion semantics of TINY.

5.1.3.1. Semantic domains and functions


Domains:
State = Memory x Input x Output
Memory = Ide-[Value + {unbound}]
Input = Value*
Output = Value*
Value = Num + Bool
Cont = State-[State + {error}]
Econt = Value-Cont
Functions:
E: Exp-Econt-Cont
C:Com-Cont-Cont
(we no longer use E' and C')
58 The Denotational Description of Programming Languages

5.1.3.2. Semantic clauses


Expressions:
(E1) E[O]ks = kOs,E[1]ks = k1s
Each numeral passes the appropriate number, together with an un-
changed state, to the continuation. The state is unchanged since evalu-
ating numerals has no side effects. Notice that by cancelling (see
3.4.10.) (E1) can be simplified to:
E[O] k = k 0 , E[1] k = k 1

(E2) E[true] k = k true, E[false] k = k false


(E3) E[read] k (m,i,o) = null i-error, k (hd i) (m,tl i,o)
If the input i is empty then the program stops with final answer error,
otherwise hd i and the side-effected state (m,tl i,o) are passed as inter-
mediate results to k.
(E4) Em k (m,i,o) = (m I = unbound)-error, k (m I) (m,i,o)
This was explained above.
(E5) E[not E] k 5 = E[E](AV 5' • isBool v-k (not v) s' , error) 5
E is evaluated and the resulting value v and state s' are passed to the
continuation (AV s' . isBool v-k (not v) 5', error). This continuation
sends not v and s' to k if Vi: Bool, otherwise k is ignored and the result is
error. If we define err = AS. error then the techniques of 3.4.10 allow us
to simplify (E5) to:
E[not E] k = E[E]Av. isBool v-k (not v), err
(E6) E[E l = E2] k = E[E l ] AV l . E[E 2] AV 2 • k(v l = v 2)
We evaluate El to get Vl which is then passed to
AV l . E[E 2 ]AV 2 • k(v l = v 21, this evaluates E2 to get V2 which is then
passed to the continuation AV 2 • k(v l =v 2 ) which passes Vl =V 2 to k. If
El caused an error then its continuation would be ignored, for example
E[read = E] k (m,( ),0) = E[read](. .. )(m,( ),0)
= error (since null () =true-see (E3))
Standard Semantics 59

(E7) E[E, + E2] k =


E[E,]AV, . E[E 2]AV 2 . isNum v, and isNum v 2-k(v, + v 2), err
err = AS. error; thus E, is evaluated to get v, then E2 is evaluated to get
v 2, then if v, and v 2 are both numbers v, + V2 is passed to k, otherwise k
is ignored and error results.
Commands:
(Cl) C[I: = E] c = E[E]Av(m,i,o). c(m[v/I],i,o)
E is evaluated and its result passed to the continuation
Av(m,i,o). c(m[v/IJ,i,o) which passes (m[v/I],i,o) to c.
(C2) C[output E] c = E[E]Av(m,i,o). c(m,i,v. 0)
The value of E is put on to the front of the output component of the store
and the result passed to c. Further discussion of output is given in the
next section.
(C3) CUf E then C, else C 2] c =
E[E]AV. isBool v-(v-C[C,] c, C[C 2] c), err
(C4) C[while E do C] c =
E[E]AV. isBool v-(v-C[C]; C[while E do C] c, c), err
As described in 3.4.3. (iii) C[C]; C[while E do C] c means
C[C](C[while E do C] c). Thus if E's value is true C is done and the
resulting state passed back to C[while E do C] c (( C4) is recursive). If v is
false then the state resulting from E is sent straight to c.

Thus C[while false do C] c = E[false]Av. isBool v-(v- ... , c), err


= (AV. isBool v-(v- ... , c), err) false
= (false-... , c)
= c
so while false do C has no effect.
(C5) C[C, ;C 2] = C[C,] oC[C 2]
here 0 is function composition (see 3.4.12.1.) so (C5) is equivalent to:
C[C, ;C 2] c s = C[C,](C[C 2] c) s
which we discussed above.
60 The Denotational Description of Programming Languages

5.1.4. Final answers and output


In this section we point out, and correct, three inaccuracies which have
crept into our semantics. These are:

5.1.4.1. Final answers are not states


In defining continuations we took the domain of final answers to be
[State + {error}]. This is unnatural- in practice the result of running a
program is not thought of as the whole state but just the output (together
with an indication of whether the program halted normally or via an error).

5.1.4.2. Output is not part of the state


By defining State = Memory x Input x Output we made the output file
just as accessible to programs as the memory and input. Usually informa-
tion which has been output cannot be retrieved-the command output E
should put E's value directly on to the final answer rather than pass it to
the rest of the program embedded in the state.

5.1.4.3. Output can be infinite


The definition Output = Value* assumes that the output of programs can
be represented by a finite string of values. This is incorrect for non-
terminating programs like: x: = 0; while true do (output x; x: = x + 1)
whose output is infinite (namely O. 1 .2.3 ... ).
All the inaccuracies of the last three sections can be corrected if we re-
move output from the state and change the type of continuations. The
required domain equations are:
State = Memory x Input
Memory = Ide-[Value+{unbound}1
Input = Value*
Value = Num + Bool
Cont = State-Ans
Econt = Value-Cont
Ans = {error, stop} + [Value x Ans]
The domain Ans of final answers is defined recursively (see 3.2.2. and
3.3.4.). A final answer is either error, stop or a value paired with another
Standard Semantics 61

answer-'unwinding' this we see an answer is either a finite string of


values ending with error or stop or an infinite string of values. Because
output is no longer part of the state we must change the semantic clause
for output E to:
C[output E] c = E[E]Avs. (v, c s)
Thus E is evaluated to get a value v and state S and then the final answer
is v followed by whatever the 'rest of the program' c yields when given s.
This equation models output as like a 'system call' to an output device- if
the program' crashes' after outputting then the values output will already
be in the final answer (e.g. 'printed on the line printer'). For example if
c = AS. error a continuation representing a 'rest of the program' that
generates an error for all states, then:
C[output 0] c = E[O]Av s. (v, c s)
= E[O]Av s. (v, error)
= (Av s. (v, error)) 0
= AS. (0, error)
With the old semantics (with output in the state) outputting is like writing
an 'in core' file and so if a 'crash' occurs all previous output will be lost.
For example recall (C2) in 5.1.3.2. - with that definition of C[output E]
we get:
C[outputO] c = E[O]Av(m,i,o). c(m,i,v. 0)
= E[O]Av(m,i,o) • error
= E[O]AV s. error
= lAv S . error) 0
= AS. error
Using the new continuations with Ans we can also express the fact that
when a program has finished it should stop. If C is a command represent-
ing some program to be run then the answer it produces with initial state S
is just C[C](As. stop)s-the 'rest of the program' represented by the
continuation AS. stop just outputs stop and stops. If C were to run on
forever AS. stop would never be reached.
In general the domain Ans of final answers is language dependent-for
some languages (e.g. machine code) it might include the whole state (as
we initially had things) or even the sequence of states gone through
62 The Denotational Description of Programming Languages

during the computation. Determining Ans in part of the work that has to
be done when we write the semantics of a language.

5.2. Locations, stores and environments


In some languages identifiers are not bound directly to values but instead
are bound to things called variables and it is these that possess values.
one might think that these variables were formally unecessary and that (as
in TINY) one could just bind each identifier to some value (namely the
value possessed by the corresponding variable). This works fine unless
what is called sharing (or aliasing) is possible.

5.2.1. Sharing
Sometimes distinct identifiers can come to denote the same variable so
that assigning to one of them will have the side effect of changing the
value of the other. Such sharing of a single variable by several identifiers
(or aliases) may occur as the result of an explicit command or declaration
- for example one might have a command let I, = = 12 whose evaluation
causes I, and 12 to denote the same variable. Sharing can also come
about indirectly-for example in PASCAL if one declares a procedure of
the form procedure P(var x: real; var y: real)<statement part> and then
executes a call P(z,z) then inside the body of the procedure (i.e. inside
<statement part» both x and y will share the variable denoted by z.
If sharing is possible then we have no choice but to introduce some
model of variables into our semantics. For suppose not: then if I, and 12
share, when we execute the command I,: = 1 we must ensure in our
model that 12 gets value 1 also - the semantic clause for assignments
would thus have to be:
ell: = E] c = E[E]Av(m,i,o). c(m[v,v, ... ,v/l,l, , ... ,In],i,o)
where I, , ... , In are all the identifiers which share with I
However to make this precise we have to define what it means for two
identifier.s to share and the simplist way of doing this is to say two
identifiers share when they denote the same variable.

5.2.2. Variables and locations


The word "variable" has a number of misleading connotations so in con-
Standard Semantics 63

nection with formal semantics the less loaded term "location" is often
used instead. To model locations (i.e. variables) we introduce a new primi-
tive domain, Loc, typical members of which are I, I', 1,,12 etc. The only
structure we shall assume on Loc is that locations can be tested for
equality with =. When modelling implementations it may be convenient
to make Loc more concrete, for example to identify it with Num-for
abstract semantics this is unnecessary.

5.2.3. Stores
To model the association of locations with values we introduce the con-
cept of a store. The domain Store of stores is defined by:
Store = Loc-[Sv + {unused}]
Where Sv is a language dependent domain of storable values-the values
that can be held in locations. Exactly what the storable values of a langu-
age are is an important question to ask and forms a dimension along
which languages can be classified [Strachey 74]. If seStore and leLoc
then if s I = veSv we say I has (or holds) value v and if s I = unused we
say I is unused in s. Certain constructs (for example ALGOL 60 blocks).
cause new locations to come into use and to model these we assume a
function new:Store-[Loc + {error}] which has the property that if
isLoc(news) = true then s(news) = unused-i.e. news is an unused
location in s (new s = error means that there are no locations available in
s). There are serious dangers in only partially specifying function like we
have just done for new. For example suppose we had assumed
new:Store-Loc and that for all ss(news) = unused. Then for any
VESV if sEStore is defined by s = AI. v then v = s(new s) = unused!
The lesson to learn from this is that one must be very careful about as-
suming there exist functions with given properties. In fact the
assumptions we made are safe. To see that there do exist new's satisfy-
ing them we arbitrarily choose I, , 12, ... , InE Loc and then define new by:
new = AS. s I, = unused-I"
s 12 = unused~12'

Sin = UnUSed-In' error


64 The Denotational Description of Programming Languages

This models finite stores with locations I" 12 , ... , In and where new s is the
first unused Ij (or error if all the Ij'S are in use). Clearly for this particular
new if isLoc(news) = true then s(news) = unused. When Modelling
implementations it is necessary to specify details of storage allocation and
so particular new's must be defined; however for our purposes this is un-
necessary. Finally note that we shall use s, 5,,5 2 etc. to range over store
and v" ... ,vn!l, , ... ,I n to denote the 'little' store defined by:
v, , ... ,vn!l, , ... ,In = AI. I = I,-V"
I = 12~V2'

I = In-Vn, unused
Thus in v, , ... ,vn!l" ... ,ln the only locations in use are I, , ... ,In and these
hold v" .. "Vn respectively.

5.2.4. Environments
To model the binding of identifiers we use the concept of an environment.
The domain Env of environments is defined by:
Env = Ide-[Ov + {unbound}]
Where Ov is the language dependent domain of denotable values. If the
only thing identifiers can be bound to is locations then Ov = Loc; how-
ever in most languages identifiers can denote other things besides loca-
tions-for example constants (e.g. numbers), arrays, records, procedures
etc. Exactly what the denotable values are is an important question to ask
about a language and, like the storable values, forms a dimension for
classification. We shall use r, r', r, , r 2 etc to range over Env and also the
following two notations:
(i) If d, , ... ,dncOv and I, , ... ,Inclde then d, , ... ,dJI, , ... ,I n is the 'little'
environment defined by
d, , ... ,dnll" ... ,ln = AI.I = I,--:d"
Standard Semantics 65

(ij) If r" r2EEnv then r, [r 2]EEnv is defined by:

r, [rJ = AI. r21 = unbound-r,1, r21


If we put (j) and (ij) together we get
r[d, , ... ,dn/l, , ... ,In] = AI.I = I,~d"
I = In -d n, r I
which is consistent with the notation described in 3.4.7.

5.3. Standard domains of values


In TINY we had just one domain of values, namely Value. In general one
needs to distinguish several value domains, amongst the most important
are:
(j) Storable values Sv - the values which can be stored in locations in
the store, we use v, v', v, , V 2 etc. to range over Sv.
(ij) Denotable values Ov-the values which can be bound to (or
"denoted by") identifiers in the environment. We use d, d', d" d 2
etc. to range over Ov.
(iij) Expressible values Ev-the values expressions can produce as
results. We use e, e', e" e 2 etc. to range over Ev.
Although these domains usually intersect they are conceptually distinct
and are not normally the same. For example in PASCAL constants such
as numbers and truth values are denotable but in ALGOL 60 they are not.
In PASCAL locations are storable but in ALGOL 60 they are not. Other
domains of values besides Sv, Ov and Ev may also be needed for particu-
lar languages. For example one may need to explicitly distinguish and
define the 'outputable values' or the values that can be passed to pro-
cedures. Another general purpose value domain which we discuss later (in
5.8.1 is Rv the domain of R-values-this is the domain of 'dereferenced'
expression values (called "R-values" because they are extracted on the
right of assignments).

5.4. Blocks, declarations and scope


In languages like ALGOL 60 and PASCAL commands change the
66 The Denotational Description of Programming Languages

contents of locations but not the binding of identifiers to locations. For


example in the ALGOL 60 program:

begin int~ger x;

x: =1;

x: =2;

end
throughout the body of the block (the text between begin and end) x de-
notes a fixed location, I say, in the environment but the contents of I
changes with each assignment to x. The only way bindings in the environ-
ment can be changed is with declarations. For example in:
begin integer x;
int~ger y;

begin inte.ger x;

end

end
The x in the inner and outer blocks denote different locations. Since y is
not declared in the inner block it denotes the same location in this as it
does in the outer block. Each declaration 'holds' throughout a certain part
of the program. In ALGOL 60 declarations hold throughout the body of
the block in which they are introduced unless they are overwritten by a
declaration in an inner block. In PASCAL declarations hold throughout
the procedure body at whose head they occur unless they are overwritten
by a with-statement (see 9.) within this procedure body. In the example
above (which is based on ALGOL 60) the outer declaration of x holds
throughout the parts of the outer block not included in the inner one. The
declaration of y holds throughout the whole outer block. If x (like y) were
not declared in the inner block then its outer declaration would hold there
also.
The parts of a program where a declaration holds are called its scope.
Standard Semantics 67

The example above illustrates holes in scope: the inner block is a hole in
the scope of the outer declaration of x but not of y. The scope of an
identifier with respect to a declaration is the scope of the declaration. If
this declaration is unique (or defined by the context) then we can un-
ambiguously talk about the scope of the identifier. Thus lithe scope of y"
makes sense in the above example but not lithe scope of x" - the' latter
phrase is ambiguous between the scopes of the inner and out declarations
ofx.
In standard semantics:
(j) Commands change the store but not the environment.
(ij) Declarations change the environment (and possibly also the store).

Thus environments only change at 'scope boundaries'. Declarations may


change the store as well as the environment since-for example in the
case of variable declarations-they may 'allocate' new storage (i.e. put
new locations into use). In the language SMALL described below in 6.
there are variable declarations of the form var I = E which have the effect
that:
(j) A new location, I say, is obtained.
(ij) E's value is stored in I.
(iii) I is bound to I in the environment.
(ij) changes the store and (iii) the environment.
In general the declarations of a language form a syntactic category
modelled by a syntactic domain Dec. We shall use D to range over Dec
and D for the associated semantic function (see 5.6.). In SMALL Dec will
be defined by:
D :: = const I = E I var I = E I proc 1(1, );C I fun 1(1, );E I 0, ;0 2
var I = E is as just described, const I = E binds E's value directly to I in
the environment (and so does not change the store), proc 1(1, );C declares
a procedure named I with formal parameter I, and body C (see 5.9.1. for
details), fun 1(1, );E declares a function named I with formal parameter I,
and body E (see 5.9.2.) and finally 0, ;0 2 is a compound declaration
whose effects are those of 0, followed by those of O 2 ,
In the semantics (see 5.5.3. and 5.6.) each declaration generates a
'little' environment containing the bindings it specifies. For example
68 The Denotational Description of Programming Languages

const I = E generates ell where e is E's value, whereas var I = E


generates ./1 where. is a new location updated with E's value. This little
environment is then passed, together with a possibly changed store, to
the rest of the program following the declaration.

5.5. Standard domains of continuations


In standard semantics three main kinds of continuation are used.

5.5.1. Command continuations


Since commands pass a store to the rest of the program following them
we define Cc-the domain of command continuations-by:
Cc = Store-Ans
Where Ans is the domain of final answers. The exact structure of Ans is
language dependent; all we shall assume is that it contains an error ele-
ment error. We shall use c, c', c 1 , C 2 etc. to range over Cc.

5.5.2. Expression continuations


Since expressions pass their values, together with a possibly changed
store, to the rest of the program following them vye define Ec-the
domain of expression continuations- by:
Ec = Ev-Store-Ans
or more neatly:
Ec = Ev-Cc
We use k, k', kl' k2 etc to range over Ec.

5.5.3. Declaration continuations


Since declarations pass an environment, together with a possibly
changed store, to the rest of the program following them we define Dc-
the domain of declaration continuations-by:
Dc = Env-Store-Ans
Standard Semantics 69

or more neatly:
Dc = Env-Cc
We use u, u', U" u 2 etc. to range over Dc.

5.6. Standard semantic functions


In a standard semantics we use semantic functions with the following
types:
E: Exp-Env-Ec-Store-Ans
C: Com-Env-Cc-Store-Ans
D: Dec-Env-Dc-Store-Ans
When no errors, jumps etc. occur the intuitive meanings of these func-
tions are:
E[E] r k s = k e s' where e is E's value in environment r and store sand
s' is the store after E's evaluation.

C[C] r c s = c s' where s' is the store resulting from executing C in


environment r and store s.
D[D] r u s = u r' s' where r' is the environment consisting of the 'bind-
ings' specified in D (when evaluated with respect to
rand s) and s' the store resulting from D's evalua-
tion.
Thus each semantic function passes the appropriate intermediate results
to its continuation. Examples of typical semantic clauses for SMALL are
(E1) E[O]rks = kOs
D's value 0 is passed, together with an unchanged store, to k.
(e7) C[C,;C 2 Jrcs = C[C,]r(As' .C[CJrcs')s
C, is executed in environment r and store s to get a store s' which is
passed to the continuation AS' . C[C 2 ] r c s' which executes C 2 in the same
environment but in store s' and then finally sends the resulting store on to
c. Notice that since commands don't change the environment r is passed
to both C, and C 2 . Note also that the semantic clause can be simplified to
70 The Denotational Description of Programming Languages

C[C, ;C 2 1r c = C[C,l r (C[C 2 1r c) (see 3.4.10')


or C[C, ;C 2 1r c = C[C,l r;C[C 2 1r c (see 3.4.3. (iii))
or even C[C, ;C 2 1 r = C[C,l r 0 C[C 2 1 r (see 3.4.12.1.).
(01) D[const I = E] r u s = E[E1r (Aes' • u(e/I)s') s
Here E's value e is bound to I to form the little environment e/I which is
passed on to u together with the store s' resulting from E's evaluation.
The clause can be simplified to:
D[const I = E] r u = E[El rAe. u(e/I)
(N. B. This clause although right in spirit has a minor error-the value of E
needs to be 'dereferenced' before being bound to I. See 5.8 for a discus-
sion and 6.2.3.4. for the correct equation).
The rest of the semantic clauses for SMALL are explained in the next
chapter (specifically 6.2.3.).

5.7. Continuation transforming functions


In this section we describe some functions for transforming continua-
tions. These functions intuitively transform values and stores to new
values and stores but are defined over continuations to make them con-
venient for use in continuation semantics. The general idea is as follows:
suppose f:Ev-Store-[[Ev x Store] + {error}] then instead of using f we
shall usef':Ec-Ec defined by:
f' k = Ae s. (f e s = (e' ,s'))-k e' s', error
= f* k (using * as defined in 3.4.12.2.)
Thus f' takes a continuation k and produces another continuation which
'first does f and then passes the results to k'. The relationship between f
and f' can be factored out into a function mkconfun defined so that
f' = mkconfun f. The appropriate definition is simply:
mkconfun = At k. f* k
The use of these continuation transformers is as follows: frequently we
want to evaluate an expression, E say, and then transform the resulting
values by a sequence of functions, f" ... ,fn say, and then pass the result
of these transformations to the 'rest of the program', k say. Now if
Standard Semantics 71

f, ', ... ,fn': Ec-Ec are the continuation transformations corresponding to


f, , ... ,fn(i.e. f/ = mkconfunf;) then the desired effect is modelled by:
E[E] r (f, '(f 2', .. (fn' k) ... ))
or using ";" (see 3.4.3.):
E[E] r;f, '; ... ;fn';k
To see that this works we simply note that
f,'(f2' ... (fn' k). . .) = f, *f 2* ... *fn*k
Summing up an expression like E[E] r;f, '; ... ;fn';k should be 'read' as
"evaluate E, do f,', ... ,fn' and then pass the results to k".
In what follows we define the continuation transformations (e.g. f, ')
directly since the non continuation ones (e.g. f, ) will not be of use to us.
If we were going to use direct semantics then it would be the non
continuation transformations that we would need. For example compare
checkNum and checkBool which we used in the direct semantics of
TINY (see 4.2.2.) with Num? and Bool? which are what we shall use with
continuations.
5.7.1. cont:Ec-Ec
Informal description
cont k e s checks that e is a location and if so looks up its contents in s
and passes the result, together with s, to k. If e is not a location or if it is
but is unused in s then cont k e s error.=
Formal description
cont = Ak e s. isLoc e-(s e = unused-error, k(s e)s), error

5.7.2. update:Loc-Cc-Ec
Informal description
update Ice s stores e at location I in s and passes the resulting store to c.
If e is not storable value then update Ice s = error.
72 The Denotational Description of Programming Languages

Formal description
update = AI c e s • isSv e-c(s[e!ln, error
5.7.3. ref:Ec-Ec
Informal description
ref k e s gets an unused location from s, updates it with e and then passes
it, and the updated store, to c. If s has no unused locations available then
ref k e s = error.
Formal description
ref = Ak e s. new s = error-error, update (new s)(k(new s))e s
5.7.4. deref:Ec-Ec
Informal description
deref k e s tests whether e is a location and if so passes its contents in s,
together with s, to k. If e is not a location then e and s are passed to k.
Formal description
deref = Ak e s . isLoc e-cont k e s, k e s
5.7.5. err:Cc
Informal description
err is the error continuation.
Formal description
err = AS. error
5.7.6. Domain checks: D?:Ec-Ec
Informal description
For each summand 0 of Ev we define a function 07 which checks
whether an element is in 0 and produces an error if not. 07 k e s passes e
and s to k if e is in 0 and is error otherwise.
Standard Semantics 73

Formal description
07 = Ak e • isD e-k e, err

Examples
If Ev = Num + Bool + ... then
Num 7 = Ak e • isNum e-k e, err
Bool7 = Ak e . isBool e-k e, err

5.S. Assignment and land R values


The meaning of an assignment I : = E in TINY is the function which takes
a state and then updates the memory component at I with E's value. In
standard semantics-where instead of states we have environments and
stores- E is evaluated and its value stored in the location, 1 say, denoted
by I in the environment (if I does not denote a location then an error
results). The assignment only changes the contents of 1 in the store; the
binding of I to 1 in the environment is unaffected.
Suppose I, and 12 denote locations I, and 12 in the environment then
what is the effect of I, : = 12? There are two obvious possibilities:
(i) Location 12 is stored in location I, .
(ii) The contents of location 12 is stored in location I,.
In both ALGOL 60 and PASCAL (ij) describes what happens (in PASCAL
the effect of (i) can be obtained using pointers-see 9.1.).
Thus in standard semantics we shall assume that expressions occurring
on the right of assignments have their values dereferenced- i.e. have
their value looked up in the store if they are locations.
On the left of assignments, however, we need a location, not its
contents, since it is a location we update. The dereferencing that is neces-
sary on the right hand side can be done with the function deref defined in
5.7.4. If k is a continuation expecting a dereferenced value (e.g. the 'rest
of the program' following an expression on the right of an assignment)
then deref k is a continuation which when sent an undereferenced value
will first dereference it and then send the result to k. The standard
semantics of I : = E is thus:
74 The Denotational Description of Programming Languages

ClI : = E1r C 5 =
Em r k, 5
where k, = Ae, 5, . i5loc e,-E[E] r k2 5" error
where k2 = Ae 2 52 . deref k3 e 2 52
where k3 = Ae 3 53 . update e, C e 3 53
We evaluate I and pass the results to k, . k, takes the results, e, and 5, ,
of I's evaluation, checks e, is a location and then evaluates E and passes
the results to k2 . k2 takes the results, e 2 and 52' of E, dereferences e 2
and passes the results to k3 . k3 takes the results, e 3 and 53, of de-
referencing e 2, updates the location e, with contents e 3 anCl then passes
the resulting store to c. This semantic clause can be simplified. First
notice that:
k3 = update e, C
hence k2 = deref k3
= deref;update e,;c
hence k, = Ae, . i5loc e,-E[E] r;deref;update e, ;c, err
= Ae, . loc7(A1 . E[E] r;deref;update I;c)e,
= loc7 AI. E[E] r;deref;update I;C
and thus:
ClI : = E] r c = Em r;loc7 AI. E[E] r;deref;update I;C

5.B.1. land R values


To distinguish the different values extracted from expressions during
assignments the following terminology is often used:
(i) The value of an expression needed on the left of an assignment is
the expression's L-value. This value is a location and is obtained by
the semantic function E without any dereferencing.
OJ) The value of an expression needed on the right of an assignment is
the expression's R-value. This value is (Normally) obtained by de-
referencing the value obtained with E.
There are often other contexts besides the right hand sides of assign-
ments in which we need to dereference expression values. For example in
Standard Semantics 75

output E, if E's value is a location (e.g. E = I), then we should output its
contents. Rather than continually write E[E] r;deref it is traditional to
define a new semantic function R for obtaining R-values.
R:Exp-Env-Ec-Cc
and is defined by: R[E] r k = E[E] r (deref k)
A function L:Exp-Env-Ec-Cc defined by L[E]rk = E[E]r(Loc7k)
and which obtains L-values is also sometimes used. Using E and R the
semantic clause for assignments becomes simply:
ell : = E] r c = L[I] r AI. R[E] r;update I;C

In some languages only a subset of the expressible values are allowed as


the contents of locations. For example in PASCAL procedures are
expressible but cannot be assigned to variables. The storable subset of Ev
is the domain Rv of R-values, and it is convenient to make the semantic
function R check that it only produces R-values. Thus, in this case, we
define:
R[E] r k = E[E] r;deref;Rv7;k
5.9. Procedures and functions
In this section we outline the standard semantics of declarations and calls
of procedures and functions. Note that by "functions" we mean the pro-
gramming concept of a function-see the discussion at the beginning
of 3.4.

5.9.1. Procedures
A procedure, declared by proc HI, );C say, has name I, formal parameter
I, and body C. Within the scope of its declaration it can be called, for
example by a command of the form HE). The effect of this is to execute
the body C in an environment identical to the environment when the pro-
cedure was declared except that the formal parameter I, denotes E's
value. The evaluation of procedure bodies in an environment derived from
the declaration time environment is called static binding. Other kinds of
binding using other environments are possible (for example dynamic bind-
ing-found in LISP and POP-2-uses the call time environment). The
76 The Denotational Description of Programming Languages

semantics and merits of different kinds of binding are discussed in detail in


8.3.; for the time being we stick to static binding since it is the one used in
ALGOL 60 and PASCAL. The expression E in a call1(E) is called the actual
parameter-thus when a procedure is called the actual parameter value is
bound to the formal parameter. The semantics of a procedure declaration
is to bind some 'procedure value' to the procedure name in the environ-
ment, thus:
D[proc 1(1, );C]r u = u(p/l)
where p is a procedure value modelling a
procedure with formal parameter I, and
body C.
Procedure values are members of the domain Proc defined by:
Proc = Cc-Ev-Store-Ans
We use p, p', p" P2 etc. to range over Proc. The definition of Proc can
be simplified to:
Proc = Cc-Ec
The intuitive idea is that if pcProc then:
pc e s = c s' where s' is the store resulting from executing p's
body with actual parameter e and store s.
The continuation c passed to procedure values is analogous to the 'return
address' in an implementation-it enables the procedure to return to the
rest of the program following its call when execution of its body has
finished. The complete semantic clause for procedure declarations can
now be given:
D[proc 1(1, );C] r u = u(p/l)
where p = AC e. e[C] r[e/l,] c
Notice that the environment r[el I,] in which the body C is evaluated is
derived from the declaration time environment r-i.e. we have static
binding. Notice also that for this equation to make sense p's must be de-
notable values-i.e. Dv = ... +Proc+ ....
The semantics of a procedure call is given by:
Standard Semantics 77

ClUE)] res =
E[I] r(Ae, s, . isProc e,-E[E] r(Ae 2 S2 • e, c e 2 )s" error)s
We evaluate I to get e, and s, and check that e, is a procedure value. If it
is not then an error occurs. If it is we evaluate E in s, to get e 2 and S2 and
then apply e, to 'return address' c and actual parameter value e 2 • This
equation can be written more compactly as:
C[I(E)] r c = E[I] r;Proc7 AP, E[E] r;p c
which can be 'read' as: "evaluate I, check the result is in Proc, if so call
the result p and evaluate E and pass its value to pc".

5.9.2. Functions
In many languages one can declare 'functions' which, when called, return
a value as a result of the call. For example if fact is such a function then
fact(n) is an expression and things like output fact(n) make sense. Note
that these 'functions' must be carefully distinguished from the mathe-
matical functions they denote. In both ALGOL 60 and PASCAL the result
of a function is the last value assigned to the function name in its body.
For example in PASCAL a function to square an integer n would be
declared thus:
function square (n:integer):integer;square : = n x n
Since function calls are expressions the continuation passed to a function
value must be an expression continuation. Thus we define the domain
Fun of function values by:
Fun = Ec-Ec
We use f, f, f" f 2 etc. to range over Fun. The semantics of function
calls is:
E[I(E)) r k = Em r;Fun7 At. E[E] r;f k
. which can be understood by analogy with the clause for procedure calls
which it closely resembles. Note that for this to make sense 1's must be
denotable.
In SMALL (see next chapter) function declarations are of the form
fun 1(1, );E. Thus the body of a function is an expression which, when
78 The Denotational Description of Programming Languages

evaluated, yields the value to be returned. This way of yielding values is


mU{;h cleaner than assigning to the function's name; we discuss some of
the problems of that in 8.2.1. The semantic clause for function declara-
tions is similar to the one for procedures:
D[funl(l, );E] r u = u(f/l)
wheref= lke.E[E] r[e/l,] k

5.9.3. Summary
Domains: Proc = Cc- Ec - procedure values p
Fun = Ec-Ec - function values f

Clauses for declarations:


D[proc 1(1, );C]r u = u((lc e. C[C]r[e/l,] c)/1)
D[fun I( I, ); E]r u = u((lk e • E[E]r[e/l, ] k) II)

Clauses for calls:


C[I(E)] r c = Em r;Proc7 lp. E[E] r;p c
E[I(E)] r k = Em r;Fun7 At. E[E]r;f k

5.10. Non standard semantics and concurrency


Standard semantics is based on the idea that the meaning of a construct is
its effect on the final answer of programs in which it occurs. This implies,
for example, that x : = 2 and x : = 1;x : = x + 1 have the same meaning.
For most purposes this is exactly what we want, but not always. Consider
a language containing a construct cobegin C, ,C 2 coend whose mean-
ing is that C, and C 2 are executed concurrently until they both terminate.
In such a language x : = 2 and x : = 1;x: = x + 1 must have different de-
notations because, for example, cobegin x : = 2, x : = 2 coend can have
different effects than cobegin x : = 2, x : = 1;x : = x + 1 coend. In the
second case the x : = 2 might be done after the x : = 1 but before the
x : = x + 1 resulting in x denoting 3. The only result of
cobegin x : = 2, x : = 2 coend is for x to denote 2. If we had
ax : = 21 = C[x : = 1;x : = x + 1] then since the denotation of a construct
must depend only on the denotations of its immediate constituents; no
Standard Semantics 79

matter how we defined C[cobegin C" C 2 coend] we would have:


C[cobegin x: = 2, x: = 1;x: = x + 1 coend]
= ... C[x:= 2] ... C[x:= 1;x:= x+1]. ..
= ... C[x: = 2] ... C[x: = 2] ...
= C[cobegin x : = 2, x : = 2 coend]

To enable a semantics of cobegin C" C 2 coend to be given each com-


mand must denote some sort of 'process' in which the 'interleavable
operations' are modelled (see [Milner] for example); in such cases stan-
dard semantics is not apporpriate. Non-standard semantics are also useful
for modelling implementations, see [Milne and Strachey).
6. A second example: the Language
SMALL
The language SMALL described in this chapter has two roles:
(j) To illustrate the main ideas of standard semantics.
OJ) To provide a 'kernel language' which we shall extend as we discuss
the semantics of various constructs in later chapters.

6.1. Syntax of SMALL

6.1.1. Syntactic domains


The primitive syntactic domains of SMALL are:
Ide The domain of identifiers I
Bas The domain of basic constants B
apr The domain of binary operators a
We do not further specify these primitive domains. Bas could, for ex-
ample, be {O, 1,2, ... } and apr could be {+, -, x, /, <, >, ... }.
The compound syntactic domains of SMALL are:
Pro The domain of programs P
Exp The domain of expressions E
Com The domain of commands C
Dec The domain of declarations D

These domains are defined by the following syntactic clauses:

6.1.2. Syntactic clauses


P:: = program C
E :: = B I true I false I read III E, (E 2 1
I if E then E, else E2 IE, a E2
C :: = E, : = Ez I output E I E, (E 2 1 I if E then C, else C 2
I while E do C I begin D;C end I C, ;C 2
o :: = const I = E I var I = E I proc I (I, I; C I fun I (I , I; E I 0,; 0 2

80
A Second Example: The Language SMALL 81

We have procedure (and function) calls of the form E1 (E 2) rather than just
I(E) to permit calls like (if E1 then 11 else 12)(E 2) in which either the pro-
cedure (or function) denoted by 11 or the one denoted by 12 is applied to
E2 depending on the value of E1. Similarly we permit assignments of the
form (if E1 then 11 else 12) : = E2 or even HE 1) : = E2 (in which the loca-
tion assigned to is the result of the function caIlHE 1)).

6.2. Semantics of SMALL

6.2.1. Semantic domains


The primitive semantic domains of SMALL are:
Num The domain of numbers n.
Bool The domain of bodeans b.
Loc The domain of locations I.
Bv The domain of basic values e.
Basic values are the denotations of basic constants. If Bas = {O, 1,2, ... }
then Bv = {O, 1, 2, ... }- however for simplicity we shall not specify any
particular choice of basic values or constants.
The compound semantic domains are defined by the following domain
equations:
Dv = Loc+ Rv+ Proc+ Fun - denotable values d
Sv = File + Rv -storable values v
Ev = Ov -expressible values e
Rv = Bool+Bv -R-values e
File = Rv* -filesi
Env = Ide-[Ov + {unbound}] - environments r
Store = Loc-[Sv + {unused}] -stores s
Cc = Store-Ans -command continuations c
Ec = Ev-Cc - expression continuations k
Dc = Env-Cc -declaration continuations u
Proc = Cc-Ec - procedure values p
Fun = Ec-Ec -function values f
Ans = {error, stop} + [Rv x Ans1 - final answers a
We assume Loc contains a location input which holds the input file. We
82 The Denotational Description of Programming Languages

have not provided SMALL with any facilities for creating new files, this is
discussed in 9.5. For the time being the only file around is the input.

6.2.2. Semantic functions


We assume as given:
B:Bas-Bv
O:Opr-[Rv x Rv]-Ec-Cc
For example if Bas = {O, 1,2, ... }, Opr = { +, x, -, /, ... } and Bv = Num
then we might have B[O] = 0, O[ + ](1, 2)k = k 3,0[/](1, O)k = err etc.
The meaning of the non-basic constructs are given by:
P: Pro- File-Ans
R:Exp-Env-Ec-Cc
E: Exp-Env-Ec-Cc
C:Com-Env-Cc-Cc
D:Dec-Env-Dc-Cc
These are defined by the following semantic clauses.

6.2.3. Semantic clauses

6.2.3.1. Programs
(P) P[program C] i = C[C]( ) (As. stopHi/input)
The final answer produced when program C is run with input i is ob-
tained by evaluating the command C with an empty environment
(() = AI. unbound), a store in which the only used location is input
which has contents i (li/input) = AI. I = input-i, unused) and a con-
tinuation which when sent a store stops with final answer stop.

6.2.3.2. Expressions
(R) R[E]r k = E[E] r;deref;Rv7;k
E is evaluated, its result dereferenced and then checked to make sure it is
an R-value and then passed to the rest of the program.
(El) E[B]rk = k(B[B))
A Second Example: The Language SMALL 83

D[B] is passed to the rest of the program.


(E2) E[true] r k = k true , E[false] r k = k false
The appropriate boolean value is passed to the rest of the program.
(E3) E[readJr k s =
null(s inpud-error, k(hd(s input))(s[tl(s inpud/input))
If the input file (i.e. s input) is empty then an error occurs, otherwise the
first item on the input (hd(s inpud) is sent to the rest of the program k to-
gether with a store in which the item first read has been removed from
the input file (s[tl(s input)/input)).
(E4) Em r k = (r I = unbound)-err, k(r I)
If I is unbound an error occurs, otherwise the value denoted by I in the
environment is sent to the rest of the program.
(E5) E[E 1(E 2)] r k = E[E 1] r;Funnf. E[E 2] r;f;k
E1 is evaluated and its value checked to ensure it is a function f, E2 is then
evaluated and its value passed to f and finally the result of f is passed to
the rest of the program k.
(ES) Elif E then E1 else E2] r k =
R[E] r;BooI7;cond(E[E 1] r k,E[E 2] r k)

E is evaluated for its R-value which is then checked to ensure it is a


boolean, then E, or E2 are evaluated depending on whether E's value was
true or false.
(E7) E[E 1OE 2]r k =
R[E 1] rAe 1 • R[E 2] rAe 2 • 0[O](e 1,e 2)k
E1 is evaluated for its R-value e 1, then E2 is evaluated for its R-value e 2
and finally the result of doing 0 to e 1 and e 2 are sent to the rest of the
program k.

6.2.3.3. Commands
(e1) e[E,: = Ezl rc =
E[E 1] r;Loc? AI. R[E 2] r;update I;C
84 The Denotational Description of Programming Languages

E, is evaluated for its L-value I, then E2 is evaluated for its R-value which
is then stored in I and the resulting store passed to the rest of the
program c.
(C2) C[output E] r c =
R[E] rAe s . (e, c sl
E is evaluated for its R-value e and new store s, then e is put onto the
front of the answer produced when the rest of the program c is passed s.
(C3) C[E,(E 211rc =
E[E,] r;Proc7Ap. E[E 2] r;p;c
E, is evaluated and its value checked to ensure it is a procedure p, E2 is
then evaluated and its value passed to p and finally the store resulting
from p is passed to the rest of the program c.
(C4) CUf E then C, else C 2] r c =
R[E] r;Bool?;cond(C[C,] rc,C[C 2] rc)
E is evaluated for its R-value which is then checked to ensure it is a
boolean, then C, or C 2 are evaluated depending on whether E's value
was true or false.
(C5) C[while E do C] r c =
R[E] r;BooI7; cond(C[C] r;C[while E do C] r c , c)
E is evaluated for its R-value which is then checked to ensure it is a
boolean, then if E's value is true C is evaluated and the resulting store
passed to the beginning of while E do C again. If E's value is false then
the rest of the program is immediately done.
(C6) C[begin D;C end] r c = D[D] rAr' • C[C] r[r'] c
D is evaluated to get a little environment r', then C is evaluated in r up-
dated with r' and then the rest of the program C is done.
(C7) C[C, ;CJ r c = C[C,] r;C[C 2 ] r;c
C, is done, then C 2 is done and then the rest of the program c is done.

6.2.3.4. Declarations
(D1) D[constl = E1r u = R[E]rAe.u(e/1)
A Second Example: The Language SMALL 85

E is evaluated for its R-value e and then the little environment ell is
passed to the rest of the program u.
(02) D[var I = E] r u = R[E] r;ref AI. uh/!)
E is evaluated for .its R-value which is then stored in a new location I and
the little environment III passed to the rest of the program u.
(D3) D[procl(l,);C]ru = u((Ace.C[C] r[e/l,] c)/!)
A little environment in which the procedure value AC e • C[C]r[el I,] c is
bound to I is passed to the rest of the program u. When this procedure
value is called the body C is evaluated in an environment got from the
declaration time environment r by binding the actual parameter e to the
formal parameter I, .
(04) D[fun 1(1, );E] r u = u((Ak e . E[E] r[e/l,] k)/!)
A little environment in which the function value Ak e • E[E] r[el I,] k is
bound to I is passed to the rest of the program u.
(05) D[D,;D 2 ]ru = D[D,]rAr,.D[D 2 ]r[r,]Ar 2 .u(r,[r 2 ])
0, is evaluated in r to get a little environment r, , then D 2 is evaluated in
r[r,] (i.e. r updated with the bindings in r, ) to get another little environ-
ment r 2 and then finally r, [r 2 ] (the bindings from D, updated with those
of D 2) are passed to the rest of the program u. Notice that D 2 is evaluated
in an environment which contains the effects of 0, and that if D, and O 2
contain declarations of the same identifier then the result of D, ;0 2 is the
result of O2 , For example const x = 1;const x = x + 1 would bind x to 2.

6.3. A worked example


To illustrate how the semantics of SMALL works we show how to use it
to 'evaluate' program begin var x = read;output x end with respect to
an initial input file i such that null i = false and hd i = 1 (for example
i = 1. 2 . 3... ). We show as expected that
P[program begin var x = read;output x end] i = (1, stop)
i.e. C[begin var x = read;output x end]( HAs. stopHi/input) =
(1, stop)
86 The Denotational Description of Programming Languages

We do the calculation in several stages, first let r = (), s, = (i/input)'


S2 = s, [tl i/input] = (tl i/input), assume 12 = new S2 and finally let
S3 = s2[111 2]. Then:
(i) E[read] r k s,
= null(s, input)-error, k{hd(s, input)){s, [tl(s, input)/ input])
(E3)
= null i-error, k 1 (s, [tl i/input]) (s, input = j)
= k 1 S2 (null i = false)
(ii) D[var x = readJr u s,
= (R[read] r;ref AI. uh/x)) S, (02)
= (E[read] r;deref;Rv?;ref AI. uh/x)) S, (R)
= E[read] r(deref{Rv?(ref AI. uh/x)))) S, (definition of ;)
= deref(Rv?(ref AI. uh/x))) 1 S2 (by (i) above)
= isLoc 1-... ,Rv?(refAl. uh/x)) 1 S2 (definition of deref)
= Rv?{ref AI. uh/x)) 1 S2 (isLoc 1 = false)
= isRv 1-(ref AI. uh/x)) 1 S2"" (definition of Rv?)
= ref (AI. uh/x)) 1 S2 (isRv 1 = true)
= new S2 = error-error,
update (new S2){(AI. uh/x)){new S2)) 1 S2 (definition of ref)
= update 12 (uh 2/x)) 1 S2 (news 2 = 12)
= isSv 1-uh 2/x){S2[111 2]), error (definition of update)
= uh 2/x)(S2[111 2]) (isSv 1 = true)
= uh 2 /x)S3 (S3 = s2[111 2])
(iii) C[begin var x = read;output x end] r c s,
= D[var x = read] r(Ar' . C[output x] r[r'] c) s, (C6)
= (Ar' . C[output x] r[r'] c) h 2/x) S3 (by Oi) above)
= C[output x] r[l2 Ix] c S3
= R[x] r[l2/x] (Ae s. (e,c s)) S3 (C2)
= E[x] r[i2/x] (deref(Rv? Ae s. (e,c s))) S3 (R)
= deref(Rv?Aes.(e,cs))1 2 S3 ((E4), r[1 2/x]X = 12)
= isLoc l-cont{Rv? Ae s. (e,c s)) 12 S3'''' (definition of deref)
= cont(Rv? Ae s. (e,c S)) 12 S3
= (Rv? Ae s. (e,c s)) 1 S3 (definition of cont, isLoc 12 = true
ands 3 12 = 1)
= (Aes. (e,c s)) 1 S3 (definition of Rv?, isRv 1 = true)
= (1,c S3)
A Second Example: The Language SMALL 87

(iv) P[prog,am begin va, x = raad;output x end] i


= C[begin va, x = read;output x end]( HAs. stopHilinput) (P)
= (1,(As. StOP)S3) (by (iii) above)
= (1,stop)
Calculations like this, though very tedious, are purely mechanical. The
Semantics Implementation System (SIS) of Peter Mosses [Mosses] is a
computer system which, by automating such calculations, runs programs
directly from a denotational semantics. Although the resulting 'imple-
mentation' is very inefficient it is nevertheless useful for 'debugging
semantics' and as an aid to language designers. An implementation pack-
age (on DEC tapes) is currently available for PDP 10's from Peter Mosses
at Aarhus University, Denmark.
7. Escapes and Jumps
In this chapter we describe various constructs which change the 'flow of
control'. From a semantic viewpoint the normal continuation is ignored
and control passes to some other one. Constructs of this sort are some-
times called sequencers.

7.1. Escapes
Escapes enable one to exit from the middle of a construct and then
resume computation at its end.

7.1.1. Escapes from commands


To illustrate escaping from commands we shall describe a construct re-
lated to 'event mechanisms' [Knuth 74; Zahn]. We introduce two new
kinds of commands: traps trap C I, :C, , ... ,In:Cn end, escapes escapeto I.
Informally the meaning of these is as follows
(i) On entering trap C I, :C" ... ,ln:Cn end the body C is executed. If
during the execution of C a command escapeto I; is encountered
then immediately the 'postlude' C; is executed and then control
passes to the rest of the program.
OJ) If no escapes are encountered then the effect of the trap is just the
effect of its body.
The use of constructs like these has been advocated as a more 'struc-
tured' way of exiting from loops and dealing with exceptional conditions
than the use of unrestricted jumps. For example in:
trap while E do
if E, then escapeto x, else C,
i~ E2 then escapeto X 2 else C 2

end
If E, or E2 becomes true then the while command is aborted and either

88
Escapes and Jumps 89

C,' or C 2 ' is done.


For the semantics of escapes we simply make the escape identifiers de-
note continuations defined by the corresponding postludes and then
when escapeto I is encountered the normal continuation is ignored and
the one bound to I used instead. Thus
C[trap C I, :C, , ... ,In:Cn end] r C =
C[C] r[C[C,] r c, ... ,C[Cn] r c/l, , ... ,In] C
C[escapeto I] r C = E[I] r;Cc? Ac' • c'
This last clause may at first look a bit funny-to see that it is indeed what
we want we have:
C[escapeto I] r C s
= (E[I]r;Cc?Ac' .c')s
= E[I]r(Cc?Ac' .c')s
= (r I = unbound)-error, (Cc? AC' . c')(r I) s
= (r I = unbound)-error,
isCc(r I)-(Ac' . c')(r I) s, error
= (r I = unbound)-error,
isCc(r I)-r I s, error

Thus if r I is bound to a command continuation (as it should be in the body


of a trap) then the effect of escapeto I is to pass the store to this
continuation. As is clear from the clause for traps the continuation bound
to I will cause the appropriate postlude and then the rest of the program
to be done. Notice that for these semantic clauses to make sense
members of Cc must be denotable (i.e. Dv= ... + Cc+ ... ).
An iteration construct closely related to these kind of escapes is described
in 10.2.

7.1.2. Escapes from expressions


An example of an escape mechanism for expressions is Landin's J-
operator [Landin] or the jumpout facility in POP-2. To illustrate the es-
sential idea of these we introduce expressions jumpout I in E with mean-
ing as follows:
(i) To evaluate jumpout I in E one starts evaluating the body E. If
during E's evaluation a function caIlI(E,) is encountered then E, is
90 The Denotational Description of Programming Languages

evaluated and its value immediately returned as the value of


jumpout I in E.
(ii) If no call of I is encountered then the result of jumpout I in E is
just the result of E.
Thus in the body E of jumpout I in E I denotes a 'function-like' value
which, when called, immediately returns the value of its actual parameter
as the value of jumpout I in E. I differs from an ordinary function in that it
never returns to the point from which it was called. For example
jumpout I in ((if E then I(E,) else E2 ) + E3 )
is equivalent to
if E then E, else (E 2 + E3 )
The semantics is straightforward:
Eijumpout I in E) r k = E[E] r[(le k' • k e)/I] k

E is evaluated in an environment in which the 'jumpout function' I is


bound to the function valve le k' . k e. This function value when passed
an actual parameter e and 'return address' k' promptly ignores k' and
passes e to the rest of the program following jumpout I in E.

7.1.3. valof and resultis


A hybrid kind of escape construct occurs in BCPL and PAL. In these one
has an escape trap valof C which is an expression, though its body C is a
command. The expression valof C is used with the escape command
resultis E as follows:
(j) The value of valof C is obtained by evaluating C until a resultis E
is encountered. The value of valof C is then the value of E.
(ii) If the evaluation of C finishes without any resultis E being en-
countered then an error results.
Escapes and Jumps 91

For example a function fact to compute the factorial of n could be de-


clared by:
fun fact (n);
valof begin var x = 1;
while n>O do (x : = n x x;n : = n-1);
resultisx
end
The semantics of resultis E is roughly:
C[resultis E] r c = R[E] r k
where k is the continuation representing the rest of the program following
the enclosing valof C. There are two possible ways of handling this k:
(j) One can make k an argument of the semantic function C-Le.
change the type of C to:
C:Com-Env-Ec-Cc-Cc

~L,
valof normal
continuation continuation
then:
C[resultis E] r k c = R[E] r k
E[valof C) r k = C[C] r kerr
where the continuation err in the last clause is only used if C finishes with-
out encountering a resultis E. With this approach to handling k one also
has to change all the other semantic clauses for commands so that k is
either passed on or ignored. For example:
C[output E] r k c = R[E] r 1e s • (e,c s)
C[C, ;C 2 1r k c = C[C,1r k;C[C 2 1r k;c
(ij) One can make k part of the environment and redefine

Env = [lde-[Dv+ {unbound}11 x Ec


92 The Denotational Description of Programming Languages

and then
C[resultis E] r c = R[E] r;el2 r
E[valof C] r k = C[C] (e11 r,k) err
We must also redefine Em to handle these new environments:
Em r k = (e11 r I = unbound)-error,k(eI1 r I)
we discussed these two essentially equivalent methods of treating valof
and resultis to show that sometimes there is not a single, obviously pre-
ferred, way of doing things. The first method has the advantage that
escape continuations are not hidden-so the possibility of escape can im-
mediately be seen from the type of C. The second method has the ad-
vantage that it expresses the fact that escape continuations are 'statically
scoped' just like environments.

7.2 Jumps
Jumps are changes in the flow of control typically invoked by goto com-
mands of the form goto I. For example if noaction is a command which
does nothing (i.e. C[noaction] r c = c) then
l:if E then C;goto l else noaction
is equivalent to while E do C. Escapes are restricted jumps in which one
can only transfer control to the end of a construct. The semantics of
escapes easily generalizes to handle arbitrary jumps. For example:
C[goto I] r c = Em r;Cc1 AC' • c'
This is just like C[escapeto 1]; general jumps differ not in the semantics
of changing the flow of control but in the places control can pass to. The
only problem is getting the appropriate continuation bound to the cor-
responding labels. To do this we must:
(i) Determine the scope of each label- i.e. the places in the program
from which one can jump to it. In PASCAL labels must occur in
procedure bodies and one can only jump between commands in
the same body. Thus the scope of a label in PASCAL is the
smallest procedure body in which it occurs. In ALGOL 60 on the
other hand the scope of a label is the smallest textually enclosing
block. Throughout its scope each label denotes a fixed value (a
Escapes and Jumps 93

continuation) just as other identifiers do.


(ii) Determine the continuation denoted by each label. For example
in ALGOL 60 one can jump 'into' for-statements and the arms of
conditionals and so if a label occurs in the middle of such a
construct then the continuation it denotes is the 'rest of the
construct' followed ~y the 'rest of the program' following the
construct. Determining this continuation can be messy-as we
shall see.

7.2.1. The semantics of jumps


To illustrate a typical semantics we add labels and gotos to SMALL. Thus
we add to the commands of SMALL goto I and I:C with meaning such
that:
(i) The scope of each label is the smallest textually enclosing begin/
end block.
(ii) One can jump into both the arms of conditionals and the body of
while commands.
To handle the semantics of this we must ensure that on entering a block
begin D;C end we bind to each label occurring in C the continuation cor-
responding to the 'rest of the program' following it. Now this 'rest of the
program' is only implicitly defined by the position of the label and so to
extract the appropriate continuation we define a function
J :Com-Env-Cc-Env such that
J[C]rc = (c" ... ,cn/I" ... ,ln)
where 1", ",In are the labels occurring in C and c, '''''C n the continuations
corresponding to them (if C is evaluated in environment r and with
continuation c). Thus C; is the continuation from I; to the 'end of the pro-
gram' in which the block occurs. This consists of the program from I; to
the end of the block followed by the 'rest of the program' c from the end
of the block to the end of the program. Before we define J note that Com
is now defined by:
C :: = E, : = E2 I output E I E, (E 2) I if E then C, else C 2
I while E do C I begin D;C end I C, ;C 2 Igoto I II:C
94 The Denotational Description of Programming Languages

Then J is defined by cases as follows:


J[E,: = E21r e = J[output E1r e = J[E, (E 2 )] r e = ( )
Since none of these commands can contain labels J produces the empty
label binding () when applied to them.
JUf E then C, else C 2 ] r e = J[C,] r e [J[C 2 ] r e)
The label bindings specified by if E then C, else C 2 are the ones specified
by C, combined with those specified by C 2 • Recall that r,[r 2 ) is r, up-
dated by r 2 so the asymetry of J[C,] r e [J[C 2 ] r e) reflects an arbitrary
decision that if a label occurs in both C, and C 2 then it is the one in C 2
that counts. For example
goto L;if E then L:C, else L:C 2
will cause a jump to C 2 • If we had wanted to disambiguate the other way
we would have used J[Czl r e [J[C,] r e). Probably the least ad hoc thing
to do would be to prohibit more than one occurrence of a label in a given
scope. To do this is straightforward but requires a messy syntax.
J[while E do C) r e = J[C] r (C[while E do C) r e)
The label bindings specified by while E do C are those of C computed
with respect to a continuation C[while E do C) r e to reflect the fact that
after the body of a while is completed control returns to its beginning.
J[begin D;C end] r e = ()
Since the scope of labels is the block in which they occur one cannot jump
into blocks and so no label bindings are exported out
J[C, ;C 2] r e = J[C,] r (C[C 2] r e) [J[C 2 ] r e)
The label bindings from C, ;C 2 are those of C" computed with respect to
the continuation C[C 2] r e, combined with those of C 2 • If a label occurs in
both C, and C 2 then the one in C, is ignored (see the discussion of the
conditional) .
J[goto I] r e = ()
goto I cannot contain any labelled commands.
J[I:C] r e = J[C1r e [C[C] r ell]
Escapes and Jumps 95

I:C gives rise to all the bindings of C together with the new binding
C[C] r cl 1- i.e. I is bound to the continuation corresponding to doing C
then the 'rest of the program' c.
Before we describe the use of J note that when executing the com-
mand I:C the I is ignored and hence
C[I:C] = C[C]

The denotation of I:C is identical to that of C-the purpose of the "I:" is


just to mark a place in the program so J can extract the appropriate
continuation.
We can now use J to write a semantic clause for blocks which sets up
label bingings:
C[begin D;C end] r c = D[D] r Ar' . C[C] r[r'][r"]
whererec r" = J[C] r[r'][r"] c

The explanation of why r" must be recursively defined like this is rather
subtle and perhaps best conveyed by an example. Let C be the 'infinite
loop' L:output 1;goto l. It is easy to see from our definition of J that:
J[C] r c = J[L:output 1;goto U r c = (c,/L)
wherec, = As.(1,rLs)
Now if we just bound c, to L when evaluating L:output 1;goto L then:
C[L:output1;gotoUr[c,/L]cs = (1,(1,rLs))
so 1 would be output twice and then control would pass out of the loop to
r l. This is wrong-what we want is to bind to L a continuation, C 2 say,
which after it outputs 1'goes back' to C 2 again-and thus repeatedly out-
puts 1 forever. This is achieved by defining C 2 recursively by
C 2 = AS. (1,c 2 s)-Le. defining (c 2 /L) = J[C] r[c 2 /L] c or in other words
defining C 2 = r" L where r" = J[C] r[r"] c Then:
C[L:output 1;goto U r[c 2 /L] c s = (1,(1, ... ))).
Generalizing this argument leads to the requirement that
r" = J[C] r[r'][r"] c in the semantic clause for blocks.
The mutually recursive definition of the denotations of labels is not
really surprising if one reflects on the fact that in, for example:
96 The Denotational Description of Programming Languages

~2:goto L,;

the 'rest of the program' following L, starts with the 'rest of the program'
following L2 and the 'rest of the program' following L2 starts with the 'rest
of the program' following L, .
Summing up; the semantics of jumps is:
C[I:C] = C[C]
C[goto I] r c = E[I] r;Cc? Ac' • c'
C[begin D;C end] r c = D[D] r Ar' • C[C] r[r'][r"] c
whererec r" = J[C] r[r'][r"] c
Finally notice that to handle computed goto's such as
goto (if E then L, else L2) we just replace the construct 90tO I with the
more general one goto E and define:
C[goto E] r c = R[E] r;Cc? Ac' • c'
(we use R to cope with the possibility that continuations are the values of
variables-this is discussed briefly in the next section).
Note also that we have arbitrarily assumed labels are identifiers-in
some languages (e.g. PASCAL) other things (e.g. numerals) are used. In
such cases a special label environment would be needed since label
bindings could no longer be held in the ordinary environment.

7.2.2. Assigning label values to variables


In some languages, e.g. PAL [Evans], one can assign labels to variables.
For example when goto I is executed in:
begin

L:
I: = L;

end;
gotol
Escapes and Jumps 97

Control re-enters the block which has already been left. To handle the
semantics of this all one has to do is include command continuations in
the R-values. The semantic clauses for E, : = E2 and goto E described
previously will then handle such jumps automatically.
Treating labefs as 'first class values' in this way enables powerful
control structures like backtracking and co-routines to be implemented. It
also makes the language hard to efficiently implement since local storage
cannot be reclaimed after exiting from blocks and procedures as they
might be jumped back into later.
8. Various kinds of procedures and
functions
In this chapter we discuss a number of 'dimensions' along which pro-
cedures (and functions) can be varied. In outline these are:
(i) Number of parameters.
(ii) Possibility of recursion.
(iii) Environment used to bind free identifiers (i.e. identifiers in the body
which are not formal parameters).
(iv) Coercion of the actual parameter value.
(v) Evaluation of the actual parameter expression.

8.1. Procedures (or functions) with zero or more parameters


In SMALL every procedure (and function) had exactly one parameter,
other possibilities are:

8.1.1. Zero parameters


We might have declarations proc I;C and fun I;E with semantics:
D[proc I;C] r u = u(p/I)
where p = AC. C[C] r c = C[C] r
D[fun I;E] r u = u(fll)
where f = Ak. E[E]r k = E[E]r
Here p and f have types Cc-Cc and Ec-Cc respectively so we define:
Proc o = Cc-Cc - procedure values of zero parameters
Funo = Ec-Cc -function values of zero parameters
For the above semantic clauses to make sense Proc o and Funo must be
included in (i.e. made summands of) the denotable values Dv.
To handle parameterless procedure calls we just add commands of the
form 1and give them semantics:
C[I] r c = E[I] r;Proc o7 Ap. pc
Parameterless function calls are slightly more tricky since we already have
expressions of the form I. Two possibilities are:

98
Various Kinds of Procedures and Functions 99

(j) To syntactically distinguish calls-e.g. give them syntax I(), or


more generally E( ) - with semantics:
E[E( )] r k = E[E] r; Funo? At. f k
(ij) To use the expression I both for parameterless function cans and
for naming constants and variables. In this case we must redefine
Em by:
Em r k = (r I = unbound)-err,
isFuno(r I)-r I k, k(r I)

8.1.2. More than one parameter


We might have declarations proc 1(1, ,1 2 );C, proc 1(1, ,1 2 ,1 3 );C, ... etc.
and fun 1(1, ,1 2 );E, fun HI, ,1 2 ,1 3 );E, ... etc. With semantics
D[proc 1(1, , ... ,In);C] r u = u(p/l)
where p = Ac(e, , ... ,en). C[C] r[e, , ... ,en/l, , ... ,In] c
D[fun 1(1, , ... ,In);E] r u = u(f/l)
where f = Ak(e, , ... ,en). E[E] r[e, , ... ,enll, , ... ,In] k
For each n we need domains Proc n and Fun n for procedure and function
values with n parameters:
Proc n = Cc-Evn-Cc
Fun n = Ec-Evn-Cc
Then for the above semantic clauses to make sense we must make all
these procedure and function values denotable, i.e.
Dv = ... + Proc, + Proc 2 + ... + Fun, + Fun 2 + ...
For calling we use the syntax E(E, , ... ,En) with semantics
C[E(E, , ... ,En}] r c =
E[E] r;Proc n7 Ap. E[E,] rAe, .... E[En] r len' pc (e, , ... ,en)
E[E(E" ... ,En)]rk =
E[E] r;Fun n7 At. E[E,] r le, .... E[En] r len' f k (e, , ... ,en)
Thus the actual parameters are evaluated from left to right. Other orders
of parameter evaluation would correspond to other orders of E[E,], ... ,
100 The Denotational Description of Programming Languages

E[EJ in the right hand side of the semantic clauses.

8.2. Recursive procedures and functions


Consider the following SMALL procedure copy:
proc copy(n);
if n = 0 then output 0 else (output read;copy (n-1))
The idea is that a call copy(n) will be equivalent to
output read;output read; ... ;output read;output 0
\ ~
Y
n output commands
and so n items will be copied from the input to the output and then 0 will
be printed. Unfortunately according to the semantics we have given so far
this will not work. To see why not recall that:
D(proc 1(1, );C]r u = U((AC e. C[C]r[e/l,] c)/1)
Thus the environment in which procedure bodies are done-r[e/l,]-
only differs from the declaration time environment r in that the actual
parameter value e is bound to the formal parameter I, . Thus in the body
the procedure name 1 denotes whatever it is bound to in r-Le. whatever
1happens to be bound to before the declaration. For example the effect of
the command:
begin proc copy(n);output n;
proc copy(n);if n =Othen output 0 else
(output read;copy(n-1));
copy(10)
end
would be to output the first item on the input, then to output 9 and then
to stop. The 'recursive' call copy(n-1) in the body of the second version of
copy would be a call to the first version.
To model recursive procedures we must bind the procedure value being
declared to its name in its body. If we use rae proc 1(1, );C for recursive
procedure declarations and retain proc 1(1, );C for non recursive ones then
Various Kinds of Procedures and Functions 101

the semantics are:


D[proc 1(1, );C]r u = u(p/l)
where p = AC e • C[C] r[e/lJ C

,,/
D[rae proc HI, );C] r u = u(p/l)
whererec p = AC e. C[C] r[p,e/l,I,] C

N.B.

In both PASCAL and ALGOL 60 all procedures are recursive but in general
this need not be so.
To handle the declaration of several mutually recursive procedures we
could introduce the construct:
raeproc I,U,,);C,
andproc I~U2' );C 2

with semantics
D[rae proc I, U, , );C,
and proc 12.U 2 , );C 2

andproc InUn,);Cn] ru = u(p"P2, ... ,PnJl,,12, ... ,ln)


whererec P, = AC, e, . C[C,] r[e, ,p, ,P2, ... ,Pn/l,l, ,12, ... ,ln] C,
a,"!d P2 = Ac 2e 2 • C[C 2] r[e 2,p, 'P2,. .. ,Pn/ I,I, ,12,···,ln] Cz

and Pn = Acne n• C[Cn] r[en,p, ,P2,···,Pn/ I,I, ,12,···,ln] Cn


Recursive functions rae fun 1(1, );E can be handled analogously to re-
cursive procedures:
D[fun HI,);E]ru = u(f/l)
where f = Ak e. E[E] r[e/l,] k
102 The Denotational Description of Programming Languages

D[recfun I(I,);E] ru = u(f/l)


whererecf = Ake.E[E]r[f,e/l,I,]k

~/ N.B.

8.2.1. Recursive functions in ALGOL 60 and PASCAL


In both ALGOL 60 and PASCAL the bodies of functions are commands
(not expressions) and the results are returned by assigning to the
functions name. For example in PASCAL a squaring function could be
defined by:
function square (n:real):real;square : = n x n
An initially plausible semantics to handle functions which return results in
this way is as follows:
(j) The function name is made to denote a new location (initialized to
unassigned say) within its body.
(ij) When control leaves the function body the location denoted by its
name has its contents passed as the result of the call. If this con-
tents is still unassigned an error occurs.
It is routine (but messy) to write a semantic clause which expresses this
and we leave it as an exercise for the reader. Notice however that this ap-
proach cannot be combined with the semantics of recursive functions dis-
cussed in the previous section since the function name would have to be
bound to both a function value (to handle recursive calls) and a location
(to handle the returning of results). To cope with recursive functions we
would have to distinguish result returning assignments- i.e. assignments
to the function name-from ordinary ones. The former could then (for ex-
ample) be treated as assignments to some 'system variable'. To write a
semantics to model this would be straightforward but very messy-the
messiness of the semantics reflecting the messiness of the construct.

8.3. Static and dynamic binding


In some languages (for example LISP and POP-2) procedure (or function)
bodies are evaluated in the call time rather than the declaration time en-
Various Kinds of Procedures and Functions 103

vironment. This is called dynamic binding to distinguish it from the static


binding of ALGOL 60 and PASCAL (and SMALL). In implementation
terms dynamic binding corresponds to using the 'dynamic chain' for look-
ing up identifiers-no 'static chain' is needed.
An example illustrating the difference between the two kinds of bind-
ingis:
begin const x = 1;
proc p(y);output (x + V);
begin const x = 2;
p(3)
end
end
With static binding 1 + 3 = 4 is output whereas with dynamic binding
2+3=5 is output. To see this note that when p is declared x= 1 but when
p is called x=2.

8.3.1. Semantics of binding


To describe the semantics of dynamic binding we simply change pro-
cedure values so they can be passed the call time environment and
change the semantic clause for calling to pass it (functions are analogous
to procedures and so we shall not bother to discuss them). Thus:
Proc = Env-Cc-Ec

"
D[proc 1(1, );C] r u = u((lr' c e. C[C] r'[e/I,] c)/I)

N.B./
C[E,(E 2 )]rc = E[E,1r;Proc1lp.E[E 2 1r;prc
I
N.B.

More generally one can imagine procedures whose bodies are evaluated in
a mixture of the declaration and call time environments. For example for
any function mix: Env x Env-Env we could have procedures defined by:
104 The Denotational Description of Programming Languages

D[proc 1(1, );C] r U = u((Ar' c e. C[C] mix(r,r')[e/l,] c)/1)


Static and dynamic binding are special cases of this:
(i) Static binding corresponds to mix = A(r,r'). r
(ii) Dynamic binding corresponds to mix = A(r,r'). r'.
An example of a language with a non-trivial mix function is INTERLISP in
which mix depends on the 'alist' parameter to the function FUNCTION-
(FUNCTION F (X, ... Xn)) creates a function value with X" .. "Xn bound
statically and all other free identifiers bound dynamically, i.e.
mix = A(r,r').r'[rX" ... ,rXn/X" ... ,Xn]
(N.B. This discussion of INTERLISP is oversimplified and approximate!)

8.3.2. Advantages and disadvantages of dynamic binding


The main advantages of dynamic binding are:
(i) With dynamic binding recursion works automatically. Since pro-
cedure calls occur in the scope of their declaration the call time
environment will include the binding of the procedure's name to
its value. Put another way: by the time calls are done the pro-
cedure is already bound to its name and so recursive calls work.
(ii) Procedures can be written 'top down' -i.e. one can write pro-
cedures which call as yet undefined subprocedures. Parts of the
main program which do not use the undefined subprocedures can
be run directly.
(iii) On-line interactive modification and debugging is easy. If a pro-
cedure is redefined there is no need to recompile all programs
which use it. Procedures may thus be 'traced' by temporarily re-
defining them to print their arguments on entry and result on
exit.
For reasons such as these many interactive languages use dynamic bind-
ing. The main disadvantages of it are:
(iv) It is impossible to do compile time type checking since at compile
time one cannot determine what are the types of the things free
identifiers in procedures will be bound to.
Various Kinds of Procedures and Functions 105

(v) Suble and elusive 'bugs' can appear as a result of procedures or


functions being called in contexts in which some of their free
identifiers have been rebound. One cannot protect procedure
definitions against subsequent inadvertent corruption.
(vi) The conceptual distinciton between formal parameters and free
identifiers is blurred since all binding is done at call time.
On the whole people seem to feel that the disadvantages of dynamic bind-
ing outweigh the advantages-especially as most of the advantages can
be obtained with a statically bound language if a good compiler and de-
bugger are available.

8.4. Parameter passing mechanisms


In SMALL the undereferenced value of the actual parameter is bound
directly to a procedure's formal parameter. This is somewhat similar to
PASCAL's call by reference (the differences are discussed below)' In
other languages various transformations are done to the actual
parameter's value before it is bound to the formal parameter. We shall call
such transformations parameter passing mechanisms and discuss below
the main mechanisms that commonly occur.

8.4.1. Call by value


Call by value, as discussed here, is found in most common languages in-
cluding ALGOL 60 and PASCAL. The value of the actual parameter is first
dereferenced, the resulting R-value is then stored in a new location (local
to the procedure body) and finally this new location is bound to the formal
parameter. For example consider
begin proc P(x);x: = 1;
var x=2;
PIx);
output x
end
In SMALL the call PIx) would simply assign 1 to the variable bound to x
and so 1 would be output. In ALGOL 60 or PASCAL (with x called by
value) the variable bound to x in the body of P would be different from the
106 The Denotational Description of Programming Languages

one bound by the declaration var x =2. Thus no side effect would occur
and 2 would be output.
To model call by value we must arrange that inside a procedures body
the formal parameter denotes a new reference to the dereferenced actual
parameter value. This can be described in at least three ways
(j) The semantic clause for procedure calling can do the necessary
dereferencing and creation of a new location and then pass this
new location to the procedure value which binds it directly.
C[E, (E 2 )] r C =
E[E,] r;Proc? Ap. E[E 2 ] r;deref;ref;p c
D[proc 1(1, );C] r u = u(p/l)
where p = AC e. C[C] r[e/l,] C
(ii) The procedure value can do the dereferencing and creation of a
new location - in this case the semantic clause for procedure
calling would just pass the undereferenced actual parameter value.
C[E,(E 2 )]rc = E[E,]r;Proc?Ap.E[E 2 ]r;pc
D[proc 1(1, );C]r u = u(p/l)
wherep = AC. deref;reHe.C[C] r[e/l,]c
(iii) The semantic clause for procedure calling could do the de-
referencing and the procedure value the allocation of a new
location:
C[E,(E 2 )]rc = E[E,] r;Proc?Ap.E[E 2 ] r;deref;pc
D[proc 1(1, );C]r u = u(p/l)
where p = AC. ref Ae. ere] r[e/l,] c
For languages like ALGOL 60 and PASCAL each of these three ap-
proaches would be equivalent since procedures are always entered im-
mediately after being called. In other languages, POP-2 and SL5 [Hanson]
for example, it is possible to bind a procedure to its actual parameters but
not execute the body. The execution of the resulting 'closure' can then be
invoked separately-perhaps much later. In cases like this each of the
three semantics above is different:
(j) With this dereferencing and storage allocation (via ref) occur at
parameter binding time so if they cause errors the program will
stop then.
Various Kinds of Procedures and Functions 107

(ij) With this dereferencing and storage allocation are done not when
the actual parameters are bound but when the resulting closure
is executed - thus errors will occur then.
(iii) With this dereferencing errors would manifest themselves at para-
meter binding time and allocation errors at closure execution
time.
In fact in POP-2 everything is done when the parameters are bound (as in
(j) above) whereas in SL5 everything is done when the procedure body is
executed (as in (ii) above). Thus which of these three descriptions to use
in general depends on the language being described. When they are equi-
valent-as with ALGOL 60 and PASCAL-we shall adopt approach (ii) for
the following two reasons:
(a) The parameter passing mechanism is usually specified in the pro-
cedure declaration (for example in PASCAL
procedure p(var x:real) ... for variable parameters and
procedure p(x:real) ... for value parameters). Thus conceptually
the semantics of parameter passing should be embedded in the
procedure value
(b) It is nice to have a single semantic clause for procedure calls,
namely:
C[E 1 (E 2 )]rc = E[E 1 1r;Proc?Ap.E[E 2 ]r;pc

If we put the semantics of parameter passing into the semantic


clause for calls then we would require a messy cases switch in its
right hand side, for example:
C[ E1(E 2)] r C =
E[E 1 1 r;Proc? Ap. varparameter p-E[E 2 1 r;p c,
valueparameter J?-E[E 2 1 r;deref;ref;p c

As we shall see in the section on calling mechanisms this sort of


thing is sometimes inevitable- however when it can be avoided we
shall do so.
(a) and (b) above are typical of the sort of 'reasoning' one often has to use
to decide between equivalent formal descriptions. The decision must be
108 The Denotational Description of Programming Languages

based on judgements both of conceptual appropriateness (e.g. (a)) and


technical convenience (e.g. (b)),

8.4.2. Call by reference


We shall use the phrase call by reference for any parameter passing
mechanism in which if the actual parameter value is a location then it is
bound directly (i.e. undereferenced) to the formal parameter. Different
kinds of call by reference correspond to different actions taken when the
actual parameter value is not a location.

8.4.2.1. Simple call by reference


With simple call by reference the actual parameter values is bound directly
to the formal parameter- nothing is done if this value is not a location.
This is the parameter passing method of SMALL and is described by:
D[proc 1(1, );C1r u = u(p/l)
where p = AC e. C[C] r[e/l,] c

8.4.2.2. PASCAL call by reference


Formal parameters of PASCAL can be decorated with var as in
procedure P(var x:real) .... In this case the value of the corresponding
actual parameter must be a location (if not an error results) which is bound
directly. The semantics is thus:
D[proc 1(1, );C] r u = u(p/l)
wherep = Ac.Loc7AI.C[C1r[l/I,lc

8.4.2.3. Fortran call by reference


In (some implementations of) FORTRAN parameters are passed in the
following way:
(i) If the value of the actual parameter is a location then this is bound
directly to the formal parameter.
(ii) If the value of the actual parameter is not a location then a new
location containing it is bound to the formal parameter.
This differs from PASCAL call by reference because of (ii) - in PASCAL
Various Kinds of Procedures and Functions 109

an error would occur. The semantics of FORTRAN call by reference is:


Dlproc 1(1, );C1r u = u(p/l)
where p = Ac. fortran AI. C[C]r[I/I,] c
where fortran = Ak e • isLoc e-k e,ref k e

The function fortran: Ec-Ec takes a continuation k and produces a


continuation which when sent a value e tests it to see if it is a location, if
so sends it straight on to k, otherwise it sends a reference to e to k.

8.4.3. Call by value and result


In this parameter passing method:
(j) The actual parameter value e must be a location.
(ii) Inside the procedure body the formal parameter denotes a new
location, I say, containing the contents of e.
(iii) On leaving the procedure the contents of I is copied into e.

This is called call by value and result since (j) and (ii) are similar to call by
value (with the extra requirement that the actual parameter value must be
a location) and (iii) is sometimes known as call by result. The semantics is
routine but slightly messy:
D[proc 1(1, );C]r u = u(p/l)
where p = Ac e. (Loc7;deref;ref AI. C[C] rh/l,] (cont(updatee ch)) e
The definition of p has the form AC e. ( ... e ... )e-we cannot cancel the e's
because of the inner occurrence (in update e c).
The purpose of call by value and result is to simultaneously gain ad-
vantages of both call by reference and call by value. Call by reference has
the virtue that results can be returned via formal parameters whereas with
call by value this does not work but, as compensation, the dangers of in-
advertent sharing are eliminated. For example consider the following pro-
gram to compute the factorial of n and place the result in the location
denoted by res.
proc fact (n,res);
res: = 1;
while not (n =0) do (res: = n x res;n: = n-1)
This program has a lurking bug: suppose one calls fact (x,x) (hoping to
110 The Denotational Description of Programming Languages

replace x's value by its factorial) then things will go wrong. Inside the call
fact (x,x) both n and res will denote the location denoted by x; thus the
initialization res: = 1 will also set the value of n to 1 and the result of the
call will, after one iteration, be to assign 0 as the contents of this location.
Thus fact (x,x) is equivalent to x: =O!
If we used call by value and result on fact then things would work as
planned since the effect of a call fact (x,x) would be
(j) New locations, each initialized to x's R-value would be bound to n
and res.
(ij) The initialization res: = 1 would only effect the location bound to
res.
(iii) The while command would correctly build up n! in res.

Thus with call by value and result we get the best of both worlds-the
ability to return results via parameters and the security of call by value.
This has to be balanced against the ugly and ad hoc nature of the
mechanisms.

8.5. Procedure calling mechanisms


In the previous section we assumed that when a procedure is called the
first thing that happens is that the actual parameter is evaluated. Thus:
C[E 1 (E 2)]rc = E[E,] r;Proc7Ap.E[E 2] r;pc
In many languages there are kinds of procedures which, when called, do
not 'fully evaluate' the actual parameter. Examples we shall discuss later
are ALGOL 60 call by name and LISP FEXPRs. The general shape of the
semantics of such a call is:
C[E, (E 2)] r c = E[E,1r;Proc7 Ap. pc ('partially evaluated E/)
The evaluation of the actual parameter E2 is completed each time the
formal parameter denoting it is encountered during the execution of the
procedure's body.
Before getting down to details a note on our terminology is appropriate.
For us procedure calling is the processing of the actual parameter expres-
sion to yield some (partially or fully evaluated) value, whereas parameter
passing is the further processing of this value to yield whatever is eventu-
Various Kinds of Procedures and Functions 111

ally bound to the formal parameter. Different procedure calling methods


are modelled via different semantic clauses for procedure calls whereas
(as we saw in the last section) different parameter passing methods cor-
respond to different procedure values and so are modelled with different
semantic clauses for procedure declarations. It would, perhaps, have
been better to have given the various parameter passing methods names
such as "pass by value" or "pass by reference" and so have been able to
use "call by" exclusively for calling methods-however we defer to con-
vention. The reader should note that our terminology is not always stan-
dard though.

B.S.1. Call by closure (ALGOL 60 call by name)


With this the closure E[E 21 r of the actual parameter E2 in the call time
environment r is passed. The semantics of calls is thus:
C[E, (E 2)] r c = E[E,1 r;Proc? Ap. pc (E[E 21 r)
E[EJ r has type Ec-Cc hence we define the domain Closure of closures
by:
Closure = Ec-Cc
Closures are usually implemented using parameterless functions called
"thunks"; thus it is not surprising that the denotation of closures has the
same type as that of parameterless functions.(Closure = Fun o -see
8.1.1.).
With call by closure, closures must be made denotable and the
semantic clause defining Ell] must be fixed so that if I denotes a closure
then it is evaluated. Thus
Ell] r k = (r I = unbound)-err,
isClosure (r I)-r I k,k(r I)
This ensures that when a formal parameter bound to a closure is en-
countered in the procedure body the corresponding closure is evaluated.
Notice that this clause is similar to that required for parameterless function
calls.
112 The Denotational Description of Programming Languages

As an example consider:
begin proc P(x);(output x;output x);
P(read)
and
If P is called by closure then, if r is the call time environment, E[reatI] r is
bound to x and so at call time no reading is done. During the execution of
P's body each evaluation of x will cause E[read] r to be evaluated and
thus the two occurrences of output x will cause the first two items from
the input to be output. If call by value had been used then at call time the
first item on the input would have been read and bound to x and then this
would have been output twice.

8.5.2. Call by text (LISP FEXPRs)


With this the text E2EExp of the actual parameter is passed and thus the
semantics is:
C[E,(E 2)]rc = E[E,]r;Proc7Ap.pcE 2
Formal parameters can now denote members of Exp and so these must
be included in the denotable values Dv. To ensure that when text de-
noting formal parameters are evaluated the corresponding expression text
is evaluated we must define:
Em r k = (r I = unbound)-err,
isExp(r I)-E[r I] r k,k(r I)
Notice that with call by text the environment in which actual parameters
are eventually evaluated depends on where the corresponding formal
parameters occur in the procedure body and so might vary. With call by
closure this environment is fixed as the call time one. To illustrate this
consider:
begin const x = 1;
proc P(y);begin const x = 2; output y and;
P(x)
and
If P is called by text y will be bound to x in P and since when the com-
Various Kinds of Procedures and Functions 113

mand output y is executed x denotes 2; 2 will be output. With call by


closure E[x] r would be bound to y where r is the call time environment in
which x denotes 1; when output y is done E[x] r is evaluated yielding x's
value in r-Le. 1-which is then output.

8.5.3. Call by denotation


With this the denotation E[E 2 ] of the actual parameter is passed, thus:
C[E,(E 2 )]rc = E[E,]r;Proc7lp.pc(E[E 2 ])

Formal parameters can now denote members of Ed = Env-Ec-Cc


(since E[E 2 ]EEd) which must therefore be included in the denotable
values. We must also ensure that denotation denoting formal parameters
get evaluated when used, thus:
Em r k = (r I =unbound)-err,
isEd(r I)-r I r k,k(r I)
The difference between call by text and denotation is fairly subtle since, as
we have described them, they would yield identical final answers. If we
did not always force the evaluation of texts and denotations, i.e. if we
reverted to
Em r k = (r I = unbound)-err,k(r I)
then texts but not denotations could be subjected to various syntactic
processes like compiling optimising etc. e.g.
proc run(x);eval (optimise x);
where
E[eval E] r k = E[E] r;Exp7 le. E[e] r k
All one could do with denotations would be to 'run' them, for example
using run E where:
E[run E] r k = E[E] r;Ed7 le. e r k

8.5.4. Quotation constructs


Consider expressions text E, denotation E and closure E with
114 The Denotational Description of Programming Languages

semantics:
E[text E] r k = k(E)
E[denotation E] r k = k(E[E])
E[closure E] r k = k(E[E] r)
Using these we can simulate the various ways of calling a procedure P on
actual parameter E as follows:
call by text: P(text E)
call by denotation: P(denotation E)
call by closure: P(closure E)
In LISP the expressions (QUOTE E), (COMPILI: E) and (FUNCTION E)
correspond roughly to text E, denotation E and closure E.

8.6. Summary of calling and passing mechanisms


Gathering together the various mechanisms discussed in the previous two
sections we see that the general sequence of events when a procedure is
invoked is:
At call time: The actual parameter is evaluated to some degree
and passed to the procedure. At this time the
distinction between calls by text, denotation and
closure is made.
At entry time: The value passed to the procedure is transformed
to a denotable value to be bound to the formal
parameter. At this time the distinction between
calls by value and reference is made.
At execution time: The body of the procedure is executed. If any
formal parameters denote texts, denotations or
closures then it is during this execution that they
are evaluated.
At exit time: The store resulting from the execution of the body
is possibly transformed (for example with call by
value and result) and then passed to the continua-
tion following the call.
Various Kinds of Procedures and Functions 115

The general form of the semantics is:


C[E,(E 2)]rc = E[E,1r;Proc1Ap.callpcE 2 r
D[proc 1(1, );C1r u = u(p/l)
where p = AC. entry Ae. C[C] r[e/l,] (exit c)
Em r k = (r I =unbound)-err.
isnoteval(r I)-eval(r I) k.k(r I)

where, for example, ALGOL 60 call by value corresponds to:


call p c E2 r = E[E 2] r;p c
entry k = deref;ref;k
exitc =c
whereas ALGOL 60 call by name corresponds to:
call p c E2 r = pc;E[E 2] r
entry k =k
exitc =c
evalek = (ek)
Writing the semantics like this shows that many possibilities exist for
manufacturing new kinds of procedures by combining different calling
and parameter passing mechanisms. For example we could combine a call
by closure calling mechanism with a call by value like parameter passing
mechanism to get a system in which formal parameters denote references
to the closures of actual parameters. To do this we would take:
call pc E2 r = pc (E[E 2] r)
entry k = ref k
Finally notice that everything we have said about procedures applies
analogously to functions.

8.7. Procedure and function denoting expressions (abstractions)


In ALGOL 60 and PASCAL one can pass a procedure (or function) as an
actual parameter, but the only way of doing this is to write the procedure
(or function) name directly. In other languages there are expressions other
than identifiers called abstractions which evaluate to procedure (or func-
116 The Denotational Description of Programming Languages

tion) values. For example we might add to SMALL expressions pr I;C and
fn I;E with semantics:
E[pr I;C] r k = k(AC e. C[C] r[e/l) c)
E[fn I;E] r k = k(Ak' e. E[E] r[e/l) k')
Notice that with these constructs we can dispense with separate pro-
cedure and function declarations since proc 1(1, );C is exactly equivalent
to const I = pr I,;C and fun 1(1, );E is exactly equivalent to
const I =fn I, ;E. For example
D[const I = pr I, ; C] r u
= E[prl,;C]rAe.u(e/l)
= (Ae. u(e/l)) AC e. C[C] r[e/l,] C
= U(AC e . C[C]r[el I,] cl I)
= D[proc HI, );C]r u
Because the procedure (or function) denoted by pr I;C (or fn I;E) has no
name these expressions are sometimes called anonymous procedures (or
functions), Just as ordinary procedures (or functions) can have a variety
of calling, parameter passing and identifier binding mechanisms so can
abstractions. For example abstractions with call by value and dynamic
binding would have semantics:
E[pr I;C] r k = k(Ar' c . deref;ref AI. C[C] r'(1/1) c)
In LISP (QUOTE (LAMBDA (X) E)) is dynamically bound whereas
(FUNCTION (LAMBDA (X) E)) is statically bound (the latter is thus
more analogous to fn X; E then the former).

S.S. Declaration binding mechanisms


A command of the form begin var 1= E;C end is similar in effect to
(pr I;CHE) where the parameter I is called by value. Both evaluate C in an
environment in which I 'locally' denotes a reference to E's R-value. Thus
variable declarations are related to call by value. In fact to any parameter
passing mechanism there corresponds a kind of declaration, for example
to abstractions pr I;C with semantics
E[pr I;C] r k = k(AC. entry Ae. C[C] r[e/l) c)
Various Kinds of Procedures and Functions 117

there corresponds declarations dec 1= E with semantics


D[dec 1= E] r u = E[E] r;entry Ae. u(e/I)
Thus var I = E corresponds to procedures with
entry k = deref;ref;k
i.e. to call by value; whereas const 1= E corresponds to procedures with
entry k = deref;k
Procedures of this sort can be found lurking in ALGOL 68.
This semantic correspondence between declarations and parameter
passing is often obscured in programming languages. For example
declarations might exist without their being any way of defining pro-
cedures with the corresponding parameter passing mechanism (or vice
versa). PASCAL has been criticised on these grounds [Tennent] and the
same criticisms can be applied to SMALL as follows: Although there are
constant declarations const 1= E there is no corresponding parameter
passing method. Thus in SMALL (and PASCAL) there is no way of
declaraing a procedure P so that
begin const I = E; C end
is equivalent to P(E).
9. Data Structures
In this chapter we discuss four kinds of data structures: references
(pointers), arrays, records and files. References have a single component,
arrays and records have several components (the number being deter-
mined when the structure is declared) and files have a dynamically varying
number of components. The components of arrays are accessed with a
subscript (e.g. A[3]) whereas the components of records are accessed
with a name (e.g. R.three). The components of files are accessed by
moving a 'window' up and down the file.
These four data structures are precisely those that are found in
PASCAL. We illustrate the main ideas (but not all the details) in the con-
text of SMALL (indicating here and there how things differ in PASCAL).

9.1. References
A reference to a value is just a location holding it. Suppose ref 1= E is a
declaration which binds 1 to a reference to E's value and cont E is an ex-
pression whose value is the contents of the reference denoted by E, then:
D[ref 1= E) r u = E[E] r;refAl. uhfll
E[cont E1r k = E[E] r;Loc?;cont;k
Thus if E's value is a location I then ref 1= E binds 1 to another location
containing I. This differs from var 1= E which would bind 1 to a location
containing I's contents. To illustrate cont E suppose identifiers I, and 12
denote locations I, and 12 respectively, then:
C[cont I,: = cont 12 ] r c
= E[cont I,] -:;Loc?
AI. R[cont 12 ] r;update Ie
= Ell,] r;Loc?;cont;Loc?
AI. E[cont 12 ] r;deref;Rv?;update I c
= Ell,] r;Loc?;cont;Loc?
AI. Ell 2] r;Loc?;cont;deref;Rv?;update I c
Thus we update the location which is the contents of I, with the de-
referenced contents of 12 - i. e. the contents of the contents of 12 .

118
Data Structures 119

9.2. Arrays
An array is a data structure whose components are accessed with sub-
scripts. References are like the special case of arrays with just one
component. To declare an array A with components A[1], A[2], """' A[n]
we shall use the declaration array A[1;n]. Normally each component of
an array is a variable which is initially unassigned; however constant
arrays are possible. In the former case array components can be changed
by assignment, in the latter this would cause an error. We shall just deal
with arrays of variables subscripted by integers (as in ALGOL 60); other
possibilities such as constant arrays, or arrays subscripted by other
values, can be handled in an analogous way.
Thus, in general, a declaration array I[E, ;E 2 ] binds I to an array value,
and this value contains:
(i) The lower bound, n, say, specified by E, .
(ij) The upper bound, n 2 say, specified by E2 :
(iii) The n 2 -n, + 1 locations comprising the array.

A suitable domain Array of array values is thus:


Array = Num x Num x Loc*

J,
bound
1
upper
101.tions

bound
Each time array HE, ;E 2] is evaluated E, will be evaluated to get n" E2
will be evaluated to get n 2 and then new locations I" 12 , """' In (where
n=n 2 -n, +1) will be obtained and the array value (n"n 2 ,1, .1 2 . . . . . In )
bound to I. To access array components we use expressions of the form
E, [E 2 ]; the value of E, must be an array value and the value of E2 an
integer between the lower and upper bounds. E, [E 2 ] then denotes the
appropriate location. To simplify the semantic clauses for array declara-
tion and accessing it helps to first define some functions.
120 The Denotational Description of Programming Languages

9.2.1. news
Informal description
news: Num-Store-[[Loc* x Store] + {error}]
(I, ..... In' if s has unused
news ns = j s[unassigned, ... ,unassignedh, , ... ,In]) locations I" ... ,In
error othelWise
Thus news n s gets n new locations from s and returns them together
with a store in which they each have contents unassigned.
Formal description
news = An s. (n = 0)-(( ),s),
((news (n-1) = (I*,s'))-
((new s' = 1)-(1 • 1* ,s'[unassignedh]) ,error),
error)

9.2.2. newarray
Informal description
newarray:[Num x Num]-Ec-Cc
newarray (n, ,n 2) ks = kes' where e is the array value (n, ,n 2,1, ..... I n )
(n = n 2-n, + 1) and s' = s[unassigned, ... ,unassignedh, , ... ,In]. Thus
newarray (n, ,n 2 ) k sends a new array value on to k together with a
modified store.
Formal description
newarray = A(n"n 2)ks.
(n, >n 2)-error,
(news (n 2-n, + 1) s = (1* ,s'))-k(n, ,n2 ,1*) s' ,error

9.2.3. subscript
Informal description
subscript:Array-Ec-Ec
I
Data Structures 121
if I is the eth component of array a
subscript a k e = kI
err otherwise
subscript a k e checks that e is a number between the bounds of a and if
so sends to k the appropriate location.
Formal description
subscript = A(n"n 2 ,1*)ke.
isNum e-(between (n, ,n 2 ) e-k(el (e-n, + 1h*),err),err
where between (n, ,n 2 ) e is true if n, is less than or equal to n 2 and e is
between them, and false otherwise.
We can now write the semantics of array declarations and accesses:
D[array I[E, ;E 2 11 r u
= R[E,1 r;Num7
An, . R[E 2 1 r;Num7 An 2 • newarray (n, ,n 2 ) Ae. u(e/U
E[E, [E 2 11 r k
= E[E,] r;Array7 Ae . R[E 2 1 r;Num 7;subscript e k
In PASCAL array bounds are fixed at compile time so in array I[E, ;E 2 1
the bounds expressions E, and E2 must be evaluable then.

9.3. Records
A record is a data structure with a fixed number of named components or
fields. We shall use record HI, , ... ,In' to declare a record named I with
fields named I, , ... , In. As with arrays we shall assume each component is a
location which can be assigned to. To access component I of a record de-
noted by E we use the expression E. I. To illustrate records here is a
declaration of a record named family with field names father, mother,
son, daughter:
record family (father, mother, son, daughter)
To initialize this we could do:
family. father : = "Phil";
family. mother : = "Liz";
family. son : = "Charlie";
family. daughter : = "Annie";
122 The Denotational Description of Programming Languages

(assuming quotations such as "Phil" are valid expressions).


In PASCAL there is a command with E do C which when evaluated
first evaluates E and checks its value is a record; then C is evaluated in an
environment where each field name of this record is bound to the cor-
responding field. The initialization of family could thus be achieved more
succinctly by:
with family do father: = "Phil";
mother: = "Liz";
son: = "Charlie";
daughter: = "Annie"
Notice that E. I and with E do C are similar: both cause the evaluation of
a construct -I in the case of E. LC in the case of with E do C - in an
environment in which the field names of the record denoted by E denote
the corresponding fields. Thus both E. I and with E do C are kinds of
block - E acts like a declaration (of field names) and I (in the case of E. I)
or C (in the case of with E do C) are the bodies. This observation sug-
gests two things:
(j) That E.I be generalized to E, . E2 -i.e. the 'body' of the 'block'
allowed to be a full expression. This would then allow things like
E. Uf ... then mother else father).
(ij) That a syntax should be chosen which expresses the similarity bet-
ween E.I and with E do C. For example either E.I and E. C or
with E do I and with E do C.

For the semantics of records we must define a domain Record of record


values. Since records specify a location for each field name an appropriate
domain is:
Record = Ide-[Ov + {unbound}] = Env
By using Ov here rather than just Loc we allow for the possibility (not
discussed above) that record fields can be other things than locations. If
reRecord then r I = unbound means I is not a field name of e.
The semantics of the various constructs is now routine:
Dlrecord 1(1" ... ,l n )] r u s
= (news n s = (I, . ' ..• In,s'))-u(h, , ... ,In/l, , ... ,In)/I) s' ,error
Data Structures 123

E[E, • E2] r k
= E[E,] r;Record7 Ar' • E[E 2] r[r'] k
C[with E do C] r c
= E[E] r;Record7 Ar' . C[C] r[r'] k
Notice how the semantics brings out the similarity between E, • E2 and
with E do C and the similarity between these and begin D;C and. Notice
also that a declaration record 1(1, , ... ,In) somewhat resembles a procedure
declaration proc 1(1, , ... ,In);C but without the body C. In SL5 [Hanson]
procedures can be called but 'not entered' (the formal parameters are
bound to actual parameters but the body is not executed) and one use of
this is to enable procedures to double up as records. One can, in effect,
declare proc 1(1, , ... ,In) (no body) and then a 'call' I(E, , ... ,En) creates an
entity which can be used like a record-the formal parameters being the
field names and the actual parameters the contents of the fields.
9.4. Data structure valued expressions
Instead of having declarations ref 1= E, away I(E, ;E 2] and record 1(1, , ... ,In)
one could have expressions ref E, array [E, ;E 2] and record (I, , ... ,In) so
that:

raf I = E is equivalent to const I = ref E


array I[E, ;E 2] is equivalent to const I = array [E, ;EJ
record 1(1, , ... ,In) is equivalent to const I = record (I, , ... ,In)
Thus the expressions are related to the declarations as pr I, ;C is related to
proc 1(1, );C (see 8.7.). The semantics of data structure valued expres-
sions is straightforward:
E[ref E) r k= E[E] r;ref k
E[array [E, ;E 21J r k =
R[E,] r;Num? An, . R[E 2] r;Num7 An 2 . newarray(n, ,n 2) k
E[record (I, , ... ,In)] r k =
AS • (news n s = h, ..... In • s' ))-kh, , ... ,Inl I, , ... , In) s' ,error
Using expressions like these has the advantage that it makes apparent
that when we declare a data structure in SMALL we are really declaring a
constant with variable components.
124 The Denotational Description of Programming Languages

Data structure variables (which are what occur in PASCAL) would be


declared by:
var I = ref E
var I = array [E, ;E 2 ]
var I = record (I, , ... ,I n )
To ensure that such variables work we must dereference them when
accessing components of their contents. For example instead of
E[E, . E2]r k = E[E, ]r;Record? Ar' . E[E 2]r[r'] k
we must have:
E[E, . E2l r k = R[E,l r;Record? Ar' . E[E 2l r[r'] k
+
N.B.
The changes to the other clauses are analogous- i.e. replace some occur-
rences of E by R. However notice that confusion can arise between
variables (locations created using var I = E) and references (locations
created using ref E). For example in our semantics of var 1= E, E's value
is dereferenced and so both var x = 1 and var x = ref 1 would bind x to a
location containing 1. One needs to get clear-and this is really a language
design problem-whether variables and references are the same or
different and hence, for example, whether R should only dereference
locations created by variable declarations or all locations, including those
obtained by evaluating ref E. In short one must decide which de-
referencing coercions are to be automatic.

9.5. Files
In this section we describe how to model PASCAL-like file structures.
Such files are sequences of values such that:
(i) The number of components can vary dynamically- unlike arrays it
is not fixed when the structure is created.
Oi) Component values are accessed via a 'window' which can be
moved up and down the file and whose position defines the effect
of 'reading' and 'writing' operations.
(iii) Associated with each file is a buffer variable which provides the
Data Structures 125

channel through which values are read or written.


We shall use file I, with buffer 12 to declare a file named I, with buffer
variable denoted by 12 , The effect of the declaration is to make both I,
and 12 denote new locations, I, and 12 say, where I, holds an 'empty' file
value and 12 holds unassigned. We call I, the file variable and 12 the
buffer variable.
If E is an expression whose value is a file variable then the meanings of the
expression eof E and commands reset E, rewrite E, get E and put E are
informally as follows:
eofE:
If the window is beyond the end of the file then eof E evaluates to true; if
the window is within the file then it evaluates to false.
reset E:
The window is reset to the beginning of the file and the first component is
assigned to the buffer variable (if the file is empty the buffer is assigned
unassigned),
rewrite E:
The file is emptied, the window is reset to the beginning and the buffer
variable is assigned unassigned.
get E:
The window is advanced to the next component and the value of this
component is assigned to the buffer variable. If the window is not within
the file an error occurs; if it is at the last element unassigned is assigned
to the buffer variable.
putE:
The contents of the buffer variable is appended to the file. The window
must be at the first free space in the file otherwise an error will occur.

In PASCAL instead of having these constructs one has equivalent 'pre-


declared' functions and procedures. To model this one just modifies the
semantic clause for P by replacing the empty environment () by one
containing the desired initial bindings (see 6.2.3.1.).
The domain of file values must model:
126 The Denotational Description of Programming Languages

(i) The sequence of component values (we shall assume these are
R-valuesl.
(ij) The position of the window.
(iii) The buffer variable.

We thus define:
File = Rv* x Num x Loc

compon~nts ~Uf~ variable


window
position

The semantics of file declarations is then:


D[file I, withbuffer 12 ] r u s =
(news2s= (I, .1 2,S'))-u(l, ,12/1, ,1 2)(s'[(( )'1,1 2)!1,]),error
We get two new locations I, and 12 and then pass the binding I, ,1 2/1, ,1 2 ,
together with a store in which I, holds the empty file ((), 1,1,) and 12 holds
unassigned, to the rest of the program u (see 9.2.1. for details of news).
The semantics of eof E is:
E[eof E] rk =
ElE] r;Loc7 AI S. (s I = (e* ,n,I'))-k(n>length e*) s,error
We evaluate E, test that its value is a location holding a file value (e* ,n,!,)
and then send the result of the test n>length e* to the rest of the
program k.
Semantic clauses for the other constructs can easily be written in a
similar way to this, but unless one is careful messy details rapidly obscure
what is going on. To simplify the semantics it is convenient to define the
domain:
Filestate = Rv* x Num x [Rv + {unassigned}]
A member (e* ,n,e) of Filestate represents the 'state' of a file whose
components are e*, whose window position is n and whose buffer vari-
able holds e. The effect of reset E, rewrite E, get E and put E can now
Data Structures 127

be described with the 'state' transformations resetf, rewritef, getf and


putf which are the functions of type Filestate-[Filestate + {error}] de-
fined by:
resetf(e* ,n,e) = null e*-(e*, 1,unassigned) , (e*, 1,e11 e*)
rewritef(e*,n,e) = (( )'1,unassigned)
getf(e*,n,e) = (n>length e*)-error,
(n = length e*)-(e*,n + 1,unassigned),
(n<length e*)-(e*,n + 1,el (n + 1) e*)
putf(e* ,n,e) = (n = length e* + 1)-(e*. e,n + 1,unassigned),
error
These functions constitute a direct semantics of the file operations. To
'interface' them to a continuation semantics we define a function:
do:[Filestate-[Filestate + {error}1J-Cc-Ec
where informally
do f c e s = c s' where s' is a store derived from s by doing the file
state transformation f on the file in the file vari-
ablee.
do is analogous to the function mkconfun discussed in 5.7. The formal
description of it is rather messy:
dofces = isLoce-
((s e = (e* ,n,I))-
((s I = e')-
((f(e* ,n,e') = (e*' ,n' ,e"))-
c(s[(e*' ,n' ,1) ,e" / e,I]),
error),
error) ,
error) ,
error
If e is a file variable whose contents in s is the file (e* ,n,I), and if the buffer
variable I has contents e', and if f(e* ,n,e') = (e*' ,n' ,e") then the file vari-
able e is updated with the new file value (e*' ,n' ,1) and the buffer variable I
128 The Denotational Description of Programming Languages

is updated with e and the resulting store passed to c.


U

We can now write the remaining semantic clauses very simply:


C[reset E) r c = E[E] r;do resetf c
C[rewrite E) r c = E[E] r;do rewritef c
C[get E) r c = E[E] r;do getf c
C[put E) r c = E[E] r;do putf c
The 'real meaning' is clearly expressed in the definitions of resetf,
rewritef, getf and putf whilst all the messiness is factored out into the
definition of do.
10. Iteration constructs
In SMALL the only iteration construct is while E do C. In this chapter we
look briefly at thre~ others: repeat C until E,
loop C then I, :C, ; ... ;In:Cn end together with event I and for-statements
for I: =F do C. The first two of these are minor variations on constructs
we have already met; for-statements are a bit more tricky since we have to
devise denotations for 'for-lists' F.

10.1. repeat C until E


This construct is found in PASCAL and has semantics:
C[repeat C until E] r e =
C[C] r;R[E] r;Bool?;eond(e,C[repeat C until E] r c)
We evaluate C then evaluate E; if E's value is false then we evaluate
repeat C until E again, but if it is true we proceed to the rest of the pro-
gram e.

10.2. Event loops


The construct described in this section has been advocated as good for
'structured programming' [Knuth 74]. It is very similar to the escapes from
commands described in 7.1.1. The idea is that one has a command
loop C then I, :C, , ... ,In:Cn end which traps escapes invoked byevalu-
ating the command event I. The body C of the event loop is repeatedly
evaluated until an event event I; is encountered whereupon the postlude
C; is done and control proceeds to the rest of the program. Thus:
C[loop C then I, :C, , ... ,lnCn end] r e =
C[C] r[C[C,] r e, ... ,C[Cn] r ell, , ... ,In];
C[loopC then I, :C" ... ,In:Cn end] r e
(notice the recursion)
C[event I] r e = Em r;Ce? Act. e'
(this is just like the clause for escapeto I and goto I).

129
130 The Denotational Description of Programming Languages

10.3. For-statements
We shall discuss the "for-statements" of ALGOL 6O-a special case of
this is found in PASCAL. ALGOL 60 style for-statements are commands
of the form for I: = F do C where F is a member of the syntactic domain
For of for-lists defined by:
F:: = E I E, while E2 I E, step E2 until E3 I F, ,F2
For each particular for-list it is straightforward to write a semantic clause
for the corresponding for-statement. For example it is easy to define
C[for I: = E, while E2 do C)
or C[for I: = E, step E2 until E3 do C]
or even C[for I: = E, while E2,E 3 step E4 until Es do C)
However since there are infinitely many different for-lists it is necessary
(as well as aesthetically pleasing) to have a single clause for
C[for I: = F do C) and then four separate clauses for each of the different
kinds of for-list F.
Thus we must devise denotations F[F] for for-lists F and then:
(i) Write C[for I: = F do C) in terms of F[F]
(ii) Write semantic clauses for F[F].

Before formalizing the semantics we must be clear about the meaning we


are going to express-there have been several incompatible interpreta-
tions of the original descriptions of ALGOL 60 for-statements. For us the
effect of for I: = F do Cis:
(i) If 1 denotes a location I and if F specifies a sequence of values
e, , ... ,en then C is executed n times with I successively containing
these values.
OJ) On exiting the for-statement the final contents of I is whatever the
last iteration of C left in it.
Thus, roughly speaking, the effect of for I: = F do C is equivalent to
I: = e, ;C;I: = e 2;C; ... ;I: = en;C (this does not really make sense as the e;'s
are not expressions, but perhaps it conveys the idea).
The way we shall model this is for C[for I: = F do C] to pass to F[F] the
procedure value p = Ac. update I;C[C] r c which when applied to a value
Iteration Constructs 131

stores it in I and then does C. HF] can then apply p to the arguments
specified by F (i.e. to 8, , ... ,en of (j) above). For example in
for I: = 1,2,3 do C we would apply p successively to 1,2,3.
A semantics based on this idea is as follows:

Semantic function:
F: For- Env- Proc-Cc-Cc
Then F[F] r p c applies p successively to values generated by F and then
passes control to the rest of the program c.

Semantic clauses:
C[forl:=FdoC]rc =
Em r;Loc7 AI. F[F] r (Ac' • update I;C[C] r c') c
Thus I is evaluated, its value is checked to ensure it is a location I and then
HF] r p c is done where p = AC' • update I;C[C] r c' .

We now give the semantic clauses for F[F], recall that:


F:: = E I E, while E2 I E, step E2 until E3 I F, ,F 2
then:
HE] r p c = R[E] r;p c
Thus the for-list E-a simple expression-just 'calls' p with 'actual para-
meter value' the R-value of E.
HE, while E2] r p c =
R[E,] rAe. R[E 2] r;BooI7cond(p(F[while E, do E2] r p c)e,c)
Thus E, is evaluated for its R-value e then E2 is evaluated and if it yields
true then p is applied to e with 'return address' the beginning of
while E, do E2. If E2 yields false then control immediately passes to the
rest of the program c.
132 The Denotational Description of Programming Languages

F[E, step E2 until E31 r pc =


R[E,1 r;Num?;step (R[EJ r,R[E 31 r) p c
where
step (w, ,w 2) p c n =
w, ;Numnn, . w 2;Num? An 2 .
(n-n 2) x (sign n, )<1-p(step (w, ,w 2) p c (n + n, »n,c
Here step:[Closure x Closure]-Proc-Cc-Ec where Closure = Ec-Cc
is the domain of closures discussed in 8.5.1. The details of the semantic
clause for E, step E2 until E3 are a bit intricate and if the reader wishes to
check that we have got things right he should first read 4.6.4.2. of the
modified report on ALGOL 60 [Backus et a/.J. The idea is that we evaluate
E, once and check it produces a number n which we then pass to
step (w, ,w 2) pc where w, and W 2 are the closures R[E 21 rand R[E 31 r
respectively. Then 'each time around the loop' step evaluates w, and W 2
to get numbers n, and n 2; if we have not finished ((n-n 2) x (sign n, )<1)
then p is applied to n with a 'return address' consisting of
step(w"w 2)pc(n, +n)-i.e. we increment n by n, and start again.
When we have incremented n past n 2 then control passes to the rest of
the program c. If the reader wants to, but cannot, follow this he is advised
to evaluate for I: = 1 step 1 until 2 do C. The result should be the same as
for I: = 1,2 do C which we evaluate as a simple example shortly. First the
remaining semantic clause for for-lists F, ,F2
F[F"F 21 r p c = F[F,]r p;F[F 21 r p;c
Thus F, repeatedly calls p until it is finished and then F2 repeatedly calls p
and finally control passes to c.
Here is an example to illustrate how things work:
C[for I: = 1,2 do C] r c
= Em r;Loc? AI. F[1,2] r (Ac' . update I;C[C] r c') c
Now F[1,2] r p c
= F[1]r p;F[2] r p;c
= F[1] r p (F[2] r p c)
= R[1] r;p (R[2] r;p c)
= R[1] r(p(R[2] r (p c»)
= p (R[2] r (p c» 1
= p (p c 2) 1
Iteration Constructs 133

and if p = AC' • update I;C[C] r c' then


p (p c2) 1 S
= update I (C[C] r (p C 2)) 1 S
= C[C1r (p C 2) s[111]
= C[C] r (update I (E[C] r c) 2) s[111]
= C[C] r (As' • C[C] r C s'[211]) s[111]
Thus
C[for I: = 1,2do C] res
= Ell] r;Loc1 AI. C[C] r (As' • C[C] r c (s'[211])) (s[111])

So C is first executed in the store s[111] to yield a new store s', then C is .
executed again but this time in s'[211] and then control passes to the rest
of the program c.
11. Own-variables
In this chapter we discuss the semantics of ALGOL 60 own-variables.
Since most people feel that these are badly designed we need to explain
why it is worth considering their semantics at all. The two main reasons
are:
(i) The semantics of own-variables vividly shows how an at first sight
intuitively clean construct is in fact very messy and fraught with
subtle ambiguities. Thus it illustrates one kind of insight one can
rapidly gain by attempting a formal description. It is true that the
various ambiguities were discovered without using any formal
methods, but their uncovering took several years. Had a formal
semantics been attempted the ambiguities would immediately have
revealed themselves and one would have been forced to face and
resolve them. Also semantic concepts enable one to concisely and
lucidly articulate all the possible interpretations.
(ij) The semantics of own-variables requires the use of 'position
dependent denotations' and the techniques to handle these are of
some interest in their own right. Although these techniques are
not very often useful for describing the conceptual meaning of
languages-which is what this book has concentrated on-they
are required in describing operational semantics modelling imple-
mentations.
The intuitive idea of own-variables is simple: an own-variable is a local
variable of a block which 'remembers' its value between activations of the
block. Thus if control leaves a block and then sometime later re-enters it
the contents of the own-variables on entry are what they were on the pre-
ceding exit. A typical use of own-variables is the following ALGOL 60
procedure:
integer procedure count;
begin own integer n;
count: = n;
n: = n + 1
end

134
Own-Variables 135

If we assume the own-variable n is initialized to 0 (as in modern ALGOL


60) then successive calls of count will produce successive integers start-
ing with O-i.e. the first time count is called it will return 0, the next time 1
and so on. Each time count exits the own-variable n, though
inaccessible, still exists and can be read and updated by entering the block
in which it was declared - i. e. by calling count again. Before proceeding
to the semantics of own-variables we point out that most of their uses can
be better achieved with the much cleaner within construct.

11.1 The within construct


If D, and D2 are declarations then D, within D2 is a declaration whose
meaning is that D, is 'local' to D2 -thus D, within D2 is like a block
whose body is D2 (perhaps a syntax begin D, ;D 2 end would thus have
been better). The semantics is simply:
D[D, within D 2 ] r u = D[D,] r;Ar' • D[D 2] rlr'] u
Using the within construct the function count can be elegantly pro-
grammed by:
var n =0 within (integer procedure count;count: = n;n: = n + 1)

11.2. Different interpretations of ALGOL 60 own-variables


In early descriptions of ALGOL 60 own-variables were ambiguously des-
cribed and various rival interpretations arose. These interpretations differ
on the times at which own-variables declared in procedure bodies are
initialized. We shall consider three possibilities.
(j) The static interpretation: all own-variables are initialized before the
program containing them is run. Thus own-variables can be
thought of as global variables declared in an imaginary outermost
block but only 'visible' within the block they are declared in. This is
now the official interpretation.
(ii) The intermediate interpretation: own-variables in procedure bodies
get initialized each time the procedure is declared.
(iii) The dynamic interpretation: own variable in procedure bodies get
initialized each time the procedure is called.
136 The Denotational Description of Programming Languages

To illustrate the difference between these interpretations consider the


ALGOL 60 program below:
begin
L:begin integer procedure count;
begin own integer n;
count: = n;
n: =n+1
end
print count;
print count;
end
goto L
end
(j) On the static interpretation n is initialized just once to °
at the
beginning and so the program will loop forever printing 0, 1, 2, ...
(each time count is called n is increased by 1),

°
(iii On the intermediate interpretation each time count is declared n
will be reinitialized to so the program will loop forever printing
0,1,0,1,0,1, ....

reinitialized to
0,0,0,0, ....
°
(iii) On the dynamic interpretation each time count is called n will be
so the program will loop forever printing

11.3. Semantics of own-variables


To show how to formally describe own-variables we add declarations
own 1= E to SMALL. When I is initialized it is bound to a new location
containing E's R-value. Because of the possibility that the same identifier
might be declared to denote different own-variables in different blocks we
must somehow arrange that a single identifier can simultaneous denote
several own variables. We shall do this by introducing a domain Posn of
'positions' and then redefine environments by
Env = lIde-[Dv+ {unbound}]] x ([Ide x Posnl-[Loc + {unbound}]]
, ¥ I \ ;
T

ordinary bindings own variable bindings


Own-Variables 137

The binding of ordinary variables, constants, procedures, functions etc.


will be held in the first component which has type Ide-[Ov + {unbound}].
In the second component of type [Ide x Posn]-[Loc + {unbound}] we
hold the bindings of own-variables-specifically if I is declared at position
wePosn to denote location. then we bind. to the pair (I,w) in the second
component. To make this intelligible we must explain Posn. Pretty well
any way of distinguishing different occurrences of declarations could be
used to disambiguate own-variable names. We shall 'name' each occur-
rence of a construct in a program by its position which we represent as a
string of numbers as follows
(i) The position of the whole program is the empty string ( ).
OJ) If a construct has position w then its immediate constituents have
positions w.1, w.2, w.3, ... from left to right.
Thus we define Posn = Num*. As an example consider the following silly
program:
program begin proc P(x); output (x + x);
P(1);
P(1)
end
Then the position of the whole program is ( ). The position of the block
constituting the body of the program is (1). The position of the declaration
of P is 1.1. The position of the first call P(1) is 1.2. The position of the
second call P(1) is 1.3. The positions of P, x and output (x + x) in the
declaration of Pare 1.1.1, 1.1.2 and 1.1.3.
To initialize own-variables we use a function W (so named because of
the w in own) which when applied to constructs locates the own-variable
declarations in them and generates the appropriate bindings. W has type:
W:[Com + Dec]-Posn-Env-Dc-Cc
and the idea is that
W[Xl w r u = u(( ),(1, , ... ,In/(I, ,w, ), ... ,(In,wn)))
where I, , ... , In are the identifiers declared to denote own-variables in X (X
is either a command or a declaration) at positions w, '''''Wn and '" ... ,In
are new locations initialized appropriately. Thus W passes an environment
138 The Denotational Description of Programming Languages

containing an empty-i.e. ()-normal part and new own-variable


bindings to u. The definition of W is straightforward:
Those constructs which cannot have own-variable declarations hidden
in them generate empty bindings. Thus
W[I: = E) w r u= u(( ),( ))
W[output E) w r u = u((),())
W[E,(E 2 )]wru = u((),())
W[var I = E) w r u = u(( ),( ))
The bindings of commands which contain other commands as constitu-
ents are got by merging the bindings of the constituents. Thus:
W[while E do C) w r u = W[C] w.2 r u
W[if E then C, else C 2 ] w r u =
W[C,] w.2 r Ar, • W[C 2 ] w.3 r Ar 2 • u(r, [r 2])
W[begin D;C end] w r u =
W[D]w.1 rAr, • W[C]w.2rAr 2 • u(r,[r 2 ])
W[C,;C 2 ]wru =
W[C,]w.1 rAr, • W[C 2 ]w.2rAr 2 • u(r,[r 2 ])
Here r, [r 2] means the environment obtained by merging the normal part
of r, with the normal part of r 2 and the own-variable part of r, with that of
r 2 • Notice how we 'pass down' the appropriate position to each constitu-
ent; this ensures that we can generate different bindings for the same
identifier declared own in different positions. The clause which actually
generates these bindings is:
W[own 1= E) w r u = R[E] r;ref AI. u(( ),hj(l,w)))

Finally we must define W[proc 1(1, );C). If own-variables in procedure


bodies are initialized at the same time as all other own-variables-i.e.
before the program has been run, then:
W[proc 1(1, );C] w r u = W[C] w.3 r u
This corresponds to the static interpretation. With the other interpreta-
tions the own-variables in C are not initialized at the beginning and so
W[proc 1(I,);C]wru = u((),())
Own-Variables 139

In this case the semantic clause for D[proc 1(1, );C] must ensure initializa-
tion is done at the right time. With the intermediate interpretation
initialization is done at declaration time so, if w is the position of C, then
we need something like
D[procl(l,);C]ru = W[C]wrAr' .u((p/I),())
where p = AC e . C[C] r[r'][el I,] C
i.e. at declaration time we generate an initialization r' of the own-variables
in the body C and bind that into the procedure value. This semantic clause
is defective in various ways which we rectify below-first, though, we
give a defective clause for the dynamic interpretation. With this the
initialization is done each time the procedure is called so each time the
procedure value p is applied it must invoke W. Thus:
D[proc 1(1, );C] r u = u((p/I),())
where p = AC e. W[C] w r[e/I, Hr' . C[C] r[r'][e/l,] C

so r' is generated each time the procedure is called. The last two clauses
are defective because the position w occurring in their right hand sides is
not defined. What we must do is change C and D so they 'pass positions
down' to their constituents. (The problem we are having with w is like the
problem we had with valof continuations k discussed in 7.1.3. -we adopt
the analogue of solution (i) to handle w). Thus we must change the types
of C and D so that
C: Com-Posn-Env-Cc-Cc
D: Dec-Posn-Env-Dc-Cc
The semantic clauses must now be rewritten to handle:
(i) The passing down of positions.
(ij) The new environments.
(iii) The binding of own-variables to their names in their scopes.
To cope with (j) we proceed as in the definition of W, for example
CUf E then C, else C 2 ] w r c =
R[E] r;BooI7;cond(C[C,1 w.2 r c, C[C 2 1 w.3 r c)
,N.B./
140 The Denotational Description of Programming Languages

"
C[C, ;C 2 ] w r c = C[C,] w.1 r;C[C 2 ] w.2 r;c

N.B. /
To cope with (ij) we must change Em so that:
Em r k = (e11 r I = unbound)-k(eI1 r I),err
and declarations must generate bindings in the normal part of the environ-
ment, thus:
D[const 1= E] w r u = R[E] rAe. u((e/I),( ))
D[var I = E] w r u = R[E] r;ref AI. u(h/I),( ))

For procedure and function declarations we require a different semantic


clause for each interpretation of own-variables. These clauses being the
correct versions of the defective ones discussed above. We just give the
clauses for procedures as the ones for functions are similar.

11.3.1. Static interpretation


D[proc 1(1, );C] w r u = u((p/I),())
where p = AC e . C[C] w.3 r[e/I,] C
No initialization is done at either declaration or call time- it is all done
before the program is run.

11.3.2. Intermediate interpretation


D[proc 1(I,);C]wru = W[C]w.3rAr'. u((p/I)'())
where p = AC e . C[C] w.3 r[r'][ell,] C
Initialization done at declaration time.

11.3.3. Dynamic interpretation


D[proc 1(1, );C] w r u = u((p/I),())
wherep = Ace.W[C]w.3r[e/I]Ar' .C[C]w.3r[r'][e/I,]c
Initialization done at call time.
Notice that with the dynamic interpretation procedures like:
proc P(x);begin own 1= x; ... end
Own-Variables 141

make sense. With the static or intermediate interpretation the x in


own I = x would be evaluated in the initial (i.e. 'compile-time') or declara-
tion time environments respectively.
We still have not explained how own-variables are made accessible
within their scope. This is done by 'moving' the bindings from the own-
variable part of the environment to the normal part at block entry-i.e.
when own 1= E is executed. Thus
D[own I = E] w r u =
(e12 r (I,w) = unbound-err, u(((eI2 r (I,w))/I),()))
So on block entry E is ignored (it was evaluated earlier by W at 'own-
variable initialization time') and the value bound to (I,w) in the second
component of the environment is bound to I in the first component.
Finally we must fix the semantics P[P] of whole programs to initialize
own-variables before running the program P (with the intermediate or
dynamic interpretations only own-variables not occurring in procedure
bodies are initialized now). The position of the 'main command' is 1
hence:
P[program C] i = W[C] 1 ( ) (Ar. C[C] 1 r (As. stop)) (i/input)
12. Types
In many programming languages one must declare the type of all identi-
fiers in use. For example in PASCAL variables are declared by: var I:T
where T is a type such as integer, real, array [1 ... 100] of real etc. The
purpose of such explicit types include making programs easier to read,
aiding 'debugging' by enabling type errors (such as 1 + true) to be
detected before programs are run and permitting efficient code to be com-
piled (by removing the need to generate time consuming run-time type
checks).
To describe the type system of a language one must:
(j) Define what the types are.
(ij) Define which programs are well typed.
(iii) Define how the types relate to the semantic domains.

We consider these in turn in the next three sections.

12.1. Various kinds of types


Both the types available and the nature of the types themselves differ
from language to language. For PASCAL we might use a domain Type of
types T defined by:
T :: = integer I real I boolean I char I
18, .. 8 2 1arrayT, ofT21 record 1,:T,; ... ;lnTn
I set ofT I file ofT I· ..
ALGOL 60 on the other hand has no records, sets or files and its array
types do not have any index-type specified. Since ALGOL 60 array
bounds are computed dynamically (when the array declaration is executed
at run-time) it has array types of the form T array (e.g. integer array)
rather than of the form array T, of T2 (e.g. array 1 .. 100 of integer).
Exactly what types are good to have is a current research problem. For
instance a number of recent experimental languages allow types to con-
tain type variables so that, for example, a general purpose sorting pro-
cedure can take an argument of type x array (where x is a type variable)-
thus avoiding having to write separate procedures for sorting integer
arrays, real arrays, ... etc. as one would in ALGOL 60 or PASCAL.

142
Types 143

12.2. Well-typed programs and type-checking


Given the types of a language one must next define which programs are
weI/-typed. For example if x, y have types integer and real respectively
then one must say if x + y is well-typed and if so what its type is. The
process of checking that constructs are well-typed is called type-checking
and it is important that what has to be checked is carefully specified.
There is a notorious ambiguity in the description of which PASCAL pro-
grams are well-typed [Welsh, Sneeringer and Hoare]: consider types
T" T2 and T3 defined by:
type T, , T 2 = array 1 •• 100 of integer;
type T 3 = array 1 •• 100 of integer
then it is not clear if T, and T 2 are the same type and if so if they are
different from T 3' Thus if we declare:
var A, :T,;
varA 2:T 2;
var A3:T3;
procedure P(var x:T, ); ...
Then it is not clear (and currently implementation dependent) whether
P(A 2) or P(A 3) are allowable procedure calls. To settle ambiguities like
this the well-typing conditions should be rigorously defined.
Type-checking is at first sight a purely 'syntactic' process; however it is
possible-at least for simple languages-to describe it denotationally. The
idea is that one regards programs as having two sets of meanings: a
'dynamic meaning' which describes their execution and a 'static meaning'
which describes their type-checking. Looked at this way types are 'static
values' computed (usually) at compile-time. Thus just as at run-time
2+3.5 evaluates to 5.5 so at compile-time integer+ real evaluates to
real. Thus type checking is thought of as a 'non-standard' model of the
language in which calculations with types replace calculations with (run-
time) values. This approach appears elegant and unifying and it works
well for simple type systems (as found in FORTRAN, ALGOL 60, ALGOL
68, PASCAL etc.); we outline the details in the next section. However for
more sophisticated systems-for example those which use type variables
- it may be conceptually confusing and technically inefficient. Since this
is still a research area we shall not attempt further discussion.
144 The Denotational Description of Programming Languages

12.2.1. The denotational description of type-checking (outline)


Describing type-checking denotationally is just like describing run-time
meanings but much easier. For example we might define a domain Tenv
of type environments by:
Tenv = Ide-[Type+{unbound}1
A type environment r specifies the type of the various identifiers in the
program. We might then define static semantic functions:
ET : Exp-Tenv-[Type + {typeerror}1
CT: Com-Tenv-{ welltyped, typeerror}
DT: Oec-Tenv-[Tenv + {typeerror}1
where intuitively:
if E is well-typed and has type T in r
EnE] r = IT
typeerror if E is badly typed

CnC] r = j welltyped if C is well-typed

I
typeerror if C is badly typed
if 0 is well-typed and r' gives the types
DnO] r = r' of the identifiers declared in D
typeerror if D is badly typed
Typical 'static' semantic clauses would be:
Entrue] r = boolean
Enl1 r = (r I = unbound)-typeerror , r I
EnE,[EJ] r =
(EnE,] r=arrayT, ofT 2 ) and (EnE 2 ] r=T,)-T 2 , typeerror
Cnif E then C, else CJ r =
(EnE] r = boolean)-
((CnC,] r = C[C 2 ] r = welltyped)-welltyped , typeerror),
typeerror
Dnconst I:T= E] r = (EnE] r=T)-(T/I), typeerror
(Here we assume PASCAL-style arrays, and constant declarations
const I:T= E where I is declared to have type T and is initialized (at run-
Types 145

time) to E's value). We shall not give a fully worked out example since for
simple type systems the details are very straightforward and for complex
systems unclear.

12.3. The semantics of types


To fully explain the types of a language one must relate them to the (run-
time) semantic domains. For example if I is declared to be a variable of
type integer then this means only members of the semantic domain Num
(say) should be stored in it. Having defined the meaning of types one can
then prove that well-typed programs conform to this meaning. For ex-
ample one could attempt to prove that if EnE] r = integer then E (the
run-time semantic function) gives a member of Num as E's value.
Whether or not this is true depends on whether the language has a secure
type system. This question-whether programs which are syntactically
well-typed (i.e. successfully type-checked) are always semantically well-
typed (i.e. doI not cause run-time type errors) - is thus an important ques-
tion to ask when designing a language. Both ALGOL 60 and PASCAL
have needless type insecurities illustrated by the following PASCAL pro-
gram [Welsh, Sneeringer and Hoare):
procedure P(procedure Q);begin Q(1,2) end;
procedure R(x:integer);begin write (x) end;
begin P(R) end
Since one cannot say that Q must take two parameters there is no way to
exclude P being applied to R which only takes one. Had the designers
thought about proving that well-typed programs cannot cause run-time
type errors then perhaps they would have avoided this-say by requiring
one to write:
procedure P(procedure Q (integer,integer)); ...
Unfortunately there are as yet unanswered technical difficulties in giving
the semantics of all except the most trivial types. Intuitively one would like
to define a semantic function, T say, such that:
T:Type-{denotations of types}
where the denotation of a type T is something like the set of values having
146 The Denotational Description of Programming Languages

type T. Thus we might define Tv by:


Tv = {x I x is a subset of Ev}
and then T:Type-Tv might be defined by:
T[integer] = {e I isNume=true}
T[boolean] ~ {e I isBool e = true}

Alas there appear to be technical problems with this, and it is not clear if
such an approach is mathematically sound. We cannot go into the details
here but interested readers could look at [Donahue] and [Reynolds].
Appendix: Remarks for instructors and
sample exercises
This book is based on a course taught at Edinburgh University to under-
graduates and first year graduate students. The material is covered in
twelve hour-long lectures over a period of six weeks. Exercises are handed
out weekly. At first the exercises are mainly concerned with using formal
notation-especially A-notation. Students are often confused over the
difference between mathematical functions and programs, and they find
higher-order functions especially hard to grasp. A typical first exercise
sheet would contain questions like 01 and Q2 below.
The next big problem is getting accross how continuations work. From
the students point of view a good kind of exercise is 'evaluating' particular
programs from semantic clauses-e.g. 04-unfortunately such exercises
generate solutions which are very tedious to mark! Exercises like 03, 06-
010 are also good for inducing an understanding of continuations. After
the use of the formalism has been mastered questions can be set on
applying it to new constructs. Easy questions of this type are 06-010;
harder examples are 011 and 012.
Initially semantic clauses produced by students are often syntactically
and semantically ill-formed, common mistakes are:
(j) To fail to formalize crucial details and revert to english when the
going gets tough. For example 09 might elicit the following:
C[exit] r c = c' where c' is continuation corresponding to
the smallest enclosing cycle C repeat.
This is an intuitive description of a semantic clause-the whole
problem is to construct an actual clause meeting the description.
(ij) To write semantic clauses which are mathematical rubbish but, in
some loose intuitive sense, match a correct informal description, e.g.:
O[ + ](e, ,e 2 ) k = e, ;Num7;e 2 ;Num?;k(e, + e 2 )
This looks plausible until one considers the types. 03 is designed to
correct this kind of mistake.

147
148 The Denotational Description of Programming Languages

(iii) To use various undefined entities, e.g.

Dlnewvar 11 r u = u( new sl I)
Here it is incorrectly assumed that s, like a global variable in pro-
gramming, is floating around in the context.

Finally it is useful to test that all the ideas have fitted together by requiring
students to construct the complete semantics of a whole language. Suit-
able examples can easily be invented by modifying the various constructs
in SMALl. Unfortunately existing 'real' languages, although usually
straightforward, are very tedious to describe formally and are not suitable
for (undergraduate) exercises (though they might be good as term pro-
jects). Simplified subsets of real languages are more useful for learning
basic principles.
Sample exercises
(01) Write down the types of the following functions:
(i) Ax.x+1
(ij) A(x, y). x + y
(iii) Axy.x+y
(iv) At x • (f x) + 1

(02) The function lit (list iteratod is defined informally by:


litf(x" ... ,xn)xn+, = fx, (fx 2 ( ... (fxnx n+,) ... ))
For example if plusc = AX y . x + y then:
litplusc(x" ... ,xn)xn+, = x, +x 2 + ... +xn+xn+'
(e.g. lit plusc (1,2,3) 4 = 10)
(j) Write down the generic type of lit.
(ij) Use A-notation and recursion (and no elipsis-i.e. " ... ") to
define lit.
(iij) Devise a function f such that

litf (x, , ... ,xn) x = j true


if x = Xi for some i

false otherwise
(03) If kt:Ec what is the type of:
A(e, ,e 2 ). isNurn e, and isNurn e 2 -k(e, + eJ , err
Show this function is the same as:
A(e, ,e 2 ). (Nurn? Ae,' • (Nurn? Ae 2 ' • k(e,' + e 2 ')) e 2 ) e,
(04) Use the semantic clauses of SMALL to evaluate
(i) P(program output 1;output 21 ()
(ij) P(program output (read + read)] 1.2
(iii) P(program begin var x = read;output x end] ( )

149
150 The Denotational Description of Programming Languages

(05) Compare and contrast the following SMALL commands:


=
(i) begin const I E;C end
(ii) (pr I;CHE)
(iii) begin proc r(l);C;I'(E) end

(06) Devise a declaration dec 1= E so that begin dec 1= E;C end is


completely equivalent to (pr I;CHE)' Demonstrate the equivalence
from your semantic clause for dec 1= E.
(07) Describe the semantics of a command skip whose execution has
no effect. Show that:
(i) C[C;skip] = C[skip;C] = C[C]
(ii) C[while E do C] =
CUf E then (C;while E do C) else skip]
(OB) Describe the semantics of a command stop whose execution
causes the program to stop. Show that C, ;stop;C 2 is equivalent to
just C, ;stop.
(09) Describe the semantics of commands cycle C repeat, exit and
continue where:
(i) Executing cycle C repeat consists of repeatedly executing
C until an exit is encountered whereupon the loop termi-
nates and control passes to the rest of the program follow-
ing it.
(ij) Executing continue causes a jump to the beginning of the
C of the smallest enclosing cycle C repeat.
(iii) Executing exit or continue when not in the C of a
cycle C repeat causes an error.
(010) Describe the semantics of commands return and result = E where:
(i) Executing retum causes a return from the smallest currently
entered procedure.
(ii) Executing result = E causes a return with E's value from the
smallest currently entered function.
(iii) Executing retum or result = E when not in a procedure or
function respectively causes an error.
Sample Exercises 151

(011) Describe how to add ALGOL 60 switches to SMALL. You should


proceed as follows:
(i) Read 3.5. and 5.3. of the "Modified Report on the Algo-
rithmic Language ALGOL 60" [Backus et al.J.
Oi) Devise an abstract syntax for the constructs involved (e.g.
switch declarations and designational expressions).
(iii) Decide what new domains (if any) and changes to existing
ones are required.
(iv) Write semantic clauses for the various constructs.
(012) Describe how to add coroutines to SMALL. Assume a coroutine
named I is declared by coroutine HI, );C. To run I with actual para-
meter E one evaluates the expression run HE). To temporarily leave
a coroutine one executes the command leavewith E; this passes
E's value to the context which ran (or resumed) the coroutine. A
temporarily-left coroutine can be resumed by evaluating the expres-
sion resume I; this 'starts up' I at the beginning of the command
following the last leavewith E in I. To permanently leave a co-
routine one evaluates the expression quitwith E, subsequent at-
tempts to resume it cause errors. For example:
begin coroutine count (n);
while n<3do
(leavewith n;
n:=n+1);
quitwith 3;
output run count (0); -outputs 0 and leaves
output resume count; - outputs 1 and leaves
output resume count; -outputs 2 and leaves
output resume count; - outputs 3 and quits
output resume count; - causes an error
end
References
Extensive bibliographies can be found in [Scott] and [Stoy], here we just
list the references mentioned in the text.

Backus, J. W., et al.: Modified Report on the Algorithmic Language


ALGOL 60; The Computer Journal Vol. 19, No.4 (1976)
Bjorner, D., and Jones, C.: The Vienna Development Method: The Meta-
Language; Vol. 61, Lecture Notes in Computer Science, Springer-
Verlag (1978)
Burstall, R. M., et al.: Programming in POP-2; Edinburgh University Press
(1971 )
Donahue, J.: On the Semantics of "Data Type"; TR 77-311, Computer
Science Department, Cornell University (1977)
Evans, A.: PAL Reference Manual and Primer; Department of Electrical
Engineering, MIT (1970)
Hanson, D. R.: The Syntax and Semantics of SL5; SL5 Project Document
S5LD2b, The University of Arizona, Tucson, Arizona (1976)
Jensen, K., and Wirth, N.: PASCAL User Manual and Report; Springer-
Verlag (1974)
Knuth, D. E. The Remaining Trouble Spots in ALGOL 60; Communica-
tions of the ACM, Vol. 10, No. 10 (1967)
Knuth, D. E.: Structured Programming with GOTO Statements; STAN-
CS-74-416, Computer Science Department, Stanford University (1974)
Landin, P. J.: A Correspondence between ALGOL 60 and Church's
Lambda-Notation; Communications of the ACM, Vol. 8, Nos. 2, 3
( 1965)
Ligler, G.T.: A Mathematical Approach to Language Design; Proceedings
of the Second ACM Symposium on Principles of Programming
Languages, Palo Alto (1975)
London, R. L., et al.: Proof Rules for the Programming Language Euclid;
Acta Informatica, Vol. 10, No.1 (1978)
McCarthy, J.: A Formal Description of a Subset of ALGOL, pp. 1-7 of
Formal Language Description Languages for Computer Programming
(ed. Steel, T. B.), North Holland (1966)

152
References 153

Milne, G. J., and Milner, A. J. R. G.: Concurrent Processes and their


Syntax; Journal of the ACM (1979)
Milne, R. E., and Strachey, C.: A Theory of Programming Language
Semantics; Chapman and Hall (UK), John Wiley (USA) (1976)
Milner, A. J. R. G.: Processes: A Mathematical Model of Computing
Agents; pp. 157-174 of the Proceeding of the Logic Colloquim '73 (ed.
Rose, H. E., and Shepherdson, J. C.), North Holland (1975)
Mosses, P. D.: Compiler Generation Using Denotational Semantics; pp.
436-441 of Mathematical Foundations of Computer Science, Gdansk,
Lecture Notes in Computer Science, Vol. 45, Springer-Verlag (1976)
Reynolds, J. C.: Towards a Theory of Type Structure; Colloquim on
Programming, Pairs (1974)
Scott, D. S.: Data Types as Lattices; SIAM Journal of Computing, Vol. 5,
No.3 (1976)
Stoy, J. E.: Denotational Semantics: The Scott-Strachey Approach to
Programming Language Theory. MIT Press (1977)
Strachey, C.: Towards a Formal Semantics; pp. 198-220 of Formal
Language Description Languages for Computer Programming (ed.
Steel, T. B.), North Holland (1966)
Strachey, C.: The Varieties of Programming Languages; Proceedings of
the International Computing Symposium, Cini Foundation, Venice
(1972) and PRG-10 Programming Research Group, University of Oxford
(1974)
Strachey, C., and Wadsworth, C. P.: Continuations-a Mathematical
Semantics for Handling Full Jumps; PRG-11, Programming Research
Group, University of Oxford (1974)
Tennent, R. D.: Language Design Methods Based on Semantic Principles;
Acta Informatica 8 (1977)
Welsh, J., Sneeringer, W. J., and, Hoare, C. A. R.: Ambiguities and
Insecurities in PASCAL; Software Practice and Experience 7 (1977)
Zahn, C. T.: A Control Statement for Natural Top-Down Structured
Programming; Symposium on Programming Languages, Paris (1975)
154
Subject and Author Index
abstract entities as denotations 10 with mathematical functions 35
abstract syntax 24 association, left, of functions 39
abstractions 115 association, right, of -+ 31
accessing of arrays 121 *
associativity of 48
actual parameters 35,76 automata theory 10
reo jumpout 90 axiomatic semantics 9-10
algebraic semantics 9- 10
aliasing 62 backtracking 97
ALGOL 60 Backus 132
ambiguity of English semantics for 8 Bas 81
arrays in 142 BCPL 90
assignment in 73 begin 80,94,95
blocks causing new locations to come into between 121
use 63 binding, see also static binding, dynamic binding
call by closure in 111 of actual to formal parameter 76
call by name in 115 dynamic vs. static 104
call by value in 105-107 of identifiers in environment 64,67,69,75,52
commands, effects of 65-66 informal, in TINY 12
denotable values in 65 for own variables 136-138
early semantics for fragment of by McCarthy in TINY 12
28 Bjorner 57
effects of commands in 65-66 Blocks 93
evaluation of actual parameters in 110 reo own variables 134, 141
for statements in 130-132 reo records 122
functions in 39, 77 semantics of in SMALL 84
jumps in 93 BNF 7, 12
own variables in 134-136 body of procedure 75
passing procedures in 115 Bool 30
procedures in, passing 115 boolean constants, in TINY 19
procedures in, recursive 101 bound variables in Aexpressions 36-38
proof rules for failure of 8 buffer variable for files 123-124
recursive functions in 102 Bv 81
recursive procedures in 101
scope of declarations in 66 call by
scope of labels in 92-93 closure 111-112
static binding in 76, 103 denotation 113
type checking for 143 name 111
type insecurity in 145 reference 108-109
types in 142 text 112-113
ALGOL 68 value 105-108
procedures in 117 value and result 109-110
type checking for 143 call time 114
anonymous procedures 116 call time environment 75, 103
Ans 61,68 calling of procedures 75
answers, see final answers cancellation, in Aexpressions 45
application, of functions, notation for 39 capture, of bound variables, in A expressions 38
application, in Anotation 37-38 cases, notation for 42
array 119 Cc 68
Array 119 CheckBool 51
arrays 119-121 CheckNum 50
assignment 73-74 closures 106,111-112,132
reo semantics of recursive functions 102 Closure 111
association of programming language functions coercions 33
Index 155

Com 80 Dc 68
command continuations 54, 68, 89 Dec 80
commands (in TINY) 12 declaration binding mechanisms 116
semantics of, in TINY 18-21 declaration continuations 68-69
reo store and environment 67 declaration time environment 75,103,141
compile time declarations
type checking 104 in ALGOL 66
fixing array bounds 121 of arrays 121
evaluating own variables 141 compound 67
types 142-143 reo parameter passing 117
compound declarations 67 in PASCAL 66
composition of functions 47 records' relation to procedure declaration 123
concurrency 29, 78-79 reo store and environment 67
cond 41 definition of functions (see also recursive
conditionals, notation for 41-42 functions) 35
consistency denotable values 64-65
of English language semantics 8 denotational description of TINY 14
of naive semantic models 28-29 denotation semantics, definition of 9
of new function 63-64 denotations 9
recursive definitions deref 72
interpretation of 28 dereferencing 65,70,73-74
const 67,84 dimensions for classifying languages 63-64
constituents, of programs 23 direct semantics 53
constructs, of programs 23 continuation semantics, relation to 55-56
cont 71 for goto's 57
cont 118 disjoint union 15
Cont 54 domain building operations 15-16,30-34
contents, of locations 66 regarding recursive domains 28
continuations 52-57 domain constructors, see domain building
command continuations 54 operations
declaration continuations 68-69 domain checking functions 72-73
for escapes 89 domains, use of 14-15,24-28
for errors 56 domains, of functions (see also source of
expression continuations 54 functions) 31
for jumps 56, 92 domain equations
for jumpouts 89 solution of recursive 28
for labels 93 for TI NY semantics 15
in semantics of TINY 57-59 types of 34
transforming of 70-71 do 127
for valof and resultis 91 donothing 50
control 88, 92, 97 Donahue 146
in for loops 132 Dv 64-65
of own variables 134 dynamic binding 52,75, 103
coordinates, for product domains 31 semantics of 103-104
coroutines 97 dynamic chain 103
curry 40 dynamic interpretation of own variables 135
Curry 40 dynamic meaning of programs 143
currying, of functions 40-41
Ec 68
data structures 118-128 Econt 54
as values of expressions 123-124 el 31
data types, recursively defined 26 embedding of procedures in states 27
07 72 empty environment 82
156 Index

empty sequence 12,31 formal parameters 75


empty string 16 reo free identifiers in dynamic binding 105
English descriptions, inadequacy of 7 formal semantics, definition of 6
entry time 114 justification of 6
Env 64 formalisation of TINY semantics 14
environments 64-65 fortran 109
for own variables 136 FORTRAN, call by reference in 108
80f 125-126 type checking for 143
equations, recursive definitions as 25 fun 67,78,98-99
err 72 Fun 77-78
error 60,68 Funo 98
errors Fun" 99
in TINY 13-21 function composition 47
complicating semantics of TINY 21 function spaces 30
reo continuations 53, 56 functions
escapes 88-90 in ALGOL 102
from commands 88 domain of all 15
reo event loops 130 mathematical 34-35, 39
from expressions 89 in PASCAL 102
escapeto 88-89 in programming languages 34-35, 39
EUCLID 8,9 programming language, semantics of 77-78
Ev 65 functional equations, solving 25
event 129
event loops 129 generic functions 43-44
event mechanisms 88 generic types 43
execution time 114 getf 127
exit time 114 get 125, 128
Exp 80 goto 56,92,94,96
expressible values 65, 75
expressions Hanson 106, 123
continuations for 54, 68 hd 31
in TINY 12 higher order functions 38
semantics in TINY 16-18 reo continuations 54
expression continuations 54, 68 Hoare 8, 143, 145
holes, in scopes 67
fields, in records 121
File 126 Ide 30
file 125-126 identifiers, of TINY 12
file values 125 immediate constituents of programs 23
file variables 125 implementations 7, 10,76
files 81-82, 124-128 reo higher order functions 39
Filestate 126 modelling of 63,64,79, 134
final answers 54,60,68,78 producing from denotational semantics 87
finite domains 30 reo labels as first class values 97
finite output 61 reo dynamic and static binding 103,
finite sequences 31 types 142-146
fix 44 implicit structure of domains 26,28
fixed points 44 inD
flow of control 88, 92 initialisation of own variables 135, 138-139
fn 116 infinite outputs 61
for 130 infinite sum, domain of finite sequences as 34
for lists 130 input 81
for statements 130 input, irrTINY 12,15
Index 157

instances of generic functions 43 memory, in TINY 12,15


interactive programming 104 meta-language, semantics as 11
INTERLlSP, mix of static and dynamic binding meta variables, in notation 24
in 104 Milne, G. 29,79
intermediate interpretation of own variables 135 Milne, R. 10
intermediate results, reo continuations 54, 56, 69 Milner 29,79
interpretation of recursive definitions 28 mix 103
invocation of procedures 114-115 mkcon fun 70
in TINY 26-27 Mosses 8, 11, 87
isO 32 mutual recursion, definition by 44
isomorphisms, in domain equations 28 mutually recursive procedures, semantics 101
it 51,83-84
n-tuples, domain of 15
J operator 89 new 63
Jensen 6 new locations 63
Jones 57 newarray 120
jumps 35, 53, 56, 57, 92-97 news 120
jumpout 89, 90 normal continuations 53
null 31
Knuth 8,88, 129 Num 30

Aexpressions 36 one-element domain 15


A notation 35-37 operational semantics 9-10, 134
labels 93 outO 32
label environments 96 output, in TINY 12, 15
Landin 89 as distinct from States 60-62
least fixed point operator 44 reo dereferencing 75
left association, of function application 39 output 12,51,59,61
L values 73-75 own variables 134-141
Ligler 8 various interpretations 135
LISP initialisation of 135, 138-139
abstraction in 115-116 scopes 139
call by text in 112 own 134, 138, 141
dynamic binding in 52,75, 102, 104 over-writing, by assignment, in TINY 14
evaluation of actual parameters in 110
FEXPR's in 110, 112 PAL
functions in 35 assignment of labels in 96
higher order functions in 39 escape construct of 90
INTER LISP 104 PASCAL
quotation constructs of 113-114 ambiguity of English semantics 8
little environments 67,70 ambiguity of well-typed ness in 143
little stores 64 arrays in 121, 144
Loc 63 assignment in 73
locations 63 call by value in 105-107
London 9 call by reference in 108
loop 129 data structures in 118
lower bounds, of arrays 119 denotable values in 65
effect of commands in 65-66
machine independence 7 file structures in 124-125
machine readability 8 for statements in 130
mathematical functions 34-35, 39 functions in 35, 39, 77
mathematics, underlying 9, 25, 29 L values for 75
McCarthy 28 labels in 96
158 Index

passing procedures in 115 quotation constructs 113-114


procedures in 117
proof rules for, failure of 8 R-values 65,73-75
R-values for 75 read 12, 58, 83
recursive functions in 102 reading from files 124
recursive procedures in 101 reasoning about programs 6, 29
repeat construct of 129 ree 101-102
scope of declaration in 66 record 121
scope of labels in 92 Record 122
sharing induced by procedure declarations records 121-123
in 62 recursive definitions, consistent interpretation
static binding in 76, 103 of 28
type checking for 143 recursive functions, mathematical 25,35,44
type insecurity in 145 recursive functions, in programming languages
types in 142-143 100-102, 104
with command 122 recursive procedures, 100-102, 104
passing down of positions, reo own variables 139 recursively defined data types 26
parameter calling vs. parameter passing 110-111 recursiveness of semantic clause for while in
parameter passing mechanisms (see also call TINY 21,25
by) 105-115, 117 ref 118
embedded in procedure declarations 107 ref 72
parameterless function calls 98-99 references 118
parameterless functions 111 vs. variables 124
parameterless procedures in TINY, example of renaming of bound variables in Aexpressions 38
recursively defined data type 26 repeat 129
parameters 26, 35, 98 reset 125, 128
POP-235 resetf 127
call by value in 106-107 result 50
dynamic binding in 75, 102 resultis 90, 91
jumpouts in 89 return address, of procedures 76
position dependent denotations 134 for jumpouts 90
Posn 136-137 rewrite 125, 128
postludes 88, 129 rewritef 127
pr 116 Reynolds 146
pragmatics 6 right association of -+ 31
preserving domain structure 28, 31, 98 right extension, in A notation 36
proe 26,67,78,98-99 rule, functions in programming languages as 35
Pro 80 running denotations 113
Proc 76, 78 Rv 65,75
Proc o 98
Proc n 99 scope, of declarations 66-67, 75
procedure calling semantics 76-77 of labels 92-94
procedure declarations 76, 116 of own variables 139
procedure names 76 Scott 28
procedure values 76, 103 Scott's theory, role of 28-9
procedures 75-77 security of type systems 145
processes 79 semantic clauses 18
product, domain constructor 15,31 semantics, different approaches to 9-10
program proof 8 Semantics Implementation System 87
progra~ 80,82,141 sequencers 88
propogation of errors 47 sequences, over domains 16,31-32
put 125, 128 sequencing operator on functions 47
putf 127 changed for direct semantics 57
Index 159

sharing 62-63 in LISP call-by-text 112


reo call by value 109 thunks 111
side-effects 13,35,62, 106 tl 31
simplification of equations involving A- top-down programming 104
expressions 45 transforming continuations 70-71
SIS 87 translation, semantics viewed as 11
SL5, binding procedures but not executing traps 88
bodies 106-107, 123 trap 88-89
call by value in 106 Tv 146
closures in 106-107 type checking 104, 143
records in 123 as a non-standard semantics 143
Sneeringer 8, 143, 145 type environments 144
solutions to recursive domain equations 26, 28 type variables 142
sou rce of fu nctions 31, 35 Type 144
in A notation 36-37 types 30,35,142-146
standard semantics 52, 78
states, as recursive domains 27-28 uncurry 41
in TINY 12, 15, 17 uncurrying 41
static binding 75, 76,103-104 undefined element 26
static chain 103 union, of domains 15,32
static interpretation of own variables 135 vs. sum 32
static meaning of programs 143 update 71
step 132 updating notation 43
stop 60 unused locations 63
storable values 63, 65 upper bounds of arrays 119
storage allocation 64,67
stores 63 valof 90-91
Store 63 values, of variables in programs 62
Stoy 9,25,38 var 67,85
Strachey 10,28,29,52,63 variables in programs 62
strings 16,32 declarations of 116
structure, of domains 28, 31 vs. references 124
structuring, reo loops and exits 88, 129 variable types 142
subscript 120
subscripts of arrays 119 Wadsworth 52
subsidiary functions, in recursive definitions 46 well-typing 143
substitution, in A notation 36 syntactic and semantic 145
sum, domain constructor 15,32-34 Welsh 8, 143, 145
Sv 63,65 where 46
syntactic categories, of a language 24 whererec 46
syntactic clauses 24 while 12,59,84,94
syntax, definition of 6 windows, on files 118, 124
system variables 102 Wirth 6
with 122
target, of functions 31, 35 within 135
in A notation 36-37 writing, on files 124
Tennent 117
Tenv 144 Zahn 88
text, in continuation passing 55
160 Symbols
b--+v" v 2 41 * 31
32 / 43
30 A 36
47 x 28,31
* 72 + 28,32
39 28,30
0
47 as equality test 19
.. - 12,23 in domain equations 34
12.23 in cases notation 42
Selections from the Springer-Verlag Series:
Texts and Monographs in Computer Science

The Design of Well-Structured and Correct Programs


By S. Alagic and M.A. Arblb
1978/x, 292 pp./68 illus./cloth
ISBN 0-387-90299-6

Chess Skill in Man and Machine


Edited by P.W. Frey
1978/xi, 225 pp./55 illus./cloth
ISBN 0-387-90359-3

Design of Digital Computers


An Introduction
Second Revised Edition
By H.W. Gschwind and E.J. McCluskey
1975/x, 548 pp./375 iIIus./cloth
ISBN 0-387-06915-1

The Origins of Digital Computers


Selected Papers
Second Edition
Edited by B. Randall
1975/xvi, 464 pp./120 iIIus./cioth
ISBN 0-387-07114-8

Programming Methodology
A Collection of Articles by Members of IFIP WG 2.3
Edited by D. Gries
1978/xiv, 437 pp./64 iIIus./cloth
ISBN 0-387-90329-1

Automata-Theoretic Aspects of Formal Power Series


By A. Salomaa and M. Soittola
1978/x, 171 pp./cloth
ISBN 0-387-90282-1

Adaptive Information Processing


An Introductory Survey
By J.R. Sampson
1976/x, 214 pp./83 illus./cloth
ISBN 0-387-07739-1

Pattern Classifiers and Trainable Machines


(in preparation)
By J. Sklansky and G.N. Wassel
Computer Science from Springer-Verlag

Software Engineering
An Advanced Course
Edited by F.l. Bauer
1977/xii, 545 pp./paper
ISBN 0-387-08364-2

Compiler Construction
An Advanced Course
Second Edition
Edited by F.l. Bauer and J. Eickel
1976/xiv, 638 pp./123 iIIus./2 tables/paper
ISBN 0-387-08046-5

(Microcomputer) Problem Solving Using PASCAL


By K.L. Bowles
1977/x, 563 pp./paper
ISBN 0-387-90286-4

A Structured Programming Approach to Data


By D. Coleman
1979/approx. 240 pp./paper
ISBN 0-387-91138-3

A Concurrent PASCAL Compiler for Minicomputers


By A.C. Hartmann
1977/vi, 119 pp./paper
(Lecture Notes in Computer Science, Vol. 50)
ISBN 0-387-08240-9

PASCAL User Manual and Report


Second Edition
Edited by K. Jensen and N. Wirth
1975/viii, 167 pp./paper
(Springer Study Edition)
ISBN 0-387-90144-2

A Practical Introduction to PASCAL


By I.R. Wilson and A.M. Addyman
1979/x, .148 pp./paper
ISBN 0-387-91136-7

You might also like