0% found this document useful (0 votes)
17 views378 pages

Open Logic Sample

The document is a sample logic text licensed under a Creative Commons Attribution 4.0 International License, based on The Open Logic Project. It contains a comprehensive outline of topics including sets, relations, functions, and first-order logic, structured into sections with detailed subtopics. The content is designed to provide foundational knowledge in logic and mathematical concepts.

Uploaded by

starry night
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views378 pages

Open Logic Sample

The document is a sample logic text licensed under a Creative Commons Attribution 4.0 International License, based on The Open Logic Project. It contains a comprehensive outline of topics including sets, relations, functions, and first-order logic, structured into sections with detailed subtopics. The content is designed to provide foundational knowledge in logic and mathematical concepts.

Uploaded by

starry night
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 378

Sample

Logic
Text

Open Logic Project

Sample Logic Text by OLP is licensed


under a Creative Commons Attribu-
tion 4.0 International License. It is
based on The Open Logic Text by
the Open Logic Project, used under a
Creative Commons Attribution 4.0 In-
ternational License.
Contents

I Sets, Relations, Functions 1

1 Sets 3
1.1 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Subsets and Power Sets . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Some Important Sets . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Unions and Intersections . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Pairs, Tuples, Cartesian Products . . . . . . . . . . . . . . . . . . 9
1.6 Russell’s Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Relations 13
2.1 Relations as Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Special Properties of Relations . . . . . . . . . . . . . . . . . . . 15
2.3 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.5 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Operations on Relations . . . . . . . . . . . . . . . . . . . . . . . 20

3 Functions 21
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Kinds of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Functions as Relations . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Inverses of Functions . . . . . . . . . . . . . . . . . . . . . . . . 26
3.5 Composition of Functions . . . . . . . . . . . . . . . . . . . . . . 28
3.6 Partial Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 The Size of Sets 31


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Enumerations and Countable Sets . . . . . . . . . . . . . . . . . 31
4.3 Cantor’s Zig-Zag Method . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Pairing Functions and Codes . . . . . . . . . . . . . . . . . . . . 36
4.5 An Alternative Pairing Function . . . . . . . . . . . . . . . . . . 37
4.6 Uncountable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.7 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

i
C ONTENTS

4.8 Equinumerosity . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.9 Sets of Different Sizes, and Cantor’s Theorem . . . . . . . . . . 44
4.10 The Notion of Size, and Schröder-Bernstein . . . . . . . . . . . 45

II First-order Logic 47

5 Introduction to First-Order Logic 49


5.1 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.3 Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4 Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5 Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.6 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.7 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.8 Models and Theories . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.9 Soundness and Completeness . . . . . . . . . . . . . . . . . . . 57

6 Syntax of First-Order Logic 59


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 First-Order Languages . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3 Terms and Formulae . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.4 Unique Readability . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.5 Main operator of a Formula . . . . . . . . . . . . . . . . . . . . . 66
6.6 Subformulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.7 Formation Sequences . . . . . . . . . . . . . . . . . . . . . . . . 69
6.8 Free Variables and Sentences . . . . . . . . . . . . . . . . . . . . 72
6.9 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7 Semantics of First-Order Logic 75


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.2 Structures for First-order Languages . . . . . . . . . . . . . . . 76
7.3 Covered Structures for First-order Languages . . . . . . . . . . 77
7.4 Satisfaction of a Formula in a Structure . . . . . . . . . . . . . . 78
7.5 Variable Assignments . . . . . . . . . . . . . . . . . . . . . . . . 83
7.6 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.7 Semantic Notions . . . . . . . . . . . . . . . . . . . . . . . . . . 87

8 Theories and Their Models 91


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.2 Expressing Properties of Structures . . . . . . . . . . . . . . . . 93
8.3 Examples of First-Order Theories . . . . . . . . . . . . . . . . . 93
8.4 Expressing Relations in a Structure . . . . . . . . . . . . . . . . 96
8.5 The Theory of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . 97

ii
Contents

8.6 Expressing the Size of Structures . . . . . . . . . . . . . . . . . . 99

9 Natural Deduction 101


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.2 Natural Deduction . . . . . . . . . . . . . . . . . . . . . . . . . . 102
9.3 Rules and Derivations . . . . . . . . . . . . . . . . . . . . . . . . 104
9.4 Propositional Rules . . . . . . . . . . . . . . . . . . . . . . . . . 104
9.5 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.6 Examples of Derivations . . . . . . . . . . . . . . . . . . . . . . 107
9.7 Quantifier Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.8 Derivations with Quantifiers . . . . . . . . . . . . . . . . . . . . 112
9.9 Proof-Theoretic Notions . . . . . . . . . . . . . . . . . . . . . . . 116
9.10 Derivability and Consistency . . . . . . . . . . . . . . . . . . . . 118
9.11 Derivability and the Propositional Connectives . . . . . . . . . 119
9.12 Derivability and the Quantifiers . . . . . . . . . . . . . . . . . . 120
9.13 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9.14 Derivations with Identity predicate . . . . . . . . . . . . . . . . 125
9.15 Soundness with Identity predicate . . . . . . . . . . . . . . . . . 127

10 The Completeness Theorem 129


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
10.2 Outline of the Proof . . . . . . . . . . . . . . . . . . . . . . . . . 130
10.3 Complete Consistent Sets of Sentences . . . . . . . . . . . . . . 132
10.4 Henkin Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . 133
10.5 Lindenbaum’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . 135
10.6 Construction of a Model . . . . . . . . . . . . . . . . . . . . . . . 136
10.7 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
10.8 The Completeness Theorem . . . . . . . . . . . . . . . . . . . . 141
10.9 The Compactness Theorem . . . . . . . . . . . . . . . . . . . . . 141
10.10 A Direct Proof of the Compactness Theorem . . . . . . . . . . . 143
10.11 The Löwenheim–Skolem Theorem . . . . . . . . . . . . . . . . . 144

11 Beyond First-order Logic 147


11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
11.2 Many-Sorted Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 148
11.3 Second-Order logic . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.4 Higher-Order logic . . . . . . . . . . . . . . . . . . . . . . . . . . 153
11.5 Intuitionistic Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 155
11.6 Modal Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
11.7 Other Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

iii
C ONTENTS

III Turing Machines 163

12 Turing Machine Computations 165


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
12.2 Representing Turing Machines . . . . . . . . . . . . . . . . . . . 167
12.3 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
12.4 Configurations and Computations . . . . . . . . . . . . . . . . . 171
12.5 Unary Representation of Numbers . . . . . . . . . . . . . . . . . 173
12.6 Halting States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
12.7 Disciplined Machines . . . . . . . . . . . . . . . . . . . . . . . . 177
12.8 Combining Turing Machines . . . . . . . . . . . . . . . . . . . . 178
12.9 Variants of Turing Machines . . . . . . . . . . . . . . . . . . . . 180
12.10 The Church–Turing Thesis . . . . . . . . . . . . . . . . . . . . . 182

13 Undecidability 185
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
13.2 Enumerating Turing Machines . . . . . . . . . . . . . . . . . . . 187
13.3 Universal Turing Machines . . . . . . . . . . . . . . . . . . . . . 189
13.4 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . . . 191
13.5 The Decision Problem . . . . . . . . . . . . . . . . . . . . . . . . 192
13.6 Representing Turing Machines . . . . . . . . . . . . . . . . . . . 193
13.7 Verifying the Representation . . . . . . . . . . . . . . . . . . . . 196
13.8 The Decision Problem is Unsolvable . . . . . . . . . . . . . . . . 201
13.9 Trakhtenbrot’s Theorem . . . . . . . . . . . . . . . . . . . . . . . 202

IV Computability and Incompleteness 207

14 Introduction to Incompleteness 209


14.1 Historical Background . . . . . . . . . . . . . . . . . . . . . . . . 209
14.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
14.3 Overview of Incompleteness Results . . . . . . . . . . . . . . . 217
14.4 Undecidability and Incompleteness . . . . . . . . . . . . . . . . 219

15 Recursive Functions 223


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
15.2 Primitive Recursion . . . . . . . . . . . . . . . . . . . . . . . . . 224
15.3 Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
15.4 Primitive Recursion Functions . . . . . . . . . . . . . . . . . . . 227
15.5 Primitive Recursion Notations . . . . . . . . . . . . . . . . . . . 230
15.6 Primitive Recursive Functions are Computable . . . . . . . . . 230
15.7 Examples of Primitive Recursive Functions . . . . . . . . . . . . 231
15.8 Primitive Recursive Relations . . . . . . . . . . . . . . . . . . . 234
15.9 Bounded Minimization . . . . . . . . . . . . . . . . . . . . . . . 236

iv
Contents

15.10 Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237


15.11 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
15.12 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
15.13 Other Recursions . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
15.14 Non-Primitive Recursive Functions . . . . . . . . . . . . . . . . 243
15.15 Partial Recursive Functions . . . . . . . . . . . . . . . . . . . . . 244
15.16 The Normal Form Theorem . . . . . . . . . . . . . . . . . . . . . 246
15.17 The Halting Problem . . . . . . . . . . . . . . . . . . . . . . . . . 247
15.18 General Recursive Functions . . . . . . . . . . . . . . . . . . . . 248

16 Arithmetization of Syntax 249


16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
16.2 Coding Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
16.3 Coding Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
16.4 Coding Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
16.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
16.6 Derivations in Natural Deduction . . . . . . . . . . . . . . . . . 255

17 Representability in Q 261
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
17.2 Functions Representable in Q are Computable . . . . . . . . . . 263
17.3 The Beta Function Lemma . . . . . . . . . . . . . . . . . . . . . 264
17.4 Simulating Primitive Recursion . . . . . . . . . . . . . . . . . . 267
17.5 Basic Functions are Representable in Q . . . . . . . . . . . . . . 268
17.6 Composition is Representable in Q . . . . . . . . . . . . . . . . 271
17.7 Regular Minimization is Representable in Q . . . . . . . . . . . 272
17.8 Computable Functions are Representable in Q . . . . . . . . . . 275
17.9 Representing Relations . . . . . . . . . . . . . . . . . . . . . . . 276
17.10 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

18 Incompleteness and Provability 279


18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
18.2 The Fixed-Point Lemma . . . . . . . . . . . . . . . . . . . . . . . 280
18.3 The First Incompleteness Theorem . . . . . . . . . . . . . . . . . 282
18.4 Rosser’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 284
18.5 Comparison with Gödel’s Original Paper . . . . . . . . . . . . . 286
18.6 The Derivability Conditions for PA . . . . . . . . . . . . . . . . 286
18.7 The Second Incompleteness Theorem . . . . . . . . . . . . . . . 287
18.8 Löb’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
18.9 The Undefinability of Truth . . . . . . . . . . . . . . . . . . . . . 292

v
C ONTENTS

V Methods 295

A Proofs 297
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
A.2 Starting a Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
A.3 Using Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
A.4 Inference Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 300
A.5 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
A.6 Another Example . . . . . . . . . . . . . . . . . . . . . . . . . . 309
A.7 Proof by Contradiction . . . . . . . . . . . . . . . . . . . . . . . 310
A.8 Reading Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
A.9 I Can’t Do It! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
A.10 Other Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

B Induction 319
B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
B.2 Induction on N . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
B.3 Strong Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
B.4 Inductive Definitions . . . . . . . . . . . . . . . . . . . . . . . . 323
B.5 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . 325
B.6 Relations and Functions . . . . . . . . . . . . . . . . . . . . . . . 326

C Biographies 331
C.1 Georg Cantor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
C.2 Alonzo Church . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
C.3 Gerhard Gentzen . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
C.4 Kurt Gödel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
C.5 Emmy Noether . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
C.6 Rózsa Péter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
C.7 Julia Robinson . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
C.8 Bertrand Russell . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
C.9 Alfred Tarski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
C.10 Alan Turing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
C.11 Ernst Zermelo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

D Problems 345

Photo Credits 363

Bibliography 365

vi
Part I

Sets, Relations, Functions

1
Chapter 1

Sets

1.1 Extensionality
A set is a collection of objects, considered as a single object. The objects making
up the set are called elements or members of the set. If x is an element of a set A,
we write x ∈ A; if not, we write x ∈ / A. The set which has no elements is
called the empty set and denoted “∅”.
It does not matter how we specify the set, or how we order its elements, or
indeed how many times we count its elements. All that matters are what its
elements are. We codify this in the following principle.

Definition 1.1 (Extensionality). If A and B are sets, then A = B iff every ele-
ment of A is also an element of B, and vice versa.

Extensionality licenses some notation. In general, when we have some


objects a1 , . . . , an , then { a1 , . . . , an } is the set whose elements are a1 , . . . , an . We
emphasise the word “the”, since extensionality tells us that there can be only
one such set. Indeed, extensionality also licenses the following:

{ a, a, b} = { a, b} = {b, a}.

This delivers on the point that, when we consider sets, we don’t care about
the order of their elements, or how many times they are specified.

Example 1.2. Whenever you have a bunch of objects, you can collect them
together in a set. The set of Richard’s siblings, for instance, is a set that con-
tains one person, and we could write it as S = {Ruth}. The set of positive
integers less than 4 is {1, 2, 3}, but it can also be written as {3, 2, 1} or even as
{1, 2, 1, 2, 3}. These are all the same set, by extensionality. For every element
of {1, 2, 3} is also an element of {3, 2, 1} (and of {1, 2, 1, 2, 3}), and vice versa.

Frequently we’ll specify a set by some property that its elements share.
We’ll use the following shorthand notation for that: { x | ϕ( x )}, where the

3
1. S ETS

ϕ( x ) stands for the property that x has to have in order to be counted among
the elements of the set.

Example 1.3. In our example, we could have specified S also as

S = { x | x is a sibling of Richard}.

Example 1.4. A number is called perfect iff it is equal to the sum of its proper
divisors (i.e., numbers that evenly divide it but aren’t identical to the number).
For instance, 6 is perfect because its proper divisors are 1, 2, and 3, and 6 =
1 + 2 + 3. In fact, 6 is the only positive integer less than 10 that is perfect. So,
using extensionality, we can say:

{6} = { x | x is perfect and 0 ≤ x ≤ 10}

We read the notation on the right as “the set of x’s such that x is perfect and
0 ≤ x ≤ 10”. The identity here confirms that, when we consider sets, we don’t
care about how they are specified. And, more generally, extensionality guar-
antees that there is always only one set of x’s such that ϕ( x ). So, extensionality
justifies calling { x | ϕ( x )} the set of x’s such that ϕ( x ).

Extensionality gives us a way for showing that sets are identical: to show
that A = B, show that whenever x ∈ A then also x ∈ B, and whenever y ∈ B
then also y ∈ A.

1.2 Subsets and Power Sets


We will often want to compare sets. And one obvious kind of comparison one
might make is as follows: everything in one set is in the other too. This situation
is sufficiently important for us to introduce some new notation.

Definition 1.5 (Subset). If every element of a set A is also an element of B,


then we say that A is a subset of B, and write A ⊆ B. If A is not a subset of B
we write A ̸⊆ B. If A ⊆ B but A ̸= B, we write A ⊊ B and say that A is a
proper subset of B.

Example 1.6. Every set is a subset of itself, and ∅ is a subset of every set. The
set of even numbers is a subset of the set of natural numbers. Also, { a, b} ⊆
{ a, b, c}. But { a, b, e} is not a subset of { a, b, c}.

Example 1.7. The number 2 is an element of the set of integers, whereas the
set of even numbers is a subset of the set of integers. However, a set may hap-
pen to both be an element and a subset of some other set, e.g., {0} ∈ {0, {0}}
and also {0} ⊆ {0, {0}}.

4
1.2. Subsets and Power Sets

Extensionality gives a criterion of identity for sets: A = B iff every element


of A is also an element of B and vice versa. The definition of “subset” defines
A ⊆ B precisely as the first half of this criterion: every element of A is also
an element of B. Of course the definition also applies if we switch A and B:
that is, B ⊆ A iff every element of B is also an element of A. And that, in turn,
is exactly the “vice versa” part of extensionality. In other words, extensionality
entails that sets are equal iff they are subsets of one another.

Proposition 1.8. A = B iff both A ⊆ B and B ⊆ A.

Now is also a good opportunity to introduce some further bits of helpful


notation. In defining when A is a subset of B we said that “every element of A
is . . . ,” and filled the “. . . ” with “an element of B”. But this is such a common
shape of expression that it will be helpful to introduce some formal notation
for it.

Definition 1.9. (∀ x ∈ A)ϕ abbreviates ∀ x ( x ∈ A ⊃ ϕ). Similarly, (∃ x ∈ A)ϕ


abbreviates ∃ x ( x ∈ A & ϕ).

Using this notation, we can say that A ⊆ B iff (∀ x ∈ A) x ∈ B.


Now we move on to considering a certain kind of set: the set of all subsets
of a given set.

Definition 1.10 (Power Set). The set consisting of all subsets of a set A is called
the power set of A, written ℘( A).

℘( A) = { B | B ⊆ A}

Example 1.11. What are all the possible subsets of { a, b, c}? They are: ∅,
{ a}, {b}, {c}, { a, b}, { a, c}, {b, c}, { a, b, c}. The set of all these subsets is
℘({ a, b, c}):

℘({ a, b, c}) = {∅, { a}, {b}, {c}, { a, b}, {b, c}, { a, c}, { a, b, c}}

5
1. S ETS

1.3 Some Important Sets


Example 1.12. We will mostly be dealing with sets whose elements are math-
ematical objects. Four such sets are important enough to have specific names:

N = {0, 1, 2, 3, . . .}
the set of natural numbers
Z = {. . . , −2, −1, 0, 1, 2, . . .}
the set of integers
Q= {m/n | m, n ∈ Z and n ̸= 0}
the set of rationals
R = (−∞, ∞)
the set of real numbers (the continuum)
These are all infinite sets, that is, they each have infinitely many elements.
As we move through these sets, we are adding more numbers to our stock.
Indeed, it should be clear that N ⊆ Z ⊆ Q ⊆ R: after all, every natural
number is an integer; every integer is a rational; and every rational is a real.
Equally, it should be clear that N ⊊ Z ⊊ Q, since −1 is an integer but not
a natural number, and 1/2 is rational but not integer. It is less obvious that
Q ⊊ R, i.e., that there are some real numbers which are not rational.
We’ll sometimes also use the set of positive integers Z+ = {1, 2, 3, . . . } and
the set containing just the first two natural numbers B = {0, 1}.

Example 1.13 (Strings). Another interesting example is the set A∗ of finite


strings over an alphabet A: any finite sequence of elements of A is a string
over A. We include the empty string Λ among the strings over A, for every
alphabet A. For instance,

B∗ = {Λ, 0, 1, 00, 01, 10, 11,


000, 001, 010, 011, 100, 101, 110, 111, 0000, . . .}.
If x = x1 . . . xn ∈ A∗ is a string consisting of n “letters” from A, then we say
length of the string is n and write len( x ) = n.

Example 1.14 (Infinite sequences). For any set A we may also consider the
set Aω of infinite sequences of elements of A. An infinite sequence a1 a2 a3 a4 . . .
consists of a one-way infinite list of objects, each one of which is an element
of A.

1.4 Unions and Intersections


In section 1.1, we introduced definitions of sets by abstraction, i.e., definitions
of the form { x | ϕ( x )}. Here, we invoke some property ϕ, and this property

6
1.4. Unions and Intersections

Figure 1.1: The union A ∪ B of two sets is set of elements of A together with
those of B.

can mention sets we’ve already defined. So for instance, if A and B are sets,
the set { x | x ∈ A ∨ x ∈ B} consists of all those objects which are elements
of either A or B, i.e., it’s the set that combines the elements of A and B. We
can visualize this as in Figure 1.1, where the highlighted area indicates the
elements of the two sets A and B together.
This operation on sets—combining them—is very useful and common,
and so we give it a formal name and a symbol.
Definition 1.15 (Union). The union of two sets A and B, written A ∪ B, is the
set of all things which are elements of A, B, or both.

A ∪ B = { x | x ∈ A ∨ x ∈ B}

Example 1.16. Since the multiplicity of elements doesn’t matter, the union of
two sets which have an element in common contains that element only once,
e.g., { a, b, c} ∪ { a, 0, 1} = { a, b, c, 0, 1}.
The union of a set and one of its subsets is just the bigger set: { a, b, c} ∪
{ a} = { a, b, c}.
The union of a set with the empty set is identical to the set: { a, b, c} ∪ ∅ =
{ a, b, c}.

We can also consider a “dual” operation to union. This is the operation


that forms the set of all elements that are elements of A and are also elements
of B. This operation is called intersection, and can be depicted as in Figure 1.2.

Definition 1.17 (Intersection). The intersection of two sets A and B, written


A ∩ B, is the set of all things which are elements of both A and B.

A ∩ B = { x | x ∈ A & x ∈ B}

Two sets are called disjoint if their intersection is empty. This means they have
no elements in common.

7
1. S ETS

Figure 1.2: The intersection A ∩ B of two sets is the set of elements they have
in common.

Example 1.18. If two sets have no elements in common, their intersection is


empty: { a, b, c} ∩ {0, 1} = ∅.
If two sets do have elements in common, their intersection is the set of all
those: { a, b, c} ∩ { a, b, d} = { a, b}.
The intersection of a set with one of its subsets is just the smaller set:
{ a, b, c} ∩ { a, b} = { a, b}.
The intersection of any set with the empty set is empty: { a, b, c} ∩ ∅ = ∅.

We can also form the union or intersection of more than two sets. An
elegant way of dealing with this in general is the following: suppose you
collect all the sets you want to form the union (or intersection) of into a single
set. Then we can define the union of all our original sets as the set of all objects
which belong to at least one element of the set, and the intersection as the set
of all objects which belong to every element of the set.
S
Definition 1.19. If A is a set of sets, then A is the set of elements of elements
of A:
[
A = { x | x belongs to an element of A}, i.e.,
= { x | there is a B ∈ A so that x ∈ B}

T
Definition 1.20. If A is a set of sets, then A is the set of objects which all
elements of A have in common:
\
A = { x | x belongs to every element of A}, i.e.,
= { x | for all B ∈ A, x ∈ B}

Example 1.21. Suppose A = {{ a, b}, { a, d, e}, { a, d}}. Then A = { a, b, d, e}


S

and A = { a}.
T

8
1.5. Pairs, Tuples, Cartesian Products

Figure 1.3: The difference A \ B of two sets is the set of those elements of A
which are not also elements of B.

We could also do the same for a sequence of sets A1 , A2 , . . .


[
Ai = { x | x belongs to one of the Ai }
i
\
Ai = { x | x belongs to every Ai }.
i

When we have an index of sets, i.e., some set I such that we are considering
Ai for each i ∈ I, we may also use these abbreviations:
[ [
Ai = { Ai | i ∈ I }
i∈ I
\ \
Ai = { Ai | i ∈ I }
i∈ I

Finally, we may want to think about the set of all elements in A which are
not in B. We can depict this as in Figure 1.3.
Definition 1.22 (Difference). The set difference A \ B is the set of all elements
of A which are not also elements of B, i.e.,
A \ B = { x | x ∈ A and x ∈
/ B }.

1.5 Pairs, Tuples, Cartesian Products


It follows from extensionality that sets have no order to their elements. So if
we want to represent order, we use ordered pairs ⟨ x, y⟩. In an unordered pair
{ x, y}, the order does not matter: { x, y} = {y, x }. In an ordered pair, it does:
if x ̸= y, then ⟨ x, y⟩ ̸= ⟨y, x ⟩.
How should we think about ordered pairs in set theory? Crucially, we
want to preserve the idea that ordered pairs are identical iff they share the
same first element and share the same second element, i.e.:
⟨ a, b⟩ = ⟨c, d⟩ iff both a = c and b = d.

9
1. S ETS

We can define ordered pairs in set theory using the Wiener–Kuratowski defi-
nition.

Definition 1.23 (Ordered pair). ⟨ a, b⟩ = {{ a}, { a, b}}.

Having fixed a definition of an ordered pair, we can use it to define fur-


ther sets. For example, sometimes we also want ordered sequences of more
than two objects, e.g., triples ⟨ x, y, z⟩, quadruples ⟨ x, y, z, u⟩, and so on. We can
think of triples as special ordered pairs, where the first element is itself an or-
dered pair: ⟨ x, y, z⟩ is ⟨⟨ x, y⟩, z⟩. The same is true for quadruples: ⟨ x, y, z, u⟩ is
⟨⟨⟨ x, y⟩, z⟩, u⟩, and so on. In general, we talk of ordered n-tuples ⟨ x1 , . . . , xn ⟩.
Certain sets of ordered pairs, or other ordered n-tuples, will be useful.

Definition 1.24 (Cartesian product). Given sets A and B, their Cartesian prod-
uct A × B is defined by

A × B = {⟨ x, y⟩ | x ∈ A and y ∈ B}.

Example 1.25. If A = {0, 1}, and B = {1, a, b}, then their product is

A × B = {⟨0, 1⟩, ⟨0, a⟩, ⟨0, b⟩, ⟨1, 1⟩, ⟨1, a⟩, ⟨1, b⟩}.

Example 1.26. If A is a set, the product of A with itself, A × A, is also writ-


ten A2 . It is the set of all pairs ⟨ x, y⟩ with x, y ∈ A. The set of all triples ⟨ x, y, z⟩
is A3 , and so on. We can give a recursive definition:

A1 = A
A k +1 = A k × A

Proposition 1.27. If A has n elements and B has m elements, then A × B has n · m


elements.

Proof. For every element x in A, there are m elements of the form ⟨ x, y⟩ ∈


A × B. Let Bx = {⟨ x, y⟩ | y ∈ B}. Since whenever x1 ̸= x2 , ⟨ x1 , y⟩ ̸= ⟨ x2 , y⟩,
Bx1 ∩ Bx2 = ∅. But if A = { x1 , . . . , xn }, then A × B = Bx1 ∪ · · · ∪ Bxn , and so
has n · m elements.
To visualize this, arrange the elements of A × B in a grid:

Bx1 = {⟨ x1 , y1 ⟩ ⟨ x1 , y2 ⟩ ... ⟨ x1 , ym ⟩}
Bx2 = {⟨ x2 , y1 ⟩ ⟨ x2 , y2 ⟩ ... ⟨ x2 , ym ⟩}
.. ..
. .
Bx n = {⟨ xn , y1 ⟩ ⟨ xn , y2 ⟩ . . . ⟨ xn , ym ⟩}

Since the xi are all different, and the y j are all different, no two of the pairs in
this grid are the same, and there are n · m of them.

10
1.6. Russell’s Paradox

Example 1.28. If A is a set, a word over A is any sequence of elements of A. A


sequence can be thought of as an n-tuple of elements of A. For instance, if A =
{ a, b, c}, then the sequence “bac” can be thought of as the triple ⟨b, a, c⟩. Words,
i.e., sequences of symbols, are of crucial importance in computer science. By
convention, we count elements of A as sequences of length 1, and ∅ as the
sequence of length 0. The set of all words over A then is

A ∗ = { ∅ } ∪ A ∪ A2 ∪ A3 ∪ . . .

1.6 Russell’s Paradox


Extensionality licenses the notation { x | ϕ( x )}, for the set of x’s such that ϕ( x ).
However, all that extensionality really licenses is the following thought. If
there is a set whose members are all and only the ϕ’s, then there is only one
such set. Otherwise put: having fixed some ϕ, the set { x | ϕ( x )} is unique, if
it exists.
But this conditional is important! Crucially, not every property lends itself
to comprehension. That is, some properties do not define sets. If they all did,
then we would run into outright contradictions. The most famous example of
this is Russell’s Paradox.
Sets may be elements of other sets—for instance, the power set of a set A
is made up of sets. And so it makes sense to ask or investigate whether a set
is an element of another set. Can a set be a member of itself? Nothing about
the idea of a set seems to rule this out. For instance, if all sets form a collection
of objects, one might think that they can be collected into a single set—the set
of all sets. And it, being a set, would be an element of the set of all sets.
Russell’s Paradox arises when we consider the property of not having itself
as an element, of being non-self-membered. What if we suppose that there is a
set of all sets that do not have themselves as an element? Does

R = {x | x ∈
/ x}

exist? It turns out that we can prove that it does not.

Theorem 1.29 (Russell’s Paradox). There is no set R = { x | x ∈


/ x }.

Proof. If R = { x | x ∈
/ x } exists, then R ∈ R iff R ∈
/ R, which is a contradic-
tion.

Let’s run through this proof more slowly. If R exists, it makes sense to ask
whether R ∈ R or not. Suppose that indeed R ∈ R. Now, R was defined as
the set of all sets that are not elements of themselves. So, if R ∈ R, then R does
not itself have R’s defining property. But only sets that have this property are
in R, hence, R cannot be an element of R, i.e., R ∈ / R. But R can’t both be and
not be an element of R, so we have a contradiction.

11
1. S ETS

Since the assumption that R ∈ R leads to a contradiction, we have R ∈ / R.


But this also leads to a contradiction! For if R ∈
/ R, then R itself does have R’s
defining property, and so R would be an element of R just like all the other
non-self-membered sets. And again, it can’t both not be and be an element
of R.
How do we set up a set theory which avoids falling into Russell’s Para-
dox, i.e., which avoids making the inconsistent claim that R = { x | x ∈ / x}
exists? Well, we would need to lay down axioms which give us very precise
conditions for stating when sets exist (and when they don’t).
The set theory sketched in this chapter doesn’t do this. It’s genuinely naı̈ve.
It tells you only that sets obey extensionality and that, if you have some sets,
you can form their union, intersection, etc. It is possible to develop set theory
more rigorously than this.

12
Chapter 2

Relations

2.1 Relations as Sets


In section 1.3, we mentioned some important sets: N, Z, Q, R. You will no
doubt remember some interesting relations between the elements of some of
these sets. For instance, each of these sets has a completely standard order
relation on it. There is also the relation is identical with that every object bears
to itself and to no other thing. There are many more interesting relations that
we’ll encounter, and even more possible relations. Before we review them,
though, we will start by pointing out that we can look at relations as a special
sort of set.
For this, recall two things from section 1.5. First, recall the notion of a or-
dered pair: given a and b, we can form ⟨ a, b⟩. Importantly, the order of elements
does matter here. So if a ̸= b then ⟨ a, b⟩ ̸= ⟨b, a⟩. (Contrast this with unordered
pairs, i.e., 2-element sets, where { a, b} = {b, a}.) Second, recall the notion of
a Cartesian product: if A and B are sets, then we can form A × B, the set of all
pairs ⟨ x, y⟩ with x ∈ A and y ∈ B. In particular, A2 = A × A is the set of all
ordered pairs from A.
Now we will consider a particular relation on a set: the <-relation on the
set N of natural numbers. Consider the set of all pairs of numbers ⟨n, m⟩
where n < m, i.e.,
R = {⟨n, m⟩ | n, m ∈ N and n < m}.
There is a close connection between n being less than m, and the pair ⟨n, m⟩
being a member of R, namely:
n < m iff ⟨n, m⟩ ∈ R.
Indeed, without any loss of information, we can consider the set R to be the
<-relation on N.
In the same way we can construct a subset of N2 for any relation between
numbers. Conversely, given any set of pairs of numbers S ⊆ N2 , there is a

13
2. R ELATIONS

corresponding relation between numbers, namely, the relationship n bears to


m if and only if ⟨n, m⟩ ∈ S. This justifies the following definition:
Definition 2.1 (Binary relation). A binary relation on a set A is a subset of A2 .
If R ⊆ A2 is a binary relation on A and x, y ∈ A, we sometimes write Rxy (or
xRy) for ⟨ x, y⟩ ∈ R.

Example 2.2. The set N2 of pairs of natural numbers can be listed in a 2-


dimensional matrix like this:
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. ..
. . . . .

We have put the diagonal, here, in bold, since the subset of N2 consisting of
the pairs lying on the diagonal, i.e.,

{⟨0, 0⟩, ⟨1, 1⟩, ⟨2, 2⟩, . . . },

is the identity relation on N. (Since the identity relation is popular, let’s define
Id A = {⟨ x, x ⟩ | x ∈ A} for any set A.) The subset of all pairs lying above the
diagonal, i.e.,

L = {⟨0, 1⟩, ⟨0, 2⟩, . . . , ⟨1, 2⟩, ⟨1, 3⟩, . . . , ⟨2, 3⟩, ⟨2, 4⟩, . . .},

is the less than relation, i.e., Lnm iff n < m. The subset of pairs below the
diagonal, i.e.,

G = {⟨1, 0⟩, ⟨2, 0⟩, ⟨2, 1⟩, ⟨3, 0⟩, ⟨3, 1⟩, ⟨3, 2⟩, . . . },

is the greater than relation, i.e., Gnm iff n > m. The union of L with I, which
we might call K = L ∪ I, is the less than or equal to relation: Knm iff n ≤ m.
Similarly, H = G ∪ I is the greater than or equal to relation. These relations L, G,
K, and H are special kinds of relations called orders. L and G have the property
that no number bears L or G to itself (i.e., for all n, neither Lnn nor Gnn).
Relations with this property are called irreflexive, and, if they also happen to
be orders, they are called strict orders.

Although orders and identity are important and natural relations, it should
be emphasized that according to our definition any subset of A2 is a relation
on A, regardless of how unnatural or contrived it seems. In particular, ∅ is a
relation on any set (the empty relation, which no pair of elements bears), and
A2 itself is a relation on A as well (one which every pair bears), called the
universal relation. But also something like E = {⟨n, m⟩ | n > 5 or m × n ≥ 34}
counts as a relation.

14
2.2. Special Properties of Relations

2.2 Special Properties of Relations


Some kinds of relations turn out to be so common that they have been given
special names. For instance, ≤ and ⊆ both relate their respective domains
(say, N in the case of ≤ and ℘( A) in the case of ⊆) in similar ways. To get
at exactly how these relations are similar, and how they differ, we categorize
them according to some special properties that relations can have. It turns out
that (combinations of) some of these special properties are especially impor-
tant: orders and equivalence relations.

Definition 2.3 (Reflexivity). A relation R ⊆ A2 is reflexive iff, for every x ∈ A,


Rxx.

Definition 2.4 (Transitivity). A relation R ⊆ A2 is transitive iff, whenever Rxy


and Ryz, then also Rxz.

Definition 2.5 (Symmetry). A relation R ⊆ A2 is symmetric iff, whenever Rxy,


then also Ryx.

Definition 2.6 (Anti-symmetry). A relation R ⊆ A2 is anti-symmetric iff, when-


ever both Rxy and Ryx, then x = y (or, in other words: if x ̸= y then either
∼ Rxy or ∼ Ryx).

In a symmetric relation, Rxy and Ryx always hold together, or neither


holds. In an anti-symmetric relation, the only way for Rxy and Ryx to hold to-
gether is if x = y. Note that this does not require that Rxy and Ryx holds when
x = y, only that it isn’t ruled out. So an anti-symmetric relation can be reflex-
ive, but it is not the case that every anti-symmetric relation is reflexive. Also
note that being anti-symmetric and merely not being symmetric are different
conditions. In fact, a relation can be both symmetric and anti-symmetric at the
same time (e.g., the identity relation is).

Definition 2.7 (Connectivity). A relation R ⊆ A2 is connected if for all x, y ∈


A, if x ̸= y, then either Rxy or Ryx.

Definition 2.8 (Irreflexivity). A relation R ⊆ A2 is called irreflexive if, for all


x ∈ A, not Rxx.

Definition 2.9 (Asymmetry). A relation R ⊆ A2 is called asymmetric if for no


pair x, y ∈ A we have both Rxy and Ryx.

Note that if A ̸= ∅, then no irreflexive relation on A is reflexive and every


asymmetric relation on A is also anti-symmetric. However, there are R ⊆ A2
that are not reflexive and also not irreflexive, and there are anti-symmetric
relations that are not asymmetric.

15
2. R ELATIONS

2.3 Equivalence Relations


The identity relation on a set is reflexive, symmetric, and transitive. Rela-
tions R that have all three of these properties are very common.

Definition 2.10 (Equivalence relation). A relation R ⊆ A2 that is reflexive,


symmetric, and transitive is called an equivalence relation. Elements x and y
of A are said to be R-equivalent if Rxy.

Equivalence relations give rise to the notion of an equivalence class. An


equivalence relation “chunks up” the domain into different partitions. Within
each partition, all the objects are related to one another; and no objects from
different partitions relate to one another. Sometimes, it’s helpful just to talk
about these partitions directly. To that end, we introduce a definition:

Definition 2.11. Let R ⊆ A2 be an equivalence relation. For each x ∈ A, the


equivalence class of x in A is the set [ x ] R = {y ∈ A | Rxy}. The quotient of A
under R is A/R = {[ x ] R | x ∈ A}, i.e., the set of these equivalence classes.

The next result vindicates the definition of an equivalence class, in proving


that the equivalence classes are indeed the partitions of A:

Proposition 2.12. If R ⊆ A2 is an equivalence relation, then Rxy iff [ x ] R = [y] R .

Proof. For the left-to-right direction, suppose Rxy, and let z ∈ [ x ] R . By defi-
nition, then, Rxz. Since R is an equivalence relation, Ryz. (Spelling this out:
as Rxy and R is symmetric we have Ryx, and as Rxz and R is transitive we
have Ryz.) So z ∈ [y] R . Generalising, [ x ] R ⊆ [y] R . But exactly similarly,
[y] R ⊆ [ x ] R . So [ x ] R = [y] R , by extensionality.
For the right-to-left direction, suppose [ x ] R = [y] R . Since R is reflexive,
Ryy, so y ∈ [y] R . Thus also y ∈ [ x ] R by the assumption that [ x ] R = [y] R . So
Rxy.

Example 2.13. A nice example of equivalence relations comes from modular


arithmetic. For any a, b, and n ∈ Z+ , say that a ≡n b iff dividing a by n gives
the same remainder as dividing b by n. (Somewhat more symbolically: a ≡n b
iff, for some k ∈ Z, a − b = kn.) Now, ≡n is an equivalence relation, for any n.
And there are exactly n distinct equivalence classes generated by ≡n ; that is,
N/≡n has n elements. These are: the set of numbers divisible by n without
remainder, i.e., [0]≡n ; the set of numbers divisible by n with remainder 1, i.e.,
[1]≡n ; . . . ; and the set of numbers divisible by n with remainder n − 1, i.e., [n −
1] ≡ n .

16
2.4. Orders

2.4 Orders
Many of our comparisons involve describing some objects as being “less than”,
“equal to”, or “greater than” other objects, in a certain respect. These involve
order relations. But there are different kinds of order relations. For instance,
some require that any two objects be comparable, others don’t. Some include
identity (like ≤) and some exclude it (like <). It will help us to have a taxon-
omy here.

Definition 2.14 (Preorder). A relation which is both reflexive and transitive is


called a preorder.

Definition 2.15 (Partial order). A preorder which is also anti-symmetric is called


a partial order.

Definition 2.16 (Linear order). A partial order which is also connected is called
a total order or linear order.

Example 2.17. Every linear order is also a partial order, and every partial or-
der is also a preorder, but the converses don’t hold. The universal relation
on A is a preorder, since it is reflexive and transitive. But, if A has more than
one element, the universal relation is not anti-symmetric, and so not a partial
order.

Example 2.18. Consider the no longer than relation ≼ on B∗ : x ≼ y iff len( x ) ≤


len(y). This is a preorder (reflexive and transitive), and even connected, but
not a partial order, since it is not anti-symmetric. For instance, 01 ≼ 10 and
10 ≼ 01, but 01 ̸= 10.

Example 2.19. An important partial order is the relation ⊆ on a set of sets.


This is not in general a linear order, since if a ̸= b and we consider ℘({ a, b}) =
{∅, { a}, {b}, { a, b}}, we see that { a} ⊈ {b} and { a} ̸= {b} and {b} ⊈ { a}.

Example 2.20. The relation of divisibility without remainder gives us a partial


order which isn’t a linear order. For integers n, m, we write n | m to mean
n (evenly) divides m, i.e., iff there is some integer k so that m = kn. On N,
this is a partial order, but not a linear order: for instance, 2 ∤ 3 and also 3 ∤ 2.
Considered as a relation on Z, divisibility is only a preorder since it is not
anti-symmetric: 1 | −1 and −1 | 1 but 1 ̸= −1.

Definition 2.21 (Strict order). A strict order is a relation which is irreflexive,


asymmetric, and transitive.

Definition 2.22 (Strict linear order). A strict order which is also connected is
called a strict total order or strict linear order.

17
2. R ELATIONS

Example 2.23. ≤ is the linear order corresponding to the strict linear order <.
⊆ is the partial order corresponding to the strict order ⊊.

Any strict order R on A can be turned into a partial order by adding the
diagonal Id A , i.e., adding all the pairs ⟨ x, x ⟩. (This is called the reflexive closure
of R.) Conversely, starting from a partial order, one can get a strict order by
removing Id A . These next two results make this precise.

Proposition 2.24. If R is a strict order on A, then R+ = R ∪ Id A is a partial order.


Moreover, if R is a strict linear order, then R+ is a linear order.

Proof. Suppose R is a strict order, i.e., R ⊆ A2 and R is irreflexive, asymmetric,


and transitive. Let R+ = R ∪ Id A . We have to show that R+ is reflexive, anti-
symmetric, and transitive.
R+ is clearly reflexive, since ⟨ x, x ⟩ ∈ Id A ⊆ R+ for all x ∈ A.
To show R+ is anti-symmetric, suppose for reductio that R+ xy and R+ yx
but x ̸= y. Since ⟨ x, y⟩ ∈ R ∪ Id A , but ⟨ x, y⟩ ∈ / Id A , we must have ⟨ x, y⟩ ∈ R,
i.e., Rxy. Similarly, Ryx. But this contradicts the assumption that R is asym-
metric.
To establish transitivity, suppose that R+ xy and R+ yz. If both ⟨ x, y⟩ ∈ R
and ⟨y, z⟩ ∈ R, then ⟨ x, z⟩ ∈ R since R is transitive. Otherwise, either ⟨ x, y⟩ ∈
Id A , i.e., x = y, or ⟨y, z⟩ ∈ Id A , i.e., y = z. In the first case, we have that R+ yz
by assumption, x = y, hence R+ xz. Similarly in the second case. In either
case, R+ xz, thus, R+ is also transitive.
Concerning the “moreover” clause, suppose that R is also connected. So
for all x ̸= y, either Rxy or Ryx, i.e., either ⟨ x, y⟩ ∈ R or ⟨y, x ⟩ ∈ R. Since
R ⊆ R+ , this remains true of R+ , so R+ is connected as well.

Proposition 2.25. If R is a partial order on A, then R− = R \ Id A is a strict order.


Moreover, if R is a linear order, then R− is a strict linear order.

Proof. This is left as an exercise.

The following simple result establishes that strict linear orders satisfy an
extensionality-like property:

Proposition 2.26. If < is a strict linear order on A, then:

(∀ a, b ∈ A)((∀ x ∈ A)( x < a ≡ x < b) ⊃ a = b).

Proof. Suppose (∀ x ∈ A)( x < a ≡ x < b). If a < b, then a < a, contradicting
the fact that < is irreflexive; so a ≮ b. Exactly similarly, b ≮ a. So a = b, as <
is connected.

18
2.5. Graphs

2.5 Graphs

A graph is a diagram in which points—called “nodes” or “vertices” (plural of


“vertex”)—are connected by edges. Graphs are a ubiquitous tool in discrete
mathematics and in computer science. They are incredibly useful for repre-
senting, and visualizing, relationships and structures, from concrete things
like networks of various kinds to abstract structures such as the possible out-
comes of decisions. There are many different kinds of graphs in the literature
which differ, e.g., according to whether the edges are directed or not, have la-
bels or not, whether there can be edges from a node to the same node, multiple
edges between the same nodes, etc. Directed graphs have a special connection
to relations.

Definition 2.27 (Directed graph). A directed graph G = ⟨V, E⟩ is a set of ver-


tices V and a set of edges E ⊆ V 2 .

According to our definition, a graph just is a set together with a relation


on that set. Of course, when talking about graphs, it’s only natural to expect
that they are graphically represented: we can draw a graph by connecting two
vertices v1 and v2 by an arrow iff ⟨v1 , v2 ⟩ ∈ E. The only difference between a
relation by itself and a graph is that a graph specifies the set of vertices, i.e., a
graph may have isolated vertices. The important point, however, is that every
relation R on a set X can be seen as a directed graph ⟨ X, R⟩, and conversely, a
directed graph ⟨V, E⟩ can be seen as a relation E ⊆ V 2 with the set V explicitly
specified.

Example 2.28. The graph ⟨V, E⟩ with V = {1, 2, 3, 4} and E = {⟨1, 1⟩, ⟨1, 2⟩,
⟨1, 3⟩, ⟨2, 3⟩} looks like this:

1 2 4

19
2. R ELATIONS

This is a different graph than ⟨V ′ , E⟩ with V ′ = {1, 2, 3}, which looks like this:

1 2

2.6 Operations on Relations


It is often useful to modify or combine relations. In Proposition 2.24, we con-
sidered the union of relations, which is just the union of two relations consid-
ered as sets of pairs. Similarly, in Proposition 2.25, we considered the relative
difference of relations. Here are some other operations we can perform on
relations.

Definition 2.29. Let R, S be relations, and A be any set.


The inverse of R is R−1 = {⟨y, x ⟩ | ⟨ x, y⟩ ∈ R}.
The relative product of R and S is ( R | S) = {⟨ x, z⟩ : ∃y( Rxy & Syz)}.
The restriction of R to A is R↾ A = R ∩ A2 .
The application of R to A is R[ A] = {y : (∃ x ∈ A) Rxy}

Example 2.30. Let S ⊆ Z2 be the successor relation on Z, i.e., S = {⟨ x, y⟩ ∈


Z2 | x + 1 = y}, so that Sxy iff x + 1 = y.
S−1 is the predecessor relation on Z, i.e., {⟨ x, y⟩ ∈ Z2 | x − 1 = y}.
S | S is {⟨ x, y⟩ ∈ Z2 | x + 2 = y}
S↾N is the successor relation on N.
S[{1, 2, 3}] is {2, 3, 4}.

Definition 2.31 (Transitive closure). Let R ⊆ A2 be a binary relation.


The transitive closure of R is R+ = 0<n∈N Rn , where we recursively define
S

R1 = R and Rn+1 = Rn | R.
The reflexive transitive closure of R is R∗ = R+ ∪ Id A .

Example 2.32. Take the successor relation S ⊆ Z2 . S2 xy iff x + 2 = y, S3 xy iff


x + 3 = y, etc. So S+ xy iff x + n = y for some n ≥ 1. In other words, S+ xy iff
x < y, and S∗ xy iff x ≤ y.

20
Chapter 3

Functions

3.1 Basics
A function is a map which sends each element of a given set to a specific ele-
ment in some (other) given set. For instance, the operation of adding 1 defines
a function: each number n is mapped to a unique number n + 1.
More generally, functions may take pairs, triples, etc., as inputs and re-
turn some kind of output. Many functions are familiar to us from basic arith-
metic. For instance, addition and multiplication are functions. They take in
two numbers and return a third.
In this mathematical, abstract sense, a function is a black box: what matters
is only what output is paired with what input, not the method for calculating
the output.

Definition 3.1 (Function). A function f : A → B is a mapping of each element


of A to an element of B.
We call A the domain of f and B the codomain of f . The elements of A are
called inputs or arguments of f , and the element of B that is paired with an
argument x by f is called the value of f for argument x, written f ( x ).
The range ran( f ) of f is the subset of the codomain consisting of the values
of f for some argument; ran( f ) = { f ( x ) | x ∈ A}.

The diagram in Figure 3.1 may help to think about functions. The ellipse
on the left represents the function’s domain; the ellipse on the right represents
the function’s codomain; and an arrow points from an argument in the domain
to the corresponding value in the codomain.

Example 3.2. Multiplication takes pairs of natural numbers as inputs and maps
them to natural numbers as outputs, so goes from N × N (the domain) to N
(the codomain). As it turns out, the range is also N, since every n ∈ N is
n × 1.

21
3. F UNCTIONS

Figure 3.1: A function is a mapping of each element of one set to an element of


another. An arrow points from an argument in the domain to the correspond-
ing value in the codomain.

Example 3.3. Multiplication is a function because it pairs each input—each


pair of natural numbers—with a single output: × : N2 → N. By contrast,
√ N is not
the square root operation applied to the domain √ functional, since
each positive integer n has two square roots: n and√− n. We can make it
functional by only returning the positive square root: : N → R.

Example 3.4. The relation that pairs each student in a class with their final
grade is a function—no student can get two different final grades in the same
class. The relation that pairs each student in a class with their parents is not a
function: students can have zero, or two, or more parents.

We can define functions by specifying in some precise way what the value
of the function is for every possible argument. Different ways of doing this are
by giving a formula, describing a method for computing the value, or listing
the values for each argument. However functions are defined, we must make
sure that for each argument we specify one, and only one, value.

Example 3.5. Let f : N → N be defined such that f ( x ) = x + 1. This is a


definition that specifies f as a function which takes in natural numbers and
outputs natural numbers. It tells us that, given a natural number x, f will
output its successor x + 1. In this case, the codomain N is not the range of f ,
since the natural number 0 is not the successor of any natural number. The
range of f is the set of all positive integers, Z+ .

Example 3.6. Let g : N → N be defined such that g( x ) = x + 2 − 1. This tells


us that g is a function which takes in natural numbers and outputs natural
numbers. Given a natural number n, g will output the predecessor of the
successor of the successor of x, i.e., x + 1.

We just considered two functions, f and g, with different definitions. How-


ever, these are the same function. After all, for any natural number n, we have
that f (n) = n + 1 = n + 2 − 1 = g(n). Otherwise put: our definitions for f

22
3.2. Kinds of Functions

Figure 3.2: A surjective function has every element of the codomain as a value.

and g specify the same mapping by means of different equations. Implicitly,


then, we are relying upon a principle of extensionality for functions,

if ∀ x f ( x ) = g( x ), then f = g

provided that f and g share the same domain and codomain.

Example 3.7. We can also define functions by cases. For instance, we could
define h : N → N by
(
x
if x is even
h( x ) = 2x+1
2 if x is odd.
Since every natural number is either even or odd, the output of this function
will always be a natural number. Just remember that if you define a function
by cases, every possible input must fall into exactly one case. In some cases,
this will require a proof that the cases are exhaustive and exclusive.

3.2 Kinds of Functions


It will be useful to introduce a kind of taxonomy for some of the kinds of
functions which we encounter most frequently.
To start, we might want to consider functions which have the property that
every member of the codomain is a value of the function. Such functions are
called surjective, and can be pictured as in Figure 3.2.

Definition 3.8 (Surjective function). A function f : A → B is surjective iff B


is also the range of f , i.e., for every y ∈ B there is at least one x ∈ A such
that f ( x ) = y, or in symbols:

(∀y ∈ B)(∃ x ∈ A) f ( x ) = y.

We call such a function a surjection from A to B.

If you want to show that f is a surjection, then you need to show that every
object in f ’s codomain is the value of f ( x ) for some input x.

23
3. F UNCTIONS

Figure 3.3: An injective function never maps two different arguments to the
same value.

Note that any function induces a surjection. After all, given a function
f : A → B, let f ′ : A → ran( f ) be defined by f ′ ( x ) = f ( x ). Since ran( f ) is
defined as { f ( x ) ∈ B | x ∈ A}, this function f ′ is guaranteed to be a surjection
Now, any function maps each possible input to a unique output. But there
are also functions which never map different inputs to the same outputs. Such
functions are called injective, and can be pictured as in Figure 3.3.
Definition 3.9 (Injective function). A function f : A → B is injective iff for
each y ∈ B there is at most one x ∈ A such that f ( x ) = y. We call such a
function an injection from A to B.
If you want to show that f is an injection, you need to show that for any
elements x and y of f ’s domain, if f ( x ) = f (y), then x = y.
Example 3.10. The constant function f : N → N given by f ( x ) = 1 is neither
injective, nor surjective.
The identity function f : N → N given by f ( x ) = x is both injective and
surjective.
The successor function f : N → N given by f ( x ) = x + 1 is injective but
not surjective.
The function f : N → N defined by:
(
x
if x is even
f ( x ) = 2x+1
2 if x is odd.
is surjective, but not injective.
Often enough, we want to consider functions which are both injective and
surjective. We call such functions bijective. They look like the function pic-
tured in Figure 3.4. Bijections are also sometimes called one-to-one correspon-
dences, since they uniquely pair elements of the codomain with elements of
the domain.
Definition 3.11 (Bijection). A function f : A → B is bijective iff it is both sur-
jective and injective. We call such a function a bijection from A to B (or be-
tween A and B).

24
3.3. Functions as Relations

Figure 3.4: A bijective function uniquely pairs the elements of the codomain
with those of the domain.

3.3 Functions as Relations


A function which maps elements of A to elements of B obviously defines a
relation between A and B, namely the relation which holds between x and
y iff f ( x ) = y. In fact, we might even—if we are interested in reducing the
building blocks of mathematics for instance—identify the function f with this
relation, i.e., with a set of pairs. This then raises the question: which relations
define functions in this way?

Definition 3.12 (Graph of a function). Let f : A → B be a function. The graph


of f is the relation R f ⊆ A × B defined by

R f = {⟨ x, y⟩ | f ( x ) = y}.

The graph of a function is uniquely determined, by extensionality. More-


over, extensionality (on sets) will immediately vindicate the implicit princi-
ple of extensionality for functions, whereby if f and g share a domain and
codomain then they are identical if they agree on all values.
Similarly, if a relation is “functional”, then it is the graph of a function.

Proposition 3.13. Let R ⊆ A × B be such that:

1. If Rxy and Rxz then y = z; and

2. for every x ∈ A there is some y ∈ B such that ⟨ x, y⟩ ∈ R.

Then R is the graph of the function f : A → B defined by f ( x ) = y iff Rxy.

Proof. Suppose there is a y such that Rxy. If there were another z ̸= y such
that Rxz, the condition on R would be violated. Hence, if there is a y such that
Rxy, this y is unique, and so f is well-defined. Obviously, R f = R.

Every function f : A → B has a graph, i.e., a relation on A × B defined


by f ( x ) = y. On the other hand, every relation R ⊆ A × B with the proper-
ties given in Proposition 3.13 is the graph of a function f : A → B. Because
of this close connection between functions and their graphs, we can think of

25
3. F UNCTIONS

a function simply as its graph. In other words, functions can be identified


with certain relations, i.e., with certain sets of tuples. We can now consider
performing similar operations on functions as we performed on relations (see
section 2.6). In particular:

Definition 3.14. Let f : A → B be a function with C ⊆ A.


The restriction of f to C is the function f ↾C : C → B defined by ( f ↾C )( x ) =
f ( x ) for all x ∈ C. In other words, f ↾C = {⟨ x, y⟩ ∈ R f | x ∈ C }.
The application of f to C is f [C ] = { f ( x ) | x ∈ C }. We also call this the
image of C under f .

It follows from these definitions that ran( f ) = f [dom( f )], for any func-
tion f . These notions are exactly as one would expect, given the definitions
in section 2.6 and our identification of functions with relations. But two other
operations—inverses and relative products—require a little more detail. We
will provide that in section 3.4 and section 3.5.

3.4 Inverses of Functions


We think of functions as maps. An obvious question to ask about functions,
then, is whether the mapping can be “reversed.” For instance, the successor
function f ( x ) = x + 1 can be reversed, in the sense that the function g(y) =
y − 1 “undoes” what f does.
But we must be careful. Although the definition of g defines a function
Z → Z, it does not define a function N → N, since g(0) ∈ / N. So even in
simple cases, it is not quite obvious whether a function can be reversed; it
may depend on the domain and codomain.
This is made more precise by the notion of an inverse of a function.

Definition 3.15. A function g : B → A is an inverse of a function f : A → B if


f ( g(y)) = y and g( f ( x )) = x for all x ∈ A and y ∈ B.

If f has an inverse g, we often write f −1 instead of g.


Now we will determine when functions have inverses. A good candidate
for an inverse of f : A → B is g : B → A “defined by”

g(y) = “the” x such that f ( x ) = y.

But the scare quotes around “defined by” (and “the”) suggest that this is not
a definition. At least, it will not always work, with complete generality. For,
in order for this definition to specify a function, there has to be one and only
one x such that f ( x ) = y—the output of g has to be uniquely specified. More-
over, it has to be specified for every y ∈ B. If there are x1 and x2 ∈ A with
x1 ̸= x2 but f ( x1 ) = f ( x2 ), then g(y) would not be uniquely specified for
y = f ( x1 ) = f ( x2 ). And if there is no x at all such that f ( x ) = y, then g(y) is

26
3.4. Inverses of Functions

not specified at all. In other words, for g to be defined, f must be both injective
and surjective.
Let’s go slowly. We’ll divide the question into two: Given a function f : A →
B, when is there a function g : B → A so that g( f ( x )) = x? Such a g “undoes”
what f does, and is called a left inverse of f . Secondly, when is there a function
h : B → A so that f (h(y)) = y? Such an h is called a right inverse of f — f
“undoes” what h does.

Proposition 3.16. If f : A → B is injective, then there is a left inverse g : B → A


of f so that g( f ( x )) = x for all x ∈ A.

Proof. Suppose that f : A → B is injective. Consider a y ∈ B. If y ∈ ran( f ),


there is an x ∈ A so that f ( x ) = y. Because f is injective, there is only one
such x ∈ A. Then we can define: g(y) = x, i.e., g(y) is “the” x ∈ A such that
f ( x ) = y. If y ∈
/ ran( f ), we can map it to any a ∈ A. So, we can pick an a ∈ A
and define g : B → A by:
(
x if f ( x ) = y
g(y) =
a if y ∈/ ran( f ).

It is defined for all y ∈ B, since for each such y ∈ ran( f ) there is exactly one
x ∈ A such that f ( x ) = y. By definition, if y = f ( x ), then g(y) = x, i.e.,
g( f ( x )) = x.

Proposition 3.17. If f : A → B is surjective, then there is a right inverse h : B →


A of f so that f (h(y)) = y for all y ∈ B.

Proof. Suppose that f : A → B is surjective. Consider a y ∈ B. Since f is


surjective, there is an xy ∈ A with f ( xy ) = y. Then we can define: h(y) = xy ,
i.e., for each y ∈ B we choose some x ∈ A so that f ( x ) = y; since f is surjective
there is always at least one to choose from.1 By definition, if x = h(y), then
f ( x ) = y, i.e., for any y ∈ B, f (h(y)) = y.

By combining the ideas in the previous proof, we now get that every bijec-
tion has an inverse, i.e., there is a single function which is both a left and right
inverse of f .

Proposition 3.18. If f : A → B is bijective, there is a function f −1 : B → A so that


for all x ∈ A, f −1 ( f ( x )) = x and for all y ∈ B, f ( f −1 (y)) = y.
1 Since f is surjective, for every y ∈ B the set { x | f ( x ) = y } is nonempty. Our definition

of h requires that we choose a single x from each of these sets. That this is always possible is
actually not obvious—the possibility of making these choices is simply assumed as an axiom.
In other words, this proposition assumes the so-called Axiom of Choice, an issue we will gloss
over. However, in many specific cases, e.g., when A = N or is finite, or when f is bijective, the
Axiom of Choice is not required. (In the particular case when f is bijective, for each y ∈ B the set
{ x | f ( x ) = y} has exactly one element, so that there is no choice to make.)

27
3. F UNCTIONS

Proof. Exercise.

There is a slightly more general way to extract inverses. We saw in sec-


tion 3.2 that every function f induces a surjection f ′ : A → ran( f ) by letting
f ′ ( x ) = f ( x ) for all x ∈ A. Clearly, if f is injective, then f ′ is bijective, so that
it has a unique inverse by Proposition 3.18. By a very minor abuse of notation,
we sometimes call the inverse of f ′ simply “the inverse of f .”

Proposition 3.19. Show that if f : A → B has a left inverse g and a right inverse h,
then h = g.

Proof. Exercise.

Proposition 3.20. Every function f has at most one inverse.

Proof. Suppose g and h are both inverses of f . Then in particular g is a left


inverse of f and h is a right inverse. By Proposition 3.19, g = h.

3.5 Composition of Functions


We saw in section 3.4 that the inverse f −1 of a bijection f is itself a function.
Another operation on functions is composition: we can define a new function
by composing two functions, f and g, i.e., by first applying f and then g. Of
course, this is only possible if the ranges and domains match, i.e., the range
of f must be a subset of the domain of g. This operation on functions is the
analogue of the operation of relative product on relations from section 2.6.
A diagram might help to explain the idea of composition. In Figure 3.5, we
depict two functions f : A → B and g : B → C and their composition ( g ◦ f ).
The function ( g ◦ f ) : A → C pairs each element of A with an element of C. We
specify which element of C an element of A is paired with as follows: given
an input x ∈ A, first apply the function f to x, which will output some f ( x ) =
y ∈ B, then apply the function g to y, which will output some g( f ( x )) =
g(y) = z ∈ C.

Definition 3.21 (Composition). Let f : A → B and g : B → C be functions.


The composition of f with g is g ◦ f : A → C, where ( g ◦ f )( x ) = g( f ( x )).

Example 3.22. Consider the functions f ( x ) = x + 1, and g( x ) = 2x. Since


( g ◦ f )( x ) = g( f ( x )), for each input x you must first take its successor, then
multiply the result by two. So their composition is given by ( g ◦ f )( x ) =
2( x + 1).

28
3.6. Partial Functions

Figure 3.5: The composition g ◦ f of two functions f and g.

3.6 Partial Functions


It is sometimes useful to relax the definition of function so that it is not re-
quired that the output of the function is defined for all possible inputs. Such
mappings are called partial functions.
Definition 3.23. A partial function f : A → 7 B is a mapping which assigns to
every element of A at most one element of B. If f assigns an element of B to
x ∈ A, we say f ( x ) is defined, and otherwise undefined. If f ( x ) is defined, we
write f ( x ) ↓, otherwise f ( x ) ↑. The domain of a partial function f is the subset
of A where it is defined, i.e., dom( f ) = { x ∈ A | f ( x ) ↓}.

Example 3.24. Every function f : A → B is also a partial function. Partial


functions that are defined everywhere on A—i.e., what we so far have simply
called a function—are also called total functions.

Example 3.25. The partial function f : R →7 R given by f ( x ) = 1/x is unde-


fined for x = 0, and defined everywhere else.

Definition 3.26 (Graph of a partial function). Let f : A → 7 B be a partial func-


tion. The graph of f is the relation R f ⊆ A × B defined by
R f = {⟨ x, y⟩ | f ( x ) = y}.

Proposition 3.27. Suppose R ⊆ A × B has the property that whenever Rxy and
Rxy′ then y = y′ . Then R is the graph of the partial function f : X → 7 Y defined by:
if there is a y such that Rxy, then f ( x ) = y, otherwise f ( x ) ↑. If R is also serial, i.e.,
for each x ∈ X there is a y ∈ Y such that Rxy, then f is total.

Proof. Suppose there is a y such that Rxy. If there were another y′ ̸= y such
that Rxy′ , the condition on R would be violated. Hence, if there is a y such
that Rxy, that y is unique, and so f is well-defined. Obviously, R f = R and f
is total if R is serial.

29
Chapter 4

The Size of Sets

4.1 Introduction
When Georg Cantor developed set theory in the 1870s, one of his aims was
to make palatable the idea of an infinite collection—an actual infinity, as the
medievals would say. A key part of this was his treatment of the size of dif-
ferent sets. If a, b and c are all distinct, then the set { a, b, c} is intuitively larger
than { a, b}. But what about infinite sets? Are they all as large as each other?
It turns out that they are not.
The first important idea here is that of an enumeration. We can list every
finite set by listing all its elements. For some infinite sets, we can also list
all their elements if we allow the list itself to be infinite. Such sets are called
countable. Cantor’s surprising result, which we will fully understand by the
end of this chapter, was that some infinite sets are not countable.

4.2 Enumerations and Countable Sets


We’ve already given examples of sets by listing their elements. Let’s discuss
in more general terms how and when we can list the elements of a set, even if
that set is infinite.

Definition 4.1 (Enumeration, informally). Informally, an enumeration of a set A


is a list (possibly infinite) of elements of A such that every element of A ap-
pears on the list at some finite position. If A has an enumeration, then A is
said to be countable.

A couple of points about enumerations:

1. We count as enumerations only lists which have a beginning and in


which every element other than the first has a single element immedi-
ately preceding it. In other words, there are only finitely many elements
between the first element of the list and any other element. In particular,

31
4. T HE S IZE OF S ETS

this means that every element of an enumeration has a finite position:


the first element has position 1, the second position 2, etc.

2. We can have different enumerations of the same set A which differ by


the order in which the elements appear: 4, 1, 25, 16, 9 enumerates the
(set of the) first five square numbers just as well as 1, 4, 9, 16, 25 does.

3. Redundant enumerations are still enumerations: 1, 1, 2, 2, 3, 3, . . . enu-


merates the same set as 1, 2, 3, . . . does.

4. Order and redundancy do matter when we specify an enumeration: we


can enumerate the positive integers beginning with 1, 2, 3, 1, . . . , but the
pattern is easier to see when enumerated in the standard way as 1, 2, 3,
4, . . .

5. Enumerations must have a beginning: . . . , 3, 2, 1 is not an enumeration


of the positive integers because it has no first element. To see how this
follows from the informal definition, ask yourself, “at what position in
the list does the number 76 appear?”

6. The following is not an enumeration of the positive integers: 1, 3, 5, . . . ,


2, 4, 6, . . . The problem is that the even numbers occur at places ∞ + 1,
∞ + 2, ∞ + 3, rather than at finite positions.

7. The empty set is enumerable: it is enumerated by the empty list!

Proposition 4.2. If A has an enumeration, it has an enumeration without repeti-


tions.

Proof. Suppose A has an enumeration x1 , x2 , . . . in which each xi is an element


of A. We can remove repetitions from an enumeration by removing repeated
elements. For instance, we can turn the enumeration into a new one in which
we list xi if it is an element of A that is not among x1 , . . . , xi−1 or remove xi
from the list if it already appears among x1 , . . . , xi−1 .

The last argument shows that in order to get a good handle on enumer-
ations and countable sets and to prove things about them, we need a more
precise definition. The following provides it.

Definition 4.3 (Enumeration, formally). An enumeration of a set A ̸= ∅ is any


surjective function f : Z+ → A.

Let’s convince ourselves that the formal definition and the informal defini-
tion using a possibly infinite list are equivalent. First, any surjective function
from Z+ to a set A enumerates A. Such a function determines an enumeration
as defined informally above: the list f (1), f (2), f (3), . . . . Since f is surjective,
every element of A is guaranteed to be the value of f (n) for some n ∈ Z+ .

32
4.2. Enumerations and Countable Sets

Hence, every element of A appears at some finite position in the list. Since the
function may not be injective, the list may be redundant, but that is acceptable
(as noted above).
On the other hand, given a list that enumerates all elements of A, we can
define a surjective function f : Z+ → A by letting f (n) be the nth element
of the list, or the final element of the list if there is no nth element. The only
case where this does not produce a surjective function is when A is empty,
and hence the list is empty. So, every non-empty list determines a surjective
function f : Z+ → A.
Definition 4.4. A set A is countable iff it is empty or has an enumeration.

Example 4.5. A function enumerating the positive integers (Z+ ) is simply the
identity function given by f (n) = n. A function enumerating the natural
numbers N is the function g(n) = n − 1.

Example 4.6. The functions f : Z+ → Z+ and g : Z+ → Z+ given by


f (n) = 2n and
g(n) = 2n − 1
enumerate the even positive integers and the odd positive integers, respec-
tively. However, neither function is an enumeration of Z+ , since neither is
surjective.
( n −1)
Example 4.7. The function f (n) = (−1)n ⌈ 2 ⌉ (where ⌈ x ⌉ denotes the ceil-
ing function, which rounds x up to the nearest integer) enumerates the set of
integers Z. Notice how f generates the values of Z by “hopping” back and
forth between positive and negative integers:
f (1) f (2) f (3) f (4) f (5) f (6) f (7) ...

−⌈ 02 ⌉ ⌈ 12 ⌉ −⌈ 22 ⌉ ⌈ 32 ⌉ −⌈ 42 ⌉ ⌈ 52 ⌉ −⌈ 62 ⌉ . . .

0 1 −1 2 −2 3 ...
You can also think of f as defined by cases as follows:

0
 if n = 1
f (n) = n/2 if n is even

−(n − 1)/2 if n is odd and > 1

Although it is perhaps more natural when listing the elements of a set to


start counting from the 1st element, mathematicians like to use the natural
numbers N for counting things. They talk about the 0th, 1st, 2nd, and so on,
elements of a list. Correspondingly, we can define an enumeration as a surjec-
tive function from N to A. Of course, the two definitions are equivalent.

33
4. T HE S IZE OF S ETS

Proposition 4.8. There is a surjection f : Z+ → A iff there is a surjection g : N →


A.

Proof. Given a surjection f : Z+ → A, we can define g(n) = f (n + 1) for


all n ∈ N. It is easy to see that g : N → A is surjective. Conversely, given
a surjection g : N → A, define f (n) = g(n − 1).

This gives us the following result:

Corollary 4.9. A set A is countable iff it is empty or there is a surjective function


f : N → A.

We discussed above than an list of elements of a set A can be turned into


a list without repetitions. This is also true for enumerations, but a bit harder
to formulate and prove rigorously. Any function f : Z+ → A must be defined
for all n ∈ Z+ . If there are only finitely many elements in A then we clearly
cannot have a function defined on the infinitely many elements of Z+ that
takes as values all the elements of A but never takes the same value twice. In
that case, i.e., in the case where the list without repetitions is finite, we must
choose a different domain for f , one with only finitely many elements. Not
having repetitions means that f must be injective. Since it is also surjective,
we are looking for a bijection between some finite set {1, . . . , n} or Z+ and A.

Proposition 4.10. If f : Z+ → A is surjective (i.e., an enumeration of A), there is


a bijection g : Z → A where Z is either Z+ or {1, . . . , n} for some n ∈ Z+ .

Proof. We define the function g recursively: Let g(1) = f (1). If g(i ) has al-
ready been defined, let g(i + 1) be the first value of f (1), f (2), . . . not already
among g(1), . . . , g(i ), if there is one. If A has just n elements, then g(1), . . . ,
g(n) are all defined, and so we have defined a function g : {1, . . . , n} → A. If
A has infinitely many elements, then for any i there must be an element of A
in the enumeration f (1), f (2), . . . , which is not already among g(1), . . . , g(i ).
In this case we have defined a function g : Z+ → A.
The function g is surjective, since any element of A is among f (1), f (2), . . .
(since f is surjective) and so will eventually be a value of g(i ) for some i. It is
also injective, since if there were j < i such that g( j) = g(i ), then g(i ) would
already be among g(1), . . . , g(i − 1), contrary to how we defined g.

Corollary 4.11. A set A is countable iff it is empty or there is a bijection f : N → A


where either N = N or N = {0, . . . , n} for some n ∈ N.

Proof. A is countable iff A is empty or there is a surjective f : Z+ → A. By


Proposition 4.10, the latter holds iff there is a bijective function f : Z → A
where Z = Z+ or Z = {1, . . . , n} for some n ∈ Z+ . By the same argument
as in the proof of Proposition 4.8, that in turn is the case iff there is a bijection
g : N → A where either N = N or N = {0, . . . , n − 1}.

34
4.3. Cantor’s Zig-Zag Method

4.3 Cantor’s Zig-Zag Method


We’ve already considered some “easy” enumerations. Now we will consider
something a bit harder. Consider the set of pairs of natural numbers, which
we defined in section 1.5 thus:

N × N = {⟨n, m⟩ | n, m ∈ N}

We can organize these ordered pairs into an array, like so:

0 1 2 3 ...
0 ⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ...
1 ⟨1, 0⟩ ⟨1, 1⟩ ⟨1, 2⟩ ⟨1, 3⟩ ...
2 ⟨2, 0⟩ ⟨2, 1⟩ ⟨2, 2⟩ ⟨2, 3⟩ ...
3 ⟨3, 0⟩ ⟨3, 1⟩ ⟨3, 2⟩ ⟨3, 3⟩ ...
.. .. .. .. .. ..
. . . . . .

Clearly, every ordered pair in N × N will appear exactly once in the array.
In particular, ⟨n, m⟩ will appear in the nth row and mth column. But how
do we organize the elements of such an array into a “one-dimensional” list?
The pattern in the array below demonstrates one way to do this (although of
course there are many other options):

0 1 2 3 4 ...
0 0 1 3 6 10 ...
1 2 4 7 11 ... ...
2 5 8 12 ... ... ...
3 9 13 ... ... ... ...
4 14 ... ... ... ... ...
.. .. .. .. .. ..
. . . . . ... .

This pattern is called Cantor’s zig-zag method. It enumerates N × N as follows:

⟨0, 0⟩, ⟨0, 1⟩, ⟨1, 0⟩, ⟨0, 2⟩, ⟨1, 1⟩, ⟨2, 0⟩, ⟨0, 3⟩, ⟨1, 2⟩, ⟨2, 1⟩, ⟨3, 0⟩, . . .

And this establishes the following:

Proposition 4.12. N × N is countable.

Proof. Let f : N → N × N take each k ∈ N to the tuple ⟨n, m⟩ ∈ N × N such


that k is the value of the nth row and mth column in Cantor’s zig-zag array.

This technique also generalises rather nicely. For example, we can use it to
enumerate the set of ordered triples of natural numbers, i.e.:

N × N × N = {⟨n, m, k⟩ | n, m, k ∈ N}

35
4. T HE S IZE OF S ETS

We think of N × N × N as the Cartesian product of N × N with N, that is,

N3 = (N × N) × N = {⟨⟨n, m⟩, k⟩ | n, m, k ∈ N}

and thus we can enumerate N3 with an array by labelling one axis with the
enumeration of N, and the other axis with the enumeration of N2 :

0 1 2 3 ...
⟨0, 0⟩ ⟨0, 0, 0⟩ ⟨0, 0, 1⟩ ⟨0, 0, 2⟩ ⟨0, 0, 3⟩ ...
⟨0, 1⟩ ⟨0, 1, 0⟩ ⟨0, 1, 1⟩ ⟨0, 1, 2⟩ ⟨0, 1, 3⟩ ...
⟨1, 0⟩ ⟨1, 0, 0⟩ ⟨1, 0, 1⟩ ⟨1, 0, 2⟩ ⟨1, 0, 3⟩ ...
⟨0, 2⟩ ⟨0, 2, 0⟩ ⟨0, 2, 1⟩ ⟨0, 2, 2⟩ ⟨0, 2, 3⟩ ...
.. .. .. .. .. ..
. . . . . .

Thus, by using a method like Cantor’s zig-zag method, we may similarly ob-
tain an enumeration of N3 . And we can keep going, obtaining enumerations
of Nn for any natural number n. So, we have:

Proposition 4.13. Nn is countable, for every n ∈ N.

4.4 Pairing Functions and Codes


Cantor’s zig-zag method makes the enumerability of Nn visually evident. But
let us focus on our array depicting N2 . Following the zig-zag line in the array
and counting the places, we can check that ⟨1, 2⟩ is associated with the num-
ber 7. However, it would be nice if we could compute this more directly. That
is, it would be nice to have to hand the inverse of the zig-zag enumeration,
g : N2 → N, such that

g(⟨0, 0⟩) = 0, g(⟨0, 1⟩) = 1, g(⟨1, 0⟩) = 2, . . . , g(⟨1, 2⟩) = 7, . . .

This would enable us to calculate exactly where ⟨n, m⟩ will occur in our enu-
meration.
In fact, we can define g directly by making two observations. First: if the
nth row and mth column contains value v, then the (n + 1)st row and (m − 1)st
column contains value v + 1. Second: the first row of our enumeration con-
sists of the triangular numbers, starting with 0, 1, 3, 6, etc. The kth triangular
number is the sum of the natural numbers < k, which can be computed as
k (k + 1)/2. Putting these two observations together, consider this function:

(n + m + 1)(n + m)
g(n, m) = +n
2
We often just write g(n, m) rather that g(⟨n, m⟩), since it is easier on the eyes.
This tells you first to determine the (n + m)th triangle number, and then add

36
4.5. An Alternative Pairing Function

n to it. And it populates the array in exactly the way we would like. So in
particular, the pair ⟨1, 2⟩ is sent to 4×2 3 + 1 = 7.
This function g is the inverse of an enumeration of a set of pairs. Such
functions are called pairing functions.

Definition 4.14 (Pairing function). A function f : A × B → N is an arithmeti-


cal pairing function if f is injective. We also say that f encodes A × B, and that
f ( x, y) is the code for ⟨ x, y⟩.

We can use pairing functions to encode, e.g., pairs of natural numbers; or,
in other words, we can represent each pair of elements using a single number.
Using the inverse of the pairing function, we can decode the number, i.e., find
out which pair it represents.

4.5 An Alternative Pairing Function

There are other enumerations of N2 that make it easier to figure out what their
inverses are. Here is one. Instead of visualizing the enumeration in an array,
start with the list of positive integers associated with (initially) empty spaces.
Imagine filling these spaces successively with pairs ⟨n, m⟩ as follows. Starting
with the pairs that have 0 in the first place (i.e., pairs ⟨0, m⟩), put the first (i.e.,
⟨0, 0⟩) in the first empty place, then skip an empty space, put the second (i.e.,
⟨0, 2⟩) in the next empty place, skip one again, and so forth. The (incomplete)
beginning of our enumeration now looks like this

1 2 3 4 5 6 7 8 9 10 ...

⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨0, 5⟩ ...

Repeat this with pairs ⟨1, m⟩ for the place that still remain empty, again skip-
ping every other empty place:

1 2 3 4 5 6 7 8 9 10 ...

⟨0, 0⟩ ⟨1, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨1, 1⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨1, 2⟩ ...

Enter pairs ⟨2, m⟩, ⟨2, m⟩, etc., in the same way. Our completed enumeration
thus starts like this:

1 2 3 4 5 6 7 8 9 10 ...

⟨0, 0⟩ ⟨1, 0⟩ ⟨0, 1⟩ ⟨2, 0⟩ ⟨0, 2⟩ ⟨1, 1⟩ ⟨0, 3⟩ ⟨3, 0⟩ ⟨0, 4⟩ ⟨1, 2⟩ ...

37
4. T HE S IZE OF S ETS

If we number the cells in the array above according to this enumeration, we


will not find a neat zig-zag line, but this arrangement:

0 1 2 3 4 5 ...
0 1 3 5 7 9 11 ...
1 2 6 10 14 18 ... ...
2 4 12 20 28 ... ... ...
3 8 24 40 ... ... ... ...
4 16 48 ... ... ... ... ...
5 32 ... ... ... ... ... ...
.. .. .. .. .. .. .. ..
. . . . . . . .

We can see that the pairs in row 0 are in the odd numbered places of our
enumeration, i.e., pair ⟨0, m⟩ is in place 2m + 1; pairs in the second row, ⟨1, m⟩,
are in places whose number is the double of an odd number, specifically, 2 ·
(2m + 1); pairs in the third row, ⟨2, m⟩, are in places whose number is four
times an odd number, 4 · (2m + 1); and so on. The factors of (2m + 1) for
each row, 1, 2, 4, 8, . . . , are exactly the powers of 2: 1 = 20 , 2 = 21 , 4 = 22 ,
8 = 23 , . . . In fact, the relevant exponent is always the first member of the pair
in question. Thus, for pair ⟨n, m⟩ the factor is 2n . This gives us the general
formula: 2n · (2m + 1). However, this is a mapping of pairs to positive integers,
i.e., ⟨0, 0⟩ has position 1. If we want to begin at position 0 we must subtract 1
from the result. This gives us:

Example 4.15. The function h : N2 → N given by

h(n, m) = 2n (2m + 1) − 1

is a pairing function for the set of pairs of natural numbers N2 .

Accordingly, in our second enumeration of N2 , the pair ⟨0, 0⟩ has code


h(0, 0) = 20 (2 · 0 + 1) − 1 = 0; ⟨1, 2⟩ has code 21 · (2 · 2 + 1) − 1 = 2 · 5 − 1 = 9;
⟨2, 6⟩ has code 22 · (2 · 6 + 1) − 1 = 51.
Sometimes it is enough to encode pairs of natural numbers N2 without
requiring that the encoding is surjective. Such encodings have inverses that
are only partial functions.

Example 4.16. The function j : N2 → N+ given by

j(n, m) = 2n 3m

is an injective function N2 → N.

38
4.6. Uncountable Sets

4.6 Uncountable Sets


Some sets, such as the set Z+ of positive integers, are infinite. So far we’ve
seen examples of infinite sets which were all countable. However, there are
also infinite sets which do not have this property. Such sets are called un-
countable.
First of all, it is perhaps already surprising that there are uncountable sets.
For any countable set A there is a surjective function f : Z+ → A. If a set
is uncountable there is no such function. That is, no function mapping the
infinitely many elements of Z+ to A can exhaust all of A. So there are “more”
elements of A than the infinitely many positive integers.
How would one prove that a set is uncountable? You have to show that
no such surjective function can exist. Equivalently, you have to show that the
elements of A cannot be enumerated in a one way infinite list. The best way
to do this is to show that every list of elements of A must leave at least one
element out; or that no function f : Z+ → A can be surjective. We can do this
using Cantor’s diagonal method. Given a list of elements of A, say, x1 , x2 , . . . ,
we construct another element of A which, by its construction, cannot possibly
be on that list.
Our first example is the set Bω of all infinite, non-gappy sequences of 0’s
and 1’s.

Theorem 4.17. Bω is uncountable.

Proof. Suppose, by way of contradiction, that Bω is countable, i.e., suppose


that there is a list s1 , s2 , s3 , s4 , . . . of all elements of Bω . Each of these si is
itself an infinite sequence of 0’s and 1’s. Let’s call the j-th element of the i-th
sequence in this list si ( j). Then the i-th sequence si is

s i (1), s i (2), s i (3), . . .

We may arrange this list, and the elements of each sequence si in it, in an
array:
1 2 3 4 ...
1 s 1 ( 1 ) s1 (2) s1 (3) s1 (4) . . .
2 s2 (1) s 2 ( 2 ) s2 (3) s2 (4) . . .
3 s3 (1) s3 (2) s 3 ( 3 ) s3 (4) . . .
4 s4 (1) s4 (2) s4 (3) s 4 ( 4 ) . . .
.. .. .. .. .. ..
. . . . . .

The labels down the side give the number of the sequence in the list s1 , s2 , . . . ;
the numbers across the top label the elements of the individual sequences. For
instance, s1 (1) is a name for whatever number, a 0 or a 1, is the first element
in the sequence s1 , and so on.

39
4. T HE S IZE OF S ETS

Now we construct an infinite sequence, s, of 0’s and 1’s which cannot pos-
sibly be on this list. The definition of s will depend on the list s1 , s2 , . . . .
Any infinite list of infinite sequences of 0’s and 1’s gives rise to an infinite
sequence s which is guaranteed to not appear on the list.
To define s, we specify what all its elements are, i.e., we specify s(n) for all
n ∈ Z+ . We do this by reading down the diagonal of the array above (hence
the name “diagonal method”) and then changing every 1 to a 0 and every 0 to
a 1. More abstractly, we define s(n) to be 0 or 1 according to whether the n-th
element of the diagonal, sn (n), is 1 or 0.
(
1 if sn (n) = 0
s(n) =
0 if sn (n) = 1.

If you like formulas better than definitions by cases, you could also define
s ( n ) = 1 − s n ( n ).
Clearly s is an infinite sequence of 0’s and 1’s, since it is just the mirror
sequence to the sequence of 0’s and 1’s that appear on the diagonal of our
array. So s is an element of Bω . But it cannot be on the list s1 , s2 , . . . Why not?
It can’t be the first sequence in the list, s1 , because it differs from s1 in the
first element. Whatever s1 (1) is, we defined s(1) to be the opposite. It can’t be
the second sequence in the list, because s differs from s2 in the second element:
if s2 (2) is 0, s(2) is 1, and vice versa. And so on.
More precisely: if s were on the list, there would be some k so that s = sk .
Two sequences are identical iff they agree at every place, i.e., for any n, s(n) =
sk (n). So in particular, taking n = k as a special case, s(k) = sk (k) would
have to hold. sk (k) is either 0 or 1. If it is 0 then s(k ) must be 1—that’s how
we defined s. But if sk (k ) = 1 then, again because of the way we defined s,
s(k) = 0. In either case s(k) ̸= sk (k ).
We started by assuming that there is a list of elements of Bω , s1 , s2 , . . .
From this list we constructed a sequence s which we proved cannot be on the
list. But it definitely is a sequence of 0’s and 1’s if all the si are sequences of
0’s and 1’s, i.e., s ∈ Bω . This shows in particular that there can be no list of
all elements of Bω , since for any such list we could also construct a sequence s
guaranteed to not be on the list, so the assumption that there is a list of all
sequences in Bω leads to a contradiction.

This proof method is called “diagonalization” because it uses the diagonal


of the array to define s. Diagonalization need not involve the presence of an
array: we can show that sets are not countable by using a similar idea even
when no array and no actual diagonal is involved.

Theorem 4.18. ℘(Z+ ) is not countable.

40
4.7. Reduction

Proof. We proceed in the same way, by showing that for every list of subsets
of Z+ there is a subset of Z+ which cannot be on the list. Suppose the follow-
ing is a given list of subsets of Z+ :

Z1 , Z2 , Z3 , . . .

We now define a set Z such that for any n ∈ Z+ , n ∈ Z iff n ∈


/ Zn :

Z = { n ∈ Z+ | n ∈
/ Zn }

Z is clearly a set of positive integers, since by assumption each Zn is, and thus
Z ∈ ℘(Z+ ). But Z cannot be on the list. To show this, we’ll establish that for
each k ∈ Z+ , Z ̸= Zk .
So let k ∈ Z+ be arbitrary. We’ve defined Z so that for any n ∈ Z+ , n ∈ Z
/ Zn . In particular, taking n = k, k ∈ Z iff k ∈
iff n ∈ / Zk . But this shows that
Z ̸= Zk , since k is an element of one but not the other, and so Z and Zk have
different elements. Since k was arbitrary, Z is not on the list Z1 , Z2 , . . .

The preceding proof did not mention a diagonal, but you can think of it
as involving a diagonal if you picture it this way: Imagine the sets Z1 , Z2 , . . . ,
written in an array, where each element j ∈ Zi is listed in the j-th column.
Say the first four sets on that list are {1, 2, 3, . . . }, {2, 4, 6, . . . }, {1, 2, 5}, and
{3, 4, 5, . . . }. Then the array would begin with

Z1 = {1, 2, 3, 4, 5, 6, ...}
Z2 ={ 2, 4, 6, ...}
Z3 = { 1, 2, 5 }
Z4 ={ 3, 4, 5, 6, ...}
.. ..
. .

Then Z is the set obtained by going down the diagonal, leaving out any num-
bers that appear along the diagonal and include those j where the array has a
gap in the j-th row/column. In the above case, we would leave out 1 and 2,
include 3, leave out 4, etc.

4.7 Reduction
We showed ℘(Z+ ) to be uncountable by a diagonalization argument. We
already had a proof that Bω , the set of all infinite sequences of 0s and 1s, is
uncountable. Here’s another way we can prove that ℘(Z+ ) is uncountable:
Show that if ℘(Z+ ) is countable then Bω is also countable. Since we know Bω
is not countable, ℘(Z+ ) can’t be either. This is called reducing one problem
to another—in this case, we reduce the problem of enumerating Bω to the
problem of enumerating ℘(Z+ ). A solution to the latter—an enumeration of
℘(Z+ )—would yield a solution to the former—an enumeration of Bω .

41
4. T HE S IZE OF S ETS

How do we reduce the problem of enumerating a set B to that of enu-


merating a set A? We provide a way of turning an enumeration of A into an
enumeration of B. The easiest way to do that is to define a surjective function
f : A → B. If x1 , x2 , . . . enumerates A, then f ( x1 ), f ( x2 ), . . . would enumer-
ate B. In our case, we are looking for a surjective function f : ℘(Z+ ) → Bω .

Proof of Theorem 4.18 by reduction. Suppose that ℘(Z+ ) were countable, and
thus that there is an enumeration of it, Z1 , Z2 , Z3 , . . .
Define the function f : ℘(Z+ ) → Bω by letting f ( Z ) be the sequence sk
such that sk (n) = 1 iff n ∈ Z, and sk (n) = 0 otherwise. This clearly defines
a function, since whenever Z ⊆ Z+ , any n ∈ Z+ either is an element of Z or
isn’t. For instance, the set 2Z+ = {2, 4, 6, . . . } of positive even numbers gets
mapped to the sequence 010101 . . . , the empty set gets mapped to 0000 . . .
and the set Z+ itself to 1111 . . . .
It also is surjective: Every sequence of 0s and 1s corresponds to some set of
positive integers, namely the one which has as its members those integers cor-
responding to the places where the sequence has 1s. More precisely, suppose
s ∈ Bω . Define Z ⊆ Z+ by:

Z = { n ∈ Z+ | s ( n ) = 1 }

Then f ( Z ) = s, as can be verified by consulting the definition of f .


Now consider the list

f ( Z1 ), f ( Z2 ), f ( Z3 ), . . .

Since f is surjective, every member of Bω must appear as a value of f for some


argument, and so must appear on the list. This list must therefore enumerate
all of Bω .
So if ℘(Z+ ) were countable, Bω would be countable. But Bω is uncount-
able (Theorem 4.17). Hence ℘(Z+ ) is uncountable.

It is easy to be confused about the direction the reduction goes in. For
instance, a surjective function g : Bω → B does not establish that B is uncount-
able. (Consider g : Bω → B defined by g(s) = s(1), the function that maps
a sequence of 0’s and 1’s to its first element. It is surjective, because some se-
quences start with 0 and some start with 1. But B is finite.) Note also that the
function f must be surjective, or otherwise the argument does not go through:
f ( x1 ), f ( x2 ), . . . would then not be guaranteed to include all the elements of B.
For instance,
h(n) = 000
| {z. . . 0}
n 0’s

defines a function h : Z+ → Bω , but Z+ is countable.

42
4.8. Equinumerosity

4.8 Equinumerosity
We have an intuitive notion of “size” of sets, which works fine for finite sets.
But what about infinite sets? If we want to come up with a formal way of
comparing the sizes of two sets of any size, it is a good idea to start by defining
when sets are the same size. Here is Frege:
If a waiter wants to be sure that he has laid exactly as many knives
as plates on the table, he does not need to count either of them, if
he simply lays a knife to the right of each plate, so that every knife
on the table lies to the right of some plate. The plates and knives
are thus uniquely correlated to each other, and indeed through that
same spatial relationship. (Frege, 1884, §70)
The insight of this passage can be brought out through a formal definition:
Definition 4.19. A is equinumerous with B, written A ≈ B, iff there is a bijec-
tion f : A → B.

Proposition 4.20. Equinumerosity is an equivalence relation.

Proof. We must show that equinumerosity is reflexive, symmetric, and transi-


tive. Let A, B, and C be sets.
Reflexivity. The identity map Id A : A → A, where Id A ( x ) = x for all x ∈ A,
is a bijection. So A ≈ A.
Symmetry. Suppose A ≈ B, i.e., there is a bijection f : A → B. Since f is
bijective, its inverse f −1 exists and is also bijective. Hence, f −1 : B → A is
a bijection, so B ≈ A.
Transitivity. Suppose that A ≈ B and B ≈ C, i.e., there are bijections
f : A → B and g : B → C. Then the composition g ◦ f : A → C is bijective,
so that A ≈ C.

Proposition 4.21. If A ≈ B, then A is countable if and only if B is.

Proof. Suppose A ≈ B, so there is some bijection f : A → B, and suppose that


A is countable. Then either A = ∅ or there is a surjective function g : Z+ →
A. If A = ∅, then B = ∅ also (otherwise there would be an element y ∈ B but
no x ∈ A with g( x ) = y). If, on the other hand, g : Z+ → A is surjective, then
f ◦ g : Z+ → B is surjective. To see this, let y ∈ B. Since f is surjective, there
is an x ∈ A such that f ( x ) = y. Since g is surjective, there is an n ∈ Z+ such
that g(n) = x. Hence,
( f ◦ g)(n) = f ( g(n)) = f ( x ) = y
and thus f ◦ g is surjective. We have that f ◦ g is an enumeration of B, and so
B is countable.
If B is countable, we obtain that A is countable by repeating the argument
with the bijection f −1 : B → A instead of f .

43
4. T HE S IZE OF S ETS

4.9 Sets of Different Sizes, and Cantor’s Theorem


We have offered a precise statement of the idea that two sets have the same
size. We can also offer a precise statement of the idea that one set is smaller
than another. Our definition of “is smaller than (or equinumerous)” will re-
quire, instead of a bijection between the sets, an injection from the first set to
the second. If such a function exists, the size of the first set is less than or
equal to the size of the second. Intuitively, an injection from one set to another
guarantees that the range of the function has at least as many elements as the
domain, since no two elements of the domain map to the same element of the
range.

Definition 4.22. A is no larger than B, written A ⪯ B, iff there is an injection


f : A → B.

It is clear that this is a reflexive and transitive relation, but that it is not
symmetric (this is left as an exercise). We can also introduce a notion, which
states that one set is (strictly) smaller than another.

Definition 4.23. A is smaller than B, written A ≺ B, iff there is an injection f : A →


B but no bijection g : A → B, i.e., A ⪯ B and A ̸≈ B.

It is clear that this relation is irreflexive and transitive. (This is left as an


exercise.) Using this notation, we can say that a set A is countable iff A ⪯ N,
and that A is uncountable iff N ≺ A. This allows us to restate Theorem 4.18
as the observation that Z+ ≺ ℘(Z+ ). In fact, Cantor (1892) proved that this
last point is perfectly general:

Theorem 4.24 (Cantor). A ≺ ℘( A), for any set A.

Proof. The map f ( x ) = { x } is an injection f : A → ℘( A), since if x ̸= y,


then also { x } ̸= {y} by extensionality, and so f ( x ) ̸= f (y). So we have that
A ⪯ ℘( A).
We will now show that there cannot be a surjective function g : A → ℘( A),
let alone a bijective one, and hence that A ̸≈ ℘( A). For suppose that g : A →
℘( A). Since g is total, every x ∈ A is mapped to a subset g( x ) ⊆ A. We can
show that g cannot be surjective. To do this, we define a subset A ⊆ A which
by definition cannot be in the range of g. Let

A = {x ∈ A | x ∈
/ g( x )}.

Since g( x ) is defined for all x ∈ A, A is clearly a well-defined subset of A.


But, it cannot be in the range of g. Let x ∈ A be arbitrary, we will show
that A ̸= g( x ). If x ∈ g( x ), then it does not satisfy x ∈
/ g( x ), and so by the
definition of A, we have x ∈ / A. If x ∈ A, it must satisfy the defining property
of A, i.e., x ∈ A and x ∈/ g( x ). Since x was arbitrary, this shows that for each

44
4.10. The Notion of Size, and Schröder-Bernstein

x ∈ A, x ∈ g( x ) iff x ∈
/ A, and so g( x ) ̸= A. In other words, A cannot be in
the range of g, contradicting the assumption that g is surjective.

It’s instructive to compare the proof of Theorem 4.24 to that of Theorem 4.18.
There we showed that for any list Z1 , Z2 , . . . , of subsets of Z+ one can con-
struct a set Z of numbers guaranteed not to be on the list. It was guaranteed
not to be on the list because, for every n ∈ Z+ , n ∈ Zn iff n ∈ / Z. This way,
there is always some number that is an element of one of Zn or Z but not the
other. We follow the same idea here, except the indices n are now elements
of A instead of Z+ . The set B is defined so that it is different from g( x ) for
each x ∈ A, because x ∈ g( x ) iff x ∈ / B. Again, there is always an element
of A which is an element of one of g( x ) and B but not the other. And just as Z
therefore cannot be on the list Z1 , Z2 , . . . , B cannot be in the range of g.
The proof is also worth comparing with the proof of Russell’s Paradox,
Theorem 1.29. Indeed, Cantor’s Theorem was the inspiration for Russell’s
own paradox.

4.10 The Notion of Size, and Schröder-Bernstein


Here is an intuitive thought: if A is no larger than B and B is no larger than A,
then A and B are equinumerous. To be honest, if this thought were wrong, then
we could scarcely justify the thought that our defined notion of equinumeros-
ity has anything to do with comparisons of “sizes” between sets! Fortunately,
though, the intuitive thought is correct. This is justified by the Schröder-
Bernstein Theorem.

Theorem 4.25 (Schröder-Bernstein). If A ⪯ B and B ⪯ A, then A ≈ B.

In other words, if there is an injection from A to B, and an injection from B


to A, then there is a bijection from A to B.
This result, however, is really rather difficult to prove. Indeed, although
Cantor stated the result, others proved it.1 For now, you can (and must) take
it on trust.
Fortunately, Schröder-Bernstein is correct, and it vindicates our thinking of
the relations we defined, i.e., A ≈ B and A ⪯ B, as having something to do
with “size”. Moreover, Schröder-Bernstein is very useful. It can be difficult to
think of a bijection between two equinumerous sets. The Schröder-Bernstein
Theorem allows us to break the comparison down into cases so we only have
to think of an injection from the first to the second, and vice-versa.

1 For more on the history, see e.g., Potter (2004, pp. 165–6).

45
Part II

First-order Logic

47
Chapter 5

Introduction to First-Order Logic

5.1 First-Order Logic


You are probably familiar with first-order logic from your first introduction
to formal logic.1 You may know it as “quantificational logic” or “predicate
logic.” First-order logic, first of all, is a formal language. That means, it has
a certain vocabulary, and its expressions are strings from this vocabulary. But
not every string is permitted. There are different kinds of permitted expres-
sions: terms, formulae, and sentences. We are mainly interested in sentences
of first-order logic: they provide us with a formal analogue of sentences of
English, and about them we can ask the questions a logician typically is inter-
ested in. For instance:

• Does ψ follow from φ logically?

• Is φ logically true, logically false, or contingent?

• Are φ and ψ equivalent?

These questions are primarily questions about the “meaning” of sentences


of first-order logic. For instance, a philosopher would analyze the question
of whether ψ follows logically from φ as asking: is there a case where φ is
true but ψ is false (ψ doesn’t follow from φ), or does every case that makes φ
true also make ψ true (ψ does follow from φ)? But we haven’t been told yet
what a “case” is—that is the job of semantics. The semantics of first-order logic
provides a mathematically precise model of the philosopher’s intuitive idea
of “case,” and also—and this is important—of what it is for a sentence φ to be
true in a case. We call the mathematically precise model that we will develop
a structure. The relation which makes “true in” precise, is called the relation
of satisfaction. So what we will define is “φ is satisfied in M” (in symbols:
1 In fact, we more or less assume you are! If you’re not, you could review a more elementary

textbook, such as forall x (Magnus et al., 2021).

49
5. I NTRODUCTION TO F IRST-O RDER L OGIC

M ⊨ φ) for sentences φ and structures M. Once this is done, we can also give
precise definitions of the other semantical terms such as “follows from” or “is
logically true.” These definitions will make it possible to settle, again with
mathematical precision, whether, e.g., ∀ x ( φ( x ) ⊃ ψ( x )), ∃ x φ( x ) ⊨ ∃ x ψ( x ).
The answer will, of course, be “yes.” If you’ve already been trained to symbol-
ize sentences of English in first-order logic, you will recognize this as, e.g., the
symbolizations of, say, “All ants are insects, there are ants, therefore there are
insects.” That is obviously a valid argument, and so our mathematical model
of “follows from” for our formal language should give the same answer.
Another topic you probably remember from your first introduction to for-
mal logic is that there are derivations. If you have taken a first formal logic
course, your instructor will have made you practice finding such derivations,
perhaps even a derivation that shows that the above entailment holds. There
are many different ways to give derivations: you may have done something
called “natural deduction” or “truth trees,” but there are many others. The
purpose of derivation systems is to provide tools using which the logicians’
questions above can be answered: e.g., a natural deduction derivation in which
∀ x ( φ( x ) ⊃ ψ( x )) and ∃ x φ( x ) are premises and ∃ x ψ( x ) is the conclusion (last
line) verifies that ∃ x ψ( x ) logically follows from ∀ x ( φ( x ) ⊃ ψ( x )) and ∃ x φ( x ).
But why is that? On the face of it, derivation systems have nothing to do
with semantics: giving a formal derivation merely involves arranging sym-
bols in certain rule-governed ways; they don’t mention “cases” or “true in” at
all. The connection between derivation systems and semantics has to be estab-
lished by a meta-logical investigation. What’s needed is a mathematical proof,
e.g., that a formal derivation of ∃ x ψ( x ) from premises ∀ x ( φ( x ) ⊃ ψ( x )) and
∃ x φ( x ) is possible, if, and only if, ∀ x ( φ( x ) ⊃ ψ( x )) and ∃ x φ( x ) together en-
tail ∃ x ψ( x ). Before this can be done, however, a lot of painstaking work has
to be carried out to get the definitions of syntax and semantics correct.

5.2 Syntax
We first must make precise what strings of symbols count as sentences of first-
order logic. We’ll do this later; for now we’ll just proceed by example. The
basic building blocks—the vocabulary—of first-order logic divides into two
parts. The first part is the symbols we use to say specific things or to pick out
specific things. We pick out things using constant symbols, and we say stuff
about the things we pick out using predicate symbols. E.g, we might use a as
a constant symbol to pick out a single thing, and then say something about
it using the sentence P (a). If you have meanings for “a” and “P ” in mind,
you can read P (a) as a sentence of English (and you probably have done so
when you first learned formal logic). Once you have such simple sentences
of first-order logic, you can build more complex ones using the second part
of the vocabulary: the logical symbols (connectives and quantifiers). So, for

50
5.3. Formulae

instance, we can form expressions like (P (a) & Q(b )) or ∃x P (x ).


In order to provide the precise definitions of semantics and the rules of
our derivation systems required for rigorous meta-logical study, we first of
all have to give a precise definition of what counts as a sentence of first-order
logic. The basic idea is easy enough to understand: there are some simple sen-
tences we can form from just predicate symbols and constant symbols, such
as P (a). And then from these we form more complex ones using the connec-
tives and quantifiers. But what exactly are the rules by which we are allowed
to form more complex sentences? These must be specified, otherwise we have
not defined “sentence of first-order logic” precisely enough. There are a few
issues. The first one is to get the right strings to count as sentences. The sec-
ond one is to do this in such a way that we can give mathematical proofs about
all sentences. Finally, we’ll have to also give precise definitions of some rudi-
mentary operations with sentences, such as “replace every x in φ by b.” The
trouble is that the quantifiers and variables we have in first-order logic make
it not entirely obvious how this should be done. E.g., should ∃x P (a) count as
a sentence? What about ∃x ∃x P (x )? What should the result of “replace x by b
in (P (x ) & ∃x P (x ))” be?

5.3 Formulae
Here is the approach we will use to rigorously specify sentences of first-order
logic and to deal with the issues arising from the use of variables. We first
define a different set of expressions: formulae. Once we’ve done that, we can
consider the role variables play in them—and on the basis of some other ideas,
namely those of “free” and “bound” variables, we can define what a sentence
is (namely, a formula without free variables). We do this not just because it
makes the definition of “sentence” more manageable, but also because it will
be crucial to the way we define the semantic notion of satisfaction.
Let’s define “formula” for a simple first-order language, one containing
only a single predicate symbol P and a single constant symbol a, and only the
logical symbols ∼, &, and ∃. Our full definitions will be much more general:
we’ll allow infinitely many predicate symbols and constant symbols. In fact,
we will also consider function symbols which can be combined with constant
symbols and variables to form “terms.” For now, a and the variables will be
our only terms. We do need infinitely many variables. We’ll officially use the
symbols v0 , v1 , . . . , as variables.

Definition 5.1. The set of formulae Frm is defined as follows:

1. P (a) and P (vi ) are formulae (i ∈ N).

2. If φ is a formula, then ∼ φ is formula.

3. If φ and ψ are formulae, then ( φ & ψ) is a formula.

51
5. I NTRODUCTION TO F IRST-O RDER L OGIC

4. If φ is a formula and x is a variable, then ∃ x φ is a formula.

5. Nothing else is a formula.

(1) tells us that P (a) and P (vi ) are formulae, for any i ∈ N. These are the
so-called atomic formulae. They give us something to start from. The other
clauses give us ways of forming new formulae from ones we have already
formed. So for instance, by (2), we get that ∼P (v2 ) is a formula, since P (v2 )
is already a formula by (1). Then, by (4), we get that ∃v2 ∼P (v2 ) is another
formula, and so on. (5) tells us that only strings we can form in this way count
as formulae. In particular, ∃v0 P (a) and ∃v0 ∃v0 P (a) do count as formulae,
and (∼P (a)) does not, because of the extraneous outer parentheses.
This way of defining formulae is called an inductive definition, and it allows
us to prove things about formulae using a version of proof by induction called
structural induction. These are discussed in a general way in appendix B.4 and
appendix B.5, which you should review before delving into the proofs later
on. Basically, the idea is that if you want to give a proof that something is true
for all formulae, you show first that it is true for the atomic formulae, and then
that if it’s true for any formula φ (and ψ), it’s also true for ∼ φ, ( φ & ψ), and
∃ x φ. For instance, this proves that it’s true for ∃v2 ∼P (v2 ): from the first part
you know that it’s true for the atomic formula P (v2 ). Then you get that it’s
true for ∼P (v2 ) by the second part, and then again that it’s true for ∃v2 ∼P (v2 )
itself. Since all formulae are inductively generated from atomic formulae, this
works for any of them.

5.4 Satisfaction
We can already skip ahead to the semantics of first-order logic once we know
what formulae are: here, the basic definition is that of a structure. For our
simple language, a structure M has just three components: a non-empty set
|M| called the domain, what a picks out in M, and what P is true of in M.
The object picked out by a is denoted aM and the set of things P is true of
by P M . A structure M consists of just these three things: |M|, aM ∈ |M|
and P M ⊆ |M|. The general case will be more complicated, since there will
be many predicate symbols and constant symbols, the constant symbols can
have more than one place, and there will also be function symbols.
This is enough to give a definition of satisfaction for formulae that don’t
contain variables. The idea is to give an inductive definition that mirrors the
way we have defined formulae. We specify when an atomic formula is satis-
fied in M, and then when, e.g., ∼ φ is satisfied in M on the basis of whether or
not φ is satisfied in M. E.g., we could define:

1. P (a) is satisfied in M iff aM ∈ P M .

2. ∼ φ is satisfied in M iff φ is not satisfied in M.

52
5.4. Satisfaction

3. ( φ & ψ) is satisfied in M iff φ is satisfied in M, and ψ is satisfied in M as


well.

Let’s say that |M| = {0, 1, 2}, aM = 1, and P M = {1, 2}. This definition
would tell us that P (a) is satisfied in M (since aM = 1 ∈ {1, 2} = P M ). It tells
us further that ∼P (a) is not satisfied in M, and that in turn ∼∼P (a) is and
(∼P (a) & P (a)) is not satisfied, and so on.
The trouble comes when we want to give a definition for the quantifiers:
we’d like to say something like, “∃v0 P (v0 ) is satisfied iff P (v0 ) is satisfied.”
But the structure M doesn’t tell us what to do about variables. What we ac-
tually want to say is that P (v0 ) is satisfied for some value of v0 . To make this
precise we need a way to assign elements of |M| not just to a but also to v0 . To
this end, we introduce variable assignments. A variable assignment is simply
a function s that maps variables to elements of |M| (in our example, to one
of 1, 2, or 3). Since we don’t know beforehand which variables might appear
in a formula we can’t limit which variables s assigns values to. The simple
solution is to require that s assigns values to all variables v0 , v1 , . . . We’ll just
use only the ones we need.
Instead of defining satisfaction of formulae just relative to a structure, we’ll
define it relative to a structure M and a variable assignment s, and write M, s ⊨
φ for short. Our definition will now include an additional clause to deal with
atomic formulae containing variables:

1. M, s ⊨ P (a) iff aM ∈ P M .

2. M, s ⊨ P (vi ) iff s(vi ) ∈ P M .

3. M, s ⊨ ∼ φ iff not M, s ⊨ φ.

4. M, s ⊨ ( φ & ψ) iff M, s ⊨ φ and M, s ⊨ ψ.

Ok, this solves one problem: we can now say when M satisfies P (v0 ) for the
value s(v0 ). To get the definition right for ∃v0 P (v0 ) we have to do one more
thing: We want to have that M, s ⊨ ∃v0 P (v0 ) iff M, s′ ⊨ P (v0 ) for some way
s′ of assigning a value to v0 . But the value assigned to v0 does not necessarily
have to be the value that s(v0 ) picks out. We’ll introduce a notation for that:
if m ∈ |M|, then we let s[m/v0 ] be the assignment that is just like s (for all
variables other than v0 ), except to v0 it assigns m. Now our definition can be:

5. M, s ⊨ ∃vi φ iff M, s[m/vi ] ⊨ φ for some m ∈ |M|.

Does it work out? Let’s say we let s(vi ) = 0 for all i ∈ N. M, s ⊨ ∃v0 P (v0 ) iff
there is an m ∈ |M| so that M, s[m/v0 ] ⊨ P (v0 ). And there is: we can choose
m = 1 or m = 2. Note that this is true even if the value s(v0 ) assigned to v0 by
s itself—in this case, 0—doesn’t do the job. We have M, s[1/v0 ] ⊨ P (v0 ) but
not M, s ⊨ P (v0 ).

53
5. I NTRODUCTION TO F IRST-O RDER L OGIC

If this looks confusing and cumbersome: it is. But the added complexity is
required to give a precise, inductive definition of satisfaction for all formulae,
and we need something like it to precisely define the semantic notions. There
are other ways of doing it, but they are all equally (in)elegant.

5.5 Sentences
Ok, now we have a (sketch of a) definition of satisfaction (“true in”) for struc-
tures and formulae. But it needs this additional bit—a variable assignment—
and what we wanted is a definition of sentences. How do we get rid of as-
signments, and what are sentences?
You probably remember a discussion in your first introduction to formal
logic about the relation between variables and quantifiers. A quantifier is al-
ways followed by a variable, and then in the part of the sentence to which that
quantifier applies (its “scope”), we understand that the variable is “bound”
by that quantifier. In formulae it was not required that every variable has a
matching quantifier, and variables without matching quantifiers are “free” or
“unbound.” We will take sentences to be all those formulae that have no free
variables.
Again, the intuitive idea of when an occurrence of a variable in a formula φ
is bound, which quantifier binds it, and when it is free, is not difficult to get.
You may have learned a method for testing this, perhaps involving counting
parentheses. We have to insist on a precise definition—and because we have
defined formulae by induction, we can give a definition of the free and bound
occurrences of a variable x in a formula φ also by induction. E.g., it might look
like this for our simplified language:

1. If φ is atomic, all occurrences of x in it are free (that is, the occurrence of


x in P ( x ) is free).

2. If φ is of the form ∼ψ, then an occurrence of x in ∼ψ is free iff the cor-


responding occurrence of x is free in ψ (that is, the free occurrences of
variables in ψ are exactly the corresponding occurrences in ∼ψ).

3. If φ is of the form (ψ & χ), then an occurrence of x in (ψ & χ) is free iff


the corresponding occurrence of x is free in ψ or in χ.

4. If φ is of the form ∃ x ψ, then no occurrence of x in φ is free; if it is of the


form ∃y ψ where y is a different variable than x, then an occurrence of x
in ∃y ψ is free iff the corresponding occurrence of x is free in ψ.

Once we have a precise definition of free and bound occurrences of vari-


ables, we can simply say: a sentence is any formula without free occurrences
of variables.

54
5.6. Semantic Notions

5.6 Semantic Notions


We mentioned above that when we consider whether M, s ⊨ φ holds, we (for
convenience) let s assign values to all variables, but only the values it assigns
to variables in φ are used. In fact, it’s only the values of free variables in φ
that matter. Of course, because we’re careful, we are going to prove this fact.
Since sentences have no free variables, s doesn’t matter at all when it comes
to whether or not they are satisfied in a structure. So, when φ is a sentence we
can define M ⊨ φ to mean “M, s ⊨ φ for all s,” which as it happens is true iff
M, s ⊨ φ for at least one s. We need to introduce variable assignments to get a
working definition of satisfaction for formulae, but for sentences, satisfaction
is independent of the variable assignments.
Once we have a definition of “M ⊨ φ,” we know what “case” and “true
in” mean as far as sentences of first-order logic are concerned. On the basis of
the definition of M ⊨ φ for sentences we can then define the basic semantic
notions of validity, entailment, and satisfiability. A sentence is valid, ⊨ φ,
if every structure satisfies it. It is entailed by a set of sentences, Γ ⊨ φ, if
every structure that satisfies all the sentences in Γ also satisfies φ. And a set of
sentences is satisfiable if some structure satisfies all sentences in it at the same
time.
Because formulae are inductively defined, and satisfaction is in turn de-
fined by induction on the structure of formulae, we can use induction to prove
properties of our semantics and to relate the semantic notions defined. We’ll
collect and prove some of these properties, partly because they are individu-
ally interesting, but mainly because many of them will come in handy when
we go on to investigate the relation between semantics and derivation sys-
tems. In order to do so, we’ll also have to define (precisely, i.e., by induction)
some syntactic notions and operations we haven’t mentioned yet.

5.7 Substitution
We’ll discuss an example to illustrate how things hang together, and how the
development of syntax and semantics lays the foundation for our more ad-
vanced investigations later. Our derivation systems should let us derive P (a)
from ∀v0 P (v0 ). Maybe we even want to state this as a rule of inference. How-
ever, to do so, we must be able to state it in the most general terms: not just
for P , a, and v0 , but for any formula φ, and term t, and variable x. (Recall
that constant symbols are terms, but we’ll consider also more complicated
terms built from constant symbols and function symbols.) So we want to be
able to say something like, “whenever you have derived ∀ x φ( x ) you are jus-
tified in inferring φ(t)—the result of removing ∀ x and replacing x by t.” But
what exactly does “replacing x by t” mean? What is the relation between φ( x )
and φ(t)? Does this always work?

55
5. I NTRODUCTION TO F IRST-O RDER L OGIC

To make this precise, we define the operation of substitution. Substitution


is actually tricky, because we can’t just replace all x’s in φ by t, and not every t
can be substituted for any x. We’ll deal with this, again, using inductive defi-
nitions. But once this is done, specifying an inference rule as “infer φ(t) from
∀ x φ( x )” becomes a precise definition. Moreover, we’ll be able to show that
this is a good inference rule in the sense that ∀ x φ( x ) entails φ(t). But to prove
this, we have to again prove something that may at first glance prompt you to
ask “why are we doing this?” That ∀ x φ( x ) entails φ(t) relies on the fact that
whether or not M ⊨ φ(t) holds depends only on the value of the term t, i.e.,
if we let m be whatever element of |M| is picked out by t, then M, s ⊨ φ(t) iff
M, s[m/x ] ⊨ φ( x ). This holds even when t contains variables, but we’ll have
to be careful with how exactly we state the result.

5.8 Models and Theories


Once we’ve defined the syntax and semantics of first-order logic, we can get
to work investigating the properties of structures and the semantic notions.
We can also define derivation systems, and investigate those. For a set of
sentences, we can ask: what structures make all the sentences in that set true?
Given a set of sentences Γ, a structure M that satisfies them is called a model
of Γ. We might start from Γ and try to find its models—what do they look like?
How big or small do they have to be? But we might also start with a single
structure or collection of structures and ask: what sentences are true in them?
Are there sentences that characterize these structures in the sense that they,
and only they, are true in them? These kinds of questions are the domain of
model theory. They also underlie the axiomatic method: describing a collection of
structures by a set of sentences, the axioms of a theory. This is made possible
by the observation that exactly those sentences entailed in first-order logic by
the axioms are true in all models of the axioms.
As a very simple example, consider preorders. A preorder is a relation R
on some set A which is both reflexive and transitive. A set A with a two-place
relation R ⊆ A × A on it is exactly what we would need to give a structure for
a first-order language with a single two-place relation symbol P : we would
set |M| = A and P M = R. Since R is a preorder, it is reflexive and transitive,
and we can find a set Γ of sentences of first-order logic that say this:

∀ v0 P ( v 0 , v0 )
∀v0 ∀v1 ∀v2 ((P (v0 , v1 ) & P (v1 , v2 )) ⊃ P (v0 , v2 ))

These sentences are just the symbolizations of “for any x, Rxx” (R is reflexive)
and “whenever Rxy and Ryz then also Rxz” (R is transitive). We see that
a structure M is a model of these two sentences Γ iff R (i.e., P M ), is a preorder
on A (i.e., |M|). In other words, the models of Γ are exactly the preorders. Any
property of all preorders that can be expressed in the first-order language with

56
5.9. Soundness and Completeness

just P as predicate symbol (like reflexivity and transitivity above), is entailed


by the two sentences in Γ and vice versa. So anything we can prove about
models of Γ we have proved about all preorders.
For any particular theory and class of models (such as Γ and all preorders),
there will be interesting questions about what can be expressed in the cor-
responding first-order language, and what cannot be expressed. There are
some properties of structures that are interesting for all languages and classes
of models, namely those concerning the size of the domain. One can al-
ways express, for instance, that the domain contains exactly n elements, for
any n ∈ Z+ . One can also express, using a set of infinitely many sentences,
that the domain is infinite. But one cannot express that the domain is fi-
nite, or that the domain is uncountable. These results about the limitations of
first-order languages are consequences of the compactness and Löwenheim–
Skolem theorems.

5.9 Soundness and Completeness


We’ll also introduce derivation systems for first-order logic. There are many
derivation systems that logicians have developed, but they all define the same
derivability relation between sentences. We say that Γ derives φ, Γ ⊢ φ, if
there is a derivation of a certain precisely defined sort. Derivations are always
finite arrangements of symbols—perhaps a list of sentences, or some more
complicated structure. The purpose of derivation systems is to provide a tool
to determine if a sentence is entailed by some set Γ. In order to serve that
purpose, it must be true that Γ ⊨ φ if, and only if, Γ ⊢ φ.
If Γ ⊢ φ but not Γ ⊨ φ, our derivation system would be too strong, prove
too much. The property that if Γ ⊢ φ then Γ ⊨ φ is called soundness, and it
is a minimal requirement on any good derivation system. On the other hand,
if Γ ⊨ φ but not Γ ⊢ φ, then our derivation system is too weak, it doesn’t
prove enough. The property that if Γ ⊨ φ then Γ ⊢ φ is called completeness.
Soundness is usually relatively easy to prove (by induction on the structure of
derivations, which are inductively defined). Completeness is harder to prove.
Soundness and completeness have a number of important consequences.
If a set of sentences Γ derives a contradiction (such as φ & ∼ φ) it is called
inconsistent. Inconsistent Γs cannot have any models, they are unsatisfiable.
From completeness the converse follows: any Γ that is not inconsistent—or,
as we will say, consistent—has a model. In fact, this is equivalent to complete-
ness, and is the form of completeness we will actually prove. It is a deep and
perhaps surprising result: just because you cannot prove φ & ∼ φ from Γ guar-
antees that there is a structure that is as Γ describes it. So completeness gives
an answer to the question: which sets of sentences have models? Answer: all
and only consistent sets do.
The soundness and completeness theorems have two important conse-

57
5. I NTRODUCTION TO F IRST-O RDER L OGIC

quences: the compactness and the Löwenheim–Skolem theorem. These are


important results in the theory of models, and can be used to establish many
interesting results. We’ve already mentioned two: first-order logic cannot ex-
press that the domain of a structure is finite or that it is uncountable.
Historically, all of this—how to define syntax and semantics of first-order
logic, how to define good derivation systems, how to prove that they are
sound and complete, getting clear about what can and cannot be expressed
in first-order languages—took a long time to figure out and get right. We now
know how to do it, but going through all the details can still be confusing
and tedious. But it’s also important, because the methods developed here for
the formal language of first-order logic are applied all over the place in logic,
computer science, and linguistics. So working through the details pays off in
the long run.

58
Chapter 6

Syntax of First-Order Logic

6.1 Introduction
In order to develop the theory and metatheory of first-order logic, we must
first define the syntax and semantics of its expressions. The expressions of
first-order logic are terms and formulae. Terms are formed from variables,
constant symbols, and function symbols. Formulae, in turn, are formed from
predicate symbols together with terms (these form the smallest, “atomic” for-
mulae), and then from atomic formulae we can form more complex ones us-
ing logical connectives and quantifiers. There are many different ways to set
down the formation rules; we give just one possible one. Other systems will
chose different symbols, will select different sets of connectives as primitive,
will use parentheses differently (or even not at all, as in the case of so-called
Polish notation). What all approaches have in common, though, is that the
formation rules define the set of terms and formulae inductively. If done prop-
erly, every expression can result essentially in only one way according to the
formation rules. The inductive definition resulting in expressions that are
uniquely readable means we can give meanings to these expressions using the
same method—inductive definition.

6.2 First-Order Languages


Expressions of first-order logic are built up from a basic vocabulary containing
variables, constant symbols, predicate symbols and sometimes function symbols.
From them, together with logical connectives, quantifiers, and punctuation
symbols such as parentheses and commas, terms and formulae are formed.
Informally, predicate symbols are names for properties and relations, con-
stant symbols are names for individual objects, and function symbols are names
for mappings. These, except for the identity predicate =, are the non-logical
symbols and together make up a language. Any first-order language L is de-

59
6. S YNTAX OF F IRST-O RDER L OGIC

termined by its non-logical symbols. In the most general case, L contains


infinitely many symbols of each kind.
In the general case, we make use of the following symbols in first-order
logic:

1. Logical symbols

a) Logical connectives: ∼ (negation), & (conjunction), ∨ (disjunction),


⊃ (conditional), ∀ (universal quantifier), ∃ (existential quantifier).
b) The propositional constant for falsity ⊥.
c) The two-place identity predicate =.
d) A countably infinite set of variables: v0 , v1 , v2 , . . .

2. Non-logical symbols, making up the standard language of first-order logic

a) A countably infinite set of n-place predicate symbols for each n > 0:


A0n , A1n , A2n , . . .
b) A countably infinite set of constant symbols: c0 , c1 , c2 , . . . .
c) A countably infinite set of n-place function symbols for each n > 0:
f0n , f1n , f2n , . . .

3. Punctuation marks: (, ), and the comma.

Most of our definitions and results will be formulated for the full standard
language of first-order logic. However, depending on the application, we may
also restrict the language to only a few predicate symbols, constant symbols,
and function symbols.

Example 6.1. The language L A of arithmetic contains a single two-place pred-


icate symbol <, a single constant symbol 0, one one-place function symbol ′,
and two two-place function symbols + and ×.

Example 6.2. The language of set theory L Z contains only the single two-
place predicate symbol ∈.

Example 6.3. The language of orders L≤ contains only the two-place predi-
cate symbol ≤.

Again, these are conventions: officially, these are just aliases, e.g., <, ∈,
and ≤ are aliases for A20 , 0 for c0 , ′ for f01 , + for f02 , × for f12 .
In addition to the primitive connectives and quantifiers introduced above,
we also use the following defined symbols: ≡ (biconditional), truth ⊤
A defined symbol is not officially part of the language, but is introduced
as an informal abbreviation: it allows us to abbreviate formulas which would,

60
6.3. Terms and Formulae

if we only used primitive symbols, get quite long. This is obviously an ad-
vantage. The bigger advantage, however, is that proofs become shorter. If a
symbol is primitive, it has to be treated separately in proofs. The more primi-
tive symbols, therefore, the longer our proofs.
You may be familiar with different terminology and symbols than the ones
we use above. Logic texts (and teachers) commonly use ∼, ¬, or ! for “nega-
tion”, ∧, ·, or & for “conjunction”. Commonly used symbols for the “condi-
tional” or “implication” are →, ⇒, and ⊃. Symbols for “biconditional,” “bi-
implication,” or “(material) equivalence” are ↔, ⇔, and ≡. The ⊥ symbol is
variously called “falsity,” “falsum,”, “absurdity,” or “bottom.” The ⊤ symbol
is variously called “truth,” “verum,” or “top.”
It is conventional to use lower case letters (e.g., a, b, c) from the begin-
ning of the Latin alphabet for constant symbols (sometimes called names),
and lower case letters from the end (e.g., x, y, z) for variables. Quantifiers
combine with variables, e.g., x; notational variations include ∀ x, (∀ x ), ( x ),
Πx, x for the universal quantifier and ∃ x, (∃ x ), ( Ex ), Σx, x for the existen-
V W

tial quantifier.
We might treat all the propositional operators and both quantifiers as prim-
itive symbols of the language. We might instead choose a smaller stock of
primitive symbols and treat the other logical operators as defined. “Truth
functionally complete” sets of Boolean operators include {∼, ∨}, {∼, &}, and
{∼, ⊃}—these can be combined with either quantifier for an expressively
complete first-order language.
You may be familiar with two other logical operators: the Sheffer stroke |
(named after Henry Sheffer), and Peirce’s arrow ↓, also known as Quine’s
dagger. When given their usual readings of “nand” and “nor” (respectively),
these operators are truth functionally complete by themselves.

6.3 Terms and Formulae


Once a first-order language L is given, we can define expressions built up
from the basic vocabulary of L. These include in particular terms and formulae.

Definition 6.4 (Terms). The set of terms Trm(L) of L is defined inductively


by:

1. Every variable is a term.

2. Every constant symbol of L is a term.

3. If f is an n-place function symbol and t1 , . . . , tn are terms, then f (t1 , . . . , tn )


is a term.

4. Nothing else is a term.

A term containing no variables is a closed term.

61
6. S YNTAX OF F IRST-O RDER L OGIC

The constant symbols appear in our specification of the language and the
terms as a separate category of symbols, but they could instead have been in-
cluded as zero-place function symbols. We could then do without the second
clause in the definition of terms. We just have to understand f (t1 , . . . , tn ) as
just f by itself if n = 0.
Definition 6.5 (Formulas). The set of formulae Frm(L) of the language L is
defined inductively as follows:
1. ⊥ is an atomic formula.
2. If R is an n-place predicate symbol of L and t1 , . . . , tn are terms of L,
then R(t1 , . . . , tn ) is an atomic formula.
3. If t1 and t2 are terms of L, then =(t1 , t2 ) is an atomic formula.
4. If φ is a formula, then ∼ φ is a formula.
5. If φ and ψ are formulae, then ( φ & ψ) is a formula.
6. If φ and ψ are formulae, then ( φ ∨ ψ) is a formula.
7. If φ and ψ are formulae, then ( φ ⊃ ψ) is a formula.
8. If φ is a formula and x is a variable, then ∀ x φ is a formula.
9. If φ is a formula and x is a variable, then ∃ x φ is a formula.
10. Nothing else is a formula.

The definitions of the set of terms and that of formulae are inductive defini-
tions. Essentially, we construct the set of formulae in infinitely many stages. In
the initial stage, we pronounce all atomic formulas to be formulas; this corre-
sponds to the first few cases of the definition, i.e., the cases for ⊥, R(t1 , . . . , tn )
and =(t1 , t2 ). “Atomic formula” thus means any formula of this form.
The other cases of the definition give rules for constructing new formu-
lae out of formulae already constructed. At the second stage, we can use
them to construct formulae out of atomic formulae. At the third stage, we
construct new formulas from the atomic formulas and those obtained in the
second stage, and so on. A formula is anything that is eventually constructed
at such a stage, and nothing else.
By convention, we write = between its arguments and leave out the paren-
theses: t1 = t2 is an abbreviation for =(t1 , t2 ). Moreover, ∼=(t1 , t2 ) is abbre-
viated as t1 ̸= t2 . When writing a formula (ψ ∗ χ) constructed from ψ, χ
using a two-place connective ∗, we will often leave out the outermost pair of
parentheses and write simply ψ ∗ χ.
Some logic texts require that the variable x must occur in φ in order for
∃ x φ and ∀ x φ to count as formulae. Nothing bad happens if you don’t require
this, and it makes things easier.

62
6.3. Terms and Formulae

Definition 6.6. Formulas constructed using the defined operators are to be


understood as follows:

1. ⊤ abbreviates ∼⊥.

2. φ ≡ ψ abbreviates ( φ ⊃ ψ) & (ψ ⊃ φ).

If we work in a language for a specific application, we will often write two-


place predicate symbols and function symbols between the respective terms,
e.g., t1 < t2 and (t1 + t2 ) in the language of arithmetic and t1 ∈ t2 in the
language of set theory. The successor function in the language of arithmetic
is even written conventionally after its argument: t′ . Officially, however, these
are just conventional abbreviations for A20 (t1 , t2 ), f02 (t1 , t2 ), A20 (t1 , t2 ) and f 01 (t),
respectively.

Definition 6.7 (Syntactic identity). The symbol ≡ expresses syntactic iden-


tity between strings of symbols, i.e., φ ≡ ψ iff φ and ψ are strings of symbols
of the same length and which contain the same symbol in each place.

The ≡ symbol may be flanked by strings obtained by concatenation, e.g.,


φ ≡ (ψ ∨ χ) means: the string of symbols φ is the same string as the one
obtained by concatenating an opening parenthesis, the string ψ, the ∨ symbol,
the string χ, and a closing parenthesis, in this order. If this is the case, then we
know that the first symbol of φ is an opening parenthesis, φ contains ψ as a
substring (starting at the second symbol), that substring is followed by ∨, etc.
As terms and formulae are built up from basic elements via inductive def-
initions, we can use the following induction principles to prove things about
them.

Lemma 6.8 (Principle of induction on terms). Let L be a first-order language. If


some property P is such that

1. it holds for every variable v,

2. it holds for every constant symbol a of L, and

3. it holds for f (t1 , . . . , tn ) whenever it holds for t1 , . . . , tn and f is an n-place


function symbol of L

(assuming t1 , . . . , tn are terms of L), then P holds for every term in Trm(L).

Lemma 6.9 (Principle of induction on formulae). Let L be a first-order language.


If some property P holds for all the atomic formulae and is such that

1. it holds for ∼ φ whenever it holds for φ;

2. it holds for ( φ & ψ) whenever it holds for φ and ψ;

63
6. S YNTAX OF F IRST-O RDER L OGIC

3. it holds for ( φ ∨ ψ) whenever it holds for φ and ψ;

4. it holds for ( φ ⊃ ψ) whenever it holds for φ and ψ;

5. it holds for ∃ x φ whenever it holds for φ;

6. it holds for ∀ x φ whenever it holds for φ;

(assuming φ and ψ are formulae of L), then P holds for all formulas in Frm(L).

6.4 Unique Readability


The way we defined formulae guarantees that every formula has a unique read-
ing, i.e., there is essentially only one way of constructing it according to our
formation rules for formulae and only one way of “interpreting” it. If this
were not so, we would have ambiguous formulae, i.e., formulae that have
more than one reading or intepretation—and that is clearly something we
want to avoid. But more importantly, without this property, most of the defi-
nitions and proofs we are going to give will not go through.
Perhaps the best way to make this clear is to see what would happen if we
had given bad rules for forming formulae that would not guarantee unique
readability. For instance, we could have forgotten the parentheses in the for-
mation rules for connectives, e.g., we might have allowed this:

If φ and ψ are formulae, then so is φ ⊃ ψ.

Starting from an atomic formula θ, this would allow us to form θ ⊃ θ. From


this, together with θ, we would get θ ⊃ θ ⊃ θ. But there are two ways to do
this:

1. We take θ to be φ and θ ⊃ θ to be ψ.

2. We take φ to be θ ⊃ θ and ψ is θ.

Correspondingly, there are two ways to “read” the formula θ ⊃ θ ⊃ θ. It is of


the form ψ ⊃ χ where ψ is θ and χ is θ ⊃ θ, but it is also of the form ψ ⊃ χ
with ψ being θ ⊃ θ and χ being θ.
If this happens, our definitions will not always work. For instance, when
we define the main operator of a formula, we say: in a formula of the form
ψ ⊃ χ, the main operator is the indicated occurrence of ⊃. But if we can match
the formula θ ⊃ θ ⊃ θ with ψ ⊃ χ in the two different ways mentioned above,
then in one case we get the first occurrence of ⊃ as the main operator, and in
the second case the second occurrence. But we intend the main operator to
be a function of the formula, i.e., every formula must have exactly one main
operator occurrence.

Lemma 6.10. The number of left and right parentheses in a formula φ are equal.

64
6.4. Unique Readability

Proof. We prove this by induction on the way φ is constructed. This requires


two things: (a) We have to prove first that all atomic formulas have the prop-
erty in question (the induction basis). (b) Then we have to prove that when
we construct new formulas out of given formulas, the new formulas have the
property provided the old ones do.
Let l ( φ) be the number of left parentheses, and r ( φ) the number of right
parentheses in φ, and l (t) and r (t) similarly the number of left and right
parentheses in a term t.

1. φ ≡ ⊥: φ has 0 left and 0 right parentheses.

2. φ ≡ R(t1 , . . . , tn ): l ( φ) = 1 + l (t1 ) + · · · + l (tn ) = 1 + r (t1 ) + · · · +


r (tn ) = r ( φ). Here we make use of the fact, left as an exercise, that
l (t) = r (t) for any term t.

3. φ ≡ t1 = t2 : l ( φ) = l (t1 ) + l (t2 ) = r (t1 ) + r (t2 ) = r ( φ).

4. φ ≡ ∼ψ: By induction hypothesis, l (ψ) = r (ψ). Thus l ( φ) = l (ψ) =


r ( ψ ) = r ( φ ).

5. φ ≡ (ψ ∗ χ): By induction hypothesis, l (ψ) = r (ψ) and l (χ) = r (χ).


Thus l ( φ) = 1 + l (ψ) + l (χ) = 1 + r (ψ) + r (χ) = r ( φ).

6. φ ≡ ∀ x ψ: By induction hypothesis, l (ψ) = r (ψ). Thus, l ( φ) = l (ψ) =


r ( ψ ) = r ( φ ).

7. φ ≡ ∃ x ψ: Similarly.

Definition 6.11 (Proper prefix). A string of symbols ψ is a proper prefix of a


string of symbols φ if concatenating ψ and a non-empty string of symbols
yields φ.

Lemma 6.12. If φ is a formula, and ψ is a proper prefix of φ, then ψ is not a formula.

Proof. Exercise.

Proposition 6.13. If φ is an atomic formula, then it satisfies one, and only one of the
following conditions.

1. φ ≡ ⊥.

2. φ ≡ R(t1 , . . . , tn ) where R is an n-place predicate symbol, t1 , . . . , tn are terms,


and each of R, t1 , . . . , tn is uniquely determined.

3. φ ≡ t1 = t2 where t1 and t2 are uniquely determined terms.

Proof. Exercise.

65
6. S YNTAX OF F IRST-O RDER L OGIC

Proposition 6.14 (Unique Readability). Every formula satisfies one, and only one
of the following conditions.

1. φ is atomic.

2. φ is of the form ∼ψ.

3. φ is of the form (ψ & χ).

4. φ is of the form (ψ ∨ χ).

5. φ is of the form (ψ ⊃ χ).

6. φ is of the form ∀ x ψ.

7. φ is of the form ∃ x ψ.

Moreover, in each case ψ, or ψ and χ, are uniquely determined. This means that, e.g.,
there are no different pairs ψ, χ and ψ′ , χ′ so that φ is both of the form (ψ ⊃ χ) and
( ψ ′ ⊃ χ ′ ).

Proof. The formation rules require that if a formula is not atomic, it must start
with an opening parenthesis (, ∼, or a quantifier. On the other hand, every
formula that starts with one of the following symbols must be atomic: a pred-
icate symbol, a function symbol, a constant symbol, ⊥.
So we really only have to show that if φ is of the form (ψ ∗ χ) and also of
the form (ψ′ ∗′ χ′ ), then ψ ≡ ψ′ , χ ≡ χ′ , and ∗ = ∗′ .
So suppose both φ ≡ (ψ ∗ χ) and φ ≡ (ψ′ ∗′ χ′ ). Then either ψ ≡ ψ′ or not.
If it is, clearly ∗ = ∗′ and χ ≡ χ′ , since they then are substrings of φ that begin
in the same place and are of the same length. The other case is ψ ̸≡ ψ′ . Since
ψ and ψ′ are both substrings of φ that begin at the same place, one must be a
proper prefix of the other. But this is impossible by Lemma 6.12.

6.5 Main operator of a Formula


It is often useful to talk about the last operator used in constructing a for-
mula φ. This operator is called the main operator of φ. Intuitively, it is the
“outermost” operator of φ. For example, the main operator of ∼ φ is ∼, the
main operator of ( φ ∨ ψ) is ∨, etc.

Definition 6.15 (Main operator). The main operator of a formula φ is defined


as follows:

1. φ is atomic: φ has no main operator.

2. φ ≡ ∼ψ: the main operator of φ is ∼.

3. φ ≡ (ψ & χ): the main operator of φ is &.

66
6.6. Subformulae

4. φ ≡ (ψ ∨ χ): the main operator of φ is ∨.

5. φ ≡ (ψ ⊃ χ): the main operator of φ is ⊃.

6. φ ≡ ∀ x ψ: the main operator of φ is ∀.

7. φ ≡ ∃ x ψ: the main operator of φ is ∃.

In each case, we intend the specific indicated occurrence of the main oper-
ator in the formula. For instance, since the formula ((θ ⊃ α) ⊃ (α ⊃ θ )) is of
the form (ψ ⊃ χ) where ψ is (θ ⊃ α) and χ is (α ⊃ θ ), the second occurrence
of ⊃ is the main operator.
This is a recursive definition of a function which maps all non-atomic for-
mulae to their main operator occurrence. Because of the way formulae are de-
fined inductively, every formula φ satisfies one of the cases in Definition 6.15.
This guarantees that for each non-atomic formula φ a main operator exists.
Because each formula satisfies only one of these conditions, and because the
smaller formulae from which φ is constructed are uniquely determined in
each case, the main operator occurrence of φ is unique, and so we have de-
fined a function.
We call formulae by the names in Table 6.1 depending on which symbol
their main operator is. Recall, however, that defined operators do not officially
appear in formulae. They are just abbreviations, so officially they cannot be
the main operator of a formula. In proofs about all formulae they therefore do
not have to be treated separately.
Main operator Type of formula Example
none atomic (formula) ⊥, R ( t1 , . . . , t n ), t1 = t2
∼ negation ∼φ
& conjunction ( φ & ψ)
∨ disjunction ( φ ∨ ψ)
⊃ conditional ( φ ⊃ ψ)
≡ biconditional ( φ ≡ ψ)
∀ universal (formula) ∀x φ
∃ existential (formula) ∃x φ
Table 6.1: Main operator and names of formulae

6.6 Subformulae
It is often useful to talk about the formulae that “make up” a given formula.
We call these its subformulae. Any formula counts as a subformula of itself; a
subformula of φ other than φ itself is a proper subformula.

Definition 6.16 (Immediate Subformula). If φ is a formula, the immediate sub-


formulae of φ are defined inductively as follows:

67
6. S YNTAX OF F IRST-O RDER L OGIC

1. Atomic formulae have no immediate subformulae.


2. φ ≡ ∼ψ: The only immediate subformula of φ is ψ.
3. φ ≡ (ψ ∗ χ): The immediate subformulae of φ are ψ and χ (∗ is any one
of the two-place connectives).
4. φ ≡ ∀ x ψ: The only immediate subformula of φ is ψ.
5. φ ≡ ∃ x ψ: The only immediate subformula of φ is ψ.
Definition 6.17 (Proper Subformula). If φ is a formula, the proper subformulae
of φ are defined recursively as follows:
1. Atomic formulae have no proper subformulae.
2. φ ≡ ∼ψ: The proper subformulae of φ are ψ together with all proper
subformulae of ψ.
3. φ ≡ (ψ ∗ χ): The proper subformulae of φ are ψ, χ, together with all
proper subformulae of ψ and those of χ.
4. φ ≡ ∀ x ψ: The proper subformulae of φ are ψ together with all proper
subformulae of ψ.
5. φ ≡ ∃ x ψ: The proper subformulae of φ are ψ together with all proper
subformulae of ψ.
Definition 6.18 (Subformula). The subformulae of φ are φ itself together with
all its proper subformulae.
Note the subtle difference in how we have defined immediate subformulae
and proper subformulae. In the first case, we have directly defined the imme-
diate subformulae of a formula φ for each possible form of φ. It is an explicit
definition by cases, and the cases mirror the inductive definition of the set
of formulae. In the second case, we have also mirrored the way the set of all
formulae is defined, but in each case we have also included the proper subfor-
mulae of the smaller formulae ψ, χ in addition to these formulae themselves.
This makes the definition recursive. In general, a definition of a function on an
inductively defined set (in our case, formulae) is recursive if the cases in the
definition of the function make use of the function itself. To be well defined,
we must make sure, however, that we only ever use the values of the function
for arguments that come “before” the one we are defining—in our case, when
defining “proper subformula” for (ψ ∗ χ) we only use the proper subformulae
of the “earlier” formulae ψ and χ.
Proposition 6.19. Suppose ψ is a subformula of φ and χ is a subformula of ψ. Then
χ is a subformula of φ. In other words, the subformula relation is transitive.
Proposition 6.20. Suppose φ is a formula with n connectives and quantifiers. Then
φ has at most 2n + 1 subformulas.

68
6.7. Formation Sequences

6.7 Formation Sequences


Defining formulae via an inductive definition, and the complementary tech-
nique of proving properties of formulae via induction, is an elegant and effi-
cient approach. However, it can also be useful to consider a more bottom-up,
step-by-step approach to the construction of formulae, which we do here us-
ing the notion of a formation sequence. To show how terms and formulae can
be introduced in this way without needing to refer to their inductive defini-
tions, we first introduce the notion of an arbitrary string of symbols drawn
from some language L.

Definition 6.21 (Strings). Suppose L is a first-order language. An L-string is


a finite sequence of symbols of L. Where the language L is clearly fixed by
the context, we will often refer to a L-string simply as a string.

Example 6.22. For any first-order language L, all L-formulae are L-strings,
but not conversely. For example,

)(v0 ⊃ ∃

is an L-string but not an L-formula.

Definition 6.23 (Formation sequences for terms). A finite sequence of L-strings


⟨t0 , . . . , tn ⟩ is a formation sequence for a term t if t ≡ tn and for all i ≤ n, either ti
is a variable or a constant symbol, or L contains a k-ary function symbol f and
there exist m0 , . . . , mk < i such that ti ≡ f (tm0 , . . . , tmk ). When it is necessary
to distinguish, we will refer to formation sequences for terms as term formation
sequences.

Example 6.24. The sequence

⟨c0 , v0 , f02 (c0 , v0 ), f01 (f02 (c0 , v0 ))⟩

is a formation sequence for the term f01 (f02 (c0 , v0 )), as is

⟨v0 , c0 , f02 (c0 , v0 ), f01 (f02 (c0 , v0 ))⟩.

Definition 6.25 (Formation sequences for formulas). A finite sequence of L-


strings ⟨ φ0 , . . . , φn ⟩ is a formation sequence for φ if φ ≡ φn and for all i ≤ n,
either φi is an atomic formula or there exist j, k < i and a variable x such that
one of the following holds:

1. φi ≡ ∼ φ j .

2. φi ≡ ( φ j & φk ).

3. φi ≡ ( φ j ∨ φk ).

69
6. S YNTAX OF F IRST-O RDER L OGIC

4. φi ≡ ( φ j ⊃ φk ).
5. φi ≡ ∀ x φ j .
6. φi ≡ ∃ x φ j .
When it is necessary to distinguish, we will refer to formation sequences for
formulas as formula formation sequences.
Example 6.26.

⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) & A10 (v0 )), ∃v0 (A11 (c1 ) & A10 (v0 ))⟩

is a formation sequence of ∃v0 (A11 (c1 ) & A10 (v0 )), as is

⟨A10 (v0 ), A11 (c1 ), (A11 (c1 ) & A10 (v0 )), A11 (c1 ),
∀v1 A10 (v0 ), ∃v0 (A11 (c1 ) & A10 (v0 ))⟩.

As can be seen from the second example, formation sequences may contain
“junk”: formulae which are redundant or do not contribute to the construc-
tion.

Proposition 6.27. Every formula φ in Frm(L) has a formation sequence.

Proof. Suppose φ is atomic. Then the sequence ⟨ φ⟩ is a formation sequence


for φ. Now suppose that ψ and χ have formation sequences ⟨ψ0 , . . . , ψn ⟩ and
⟨χ0 , . . . , χm ⟩ respectively.
1. If φ ≡ ∼ψ, then ⟨ψ0 , . . . , ψn , ∼ψn ⟩ is a formation sequence for φ.
2. If φ ≡ (ψ & χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn & χm )⟩ is a formation
sequence for φ.
3. If φ ≡ (ψ ∨ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ∨ χm )⟩ is a formation
sequence for φ.
4. If φ ≡ (ψ ⊃ χ), then ⟨ψ0 , . . . , ψn , χ0 , . . . , χm , (ψn ⊃ χm )⟩ is a formation
sequence for φ.
5. If φ ≡ ∀ x ψ, then ⟨ψ0 , . . . , ψn , ∀ x ψn ⟩ is a formation sequence for φ.
6. If φ ≡ ∃ x ψ, then ⟨ψ0 , . . . , ψn , ∃ x ψn ⟩ is a formation sequence for φ.
By the principle of induction on formulae, every formula has a formation se-
quence.

We can also prove the converse. This is important because it shows that
our two ways of defining formulas are equivalent: they give the same results.
It also means that we can prove theorems about formulas by using ordinary
induction on the length of formation sequences.

70
6.7. Formation Sequences

Lemma 6.28. Suppose that ⟨ φ0 , . . . , φn ⟩ is a formation sequence for φn , and that


k ≤ n. Then ⟨ φ0 , . . . , φk ⟩ is a formation sequence for φk .

Proof. Exercise.

Theorem 6.29. Frm(L) is the set of all L-strings φ such that there exists a formula
formation sequence for φ.

Proof. Let F be the set of all strings of symbols in the language L that have a
formation sequence. We have seen in Proposition 6.27 that Frm(L) ⊆ F, so
now we prove the converse.
Suppose φ has a formation sequence ⟨ φ0 , . . . , φn ⟩. We prove that φ ∈
Frm(L) by strong induction on n. Our induction hypothesis is that every
string of symbols with a formation sequence of length m < n is in Frm(L). By
the definition of a formation sequence, either φ ≡ φn is atomic or there must
exist j, k < n such that one of the following is the case:
1. φ ≡ ∼ φ j .
2. φ ≡ ( φ j & φk ).
3. φ ≡ ( φ j ∨ φk ).
4. φ ≡ ( φ j ⊃ φk ).
5. φ ≡ ∀ x φ j .
6. φ ≡ ∃ x φ j .
Now we reason by cases. If φ is atomic then φn ∈ Frm(L0 ). Suppose in-
stead that φ ≡ ( φ j & φk ). By Lemma 6.28, ⟨ φ0 , . . . , φ j ⟩ and ⟨ φ0 , . . . , φk ⟩ are
formation sequences for φ j and φk , respectively. Since these are proper ini-
tial subsequences of the formation sequence for φ, they both have length less
than n. Therefore by the induction hypothesis, φ j and φk are in Frm(L0 ), and
by the definition of a formula, so is ( φ j & φk ). The other cases follow by par-
allel reasoning.

Formation sequences for terms have similar properties to those for formu-
lae.
Proposition 6.30. Trm(L) is the set of all L-strings t such that there exists a term
formation sequence for t.

Proof. Exercise.

There are two types of “junk” that can appear in formation sequences: re-
peated elements, and elements that are irrelevant to the construction of the
formation or term. We can eliminate both by looking at minimal formation
sequences.

71
6. S YNTAX OF F IRST-O RDER L OGIC

Definition 6.31 (Minimal formation sequences). A formation sequence ⟨ φ0 , . . . , φn ⟩


for a formula φ is a minimal formation sequence for φ if for every other formation
sequence s for φ, the length of s is greater than or equal to n + 1.
Similarly, a formation sequence ⟨t0 , . . . , tn ⟩ for a term t is a minimal forma-
tion sequence for t if for every other formation sequence s for t, the length of s
is greater than or equal to n + 1.

Note that a formula or term can have more than one minimal formation
sequence, but they will contain exactly the same strings.

Proposition 6.32. The following are equivalent:

1. ψ is a sub-formula of φ.

2. ψ occurs in every formation sequence of φ.

3. ψ occurs in a minimal formation sequence of φ.

Proof. Exercise.

Historical Remarks Formation sequences were introduced by Raymond Smullyan


in his textbook First-Order Logic (Smullyan, 1968). Additional properties of
formation sequences were established by Zuckerman (1973).

6.8 Free Variables and Sentences


Definition 6.33 (Free occurrences of a variable). The free occurrences of a vari-
able in a formula are defined inductively as follows:

1. φ is atomic: all variable occurrences in φ are free.

2. φ ≡ ∼ψ: the free variable occurrences of φ are exactly those of ψ.

3. φ ≡ (ψ ∗ χ): the free variable occurrences of φ are those in ψ together


with those in χ.

4. φ ≡ ∀ x ψ: the free variable occurrences in φ are all of those in ψ except


for occurrences of x.

5. φ ≡ ∃ x ψ: the free variable occurrences in φ are all of those in ψ except


for occurrences of x.

Definition 6.34 (Bound Variables). An occurrence of a variable in a formula φ


is bound if it is not free.

72
6.9. Substitution

Definition 6.35 (Scope). If ∀ x ψ is an occurrence of a subformula in a for-


mula φ, then the corresponding occurrence of ψ in φ is called the scope of
the corresponding occurrence of ∀ x. Similarly for ∃ x.
If ψ is the scope of a quantifier occurrence ∀ x or ∃ x in φ, then the free oc-
currences of x in ψ are bound in ∀ x ψ and ∃ x ψ. We say that these occurrences
are bound by the mentioned quantifier occurrence.

Example 6.36. Consider the following formula:

∃v0 A20 (v0 , v1 )


| {z }
ψ

ψ represents the scope of ∃v0 . The quantifier binds the occurrence of v0 in ψ,


but does not bind the occurrence of v1 . So v1 is a free variable in this case.
We can now see how this might work in a more complicated formula φ:

θ
z }| {
∀v0 (A10 (v0 ) ⊃ A20 (v0 , v1 )) ⊃ ∃v1 (A21 (v0 , v1 ) ∨ ∀v0 ∼A11 (v0 ))
| {z } | {z }
ψ χ

ψ is the scope of the first ∀v0 , χ is the scope of ∃v1 , and θ is the scope of
the second ∀v0 . The first ∀v0 binds the occurrences of v0 in ψ, ∃v1 binds the
occurrence of v1 in χ, and the second ∀v0 binds the occurrence of v0 in θ. The
first occurrence of v1 and the fourth occurrence of v0 are free in φ. The last
occurrence of v0 is free in θ, but bound in χ and φ.

Definition 6.37 (Sentence). A formula φ is a sentence iff it contains no free


occurrences of variables.

6.9 Substitution
Definition 6.38 (Substitution in a term). We define s[t/x ], the result of sub-
stituting t for every occurrence of x in s, recursively:

1. s ≡ c: s[t/x ] is just s.

2. s ≡ y: s[t/x ] is also just s, provided y is a variable and y ̸≡ x.

3. s ≡ x: s[t/x ] is t.

4. s ≡ f (t1 , . . . , tn ): s[t/x ] is f (t1 [t/x ], . . . , tn [t/x ]).

Definition 6.39. A term t is free for x in φ if none of the free occurrences of x


in φ occur in the scope of a quantifier that binds a variable in t.

Example 6.40.

73
6. S YNTAX OF F IRST-O RDER L OGIC

1. v8 is free for v1 in ∃v3 A24 (v3 , v1 )

2. f12 (v1 , v2 ) is not free for v0 in ∀v2 A24 (v0 , v2 )

Definition 6.41 (Substitution in a formula). If φ is a formula, x is a variable,


and t is a term free for x in φ, then φ[t/x ] is the result of substituting t for all
free occurrences of x in φ.

1. φ ≡ ⊥: φ[t/x ] is ⊥.

2. φ ≡ P(t1 , . . . , tn ): φ[t/x ] is P(t1 [t/x ], . . . , tn [t/x ]).

3. φ ≡ t1 = t2 : φ[t/x ] is t1 [t/x ] = t2 [t/x ].

4. φ ≡ ∼ψ: φ[t/x ] is ∼ψ[t/x ].

5. φ ≡ (ψ & χ): φ[t/x ] is (ψ[t/x ] & χ[t/x ]).

6. φ ≡ (ψ ∨ χ): φ[t/x ] is (ψ[t/x ] ∨ χ[t/x ]).

7. φ ≡ (ψ ⊃ χ): φ[t/x ] is (ψ[t/x ] ⊃ χ[t/x ]).

8. φ ≡ ∀y ψ: φ[t/x ] is ∀y ψ[t/x ], provided y is a variable other than x;


otherwise φ[t/x ] is just φ.

9. φ ≡ ∃y ψ: φ[t/x ] is ∃y ψ[t/x ], provided y is a variable other than x;


otherwise φ[t/x ] is just φ.

Note that substitution may be vacuous: If x does not occur in φ at all, then
φ[t/x ] is just φ.
The restriction that t must be free for x in φ is necessary to exclude cases
like the following. If φ ≡ ∃y x < y and t ≡ y, then φ[t/x ] would be ∃y y <
y. In this case the free variable y is “captured” by the quantifier ∃y upon
substitution, and that is undesirable. For instance, we would like it to be the
case that whenever ∀ x ψ holds, so does ψ[t/x ]. But consider ∀ x ∃y x < y (here
ψ is ∃y x < y). It is a sentence that is true about, e.g., the natural numbers:
for every number x there is a number y greater than it. If we allowed y as a
possible substitution for x, we would end up with ψ[y/x ] ≡ ∃y y < y, which
is false. We prevent this by requiring that none of the free variables in t would
end up being bound by a quantifier in φ.
We often use the following convention to avoid cumbersome notation: If
φ is a formula which may contain the variable x free, we also write φ( x ) to
indicate this. When it is clear which φ and x we have in mind, and t is a term
(assumed to be free for x in φ( x )), then we write φ(t) as short for φ[t/x ]. So
for instance, we might say, “we call φ(t) an instance of ∀ x φ( x ).” By this we
mean that if φ is any formula, x a variable, and t a term that’s free for x in φ,
then φ[t/x ] is an instance of ∀ x φ.

74
Chapter 7

Semantics of First-Order Logic

7.1 Introduction

Giving the meaning of expressions is the domain of semantics. The central


concept in semantics is that of satisfaction in a structure. A structure gives
meaning to the building blocks of the language: a domain is a non-empty
set of objects. The quantifiers are interpreted as ranging over this domain,
constant symbols are assigned elements in the domain, function symbols are
assigned functions from the domain to itself, and predicate symbols are as-
signed relations on the domain. The domain together with assignments to the
basic vocabulary constitutes a structure. Variables may appear in formulae,
and in order to give a semantics, we also have to assign elements of the do-
main to them—this is a variable assignment. The satisfaction relation, finally,
brings these together. A formula may be satisfied in a structure M relative to
a variable assignment s, written as M, s ⊨ φ. This relation is also defined by
induction on the structure of φ, using the truth tables for the logical connec-
tives to define, say, satisfaction of ( φ & ψ) in terms of satisfaction (or not) of φ
and ψ. It then turns out that the variable assignment is irrelevant if the for-
mula φ is a sentence, i.e., has no free variables, and so we can talk of sentences
being simply satisfied (or not) in structures.

On the basis of the satisfaction relation M ⊨ φ for sentences we can then


define the basic semantic notions of validity, entailment, and satisfiability.
A sentence is valid, ⊨ φ, if every structure satisfies it. It is entailed by a set
of sentences, Γ ⊨ φ, if every structure that satisfies all the sentences in Γ also
satisfies φ. And a set of sentences is satisfiable if some structure satisfies all
sentences in it at the same time. Because formulae are inductively defined,
and satisfaction is in turn defined by induction on the structure of formulae,
we can use induction to prove properties of our semantics and to relate the
semantic notions defined.

75
7. S EMANTICS OF F IRST-O RDER L OGIC

7.2 Structures for First-order Languages


First-order languages are, by themselves, uninterpreted: the constant symbols,
function symbols, and predicate symbols have no specific meaning attached
to them. Meanings are given by specifying a structure. It specifies the domain,
i.e., the objects which the constant symbols pick out, the function symbols
operate on, and the quantifiers range over. In addition, it specifies which con-
stant symbols pick out which objects, how a function symbol maps objects
to objects, and which objects the predicate symbols apply to. Structures are
the basis for semantic notions in logic, e.g., the notion of consequence, valid-
ity, satisfiability. They are variously called “structures,” “interpretations,” or
“models” in the literature.

Definition 7.1 (Structures). A structure M, for a language L of first-order logic


consists of the following elements:

1. Domain: a non-empty set, |M|

2. Interpretation of constant symbols: for each constant symbol c of L, an ele-


ment cM ∈ |M|

3. Interpretation of predicate symbols: for each n-place predicate symbol R of


L (other than =), an n-place relation RM ⊆ |M|n

4. Interpretation of function symbols: for each n-place function symbol f of


L, an n-place function f M : |M|n → |M|

Example 7.2. A structure M for the language of arithmetic consists of a set,


an element of |M|, 0M , as interpretation of the constant symbol 0, a one-place
function ′M : |M| → |M|, two two-place functions +M and ×M , both |M|2 →
|M|, and a two-place relation <M ⊆ |M|2 .
An obvious example of such a structure is the following:

1. |N| = N

2. 0N = 0

3. ′N (n) = n + 1 for all n ∈ N

4. +N (n, m) = n + m for all n, m ∈ N

5. ×N (n, m) = n · m for all n, m ∈ N

6. <N = {⟨n, m⟩ | n ∈ N, m ∈ N, n < m}

The structure N for L A so defined is called the standard model of arithmetic,


because it interprets the non-logical constants of L A exactly how you would
expect.

76
7.3. Covered Structures for First-order Languages

However, there are many other possible structures for L A . For instance,
we might take as the domain the set Z of integers instead of N, and define the
interpretations of 0, ′, +, ×, < accordingly. But we can also define structures
for L A which have nothing even remotely to do with numbers.

Example 7.3. A structure M for the language L Z of set theory requires just a
set and a single-two place relation. So technically, e.g., the set of people plus
the relation “x is older than y” could be used as a structure for L Z , as well as
N together with n ≥ m for n, m ∈ N.
A particularly interesting structure for L Z in which the elements of the
domain are actually sets, and the interpretation of ∈ actually is the relation “x
is an element of y” is the structure HF of hereditarily finite sets:

1. |HF| = ∅ ∪ ℘(∅) ∪ ℘(℘(∅)) ∪ ℘(℘(℘(∅))) ∪ . . . ;

2. ∈HF = {⟨ x, y⟩ | x, y ∈ |HF| , x ∈ y}.

The stipulations we make as to what counts as a structure impact our logic.


For example, the choice to prevent empty domains ensures, given the usual
account of satisfaction (or truth) for quantified sentences, that ∃ x ( φ( x ) ∨ ∼ φ( x ))
is valid—that is, a logical truth. And the stipulation that all constant symbols
must refer to an object in the domain ensures that the existential generaliza-
tion is a sound pattern of inference: φ( a), therefore ∃ x φ( x ). If we allowed
names to refer outside the domain, or to not refer, then we would be on our
way to a free logic, in which existential generalization requires an additional
premise: φ( a) and ∃ x x = a, therefore ∃ x φ( x ).

7.3 Covered Structures for First-order Languages


Recall that a term is closed if it contains no variables.

Definition 7.4 (Value of closed terms). If t is a closed term of the language L


and M is a structure for L, the value ValM (t) is defined as follows:

1. If t is just the constant symbol c, then ValM (c) = cM .

2. If t is of the form f (t1 , . . . , tn ), then

ValM (t) = f M (ValM (t1 ), . . . , ValM (tn )).

Definition 7.5 (Covered structure). A structure is covered if every element of


the domain is the value of some closed term.

Example 7.6. Let L be the language with constant symbols zer o, one, tw o,
. . . , the binary predicate symbol <, and the binary function symbols + and
×. Then a structure M for L is the one with domain |M| = {0, 1, 2, . . .} and

77
7. S EMANTICS OF F IRST-O RDER L OGIC

assignments z er o M = 0, one M = 1, tw o M = 2, and so forth. For the binary


relation symbol <, the set <M is the set of all pairs ⟨c1 , c2 ⟩ ∈ |M|2 such that
c1 is less than c2 : for example, ⟨1, 3⟩ ∈ <M but ⟨2, 2⟩ ∈ / <M . For the binary
function symbol +, define + in the usual way—for example, +M (2, 3) maps
M

to 5, and similarly for the binary function symbol ×. Hence, the value of
f our is just 4, and the value of ×(tw o, +(thr ee, z er o )) (or in infix notation,
tw o × (thr ee + z er o )) is

ValM (×(tw o, +(thr ee, z er o ))) =


= ×M (ValM (tw o ), ValM (+(thr ee, z er o )))
= ×M (ValM (tw o ), +M (ValM (thr ee ), ValM (z er o )))
= ×M (tw o M , +M (thr ee M , z er o M ))
= ×M (2, +M (3, 0))
= ×M (2, 3)
=6

7.4 Satisfaction of a Formula in a Structure


The basic notion that relates expressions such as terms and formulae, on the
one hand, and structures on the other, are those of value of a term and satisfac-
tion of a formula. Informally, the value of a term is an element of a structure—
if the term is just a constant, its value is the object assigned to the constant
by the structure, and if it is built up using function symbols, the value is com-
puted from the values of constants and the functions assigned to the functions
in the term. A formula is satisfied in a structure if the interpretation given to
the predicates makes the formula true in the domain of the structure. This
notion of satisfaction is specified inductively: the specification of the struc-
ture directly states when atomic formulae are satisfied, and we define when a
complex formula is satisfied depending on the main connective or quantifier
and whether or not the immediate subformulae are satisfied.
The case of the quantifiers here is a bit tricky, as the immediate subformula
of a quantified formula has a free variable, and structures don’t specify the val-
ues of variables. In order to deal with this difficulty, we also introduce variable
assignments and define satisfaction not with respect to a structure alone, but
with respect to a structure plus a variable assignment.

Definition 7.7 (Variable Assignment). A variable assignment s for a structure M


is a function which maps each variable to an element of |M|, i.e., s : Var →
|M|.

A structure assigns a value to each constant symbol, and a variable assign-


ment to each variable. But we want to use terms built up from them to also

78
7.4. Satisfaction of a Formula in a Structure

name elements of the domain. For this we define the value of terms induc-
tively. For constant symbols and variables the value is just as the structure or
the variable assignment specifies it; for more complex terms it is computed
recursively using the functions the structure assigns to the function symbols.

Definition 7.8 (Value of Terms). If t is a term of the language L, M is a struc-


ture for L, and s is a variable assignment for M, the value ValMs ( t ) is defined
as follows:

1. t ≡ c: ValM M
s (t) = c .

2. t ≡ x: ValM
s ( t ) = s ( x ).

3. t ≡ f (t1 , . . . , tn ):

ValM M M M
s ( t ) = f (Vals ( t1 ), . . . , Vals ( tn )).

Definition 7.9 (x-Variant). If s is a variable assignment for a structure M, then


any variable assignment s′ for M which differs from s at most in what it as-
signs to x is called an x-variant of s. If s′ is an x-variant of s we write s′ ∼ x s.

Note that an x-variant of an assignment s does not have to assign something


different to x. In fact, every assignment counts as an x-variant of itself.

Definition 7.10. If s is a variable assignment for a structure M and m ∈ |M|,


then the assignment s[m/x ] is the variable assignment defined by
(
m if y ≡ x
s[m/x ](y) =
s(y) otherwise.

In other words, s[m/x ] is the particular x-variant of s which assigns the


domain element m to x, and assigns the same things to variables other than x
that s does.

Definition 7.11 (Satisfaction). Satisfaction of a formula φ in a structure M rel-


ative to a variable assignment s, in symbols: M, s ⊨ φ, is defined recursively
as follows. (We write M, s ⊭ φ to mean “not M, s ⊨ φ.”)

1. φ ≡ ⊥: M, s ⊭ φ.

2. φ ≡ R(t1 , . . . , tn ): M, s ⊨ φ iff ⟨ValM M M


s ( t1 ), . . . , Vals ( tn )⟩ ∈ R .

3. φ ≡ t1 = t2 : M, s ⊨ φ iff ValM M
s ( t1 ) = Vals ( t2 ).

4. φ ≡ ∼ψ: M, s ⊨ φ iff M, s ⊭ ψ.

5. φ ≡ (ψ & χ): M, s ⊨ φ iff M, s ⊨ ψ and M, s ⊨ χ.

6. φ ≡ (ψ ∨ χ): M, s ⊨ φ iff M, s ⊨ ψ or M, s ⊨ χ (or both).

79
7. S EMANTICS OF F IRST-O RDER L OGIC

7. φ ≡ (ψ ⊃ χ): M, s ⊨ φ iff M, s ⊭ ψ or M, s ⊨ χ (or both).

8. φ ≡ ∀ x ψ: M, s ⊨ φ iff for every element m ∈ |M|, M, s[m/x ] ⊨ ψ.

9. φ ≡ ∃ x ψ: M, s ⊨ φ iff for at least one element m ∈ |M|, M, s[m/x ] ⊨ ψ.

The variable assignments are important in the last two clauses. We cannot
define satisfaction of ∀ x ψ( x ) by “for all m ∈ |M|, M ⊨ ψ(m).” We cannot
define satisfaction of ∃ x ψ( x ) by “for at least one m ∈ |M|, M ⊨ ψ(m).” The
reason is that if m ∈ |M|, it is not a symbol of the language, and so ψ(m) is
not a formula (that is, ψ[m/x ] is undefined). We also cannot assume that we
have constant symbols or terms available that name every element of M, since
there is nothing in the definition of structures that requires it. In the standard
language, the set of constant symbols is countably infinite, so if |M| is not
countable there aren’t even enough constant symbols to name every object.
We solve this problem by introducing variable assignments, which allow
us to link variables directly with elements of the domain. Then instead of
saying that, e.g., ∃ x ψ( x ) is satisfied in M iff for at least one m ∈ |M|, we say
it is satisfied in M relative to s iff ψ( x ) is satisfied relative to s[m/x ] for at least
one m ∈ |M|.

Example 7.12. Let L = { a, b, f , R} where a and b are constant symbols, f is a


two-place function symbol, and R is a two-place predicate symbol. Consider
the structure M defined by:

1. |M| = {1, 2, 3, 4}

2. aM = 1

3. bM = 2

4. f M ( x, y) = x + y if x + y ≤ 3 and = 3 otherwise.

5. RM = {⟨1, 1⟩, ⟨1, 2⟩, ⟨2, 3⟩, ⟨2, 4⟩}

The function s( x ) = 1 that assigns 1 ∈ |M| to every variable is a variable


assignment for M.
Then

ValM M M M
s ( f ( a, b )) = f (Vals ( a ), Vals ( b )).

Since a and b are constant symbols, ValM


s ( a) = a
M = 1 and ValM ( b ) = bM =
s
2. So

ValM M
s ( f ( a, b )) = f (1, 2) = 1 + 2 = 3.

80
7.4. Satisfaction of a Formula in a Structure

To compute the value of f ( f ( a, b), a) we have to consider

ValM M M M M
s ( f ( f ( a, b ), a )) = f (Vals ( f ( a, b )), Vals ( a )) = f (3, 1) = 3,

since 3 + 1 > 3. Since s( x ) = 1 and ValM


s ( x ) = s ( x ), we also have

ValM M M M M
s ( f ( f ( a, b ), x )) = f (Vals ( f ( a, b )), Vals ( x )) = f (3, 1) = 3,

An atomic formula R(t1 , t2 ) is satisfied if the tuple of values of its ar-


guments, i.e., ⟨ValM M M
s ( t1 ), Vals ( t2 )⟩, is an element of R . So, e.g., we have
M M
M, s ⊨ R(b, f ( a, b)) since ⟨Val (b), Val ( f ( a, b))⟩ = ⟨2, 3⟩ ∈ RM , but M, s ⊭
R( x, f ( a, b)) since ⟨1, 3⟩ ∈/ RM [ s ].
To determine if a non-atomic formula φ is satisfied, you apply the clauses
in the inductive definition that applies to the main connective. For instance,
the main connective in R( a, a) ⊃ ( R(b, x ) ∨ R( x, b)) is the ⊃, and

M, s ⊨ R( a, a) ⊃ ( R(b, x ) ∨ R( x, b)) iff


M, s ⊭ R( a, a) or M, s ⊨ R(b, x ) ∨ R( x, b)

Since M, s ⊨ R( a, a) (because ⟨1, 1⟩ ∈ RM ) we can’t yet determine the answer


and must first figure out if M, s ⊨ R(b, x ) ∨ R( x, b):

M, s ⊨ R(b, x ) ∨ R( x, b) iff
M, s ⊨ R(b, x ) or M, s ⊨ R( x, b)

And this is the case, since M, s ⊨ R( x, b) (because ⟨1, 2⟩ ∈ RM ).

Recall that an x-variant of s is a variable assignment that differs from s at


most in what it assigns to x. For every element of |M|, there is an x-variant
of s:

s1 = s[1/x ], s2 = s[2/x ],
s3 = s[3/x ], s4 = s[4/x ].

So, e.g., s2 ( x ) = 2 and s2 (y) = s(y) = 1 for all variables y other than x. These
are all the x-variants of s for the structure M, since |M| = {1, 2, 3, 4}. Note, in
particular, that s1 = s (s is always an x-variant of itself).
To determine if an existentially quantified formula ∃ x φ( x ) is satisfied, we
have to determine if M, s[m/x ] ⊨ φ( x ) for at least one m ∈ |M|. So,

M, s ⊨ ∃ x ( R(b, x ) ∨ R( x, b)),

81
7. S EMANTICS OF F IRST-O RDER L OGIC

since M, s[1/x ] ⊨ R(b, x ) ∨ R( x, b) (s[3/x ] would also fit the bill). But,

M, s ⊭ ∃ x ( R(b, x ) & R( x, b))

since, whichever m ∈ |M| we pick, M, s[m/x ] ⊭ R(b, x ) & R( x, b).


To determine if a universally quantified formula ∀ x φ( x ) is satisfied, we
have to determine if M, s[m/x ] ⊨ φ( x ) for all m ∈ |M|. So,

M, s ⊨ ∀ x ( R( x, a) ⊃ R( a, x )),

since M, s[m/x ] ⊨ R( x, a) ⊃ R( a, x ) for all m ∈ |M|. For m = 1, we have


M, s[1/x ] ⊨ R( a, x ) so the consequent is true; for m = 2, 3, and 4, we have
M, s[m/x ] ⊭ R( x, a), so the antecedent is false. But,

M, s ⊭ ∀ x ( R( a, x ) ⊃ R( x, a))

since M, s[2/x ] ⊭ R( a, x ) ⊃ R( x, a) (because M, s[2/x ] ⊨ R( a, x ) and M, s[2/x ] ⊭


R( x, a)).
For a more complicated case, consider

∀ x ( R( a, x ) ⊃ ∃y R( x, y)).

Since M, s[3/x ] ⊭ R( a, x ) and M, s[4/x ] ⊭ R( a, x ), the interesting cases where


we have to worry about the consequent of the conditional are only m = 1
and = 2. Does M, s[1/x ] ⊨ ∃y R( x, y) hold? It does if there is at least one
n ∈ |M| so that M, s[1/x ][n/y] ⊨ R( x, y). In fact, if we take n = 1, we have
s[1/x ][n/y] = s[1/y] = s. Since s( x ) = 1, s(y) = 1, and ⟨1, 1⟩ ∈ RM , the
answer is yes.
To determine if M, s[2/x ] ⊨ ∃y R( x, y), we have to look at the variable as-
signments s[2/x ][n/y]. Here, for n = 1, this assignment is s2 = s[2/x ], which
does not satisfy R( x, y) (s2 ( x ) = 2, s2 (y) = 1, and ⟨2, 1⟩ ∈ / RM ). However,
consider s[2/x ][3/y] = s2 [3/y]. M, s2 [3/y] ⊨ R( x, y) since ⟨2, 3⟩ ∈ RM , and
so M, s2 ⊨ ∃y R( x, y).
So, for all n ∈ |M|, either M, s[m/x ] ⊭ R( a, x ) (if m = 3, 4) or M, s[m/x ] ⊨
∃y R( x, y) (if m = 1, 2), and so

M, s ⊨ ∀ x ( R( a, x ) ⊃ ∃y R( x, y)).

On the other hand,

M, s ⊭ ∃ x ( R( a, x ) & ∀y R( x, y)).

We have M, s[m/x ] ⊨ R( a, x ) only for m = 1 and m = 2. But for both


of these values of m, there is in turn an n ∈ |M|, namely n = 4, so that
M, s[m/x ][n/y] ⊭ R( x, y) and so M, s[m/x ] ⊭ ∀y R( x, y) for m = 1 and m = 2.
In sum, there is no m ∈ |M| such that M, s[m/x ] ⊨ R( a, x ) & ∀y R( x, y).

82
7.5. Variable Assignments

7.5 Variable Assignments


A variable assignment s provides a value for every variable—and there are
infinitely many of them. This is of course not necessary. We require variable
assignments to assign values to all variables simply because it makes things a
lot easier. The value of a term t, and whether or not a formula φ is satisfied
in a structure with respect to s, only depend on the assignments s makes to
the variables in t and the free variables of φ. This is the content of the next
two propositions. To make the idea of “depends on” precise, we show that
any two variable assignments that agree on all the variables in t give the same
value, and that φ is satisfied relative to one iff it is satisfied relative to the other
if two variable assignments agree on all free variables of φ.

Proposition 7.13. If the variables in a term t are among x1 , . . . , xn , and s1 ( xi ) =


s2 ( xi ) for i = 1, . . . , n, then ValM M
s1 ( t ) = Vals2 ( t ).

Proof. By induction on the complexity of t. For the base case, t can be a con-
stant symbol or one of the variables x1 , . . . , xn . If t = c, then ValM s1 ( t ) = c
M =

ValM
s2 ( t ). If t = xi , s1 ( xi ) = s2 ( xi ) by the hypothesis of the proposition, and so
ValM M
s1 ( t ) = s1 ( xi ) = s2 ( xi ) = Vals2 ( t ).
For the inductive step, assume that t = f (t1 , . . . , tk ) and that the claim
holds for t1 , . . . , tk . Then

ValM M
s1 ( t ) = Vals1 ( f ( t1 , . . . , tk ))

= f M (ValM M
s1 ( t1 ), . . . , Vals1 ( tk )).

For j = 1, . . . , k, the variables of t j are among x1 , . . . , xn . By the induction


hypothesis, ValM M
s1 ( t j ) = Vals2 ( t j ). So,

ValM M
s1 ( t ) = Vals1 ( f ( t1 , . . . , tk ))

= f M (ValM M
s1 ( t1 ), . . . , Vals1 ( tk ))

= f M (ValM M
s2 ( t1 ), . . . , Vals2 ( tk ))
= ValM M
s2 ( f ( t1 , . . . , tk )) = Vals2 ( t ).

Proposition 7.14. If the free variables in φ are among x1 , . . . , xn , and s1 ( xi ) =


s2 ( xi ) for i = 1, . . . , n, then M, s1 ⊨ φ iff M, s2 ⊨ φ.

Proof. We use induction on the complexity of φ. For the base case, where φ is
atomic, φ can be: ⊥, R(t1 , . . . , tk ) for a k-place predicate R and terms t1 , . . . , tk ,
or t1 = t2 for terms t1 and t2 . In the latter two cases, we only demonstrate
the forward direction of the biconditional, since the proof of the reverse is
symmetrical.

83
7. S EMANTICS OF F IRST-O RDER L OGIC

1. φ ≡ ⊥: both M, s1 ⊭ φ and M, s2 ⊭ φ.

2. φ ≡ R(t1 , . . . , tk ): let M, s1 ⊨ φ. Then

⟨ValM M M
s1 ( t1 ), . . . , Vals1 ( tk )⟩ ∈ R .

For i = 1, . . . , k, ValM M
s1 ( ti ) = Vals2 ( ti ) by Proposition 7.13. So we also
have ⟨ValM M M
s2 ( ti ), . . . , Vals2 ( tk )⟩ ∈ R , and hence M, s2 ⊨ φ.

3. φ ≡ t1 = t2 : suppose M, s1 ⊨ φ. Then ValM M


s1 ( t1 ) = Vals1 ( t2 ). So,

ValM M
s2 ( t1 ) = Vals1 ( t1 ) (by Proposition 7.13)
= ValM
s1 ( t 2 ) (since M, s1 ⊨ t1 = t2 )
= ValM
s2 ( t 2 ) (by Proposition 7.13),

so M, s2 ⊨ t1 = t2 .

Now assume M, s1 ⊨ ψ iff M, s2 ⊨ ψ for all formulae ψ less complex than φ.


The induction step proceeds by cases determined by the main operator of φ.
In each case, we only demonstrate the forward direction of the biconditional;
the proof of the reverse direction is symmetrical. In all cases except those for
the quantifiers, we apply the induction hypothesis to sub-formulae ψ of φ.
The free variables of ψ are among those of φ. Thus, if s1 and s2 agree on the
free variables of φ, they also agree on those of ψ, and the induction hypothesis
applies to ψ.

1. φ ≡ ∼ψ: if M, s1 ⊨ φ, then M, s1 ⊭ ψ, so by the induction hypothesis,


M, s2 ⊭ ψ, hence M, s2 ⊨ φ.

2. φ ≡ ψ & χ: if M, s1 ⊨ φ, then M, s1 ⊨ ψ and M, s1 ⊨ χ, so by induction


hypothesis, M, s2 ⊨ ψ and M, s2 ⊨ χ. Hence, M, s2 ⊨ φ.

3. φ ≡ ψ ∨ χ: if M, s1 ⊨ φ, then M, s1 ⊨ ψ or M, s1 ⊨ χ. By induction
hypothesis, M, s2 ⊨ ψ or M, s2 ⊨ χ, so M, s2 ⊨ φ.

4. φ ≡ ψ ⊃ χ: exercise.

5. φ ≡ ∃ x ψ: if M, s1 ⊨ φ, there is an m ∈ |M| so that M, s1 [m/x ] ⊨ ψ. Let


s1′ = s1 [m/x ] and s2′ = s2 [m/x ]. The free variables of ψ are among x1 ,
. . . , xn , and x. s1′ ( xi ) = s2′ ( xi ), since s1′ and s2′ are x-variants of s1 and s2 ,
respectively, and by hypothesis s1 ( xi ) = s2 ( xi ). s1′ ( x ) = s2′ ( x ) = m
by the way we have defined s1′ and s2′ . Then the induction hypothesis
applies to ψ and s1′ , s2′ , so M, s2′ ⊨ ψ. Hence, since s2′ = s2 [m/x ], there is
an m ∈ |M| such that M, s2 [m/x ] ⊨ ψ, and so M, s2 ⊨ φ.

6. φ ≡ ∀ x ψ: exercise.

84
7.5. Variable Assignments

By induction, we get that M, s1 ⊨ φ iff M, s2 ⊨ φ whenever the free variables


in φ are among x1 , . . . , xn and s1 ( xi ) = s2 ( xi ) for i = 1, . . . , n.

Sentences have no free variables, so any two variable assignments assign


the same things to all the (zero) free variables of any sentence. The proposition
just proved then means that whether or not a sentence is satisfied in a structure
relative to a variable assignment is completely independent of the assignment.
We’ll record this fact. It justifies the definition of satisfaction of a sentence in
a structure (without mentioning a variable assignment) that follows.

Corollary 7.15. If φ is a sentence and s a variable assignment, then M, s ⊨ φ iff


M, s′ ⊨ φ for every variable assignment s′ .

Proof. Let s′ be any variable assignment. Since φ is a sentence, it has no free


variables, and so every variable assignment s′ trivially assigns the same things
to all free variables of φ as does s. So the condition of Proposition 7.14 is
satisfied, and we have M, s ⊨ φ iff M, s′ ⊨ φ.

Definition 7.16. If φ is a sentence, we say that a structure M satisfies φ, M ⊨ φ,


iff M, s ⊨ φ for all variable assignments s.

If M ⊨ φ, we also simply say that φ is true in M. The notion of satisfaction


naturally extends from individual sentences to sets of sentences.

Definition 7.17. If Γ is a set of sentences Γ, we say that a structure M satisfies Γ,


M ⊨ Γ, iff M ⊨ φ for all φ ∈ Γ.

Proposition 7.18. Let M be a structure, φ be a sentence, and s a variable assign-


ment. M ⊨ φ iff M, s ⊨ φ.

Proof. Exercise.

Proposition 7.19. Suppose φ( x ) only contains x free, and M is a structure. Then:

1. M ⊨ ∃ x φ( x ) iff M, s ⊨ φ( x ) for at least one variable assignment s.

2. M ⊨ ∀ x φ( x ) iff M, s ⊨ φ( x ) for all variable assignments s.

Proof. Exercise.

85
7. S EMANTICS OF F IRST-O RDER L OGIC

7.6 Extensionality
Extensionality, sometimes called relevance, can be expressed informally as fol-
lows: the only factors that bear upon the satisfaction of formula φ in a struc-
ture M relative to a variable assignment s, are the size of the domain and the
assignments made by M and s to the elements of the language that actually
appear in φ.
One immediate consequence of extensionality is that where two struc-
tures M and M′ agree on all the elements of the language appearing in a sen-
tence φ and have the same domain, M and M′ must also agree on whether or
not φ itself is true.

Proposition 7.20 (Extensionality). Let φ be a formula, and M1 and M2 be struc-


tures with |M1 | = |M2 |, and s a variable assignment on |M1 | = |M2 |. If cM1 =
cM2 , RM1 = RM2 , and f M1 = f M2 for every constant symbol c, relation symbol R,
and function symbol f occurring in φ, then M1 , s ⊨ φ iff M2 , s ⊨ φ.

M
Proof. First prove (by induction on t) that for every term, Vals 1 (t) = ValM
s ( t ).
2

Then prove the proposition by induction on φ, making use of the claim just
proved for the induction basis (where φ is atomic).

Corollary 7.21 (Extensionality for Sentences). Let φ be a sentence and M1 , M2


as in Proposition 7.20. Then M1 ⊨ φ iff M2 ⊨ φ.

Proof. Follows from Proposition 7.20 by Corollary 7.15.

Moreover, the value of a term, and whether or not a structure satisfies


a formula, only depend on the values of its subterms.

Proposition 7.22. Let M be a structure, t and t′ terms, and s a variable assignment.


Then ValM ′ M
s ( t [ t /x ]) = Vals[ValM (t′ )/x ] ( t ).
s

Proof. By induction on t.

1. If t is a constant, say, t ≡ c, then t[t′ /x ] = c, and ValM


s (c) = c
M =
M
Vals[ValM (t′ )/x] (c).
s

2. If t is a variable other than x, say, t ≡ y, then t[t′ /x ] = y, and ValM


s (y) =
ValM M ′
s[Val (t )/x ]
( y ) since s ∼ x s [ Val M ′
s ( t ) /x ] .
s

3. If t ≡ x, then t[t′ /x ] = t′ . But ValM


s[ValM (t′ )/x ]
( x ) = ValM ′
s ( t ) by definition
s
of s[ValM ′
s ( t ) /x ].

86
7.7. Semantic Notions

4. If t ≡ f (t1 , . . . , tn ) then we have:


ValM
s ( t [ t /x ]) =
′ ′
= ValM
s ( f ( t1 [ t /x ], . . . , tn [ t /x ]))
by definition of t[t′ /x ]
′ ′
= f M (ValM M
s ( t1 [ t /x ]), . . . , Vals ( tn [ t /x ]))
by definition of ValM
s ( f ( . . . ))
= f M (ValM (t ), . . . , ValM
s[ValM (t′ )/x ] 1
(t ))
s[ValM (t′ )/x ] n
s s

by induction hypothesis
= ValM
s[ValM (t′ )/x ]
(t) by definition of ValM
s[ValM (t′ )/x ]
( f (. . . ))
s s

Proposition 7.23. Let M be a structure, φ a formula, t′ a term, and s a variable


assignment. Then M, s ⊨ φ[t′ /x ] iff M, s[ValM ′
s ( t ) /x ] ⊨ φ.

Proof. Exercise.

The point of Propositions 7.22 and 7.23 is the following. Suppose we have
a term t or a formula φ and some term t′ , and we want to know the value
of t[t′ /x ] or whether or not φ[t′ /x ] is satisfied in a structure M relative to
a variable assignment s. Then we can either perform the substitution first
and then consider the value or satisfaction relative to M and s, or we can first
determine the value m = ValM ′ ′
s ( t ) of t in M relative to s, change the variable
assignment to s[m/x ] and then consider the value of t in M and s[m/x ], or
whether M, s[m/x ] ⊨ φ. Propositions 7.22 and 7.23 guarantee that the answer
will be the same, whichever way we do it.

7.7 Semantic Notions


Given the definition of structures for first-order languages, we can define
some basic semantic properties of and relationships between sentences. The
simplest of these is the notion of validity of a sentence. A sentence is valid if
it is satisfied in every structure. Valid sentences are those that are satisfied re-
gardless of how the non-logical symbols in it are interpreted. Valid sentences
are therefore also called logical truths—they are true, i.e., satisfied, in any struc-
ture and hence their truth depends only on the logical symbols occurring in
them and their syntactic structure, but not on the non-logical symbols or their
interpretation.

Definition 7.24 (Validity). A sentence φ is valid, ⊨ φ, iff M ⊨ φ for every


structure M.

87
7. S EMANTICS OF F IRST-O RDER L OGIC

Definition 7.25 (Entailment). A set of sentences Γ entails a sentence φ, Γ ⊨ φ,


iff for every structure M with M ⊨ Γ, M ⊨ φ.

Definition 7.26 (Satisfiability). A set of sentences Γ is satisfiable if M ⊨ Γ for


some structure M. If Γ is not satisfiable it is called unsatisfiable.

Proposition 7.27. A sentence φ is valid iff Γ ⊨ φ for every set of sentences Γ.

Proof. For the forward direction, let φ be valid, and let Γ be a set of sentences.
Let M be a structure so that M ⊨ Γ. Since φ is valid, M ⊨ φ, hence Γ ⊨ φ.
For the contrapositive of the reverse direction, let φ be invalid, so there is
a structure M with M ⊭ φ. When Γ = {⊤}, since ⊤ is valid, M ⊨ Γ. Hence,
there is a structure M so that M ⊨ Γ but M ⊭ φ, hence Γ does not entail φ.

Proposition 7.28. Γ ⊨ φ iff Γ ∪ {∼ φ} is unsatisfiable.

Proof. For the forward direction, suppose Γ ⊨ φ and suppose to the contrary
that there is a structure M so that M ⊨ Γ ∪ {∼ φ}. Since M ⊨ Γ and Γ ⊨ φ,
M ⊨ φ. Also, since M ⊨ Γ ∪ {∼ φ}, M ⊨ ∼ φ, so we have both M ⊨ φ and
M ⊭ φ, a contradiction. Hence, there can be no such structure M, so Γ ∪ {∼ φ}
is unsatisfiable.
For the reverse direction, suppose Γ ∪ {∼ φ} is unsatisfiable. So for every
structure M, either M ⊭ Γ or M ⊨ φ. Hence, for every structure M with M ⊨ Γ,
M ⊨ φ, so Γ ⊨ φ.

Proposition 7.29. If Γ ⊆ Γ′ and Γ ⊨ φ, then Γ′ ⊨ φ.

Proof. Suppose that Γ ⊆ Γ′ and Γ ⊨ φ. Let M be a structure such that M ⊨ Γ′ ;


then M ⊨ Γ, and since Γ ⊨ φ, we get that M ⊨ φ. Hence, whenever M ⊨ Γ′ ,
M ⊨ φ, so Γ′ ⊨ φ.

Theorem 7.30 (Semantic Deduction Theorem). Γ ∪ { φ} ⊨ ψ iff Γ ⊨ φ ⊃ ψ.

Proof. For the forward direction, let Γ ∪ { φ} ⊨ ψ and let M be a structure so


that M ⊨ Γ. If M ⊨ φ, then M ⊨ Γ ∪ { φ}, so since Γ ∪ { φ} entails ψ, we get
M ⊨ ψ. Therefore, M ⊨ φ ⊃ ψ, so Γ ⊨ φ ⊃ ψ.
For the reverse direction, let Γ ⊨ φ ⊃ ψ and M be a structure so that
M ⊨ Γ ∪ { φ}. Then M ⊨ Γ, so M ⊨ φ ⊃ ψ, and since M ⊨ φ, M ⊨ ψ. Hence,
whenever M ⊨ Γ ∪ { φ}, M ⊨ ψ, so Γ ∪ { φ} ⊨ ψ.

Proposition 7.31. Let M be a structure, and φ( x ) a formula with one free variable x,
and t a closed term. Then:

1. φ(t) ⊨ ∃ x φ( x )

2. ∀ x φ( x ) ⊨ φ(t)

88
7.7. Semantic Notions

Proof. 1. Suppose M ⊨ φ(t). Let s be a variable assignment with s( x ) =


ValM (t). Then M, s ⊨ φ(t) since φ(t) is a sentence. By Proposition 7.23,
M, s ⊨ φ( x ). By Proposition 7.19, M ⊨ ∃ x φ( x ).

2. Exercise.

89
Chapter 8

Theories and Their Models

8.1 Introduction
The development of the axiomatic method is a significant achievement in the
history of science, and is of special importance in the history of mathemat-
ics. An axiomatic development of a field involves the clarification of many
questions: What is the field about? What are the most fundamental concepts?
How are they related? Can all the concepts of the field be defined in terms of
these fundamental concepts? What laws do, and must, these concepts obey?
The axiomatic method and logic were made for each other. Formal logic
provides the tools for formulating axiomatic theories, for proving theorems
from the axioms of the theory in a precisely specified way, for studying the
properties of all systems satisfying the axioms in a systematic way.

Definition 8.1. A set of sentences Γ is closed iff, whenever Γ ⊨ φ then φ ∈ Γ.


The closure of a set of sentences Γ is { φ | Γ ⊨ φ}.
We say that Γ is axiomatized by a set of sentences ∆ if Γ is the closure of ∆.

We can think of an axiomatic theory as the set of sentences that is axiom-


atized by its set of axioms ∆. In other words, when we have a first-order lan-
guage which contains non-logical symbols for the primitives of the axiomat-
ically developed science we wish to study, together with a set of sentences
that express the fundamental laws of the science, we can think of the theory
as represented by all the sentences in this language that are entailed by the
axioms. This ranges from simple examples with only a single primitive and
simple axioms, such as the theory of partial orders, to complex theories such
as Newtonian mechanics.
The important logical facts that make this formal approach to the axiomatic
method so important are the following. Suppose Γ is an axiom system for a
theory, i.e., a set of sentences.

91
8. T HEORIES AND T HEIR M ODELS

1. We can state precisely when an axiom system captures an intended class


of structures. That is, if we are interested in a certain class of structures,
we will successfully capture that class by an axiom system Γ iff the struc-
tures are exactly those M such that M ⊨ Γ.

2. We may fail in this respect because there are M such that M ⊨ Γ, but M
is not one of the structures we intend. This may lead us to add axioms
which are not true in M.

3. If we are successful at least in the respect that Γ is true in all the intended
structures, then a sentence φ is true in all intended structures whenever
Γ ⊨ φ. Thus we can use logical tools (such as derivation methods) to
show that sentences are true in all intended structures simply by show-
ing that they are entailed by the axioms.

4. Sometimes we don’t have intended structures in mind, but instead start


from the axioms themselves: we begin with some primitives that we
want to satisfy certain laws which we codify in an axiom system. One
thing that we would like to verify right away is that the axioms do not
contradict each other: if they do, there can be no concepts that obey
these laws, and we have tried to set up an incoherent theory. We can
verify that this doesn’t happen by finding a model of Γ. And if there are
models of our theory, we can use logical methods to investigate them,
and we can also use logical methods to construct models.

5. The independence of the axioms is likewise an important question. It


may happen that one of the axioms is actually a consequence of the oth-
ers, and so is redundant. We can prove that an axiom φ in Γ is redundant
by proving Γ \ { φ} ⊨ φ. We can also prove that an axiom is not redun-
dant by showing that (Γ \ { φ}) ∪ {∼ φ} is satisfiable. For instance, this is
how it was shown that the parallel postulate is independent of the other
axioms of geometry.

6. Another important question is that of definability of concepts in a the-


ory: The choice of the language determines what the models of a theory
consist of. But not every aspect of a theory must be represented sepa-
rately in its models. For instance, every ordering ≤ determines a cor-
responding strict ordering <—given one, we can define the other. So it
is not necessary that a model of a theory involving such an order must
also contain the corresponding strict ordering. When is it the case, in
general, that one relation can be defined in terms of others? When is it
impossible to define a relation in terms of others (and hence must add it
to the primitives of the language)?

92
8.2. Expressing Properties of Structures

8.2 Expressing Properties of Structures


It is often useful and important to express conditions on functions and rela-
tions, or more generally, that the functions and relations in a structure satisfy
these conditions. For instance, we would like to have ways of distinguishing
those structures for a language which “capture” what we want the predicate
symbols to “mean” from those that do not. Of course we’re completely free
to specify which structures we “intend,” e.g., we can specify that the inter-
pretation of the predicate symbol ≤ must be an ordering, or that we are only
interested in interpretations of L in which the domain consists of sets and ∈
is interpreted by the “is an element of” relation. But can we do this with sen-
tences of the language? In other words, which conditions on a structure M can
we express by a sentence (or perhaps a set of sentences) in the language of M?
There are some conditions that we will not be able to express. For instance,
there is no sentence of L A which is only true in a structure M if |M| = N.
We cannot express “the domain contains only natural numbers.” But there
are “structural properties” of structures that we perhaps can express. Which
properties of structures can we express by sentences? Or, to put it another
way, which collections of structures can we describe as those making a sen-
tence (or set of sentences) true?
Definition 8.2 (Model of a set). Let Γ be a set of sentences in a language L.
We say that a structure M is a model of Γ if M ⊨ φ for all φ ∈ Γ.

Example 8.3. The sentence ∀ x x ≤ x is true in M iff ≤M is a reflexive relation.


The sentence ∀ x ∀y (( x ≤ y & y ≤ x ) ⊃ x = y) is true in M iff ≤M is anti-
symmetric. The sentence ∀ x ∀y ∀z (( x ≤ y & y ≤ z) ⊃ x ≤ z) is true in M iff
≤M is transitive. Thus, the models of
{ ∀ x x ≤ x,
∀ x ∀y (( x ≤ y & y ≤ x ) ⊃ x = y),
∀ x ∀y ∀z (( x ≤ y & y ≤ z) ⊃ x ≤ z) }
are exactly those structures in which ≤M is reflexive, anti-symmetric, and
transitive, i.e., a partial order. Hence, we can take them as axioms for the
first-order theory of partial orders.

8.3 Examples of First-Order Theories


Example 8.4. The theory of strict linear orders in the language L< is axioma-
tized by the set
{ ∀ x ∼ x < x,
∀ x ∀y (( x < y ∨ y < x ) ∨ x = y),
∀ x ∀y ∀z (( x < y & y < z) ⊃ x < z) }

93
8. T HEORIES AND T HEIR M ODELS

It completely captures the intended structures: every strict linear order is a


model of this axiom system, and vice versa, if R is a linear order on a set X,
then the structure M with |M| = X and <M = R is a model of this theory.

Example 8.5. The theory of groups in the language 1 (constant symbol), ·


(two-place function symbol) is axiomatized by
∀ x ( x · 1) = x
∀ x ∀y ∀z ( x · (y · z)) = (( x · y) · z)
∀ x ∃y ( x · y) = 1

Example 8.6. The theory of Peano arithmetic is axiomatized by the following


sentences in the language of arithmetic L A .
∀ x ∀y ( x ′ = y′ ⊃ x = y)
∀ x 0 ̸= x′
∀ x ( x + 0) = x
∀ x ∀y ( x + y′ ) = ( x + y)′
∀ x ( x × 0) = 0
∀ x ∀y ( x × y′ ) = (( x × y) + x )
∀ x ∀y ( x < y ≡ ∃z (z′ + x ) = y)

plus all sentences of the form

( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x )


Since there are infinitely many sentences of the latter form, this axiom sys-
tem is infinite. The latter form is called the induction schema. (Actually, the
induction schema is a bit more complicated than we let on here.)
The last axiom is an explicit definition of <.

Example 8.7. The theory of pure sets plays an important role in the founda-
tions (and in the philosophy) of mathematics. A set is pure if all its elements
are also pure sets. The empty set counts therefore as pure, but a set that has
something as an element that is not a set would not be pure. So the pure sets
are those that are formed just from the empty set and no “urelements,” i.e.,
objects that are not themselves sets.
The following might be considered as an axiom system for a theory of pure
sets:
∃ x ∼∃y y ∈ x
∀ x ∀y (∀z(z ∈ x ≡ z ∈ y) ⊃ x = y)
∀ x ∀y ∃z ∀u (u ∈ z ≡ (u = x ∨ u = y))
∀ x ∃y ∀z (z ∈ y ≡ ∃u (z ∈ u & u ∈ x ))

94
8.3. Examples of First-Order Theories

plus all sentences of the form

∃ x ∀y (y ∈ x ≡ φ(y))

The first axiom says that there is a set with no elements (i.e., ∅ exists); the
second says that sets are extensional; the third that for any sets X and Y, the
set { X, Y } exists; the fourth that for any set X, the set ∪ X exists, where ∪ X is
the union of all the elements of X.
The sentences mentioned last are collectively called the naive comprehension
scheme. It essentially says that for every φ( x ), the set { x | φ( x )} exists—so
at first glance a true, useful, and perhaps even necessary axiom. It is called
“naive” because, as it turns out, it makes this theory unsatisfiable: if you take
φ(y) to be ∼y ∈ y, you get the sentence

∃ x ∀y (y ∈ x ≡ ∼y ∈ y)

and this sentence is not satisfied in any structure.

Example 8.8. In the area of mereology, the relation of parthood is a fundamental


relation. Just like theories of sets, there are theories of parthood that axioma-
tize various conceptions (sometimes conflicting) of this relation.
The language of mereology contains a single two-place predicate sym-
bol P , and P ( x, y) “means” that x is a part of y. When we have this inter-
pretation in mind, a structure for this language is called a parthood structure.
Of course, not every structure for a single two-place predicate will really de-
serve this name. To have a chance of capturing “parthood,” P M must satisfy
some conditions, which we can lay down as axioms for a theory of parthood.
For instance, parthood is a partial order on objects: every object is a part (al-
beit an improper part) of itself; no two different objects can be parts of each
other; a part of a part of an object is itself part of that object. Note that in this
sense “is a part of” resembles “is a subset of,” but does not resemble “is an
element of” which is neither reflexive nor transitive.

∀ x P ( x, x )
∀ x ∀y ((P ( x, y) & P (y, x )) ⊃ x = y)
∀ x ∀y ∀z ((P ( x, y) & P (y, z)) ⊃ P ( x, z))

Moreover, any two objects have a mereological sum (an object that has these
two objects as parts, and is minimal in this respect).

∀ x ∀y ∃z ∀u (P (z, u) ≡ (P ( x, u) & P (y, u)))

These are only some of the basic principles of parthood considered by meta-
physicians. Further principles, however, quickly become hard to formulate or
write down without first introducing some defined relations. For instance,

95
8. T HEORIES AND T HEIR M ODELS

most metaphysicians interested in mereology also view the following as a


valid principle: whenever an object x has a proper part y, it also has a part z
that has no parts in common with y, and so that the fusion of y and z is x.

8.4 Expressing Relations in a Structure


One main use formulae can be put to is to express properties and relations in
a structure M in terms of the primitives of the language L of M. By this we
mean the following: the domain of M is a set of objects. The constant symbols,
function symbols, and predicate symbols are interpreted in M by some objects
in |M|, functions on |M|, and relations on |M|. For instance, if A20 is in L, then
M
M assigns to it a relation R = A20 . Then the formula A20 (v1 , v2 ) expresses that
very relation, in the following sense: if a variable assignment s maps v1 to
a ∈ |M| and v2 to b ∈ |M|, then

Rab iff M, s ⊨ A20 (v1 , v2 ).

Note that we have to involve variable assignments here: we can’t just say “Rab
iff M ⊨ A20 ( a, b)” because a and b are not symbols of our language: they are
elements of |M|.
Since we don’t just have atomic formulae, but can combine them using the
logical connectives and the quantifiers, more complex formulae can define
other relations which aren’t directly built into M. We’re interested in how to
do that, and specifically, which relations we can define in a structure.

Definition 8.9. Let φ(v1 , . . . , vn ) be a formula of L in which only v1 ,. . . , vn


occur free, and let M be a structure for L. φ(v1 , . . . , vn ) expresses the relation R ⊆
|M|n iff
Ra1 . . . an iff M, s ⊨ φ(v1 , . . . , vn )
for any variable assignment s with s(vi ) = ai (i = 1, . . . , n).

Example 8.10. In the standard model of arithmetic N, the formula v1 < v2 ∨


v1 = v2 expresses the ≤ relation on N. The formula v2 = v1′ expresses the suc-
cessor relation, i.e., the relation R ⊆ N2 where Rnm holds if m is the successor
of n. The formula v1 = v2′ expresses the predecessor relation. The formulae
∃v3 (v3 ̸= 0 & v2 = (v1 + v3 )) and ∃v3 (v1 + v3 ′ ) = v2 both express the < re-
lation. This means that the predicate symbol < is actually superfluous in the
language of arithmetic; it can be defined.

This idea is not just interesting in specific structures, but generally when-
ever we use a language to describe an intended model or models, i.e., when
we consider theories. These theories often only contain a few predicate sym-
bols as basic symbols, but in the domain they are used to describe often many

96
8.5. The Theory of Sets

other relations play an important role. If these other relations can be system-
atically expressed by the relations that interpret the basic predicate symbols
of the language, we say we can define them in the language.

8.5 The Theory of Sets


Almost all of mathematics can be developed in the theory of sets. Developing
mathematics in this theory involves a number of things. First, it requires a set
of axioms for the relation ∈. A number of different axiom systems have been
developed, sometimes with conflicting properties of ∈. The axiom system
known as ZFC, Zermelo–Fraenkel set theory with the axiom of choice stands
out: it is by far the most widely used and studied, because it turns out that its
axioms suffice to prove almost all the things mathematicians expect to be able
to prove. But before that can be established, it first is necessary to make clear
how we can even express all the things mathematicians would like to express.
For starters, the language contains no constant symbols or function symbols,
so it seems at first glance unclear that we can talk about particular sets (such as
∅ or N), can talk about operations on sets (such as X ∪ Y and ℘( X )), let alone
other constructions which involve things other than sets, such as relations and
functions.
To begin with, “is an element of” is not the only relation we are interested
in: “is a subset of” seems almost as important. But we can define “is a subset
of” in terms of “is an element of.” To do this, we have to find a formula φ( x, y)
in the language of set theory which is satisfied by a pair of sets ⟨ X, Y ⟩ iff X ⊆
Y. But X is a subset of Y just in case all elements of X are also elements of Y.
So we can define ⊆ by the formula

∀z (z ∈ x ⊃ z ∈ y)

Now, whenever we want to use the relation ⊆ in a formula, we could instead


use that formula (with x and y suitably replaced, and the bound variable z
renamed if necessary). For instance, extensionality of sets means that if any
sets x and y are contained in each other, then x and y must be the same set.
This can be expressed by ∀ x ∀y (( x ⊆ y & y ⊆ x ) ⊃ x = y), or, if we replace ⊆
by the above definition, by

∀ x ∀y ((∀z (z ∈ x ⊃ z ∈ y) & ∀z (z ∈ y ⊃ z ∈ x )) ⊃ x = y).

This is in fact one of the axioms of ZFC, the “axiom of extensionality.”


There is no constant symbol for ∅, but we can express “x is empty” by
∼∃y y ∈ x. Then “∅ exists” becomes the sentence ∃ x ∼∃y y ∈ x. This is an-
other axiom of ZFC. (Note that the axiom of extensionality implies that there
is only one empty set.) Whenever we want to talk about ∅ in the language of
set theory, we would write this as “there is a set that’s empty and . . . ” As an

97
8. T HEORIES AND T HEIR M ODELS

example, to express the fact that ∅ is a subset of every set, we could write

∃ x (∼∃y y ∈ x & ∀z x ⊆ z)

where, of course, x ⊆ z would in turn have to be replaced by its definition.


To talk about operations on sets, such as X ∪ Y and ℘( X ), we have to use a
similar trick. There are no function symbols in the language of set theory, but
we can express the functional relations X ∪ Y = Z and ℘( X ) = Y by

∀u ((u ∈ x ∨ u ∈ y) ≡ u ∈ z)
∀u (u ⊆ x ≡ u ∈ y)

since the elements of X ∪ Y are exactly the sets that are either elements of X or
elements of Y, and the elements of ℘( X ) are exactly the subsets of X. However,
this doesn’t allow us to use x ∪ y or ℘( x ) as if they were terms: we can only
use the entire formulae that define the relations X ∪ Y = Z and ℘( X ) = Y.
In fact, we do not know that these relations are ever satisfied, i.e., we do not
know that unions and power sets always exist. For instance, the sentence
∀ x ∃y ℘( x ) = y is another axiom of ZFC (the power set axiom).
Now what about talk of ordered pairs or functions? Here we have to ex-
plain how we can think of ordered pairs and functions as special kinds of sets.
One way to define the ordered pair ⟨ x, y⟩ is as the set {{ x }, { x, y}}. But like
before, we cannot introduce a function symbol that names this set; we can
only define the relation ⟨ x, y⟩ = z, i.e., {{ x }, { x, y}} = z:

∀u (u ∈ z ≡ (∀v (v ∈ u ≡ v = x ) ∨ ∀v (v ∈ u ≡ (v = x ∨ v = y))))

This says that the elements u of z are exactly those sets which either have x
as its only element or have x and y as its only elements (in other words, those
sets that are either identical to { x } or identical to { x, y}). Once we have this,
we can say further things, e.g., that X × Y = Z:

∀z (z ∈ Z ≡ ∃ x ∃y ( x ∈ X & y ∈ Y & ⟨ x, y⟩ = z))

A function f : X → Y can be thought of as the relation f ( x ) = y, i.e., as


the set of pairs {⟨ x, y⟩ | f ( x ) = y}. We can then say that a set f is a function
from X to Y if (a) it is a relation ⊆ X × Y, (b) it is total, i.e., for all x ∈ X
there is some y ∈ Y such that ⟨ x, y⟩ ∈ f and (c) it is functional, i.e., whenever
⟨ x, y⟩, ⟨ x, y′ ⟩ ∈ f , y = y′ (because values of functions must be unique). So “ f
is a function from X to Y” can be written as:

∀u (u ∈ f ⊃ ∃ x ∃y ( x ∈ X & y ∈ Y & ⟨ x, y⟩ = u)) &


∀ x ( x ∈ X ⊃ (∃y (y ∈ Y & maps( f , x, y)) &
(∀y ∀y′ ((maps( f , x, y) & maps( f , x, y′ )) ⊃ y = y′ )))

98
8.6. Expressing the Size of Structures

where maps( f , x, y) abbreviates ∃v (v ∈ f & ⟨ x, y⟩ = v) (this formula ex-


presses “ f ( x ) = y”).
It is now also not hard to express that f : X → Y is injective, for instance:

f : X → Y & ∀ x ∀ x ′ (( x ∈ X & x ′ ∈ X &


∃y (maps( f , x, y) & maps( f , x ′ , y))) ⊃ x = x ′ )
A function f : X → Y is injective iff, whenever f maps x, x ′ ∈ X to a single y,
x = x ′ . If we abbreviate this formula as inj( f , X, Y ), we’re already in a position
to state in the language of set theory something as non-trivial as Cantor’s
theorem: there is no injective function from ℘( X ) to X:
∀ X ∀Y (℘( X ) = Y ⊃ ∼∃ f inj( f , Y, X ))
One might think that set theory requires another axiom that guarantees
the existence of a set for every defining property. If φ( x ) is a formula of set
theory with the variable x free, we can consider the sentence
∃y ∀ x ( x ∈ y ≡ φ( x )).
This sentence states that there is a set y whose elements are all and only those
x that satisfy φ( x ). This schema is called the “comprehension principle.” It
looks very useful; unfortunately it is inconsistent. Take φ( x ) ≡ ∼ x ∈ x, then
the comprehension principle states
∃y ∀ x ( x ∈ y ≡ x ∈
/ x ),
i.e., it states the existence of a set of all sets that are not elements of them-
selves. No such set can exist—this is Russell’s Paradox. ZFC, in fact, contains
a restricted—and consistent—version of this principle, the separation princi-
ple:
∀z ∃y ∀ x ( x ∈ y ≡ ( x ∈ z & φ( x )).

8.6 Expressing the Size of Structures


There are some properties of structures we can express even without using
the non-logical symbols of a language. For instance, there are sentences which
are true in a structure iff the domain of the structure has at least, at most, or
exactly a certain number n of elements.
Proposition 8.11. The sentence

φ ≥ n ≡ ∃ x1 ∃ x2 . . . ∃ x n
( x1 ̸ = x2 & x1 ̸ = x3 & x1 ̸ = x4 & · · · & x1 ̸ = x n &
x2 ̸ = x3 & x2 ̸ = x4 & · · · & x2 ̸ = x n &
..
.
x n −1 ̸ = x n )

99
8. T HEORIES AND T HEIR M ODELS

is true in a structure M iff |M| contains at least n elements. Consequently, M ⊨


∼ φ≥n+1 iff |M| contains at most n elements.

Proposition 8.12. The sentence

φ = n ≡ ∃ x1 ∃ x2 . . . ∃ x n
( x1 ̸ = x2 & x1 ̸ = x3 & x1 ̸ = x4 & · · · & x1 ̸ = x n &
x2 ̸ = x3 & x2 ̸ = x4 & · · · & x2 ̸ = x n &
..
.
x n −1 ̸ = x n &
∀y (y = x1 ∨ · · · ∨ y = xn ))

is true in a structure M iff |M| contains exactly n elements.

Proposition 8.13. A structure is infinite iff it is a model of

{ φ ≥1 , φ ≥2 , φ ≥3 , . . . } .

There is no single purely logical sentence which is true in M iff |M| is


infinite. However, one can give sentences with non-logical predicate symbols
which only have infinite models (although not every infinite structure is a
model of them). The property of being a finite structure, and the property of
being a uncountable structure cannot even be expressed with an infinite set of
sentences. These facts follow from the compactness and Löwenheim–Skolem
theorems.

100
Chapter 9

Natural Deduction

9.1 Introduction
Logics commonly have both a semantics and a derivation system. The seman-
tics concerns concepts such as truth, satisfiability, validity, and entailment.
The purpose of derivation systems is to provide a purely syntactic method
of establishing entailment and validity. They are purely syntactic in the sense
that a derivation in such a system is a finite syntactic object, usually a sequence
(or other finite arrangement) of sentences or formulae. Good derivation sys-
tems have the property that any given sequence or arrangement of sentences
or formulae can be verified mechanically to be “correct.”
The simplest (and historically first) derivation systems for first-order logic
were axiomatic. A sequence of formulae counts as a derivation in such a sys-
tem if each individual formula in it is either among a fixed set of “axioms”
or follows from formulae coming before it in the sequence by one of a fixed
number of “inference rules”—and it can be mechanically verified if a formula
is an axiom and whether it follows correctly from other formulae by one of the
inference rules. Axiomatic derivation systems are easy to describe—and also
easy to handle meta-theoretically—but derivations in them are hard to read
and understand, and are also hard to produce.
Other derivation systems have been developed with the aim of making it
easier to construct derivations or easier to understand derivations once they
are complete. Examples are natural deduction, truth trees, also known as
tableaux proofs, and the sequent calculus. Some derivation systems are de-
signed especially with mechanization in mind, e.g., the resolution method is
easy to implement in software (but its derivations are essentially impossible
to understand). Most of these other derivation systems represent derivations
as trees of formulae rather than sequences. This makes it easier to see which
parts of a derivation depend on which other parts.
So for a given logic, such as first-order logic, the different derivation sys-
tems will give different explications of what it is for a sentence to be a theorem

101
9. N ATURAL D EDUCTION

and what it means for a sentence to be derivable from some others. However
that is done (via axiomatic derivations, natural deductions, sequent deriva-
tions, truth trees, resolution refutations), we want these relations to match the
semantic notions of validity and entailment. Let’s write ⊢ φ for “φ is a theo-
rem” and “Γ ⊢ φ” for “φ is derivable from Γ.” However ⊢ is defined, we want
it to match up with ⊨, that is:
1. ⊢ φ if and only if ⊨ φ
2. Γ ⊢ φ if and only if Γ ⊨ φ
The “only if” direction of the above is called soundness. A derivation system is
sound if derivability guarantees entailment (or validity). Every decent deriva-
tion system has to be sound; unsound derivation systems are not useful at all.
After all, the entire purpose of a derivation is to provide a syntactic guarantee
of validity or entailment. We’ll prove soundness for the derivation systems
we present.
The converse “if” direction is also important: it is called completeness. A
complete derivation system is strong enough to show that φ is a theorem
whenever φ is valid, and that Γ ⊢ φ whenever Γ ⊨ φ. Completeness is harder
to establish, and some logics have no complete derivation systems. First-order
logic does. Kurt Gödel was the first one to prove completeness for a derivation
system of first-order logic in his 1929 dissertation.
Another concept that is connected to derivation systems is that of consis-
tency. A set of sentences is called inconsistent if anything whatsoever can be
derived from it, and consistent otherwise. Inconsistency is the syntactic coun-
terpart to unsatisfiablity: like unsatisfiable sets, inconsistent sets of sentences
do not make good theories, they are defective in a fundamental way. Con-
sistent sets of sentences may not be true or useful, but at least they pass that
minimal threshold of logical usefulness. For different derivation systems the
specific definition of consistency of sets of sentences might differ, but like ⊢,
we want consistency to coincide with its semantic counterpart, satisfiability.
We want it to always be the case that Γ is consistent if and only if it is satis-
fiable. Here, the “if” direction amounts to completeness (consistency guaran-
tees satisfiability), and the “only if” direction amounts to soundness (satisfi-
ability guarantees consistency). In fact, for classical first-order logic, the two
versions of soundness and completeness are equivalent.

9.2 Natural Deduction


Natural deduction is a derivation system intended to mirror actual reasoning
(especially the kind of regimented reasoning employed by mathematicians).
Actual reasoning proceeds by a number of “natural” patterns. For instance,
proof by cases allows us to establish a conclusion on the basis of a disjunc-
tive premise, by establishing that the conclusion follows from either of the

102
9.2. Natural Deduction

disjuncts. Indirect proof allows us to establish a conclusion by showing that


its negation leads to a contradiction. Conditional proof establishes a condi-
tional claim “if . . . then . . . ” by showing that the consequent follows from
the antecedent. Natural deduction is a formalization of some of these nat-
ural inferences. Each of the logical connectives and quantifiers comes with
two rules, an introduction and an elimination rule, and they each correspond
to one such natural inference pattern. For instance, ⊃Intro corresponds to
conditional proof, and ∨Elim to proof by cases. A particularly simple rule is
&Elim which allows the inference from φ & ψ to φ (or ψ).
One feature that distinguishes natural deduction from other derivation
systems is its use of assumptions. A derivation in natural deduction is a tree
of formulae. A single formula stands at the root of the tree of formulae, and
the “leaves” of the tree are formulae from which the conclusion is derived.
In natural deduction, some leaf formulae play a role inside the derivation but
are “used up” by the time the derivation reaches the conclusion. This corre-
sponds to the practice, in actual reasoning, of introducing hypotheses which
only remain in effect for a short while. For instance, in a proof by cases, we
assume the truth of each of the disjuncts; in conditional proof, we assume the
truth of the antecedent; in indirect proof, we assume the truth of the nega-
tion of the conclusion. This way of introducing hypothetical assumptions
and then doing away with them in the service of establishing an intermedi-
ate step is a hallmark of natural deduction. The formulas at the leaves of a
natural deduction derivation are called assumptions, and some of the rules of
inference may “discharge” them. For instance, if we have a derivation of ψ
from some assumptions which include φ, then the ⊃Intro rule allows us to
infer φ ⊃ ψ and discharge any assumption of the form φ. (To keep track of
which assumptions are discharged at which inferences, we label the inference
and the assumptions it discharges with a number.) The assumptions that re-
main undischarged at the end of the derivation are together sufficient for the
truth of the conclusion, and so a derivation establishes that its undischarged
assumptions entail its conclusion.
The relation Γ ⊢ φ based on natural deduction holds iff there is a deriva-
tion in which φ is the last sentence in the tree, and every leaf which is undis-
charged is in Γ. φ is a theorem in natural deduction iff there is a derivation in
which φ is the last sentence and all assumptions are discharged. For instance,
here is a derivation that shows that ⊢ ( φ & ψ) ⊃ φ:

[ φ & ψ ]1
φ &Elim
1 ⊃Intro
( φ & ψ) ⊃ φ

The label 1 indicates that the assumption φ & ψ is discharged at the ⊃Intro
inference.

103
9. N ATURAL D EDUCTION

A set Γ is inconsistent iff Γ ⊢ ⊥ in natural deduction. The rule ⊥ I makes it


so that from an inconsistent set, any sentence can be derived.
Natural deduction systems were developed by Gerhard Gentzen and Sta-
nisław Jaśkowski in the 1930s, and later developed by Dag Prawitz and Fred-
eric Fitch. Because its inferences mirror natural methods of proof, it is favored
by philosophers. The versions developed by Fitch are often used in introduc-
tory logic textbooks. In the philosophy of logic, the rules of natural deduc-
tion have sometimes been taken to give the meanings of the logical operators
(“proof-theoretic semantics”).

9.3 Rules and Derivations


Natural deduction systems are meant to closely parallel the informal reason-
ing used in mathematical proof (hence it is somewhat “natural”). Natural
deduction proofs begin with assumptions. Inference rules are then applied.
Assumptions are “discharged” by the ∼Intro, ⊃Intro, ∨Elim and ∃Elim in-
ference rules, and the label of the discharged assumption is placed beside the
inference for clarity.

Definition 9.1 (Assumption). An assumption is any sentence in the topmost


position of any branch.

Derivations in natural deduction are certain trees of sentences, where the


topmost sentences are assumptions, and if a sentence stands below one, two,
or three other sequents, it must follow correctly by a rule of inference. The sen-
tences at the top of the inference are called the premises and the sentence below
the conclusion of the inference. The rules come in pairs, an introduction and
an elimination rule for each logical operator. They introduce a logical opera-
tor in the conclusion or remove a logical operator from a premise of the rule.
Some of the rules allow an assumption of a certain type to be discharged. To
indicate which assumption is discharged by which inference, we also assign
labels to both the assumption and the inference. This is indicated by writing
the assumption as “[ φ]n .”
It is customary to consider rules for all the logical operators &, ∨, ⊃, ∼,
and ⊥, even if some of those are defined.

9.4 Propositional Rules

Rules for &

104
9.4. Propositional Rules

φ&ψ
φ &Elim
φ ψ
&Intro
φ&ψ φ&ψ
&Elim
ψ

Rules for ∨

φ [ φ]n [ψ]n
∨Intro
φ∨ψ
ψ
∨Intro φ∨ψ χ χ
φ∨ψ n ∨Elim
χ

Rules for ⊃

[ φ]n
φ⊃ψ φ
⊃Elim
ψ
ψ
n ⊃Intro
φ⊃ψ

Rules for ∼

[ φ]n
∼φ φ
∼Elim


∼ φ ∼Intro
n

Rules for ⊥

105
9. N ATURAL D EDUCTION

[∼ φ]n

⊥ ⊥
φ I

n
⊥ ⊥
φ C

Note that ∼Intro and ⊥C are very similar: The difference is that ∼Intro derives
a negated sentence ∼ φ but ⊥C a positive sentence φ.
Whenever a rule indicates that some assumption may be discharged, we
take this to be a permission, but not a requirement. E.g., in the ⊃Intro rule, we
may discharge any number of assumptions of the form φ in the derivation of
the premise ψ, including zero.

9.5 Derivations
We’ve said what an assumption is, and we’ve given the rules of inference.
Derivations in natural deduction are inductively generated from these: each
derivation either is an assumption on its own, or consists of one, two, or three
derivations followed by a correct inference.

Definition 9.2 (Derivation). A derivation of a sentence φ from assumptions Γ


is a finite tree of sentences satisfying the following conditions:

1. The topmost sentences of the tree are either in Γ or are discharged by an


inference in the tree.

2. The bottommost sentence of the tree is φ.

3. Every sentence in the tree except the sentence φ at the bottom is a premise
of a correct application of an inference rule whose conclusion stands di-
rectly below that sentence in the tree.

We then say that φ is the conclusion of the derivation and Γ its undischarged
assumptions.
If a derivation of φ from Γ exists, we say that φ is derivable from Γ, or in
symbols: Γ ⊢ φ. If there is a derivation of φ in which every assumption is
discharged, we write ⊢ φ.

Example 9.3. Every assumption on its own is a derivation. So, e.g., φ by itself
is a derivation, and so is ψ by itself. We can obtain a new derivation from
these by applying, say, the &Intro rule,
φ ψ
&Intro
φ&ψ

106
9.6. Examples of Derivations

These rules are meant to be general: we can replace the φ and ψ in it with any
sentences, e.g., by χ and θ. Then the conclusion would be χ & θ, and so

χ θ
&Intro
χ&θ

is a correct derivation. Of course, we can also switch the assumptions, so that


θ plays the role of φ and χ that of ψ. Thus,

θ χ
&Intro
θ&χ

is also a correct derivation.


We can now apply another rule, say, ⊃Intro, which allows us to conclude
a conditional and allows us to discharge any assumption that is identical to
the antecedent of that conditional. So both of the following would be correct
derivations:

[ χ ]1 θ χ [ θ ]1
&Intro &Intro
χ&θ χ&θ
1 ⊃Intro 1 ⊃Intro
χ ⊃ (χ & θ ) θ ⊃ (χ & θ )

They show, respectively, that θ ⊢ χ ⊃ (χ & θ ) and χ ⊢ θ ⊃ (χ & θ ).


Remember that discharging of assumptions is a permission, not a require-
ment: we don’t have to discharge the assumptions. In particular, we can apply
a rule even if the assumptions are not present in the derivation. For instance,
the following is legal, even though there is no assumption φ to be discharged:

ψ
1 ⊃Intro
φ⊃ψ

9.6 Examples of Derivations


Example 9.4. Let’s give a derivation of the sentence ( φ & ψ) ⊃ φ.
We begin by writing the desired conclusion at the bottom of the derivation.

( φ & ψ) ⊃ φ

Next, we need to figure out what kind of inference could result in a sen-
tence of this form. The main operator of the conclusion is ⊃, so we’ll try to
arrive at the conclusion using the ⊃Intro rule. It is best to write down the as-
sumptions involved and label the inference rules as you progress, so it is easy
to see whether all assumptions have been discharged at the end of the proof.

107
9. N ATURAL D EDUCTION

[ φ & ψ ]1

φ
1 ⊃Intro
( φ & ψ) ⊃ φ

We now need to fill in the steps from the assumption φ & ψ to φ. Since we
only have one connective to deal with, &, we must use the & elim rule. This
gives us the following proof:

[ φ & ψ ]1
φ &Elim
1 ⊃Intro
( φ & ψ) ⊃ φ

We now have a correct derivation of ( φ & ψ) ⊃ φ.

Example 9.5. Now let’s give a derivation of (∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ).


We begin by writing the desired conclusion at the bottom of the derivation.

(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)

To find a logical rule that could give us this conclusion, we look at the logical
connectives in the conclusion: ∼, ∨, and ⊃. We only care at the moment about
the first occurrence of ⊃ because it is the main operator of the sentence in the
end-sequent, while ∼, ∨ and the second occurrence of ⊃ are inside the scope
of another connective, so we will take care of those later. We therefore start
with the ⊃Intro rule. A correct application must look like this:

[∼ φ ∨ ψ]1

φ⊃ψ
1 ⊃Intro
(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)

This leaves us with two possibilities to continue. Either we can keep working
from the bottom up and look for another application of the ⊃Intro rule, or we
can work from the top down and apply a ∨Elim rule. Let us apply the latter.
We will use the assumption ∼ φ ∨ ψ as the leftmost premise of ∨Elim. For a
valid application of ∨Elim, the other two premises must be identical to the
conclusion φ ⊃ ψ, but each may be derived in turn from another assumption,
namely one of the two disjuncts of ∼ φ ∨ ψ. So our derivation will look like
this:

108
9.6. Examples of Derivations

[∼ φ]2 [ ψ ]2

[∼ φ ∨ ψ]1 φ⊃ψ φ⊃ψ


2
φ⊃ψ
∨Elim
1 ⊃Intro
(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)
In each of the two branches on the right, we want to derive φ ⊃ ψ, which
is best done using ⊃Intro.

[∼ φ]2 , [ φ]3 [ ψ ]2 , [ φ ]4

ψ ψ
3 ⊃Intro 4 ⊃Intro
[∼ φ ∨ ψ]1 φ⊃ψ φ⊃ψ
2
φ⊃ψ
∨Elim
1 ⊃Intro
(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)
For the two missing parts of the derivation, we need derivations of ψ from
∼ φ and φ in the middle, and from φ and ψ on the left. Let’s take the former
first. ∼ φ and φ are the two premises of ∼Elim:

[∼ φ]2 [ φ ]3
∼Elim

By using ⊥ I , we can obtain ψ as a conclusion and complete the branch.

[ ψ ]2 , [ φ ]4
[∼ φ]2 [ φ ]3
⊥Intro
⊥ ⊥
I
ψ ψ
3 ⊃Intro 4 ⊃Intro
[∼ φ ∨ ψ]1 φ⊃ψ φ⊃ψ
2
φ⊃ψ
∨Elim
1 ⊃Intro
(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)
Let’s now look at the rightmost branch. Here it’s important to realize that
the definition of derivation allows assumptions to be discharged but does not re-
quire them to be. In other words, if we can derive ψ from one of the assump-
tions φ and ψ without using the other, that’s ok. And to derive ψ from ψ is
trivial: ψ by itself is such a derivation, and no inferences are needed. So we
can simply delete the assumption φ.

109
9. N ATURAL D EDUCTION

[∼ φ]2 [ φ ]3
∼Elim
⊥ ⊥
I
ψ [ ψ ]2
3 ⊃Intro ⊃Intro
[∼ φ ∨ ψ]1 φ⊃ψ φ⊃ψ
2
φ⊃ψ
∨Elim
1 ⊃Intro
(∼ φ ∨ ψ) ⊃ ( φ ⊃ ψ)

Note that in the finished derivation, the rightmost ⊃Intro inference does not
actually discharge any assumptions.

Example 9.6. So far we have not needed the ⊥C rule. It is special in that it al-
lows us to discharge an assumption that isn’t a sub-formula of the conclusion
of the rule. It is closely related to the ⊥ I rule. In fact, the ⊥ I rule is a special
case of the ⊥C rule—there is a logic called “intuitionistic logic” in which only
⊥ I is allowed. The ⊥C rule is a last resort when nothing else works. For in-
stance, suppose we want to derive φ ∨ ∼ φ. Our usual strategy would be to
attempt to derive φ ∨ ∼ φ using ∨Intro. But this would require us to derive
either φ or ∼ φ from no assumptions, and this can’t be done. ⊥C to the rescue!

[∼( φ ∨ ∼ φ)]1

1
⊥ ⊥C
φ ∨ ∼φ

Now we’re looking for a derivation of ⊥ from ∼( φ ∨ ∼ φ). Since ⊥ is the


conclusion of ∼Elim we might try that:

[∼( φ ∨ ∼ φ)]1 [∼( φ ∨ ∼ φ)]1

∼φ φ
∼Elim
1
⊥ ⊥C
φ ∨ ∼φ

Our strategy for finding a derivation of ∼ φ calls for an application of ∼Intro:

[∼( φ ∨ ∼ φ)]1 , [ φ]2


[∼( φ ∨ ∼ φ)]1


2
∼ φ ∼Intro φ
∼Elim
1
⊥ ⊥C
φ ∨ ∼φ

110
9.7. Quantifier Rules

Here, we can get ⊥ easily by applying ∼Elim to the assumption ∼( φ ∨ ∼ φ)


and φ ∨ ∼ φ which follows from our new assumption φ by ∨Intro:

[ φ ]2 [∼( φ ∨ ∼ φ)]1
[∼( φ ∨ ∼ φ)]1 φ ∨ ∼ φ ∨Intro
∼Elim

2
∼φ ∼ Intro φ
∼Elim
1
⊥ ⊥C
φ ∨ ∼φ

On the right side we use the same strategy, except we get φ by ⊥C :

[ φ ]2 [∼ φ]3
[∼( φ ∨ ∼ φ)]1 φ ∨ ∼φ ∨ Intro [∼( φ ∨ ∼ φ)] 1 φ ∨ ∼ φ ∨Intro
∼Elim ∼Elim
⊥ ⊥ ⊥
2
∼φ ∼ Intro 3
φ C
∼Elim
1
⊥ ⊥C
φ ∨ ∼φ

9.7 Quantifier Rules

Rules for ∀

φ( a) ∀ x φ( x )
∀Intro ∀Elim
∀ x φ( x ) φ(t)

In the rules for ∀, t is a closed term (a term that does not contain any variables),
and a is a constant symbol which does not occur in the conclusion ∀ x φ( x ), or
in any assumption which is undischarged in the derivation ending with the
premise φ( a). We call a the eigenvariable of the ∀Intro inference.1

Rules for ∃

[φ( a)]n
φ(t)
∃Intro
∃ x φ( x )
∃ x φ( x ) χ
n
χ ∃Elim

1 We use the term “eigenvariable” even though a in the above rule is a constant. This has

historical reasons.

111
9. N ATURAL D EDUCTION

Again, t is a closed term, and a is a constant which does not occur in the
premise ∃ x φ( x ), in the conclusion χ, or any assumption which is undischarged
in the derivations ending with the two premises (other than the assumptions
φ( a)). We call a the eigenvariable of the ∃Elim inference.
The condition that an eigenvariable neither occur in the premises nor in
any assumption that is undischarged in the derivations leading to the premises
for the ∀Intro or ∃Elim inference is called the eigenvariable condition.
Recall the convention that when φ is a formula with the variable x free, we
indicate this by writing φ( x ). In the same context, φ(t) then is short for φ[t/x ].
So we could also write the ∃Intro rule as:
φ[t/x ]
∃Intro
∃x φ

Note that t may already occur in φ, e.g., φ might be P (t, x ). Thus, inferring
∃ x P (t, x ) from P (t, t) is a correct application of ∃Intro—you may “replace”
one or more, and not necessarily all, occurrences of t in the premise by the
bound variable x. However, the eigenvariable conditions in ∀Intro and ∃Elim
require that the constant symbol a does not occur in φ. So, you cannot cor-
rectly infer ∀ x P ( a, x ) from P ( a, a) using ∀Intro.
In ∃Intro and ∀Elim there are no restrictions, and the term t can be any-
thing, so we do not have to worry about any conditions. On the other hand,
in the ∃Elim and ∀Intro rules, the eigenvariable condition requires that the
constant symbol a does not occur anywhere in the conclusion or in an undis-
charged assumption. The condition is necessary to ensure that the system
is sound, i.e., only derives sentences from undischarged assumptions from
which they follow. Without this condition, the following would be allowed:

[ φ( a)]1
*∀Intro
∃ x φ( x ) ∀ x φ( x )
∃Elim
∀ x φ( x )

However, ∃ x φ( x ) ⊭ ∀ x φ( x ).
As the elimination rules for quantifiers only allow substituting closed terms
for variables, it follows that any formula that can be derived from a set of sen-
tences is itself a sentence.

9.8 Derivations with Quantifiers


Example 9.7. When dealing with quantifiers, we have to make sure not to
violate the eigenvariable condition, and sometimes this requires us to play
around with the order of carrying out certain inferences. In general, it helps
to try and take care of rules subject to the eigenvariable condition first (they
will be lower down in the finished proof).

112
9.8. Derivations with Quantifiers

Let’s see how we’d give a derivation of the formula ∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x ).


Starting as usual, we write

∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x )

We start by writing down what it would take to justify that last step using the
⊃Intro rule.
[∃ x ∼ φ( x )]1

∼∀ x φ( x )
1 ⊃Intro
∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x )

Since there is no obvious rule to apply to ∼∀ x φ( x ), we will proceed by setting


up the derivation so we can use the ∃Elim rule. Here we must pay attention
to the eigenvariable condition, and choose a constant that does not appear in
∃ x φ( x ) or any assumptions that it depends on. (Since no constant symbols
appear, however, any choice will do fine.)

[∼ φ( a)]2

[∃ x ∼ φ( x )]1 ∼∀ x φ( x )
2 ∃Elim
∼∀ x φ( x )
1 ⊃Intro
∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x )

In order to derive ∼∀ x φ( x ), we will attempt to use the ∼Intro rule: this re-
quires that we derive a contradiction, possibly using ∀ x φ( x ) as an additional
assumption. Of course, this contradiction may involve the assumption ∼ φ( a)
which will be discharged by the ∃Elim inference. We can set it up as follows:

[∼ φ( a)]2 , [∀ x φ( x )]3


3 ∼Intro
[∃ x ∼ φ( x )]1 ∼∀ x φ( x )
2 ∃Elim
∼∀ x φ( x )
1 ⊃Intro
∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x )

It looks like we are close to getting a contradiction. The easiest rule to apply is
the ∀Elim, which has no eigenvariable conditions. Since we can use any term
we want to replace the universally quantified x, it makes the most sense to
continue using a so we can reach a contradiction.

113
9. N ATURAL D EDUCTION

[∀ x φ( x )]3
∀Elim
[∼ φ( a)]2 φ( a)
∼Elim

1
3 ∼Intro
[∃ x ∼ φ( x )] ∼∀ x φ( x )
2 ∃Elim
∼∀ x φ( x )
1 ⊃Intro
∃ x ∼ φ( x ) ⊃ ∼∀ x φ( x )

It is important, especially when dealing with quantifiers, to double check


at this point that the eigenvariable condition has not been violated. Since the
only rule we applied that is subject to the eigenvariable condition was ∃Elim,
and the eigenvariable a does not occur in any assumptions it depends on, this
is a correct derivation.

Example 9.8. Sometimes we may derive a formula from other formulae. In


these cases, we may have undischarged assumptions. It is important to keep
track of our assumptions as well as the end goal.
Let’s see how we’d give a derivation of the formula ∃ x χ( x, b) from the
assumptions ∃ x ( φ( x ) & ψ( x )) and ∀ x (ψ( x ) ⊃ χ( x, b)). Starting as usual, we
write the conclusion at the bottom.

∃ x χ( x, b)

We have two premises to work with. To use the first, i.e., try to find
a derivation of ∃ x χ( x, b) from ∃ x ( φ( x ) & ψ( x )) we would use the ∃Elim rule.
Since it has an eigenvariable condition, we will apply that rule first. We get
the following:

[ φ( a) & ψ( a)]1

∃ x ( φ( x ) & ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)

The two assumptions we are working with share ψ. It may be useful at this
point to apply &Elim to separate out ψ( a).

[ φ( a) & ψ( a)]1
&Elim
ψ( a)

∃ x ( φ( x ) & ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)

114
9.8. Derivations with Quantifiers

The second assumption we have to work with is ∀ x (ψ( x ) ⊃ χ( x, b)). Since


there is no eigenvariable condition we can instantiate x with the constant sym-
bol a using ∀Elim to get ψ( a) ⊃ χ( a, b). We now have both ψ( a) ⊃ χ( a, b) and
ψ( a). Our next move should be a straightforward application of the ⊃Elim
rule.

∀ x (ψ( x ) ⊃ χ( x, b)) [ φ( a) & ψ( a)]1


∀Elim &Elim
ψ( a) ⊃ χ( a, b) ψ( a)
⊃Elim
χ( a, b)

∃ x ( φ( x ) & ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)

We are so close! One application of ∃Intro and we have reached our goal.

∀ x (ψ( x ) ⊃ χ( x, b)) [ φ( a) & ψ( a)]1


∀Elim &Elim
ψ( a) ⊃ χ( a, b) ψ( a)
⊃Elim
χ( a, b)
∃Intro
∃ x ( φ( x ) & ψ( x )) ∃ x χ( x, b)
1 ∃Elim
∃ x χ( x, b)

Since we ensured at each step that the eigenvariable conditions were not vio-
lated, we can be confident that this is a correct derivation.

Example 9.9. Give a derivation of the formula ∼∀ x φ( x ) from the assump-


tions ∀ x φ( x ) ⊃ ∃y ψ(y) and ∼∃y ψ(y). Starting as usual, we write the target
formula at the bottom.

∼∀ x φ( x )

The last line of the derivation is a negation, so let’s try using ∼Intro. This will
require that we figure out how to derive a contradiction.

[∀ x φ( x )]1


1 ∼Intro
∼∀ x φ( x )

So far so good. We can use ∀Elim but it’s not obvious if that will help us get
to our goal. Instead, let’s use one of our assumptions. ∀ x φ( x ) ⊃ ∃y ψ(y)
together with ∀ x φ( x ) will allow us to use the ⊃Elim rule.

115
9. N ATURAL D EDUCTION

∀ x φ( x ) ⊃ ∃y ψ(y) [∀ x φ( x )]1
⊃Elim
∃y ψ(y)


1 ∼Intro
∼∀ x φ( x )
We now have one final assumption to work with, and it looks like this will
help us reach a contradiction by using ∼Elim.

∀ x φ( x ) ⊃ ∃y ψ(y) [∀ x φ( x )]1
⊃Elim
∼∃y ψ(y) ∃y ψ(y)
∼Elim

1 ∼Intro
∼∀ x φ( x )

9.9 Proof-Theoretic Notions


Just as we’ve defined a number of important semantic notions (validity, en-
tailment, satisfiability), we now define corresponding proof-theoretic notions.
These are not defined by appeal to satisfaction of sentences in structures, but
by appeal to the derivability or non-derivability of certain sentences from oth-
ers. It was an important discovery that these notions coincide. That they do is
the content of the soundness and completeness theorems.

Definition 9.10 (Theorems). A sentence φ is a theorem if there is a derivation


of φ in natural deduction in which all assumptions are discharged. We write
⊢ φ if φ is a theorem and ⊬ φ if it is not.

Definition 9.11 (Derivability). A sentence φ is derivable from a set of sentences Γ,


Γ ⊢ φ, if there is a derivation with conclusion φ and in which every assump-
tion is either discharged or is in Γ. If φ is not derivable from Γ we write Γ ⊬ φ.

Definition 9.12 (Consistency). A set of sentences Γ is inconsistent iff Γ ⊢ ⊥. If


Γ is not inconsistent, i.e., if Γ ⊬ ⊥, we say it is consistent.

Proposition 9.13 (Reflexivity). If φ ∈ Γ, then Γ ⊢ φ.

Proof. The assumption φ by itself is a derivation of φ where every undis-


charged assumption (i.e., φ) is in Γ.

Proposition 9.14 (Monotonicity). If Γ ⊆ ∆ and Γ ⊢ φ, then ∆ ⊢ φ.

Proof. Any derivation of φ from Γ is also a derivation of φ from ∆.

Proposition 9.15 (Transitivity). If Γ ⊢ φ and { φ} ∪ ∆ ⊢ ψ, then Γ ∪ ∆ ⊢ ψ.

116
9.9. Proof-Theoretic Notions

Proof. If Γ ⊢ φ, there is a derivation δ0 of φ with all undischarged assumptions


in Γ. If { φ} ∪ ∆ ⊢ ψ, then there is a derivation δ1 of ψ with all undischarged
assumptions in { φ} ∪ ∆. Now consider:

∆, [ φ]1

δ1 Γ
δ0
ψ
1 ⊃Intro
φ⊃ψ φ
⊃Elim
ψ

The undischarged assumptions are now all among Γ ∪ ∆, so this shows Γ ∪ ∆ ⊢


ψ.

When Γ = { φ1 , φ2 , . . . , φk } is a finite set we may use the simplified nota-


tion φ1 , φ2 , . . . , φk ⊢ ψ for Γ ⊢ ψ, in particular φ ⊢ ψ means that { φ} ⊢ ψ.
Note that if Γ ⊢ φ and φ ⊢ ψ, then Γ ⊢ ψ. It follows also that if φ1 , . . . , φn ⊢
ψ and Γ ⊢ φi for each i, then Γ ⊢ ψ.

Proposition 9.16. The following are equivalent.

1. Γ is inconsistent.

2. Γ ⊢ φ for every sentence φ.

3. Γ ⊢ φ and Γ ⊢ ∼ φ for some sentence φ.

Proof. Exercise.

Proposition 9.17 (Compactness). 1. If Γ ⊢ φ then there is a finite subset Γ0 ⊆


Γ such that Γ0 ⊢ φ.

2. If every finite subset of Γ is consistent, then Γ is consistent.

Proof. 1. If Γ ⊢ φ, then there is a derivation δ of φ from Γ. Let Γ0 be the set


of undischarged assumptions of δ. Since any derivation is finite, Γ0 can
only contain finitely many sentences. So, δ is a derivation of φ from a
finite Γ0 ⊆ Γ.

2. This is the contrapositive of (1) for the special case φ ≡ ⊥.

117
9. N ATURAL D EDUCTION

9.10 Derivability and Consistency


We will now establish a number of properties of the derivability relation. They
are independently interesting, but each will play a role in the proof of the
completeness theorem.

Proposition 9.18. If Γ ⊢ φ and Γ ∪ { φ} is inconsistent, then Γ is inconsistent.

Proof. Let the derivation of φ from Γ be δ1 and the derivation of ⊥ from Γ ∪


{ φ} be δ2 . We can then derive:
Γ, [ φ]1
Γ
δ2
δ1

1
∼ φ ∼Intro φ
∼Elim

In the new derivation, the assumption φ is discharged, so it is a derivation
from Γ.

Proposition 9.19. Γ ⊢ φ iff Γ ∪ {∼ φ} is inconsistent.

Proof. First suppose Γ ⊢ φ, i.e., there is a derivation δ0 of φ from undischarged


assumptions Γ. We obtain a derivation of ⊥ from Γ ∪ {∼ φ} as follows:
Γ
δ0
∼φ φ
∼Elim

Now assume Γ ∪ {∼ φ} is inconsistent, and let δ1 be the corresponding
derivation of ⊥ from undischarged assumptions in Γ ∪ {∼ φ}. We obtain
a derivation of φ from Γ alone by using ⊥C :

Γ, [∼ φ]1

δ1

1
⊥ ⊥
φ C

Proposition 9.20. If Γ ⊢ φ and ∼ φ ∈ Γ, then Γ is inconsistent.

Proof. Suppose Γ ⊢ φ and ∼ φ ∈ Γ. Then there is a derivation δ of φ from Γ.


Consider this simple application of the ∼Elim rule:

118
9.11. Derivability and the Propositional Connectives

δ
∼φ φ
∼Elim

Since ∼ φ ∈ Γ, all undischarged assumptions are in Γ, this shows that Γ ⊢ ⊥.

Proposition 9.21. If Γ ∪ { φ} and Γ ∪ {∼ φ} are both inconsistent, then Γ is incon-


sistent.

Proof. There are derivations δ1 and δ2 of ⊥ from Γ ∪ { φ} and ⊥ from Γ ∪ {∼ φ},


respectively. We can then derive

Γ, [∼ φ]2 Γ, [ φ]1

δ2 δ1

⊥ ⊥
2
∼∼ φ ∼Intro 1
∼ φ ∼Intro
∼Elim

Since the assumptions φ and ∼ φ are discharged, this is a derivation of ⊥
from Γ alone. Hence Γ is inconsistent.

9.11 Derivability and the Propositional Connectives


We establish that the derivability relation ⊢ of natural deduction is strong
enough to establish some basic facts involving the propositional connectives,
such as that φ & ψ ⊢ φ and φ, φ ⊃ ψ ⊢ ψ (modus ponens). These facts are
needed for the proof of the completeness theorem.

Proposition 9.22. 1. Both φ & ψ ⊢ φ and φ & ψ ⊢ ψ

2. φ, ψ ⊢ φ & ψ.

Proof. 1. We can derive both

φ&ψ φ&ψ
&Elim &Elim
φ ψ

2. We can derive:
φ ψ
&Intro
φ&ψ

Proposition 9.23. 1. φ ∨ ψ, ∼ φ, ∼ψ is inconsistent.

119
9. N ATURAL D EDUCTION

2. Both φ ⊢ φ ∨ ψ and ψ ⊢ φ ∨ ψ.

Proof. 1. Consider the following derivation:

∼φ [ φ ]1 ∼ψ [ ψ ]1
∼Elim ∼Elim
φ∨ψ ⊥ ⊥
1 ∨Elim

This is a derivation of ⊥ from undischarged assumptions φ ∨ ψ, ∼ φ, and


∼ψ.

2. We can derive both

φ ψ
∨Intro ∨Intro
φ∨ψ φ∨ψ

Proposition 9.24. 1. φ, φ ⊃ ψ ⊢ ψ.

2. Both ∼ φ ⊢ φ ⊃ ψ and ψ ⊢ φ ⊃ ψ.

Proof. 1. We can derive:

φ⊃ψ φ
⊃Elim
ψ

2. This is shown by the following two derivations:

∼φ [ φ ]1
∼Elim
⊥ ⊥
I
ψ ψ
1 ⊃Intro ⊃Intro
φ⊃ψ φ⊃ψ

Note that ⊃Intro may, but does not have to, discharge the assumption φ.

9.12 Derivability and the Quantifiers


The completeness theorem also requires that the natural deduction rules yield
the facts about ⊢ established in this section.

Theorem 9.25. If c is a constant not occurring in Γ or φ( x ) and Γ ⊢ φ(c), then


Γ ⊢ ∀ x φ ( x ).

120
9.13. Soundness

Proof. Let δ be a derivation of φ(c) from Γ. By adding a ∀Intro inference,


we obtain a derivation of ∀ x φ( x ). Since c does not occur in Γ or φ( x ), the
eigenvariable condition is satisfied.

Proposition 9.26. 1. φ(t) ⊢ ∃ x φ( x ).

2. ∀ x φ( x ) ⊢ φ(t).

Proof. 1. The following is a derivation of ∃ x φ( x ) from φ(t):

φ(t)
∃Intro
∃ x φ( x )

2. The following is a derivation of φ(t) from ∀ x φ( x ):

∀ x φ( x )
∀Elim
φ(t)

9.13 Soundness
A derivation system, such as natural deduction, is sound if it cannot derive
things that do not actually follow. Soundness is thus a kind of guaranteed
safety property for derivation systems. Depending on which proof theoretic
property is in question, we would like to know for instance, that

1. every derivable sentence is valid;

2. if a sentence is derivable from some others, it is also a consequence of


them;

3. if a set of sentences is inconsistent, it is unsatisfiable.

These are important properties of a derivation system. If any of them do not


hold, the derivation system is deficient—it would derive too much. Conse-
quently, establishing the soundness of a derivation system is of the utmost
importance.

Theorem 9.27 (Soundness). If φ is derivable from the undischarged assumptions


Γ, then Γ ⊨ φ.

Proof. Let δ be a derivation of φ. We proceed by induction on the number of


inferences in δ.
For the induction basis we show the claim if the number of inferences is 0.
In this case, δ consists only of a single sentence φ, i.e., an assumption. That
assumption is undischarged, since assumptions can only be discharged by

121
9. N ATURAL D EDUCTION

inferences, and there are no inferences. So, any structure M that satisfies all of
the undischarged assumptions of the proof also satisfies φ.
Now for the inductive step. Suppose that δ contains n inferences. The
premise(s) of the lowermost inference are derived using sub-derivations, each
of which contains fewer than n inferences. We assume the induction hypothe-
sis: The premises of the lowermost inference follow from the undischarged as-
sumptions of the sub-derivations ending in those premises. We have to show
that the conclusion φ follows from the undischarged assumptions of the entire
proof.
We distinguish cases according to the type of the lowermost inference.
First, we consider the possible inferences with only one premise.

1. Suppose that the last inference is ∼Intro: The derivation has the form

Γ, [ φ]n

δ1


∼ φ ∼Intro
n

By inductive hypothesis, ⊥ follows from the undischarged assumptions


Γ ∪ { φ} of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ,
then M ⊨ ∼ φ. Suppose for reductio that M ⊨ Γ, but M ⊭ ∼ φ, i.e.,
M ⊨ φ. This would mean that M ⊨ Γ ∪ { φ}. This is contrary to our
inductive hypothesis. So, M ⊨ ∼ φ.

2. The last inference is &Elim: There are two variants: φ or ψ may be in-
ferred from the premise φ & ψ. Consider the first case. The derivation δ
looks like this:

Γ
δ1

φ&ψ
φ &Elim

By inductive hypothesis, φ & ψ follows from the undischarged assump-


tions Γ of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ,
then M ⊨ φ. Suppose M ⊨ Γ. By our inductive hypothesis (Γ ⊨ φ & ψ),
we know that M ⊨ φ & ψ. By definition, M ⊨ φ & ψ iff M ⊨ φ and
M ⊨ ψ. (The case where ψ is inferred from φ & ψ is handled similarly.)

3. The last inference is ∨Intro: There are two variants: φ ∨ ψ may be in-
ferred from the premise φ or the premise ψ. Consider the first case. The
derivation has the form

122
9.13. Soundness

Γ
δ1
φ
∨Intro
φ∨ψ

By inductive hypothesis, φ follows from the undischarged assumptions Γ


of δ1 . Consider a structure M. We need to show that, if M ⊨ Γ, then
M ⊨ φ ∨ ψ. Suppose M ⊨ Γ; then M ⊨ φ since Γ ⊨ φ (the inductive
hypothesis). So it must also be the case that M ⊨ φ ∨ ψ. (The case where
φ ∨ ψ is inferred from ψ is handled similarly.)

4. The last inference is ⊃Intro: φ ⊃ ψ is inferred from a subproof with


assumption φ and conclusion ψ, i.e.,

Γ, [ φ]n

δ1

ψ
n ⊃Intro
φ⊃ψ

By inductive hypothesis, ψ follows from the undischarged assumptions


of δ1 , i.e., Γ ∪ { φ} ⊨ ψ. Consider a structure M. The undischarged
assumptions of δ are just Γ, since φ is discharged at the last inference.
So we need to show that Γ ⊨ φ ⊃ ψ. For reductio, suppose that for
some structure M, M ⊨ Γ but M ⊭ φ ⊃ ψ. So, M ⊨ φ and M ⊭ ψ. But
by hypothesis, ψ is a consequence of Γ ∪ { φ}, i.e., M ⊨ ψ, which is a
contradiction. So, Γ ⊨ φ ⊃ ψ.

5. The last inference is ⊥ I : Here, δ ends in

Γ
δ1

⊥ ⊥
φ I

By induction hypothesis, Γ ⊨ ⊥. We have to show that Γ ⊨ φ. Suppose


not; then for some M we have M ⊨ Γ and M ⊭ φ. But we always
have M ⊭ ⊥, so this would mean that Γ ⊭ ⊥, contrary to the induction
hypothesis.

6. The last inference is ⊥C : Exercise.

7. The last inference is ∀Intro: Then δ has the form

123
9. N ATURAL D EDUCTION

Γ
δ1

φ( a)
∀Intro
∀ x φ( x )

The premise φ( a) is a consequence of the undischarged assumptions Γ


by induction hypothesis. Consider some structure, M, such that M ⊨ Γ.
We need to show that M ⊨ ∀ x φ( x ). Since ∀ x φ( x ) is a sentence, this
means we have to show that for every variable assignment s, M, s ⊨ φ( x )
(Proposition 7.19). Since Γ consists entirely of sentences, M, s ⊨ ψ for all

ψ ∈ Γ by Definition 7.11. Let M′ be like M except that aM = s( x ).

Since a does not occur in Γ, M ⊨ Γ by Corollary 7.21. Since Γ ⊨ φ( a),
M′ ⊨ φ( a). Since φ( a) is a sentence, M′ , s ⊨ φ( a) by Proposition 7.18.
M′ , s ⊨ φ( x ) iff M′ ⊨ φ( a) by Proposition 7.23 (recall that φ( a) is just
φ( x )[ a/x ]). So, M′ , s ⊨ φ( x ). Since a does not occur in φ( x ), by Propo-
sition 7.20, M, s ⊨ φ( x ). But s was an arbitrary variable assignment, so
M ⊨ ∀ x φ ( x ).

8. The last inference is ∃Intro: Exercise.

9. The last inference is ∀Elim: Exercise.

Now let’s consider the possible inferences with several premises: ∨Elim,
&Intro, ⊃Elim, and ∃Elim.

1. The last inference is &Intro. φ & ψ is inferred from the premises φ and ψ
and δ has the form

Γ1 Γ2

δ1 δ2

φ ψ
&Intro
φ&ψ

By induction hypothesis, φ follows from the undischarged assumptions Γ1


of δ1 and ψ follows from the undischarged assumptions Γ2 of δ2 . The
undischarged assumptions of δ are Γ1 ∪ Γ2 , so we have to show that
Γ1 ∪ Γ2 ⊨ φ & ψ. Consider a structure M with M ⊨ Γ1 ∪ Γ2 . Since M ⊨ Γ1 ,
it must be the case that M ⊨ φ as Γ1 ⊨ φ, and since M ⊨ Γ2 , M ⊨ ψ since
Γ2 ⊨ ψ. Together, M ⊨ φ & ψ.

2. The last inference is ∨Elim: Exercise.

3. The last inference is ⊃Elim. ψ is inferred from the premises φ ⊃ ψ and φ.


The derivation δ looks like this:

124
9.14. Derivations with Identity predicate

Γ1 Γ2
δ1 δ2
φ⊃ψ φ
⊃Elim
ψ

By induction hypothesis, φ ⊃ ψ follows from the undischarged assump-


tions Γ1 of δ1 and φ follows from the undischarged assumptions Γ2 of δ2 .
Consider a structure M. We need to show that, if M ⊨ Γ1 ∪ Γ2 , then
M ⊨ ψ. Suppose M ⊨ Γ1 ∪ Γ2 . Since Γ1 ⊨ φ ⊃ ψ, M ⊨ φ ⊃ ψ. Since
Γ2 ⊨ φ, we have M ⊨ φ. This means that M ⊨ ψ (For if M ⊭ ψ, since
M ⊨ φ, we’d have M ⊭ φ ⊃ ψ, contradicting M ⊨ φ ⊃ ψ).

4. The last inference is ∼Elim: Exercise.

5. The last inference is ∃Elim: Exercise.

Corollary 9.28. If ⊢ φ, then φ is valid.

Corollary 9.29. If Γ is satisfiable, then it is consistent.

Proof. We prove the contrapositive. Suppose that Γ is not consistent. Then


Γ ⊢ ⊥, i.e., there is a derivation of ⊥ from undischarged assumptions in Γ. By
Theorem 9.27, any structure M that satisfies Γ must satisfy ⊥. Since M ⊭ ⊥
for every structure M, no M can satisfy Γ, i.e., Γ is not satisfiable.

9.14 Derivations with Identity predicate


Derivations with identity predicate require additional inference rules.

t1 = t2 φ ( t1 )
=Elim
φ ( t2 )
=Intro
t=t
t1 = t2 φ ( t2 )
=Elim
φ ( t1 )

In the above rules, t, t1 , and t2 are closed terms. The =Intro rule allows us
to derive any identity statement of the form t = t outright, from no assump-
tions.

Example 9.30. If s and t are closed terms, then φ(s), s = t ⊢ φ(t):


s=t φ(s)
=Elim
φ(t)

125
9. N ATURAL D EDUCTION

This may be familiar as the “principle of substitutability of identicals,” or Leib-


niz’ Law.

Example 9.31. We derive the sentence

∀ x ∀y (( φ( x ) & φ(y)) ⊃ x = y)

from the sentence

∃ x ∀y ( φ(y) ⊃ y = x )

We develop the derivation backwards:

∃ x ∀y ( φ(y) ⊃ y = x ) [ φ( a) & φ(b)]1

a=b
1 ⊃Intro
(( φ( a) & φ(b)) ⊃ a = b)
∀Intro
∀y (( φ( a) & φ(y)) ⊃ a = y)
∀Intro
∀ x ∀y (( φ( x ) & φ(y)) ⊃ x = y)

We’ll now have to use the main assumption: since it is an existential formula,
we use ∃Elim to derive the intermediary conclusion a = b.

[∀y ( φ(y) ⊃ y = c)]2


[ φ( a) & φ(b)]1

∃ x ∀y ( φ(y) ⊃ y = x ) a=b
2 ∃Elim
a=b
1 ⊃Intro
(( φ( a) & φ(b)) ⊃ a = b)
∀Intro
∀y (( φ( a) & φ(y)) ⊃ a = y)
∀Intro
∀ x ∀y (( φ( x ) & φ(y)) ⊃ x = y)

The sub-derivation on the top right is completed by using its assumptions


to show that a = c and b = c. This requires two separate derivations. The
derivation for a = c is as follows:

[∀y ( φ(y) ⊃ y = c)]2 [ φ( a) & φ(b)]1


∀Elim &Elim
φ( a) ⊃ a = c φ( a)
a=c ⊃Elim

From a = c and b = c we derive a = b by =Elim.

126
9.15. Soundness with Identity predicate

9.15 Soundness with Identity predicate


Proposition 9.32. Natural deduction with rules for = is sound.

Proof. Any formula of the form t = t is valid, since for every structure M,
M ⊨ t = t. (Note that we assume the term t to be closed, i.e., it contains no
variables, so variable assignments are irrelevant).
Suppose the last inference in a derivation is =Elim, i.e., the derivation has
the following form:
Γ1 Γ2

δ1 δ2

t1 = t2 φ ( t1 )
=Elim
φ ( t2 )

The premises t1 = t2 and φ(t1 ) are derived from undischarged assumptions Γ1


and Γ2 , respectively. We want to show that φ(t2 ) follows from Γ1 ∪ Γ2 . Con-
sider a structure M with M ⊨ Γ1 ∪ Γ2 . By induction hypothesis, M ⊨ φ(t1 )
and M ⊨ t1 = t2 . Therefore, ValM (t1 ) = ValM (t2 ). Let s be any variable
assignment, and m = ValM (t1 ) = ValM (t2 ). By Proposition 7.23, M, s ⊨ φ(t1 )
iff M, s[m/x ] ⊨ φ( x ) iff M, s ⊨ φ(t2 ). Since M ⊨ φ(t1 ), we have M ⊨ φ(t2 ).

127
Chapter 10

The Completeness Theorem

10.1 Introduction
The completeness theorem is one of the most fundamental results about logic.
It comes in two formulations, the equivalence of which we’ll prove. In its first
formulation it says something fundamental about the relationship between
semantic consequence and our derivation system: if a sentence φ follows from
some sentences Γ, then there is also a derivation that establishes Γ ⊢ φ. Thus,
the derivation system is as strong as it can possibly be without proving things
that don’t actually follow.
In its second formulation, it can be stated as a model existence result: ev-
ery consistent set of sentences is satisfiable. Consistency is a proof-theoretic
notion: it says that our derivation system is unable to produce certain deriva-
tions. But who’s to say that just because there are no derivations of a certain
sort from Γ, it’s guaranteed that there is a structure M? Before the complete-
ness theorem was first proved—in fact before we had the derivation systems
we now do—the great German mathematician David Hilbert held the view
that consistency of mathematical theories guarantees the existence of the ob-
jects they are about. He put it as follows in a letter to Gottlob Frege:

If the arbitrarily given axioms do not contradict one another with


all their consequences, then they are true and the things defined by
the axioms exist. This is for me the criterion of truth and existence.

Frege vehemently disagreed. The second formulation of the completeness the-


orem shows that Hilbert was right in at least the sense that if the axioms are
consistent, then some structure exists that makes them all true.
These aren’t the only reasons the completeness theorem—or rather, its
proof—is important. It has a number of important consequences, some of
which we’ll discuss separately. For instance, since any derivation that shows
Γ ⊢ φ is finite and so can only use finitely many of the sentences in Γ, it fol-
lows by the completeness theorem that if φ is a consequence of Γ, it is already

129
10. T HE C OMPLETENESS T HEOREM

a consequence of a finite subset of Γ. This is called compactness. Equivalently,


if every finite subset of Γ is consistent, then Γ itself must be consistent.
Although the compactness theorem follows from the completeness theo-
rem via the detour through derivations, it is also possible to use the the proof
of the completeness theorem to establish it directly. For what the proof does is
take a set of sentences with a certain property—consistency—and constructs
a structure out of this set that has certain properties (in this case, that it satisfies
the set). Almost the very same construction can be used to directly establish
compactness, by starting from “finitely satisfiable” sets of sentences instead
of consistent ones. The construction also yields other consequences, e.g., that
any satisfiable set of sentences has a finite or countably infinite model. (This
result is called the Löwenheim–Skolem theorem.) In general, the construction
of structures from sets of sentences is used often in logic, and sometimes even
in philosophy.

10.2 Outline of the Proof


The proof of the completeness theorem is a bit complex, and upon first reading
it, it is easy to get lost. So let us outline the proof. The first step is a shift of
perspective, that allows us to see a route to a proof. When completeness is
thought of as “whenever Γ ⊨ φ then Γ ⊢ φ,” it may be hard to even come
up with an idea: for to show that Γ ⊢ φ we have to find a derivation, and
it does not look like the hypothesis that Γ ⊨ φ helps us for this in any way.
For some proof systems it is possible to directly construct a derivation, but we
will take a slightly different approach. The shift in perspective required is this:
completeness can also be formulated as: “if Γ is consistent, it is satisfiable.”
Perhaps we can use the information in Γ together with the hypothesis that it is
consistent to construct a structure that satisfies every sentence in Γ. After all,
we know what kind of structure we are looking for: one that is as Γ describes
it!
If Γ contains only atomic sentences, it is easy to construct a model for it.
Suppose the atomic sentences are all of the form P( a1 , . . . , an ) where the ai
are constant symbols. All we have to do is come up with a domain |M| and
an assignment for P so that M ⊨ P( a1 , . . . , an ). But that’s not very hard: put
|M| = N, ciM = i, and for every P( a1 , . . . , an ) ∈ Γ, put the tuple ⟨k1 , . . . , k n ⟩
into PM , where k i is the index of the constant symbol ai (i.e., ai ≡ cki ).
Now suppose Γ contains some formula ∼ψ, with ψ atomic. We might
worry that the construction of M interferes with the possibility of making ∼ψ
true. But here’s where the consistency of Γ comes in: if ∼ψ ∈ Γ, then ψ ∈ / Γ, or
else Γ would be inconsistent. And if ψ ∈ / Γ, then according to our construction
of M, M ⊭ ψ, so M ⊨ ∼ψ. So far so good.
What if Γ contains complex, non-atomic formulas? Say it contains φ & ψ.
To make that true, we should proceed as if both φ and ψ were in Γ. And if

130
10.2. Outline of the Proof

φ ∨ ψ ∈ Γ, then we will have to make at least one of them true, i.e., proceed as
if one of them was in Γ.
This suggests the following idea: we add additional formulae to Γ so as to
(a) keep the resulting set consistent and (b) make sure that for every possible
atomic sentence φ, either φ is in the resulting set, or ∼ φ is, and (c) such that,
whenever φ & ψ is in the set, so are both φ and ψ, if φ ∨ ψ is in the set, at least
one of φ or ψ is also, etc. We keep doing this (potentially forever). Call the set
of all formulae so added Γ∗ . Then our construction above would provide us
with a structure M for which we could prove, by induction, that it satisfies all
sentences in Γ∗ , and hence also all sentence in Γ since Γ ⊆ Γ∗ . It turns out that
guaranteeing (a) and (b) is enough. A set of sentences for which (b) holds is
called complete. So our task will be to extend the consistent set Γ to a consistent
and complete set Γ∗ .
There is one wrinkle in this plan: if ∃ x φ( x ) ∈ Γ we would hope to be able
to pick some constant symbol c and add φ(c) in this process. But how do we
know we can always do that? Perhaps we only have a few constant symbols
in our language, and for each one of them we have ∼ φ(c) ∈ Γ. We can’t also
add φ(c), since this would make the set inconsistent, and we wouldn’t know
whether M has to make φ(c) or ∼ φ(c) true. Moreover, it might happen that Γ
contains only sentences in a language that has no constant symbols at all (e.g.,
the language of set theory).
The solution to this problem is to simply add infinitely many constants at
the beginning, plus sentences that connect them with the quantifiers in the
right way. (Of course, we have to verify that this cannot introduce an incon-
sistency.)
Our original construction works well if we only have constant symbols in
the atomic sentences. But the language might also contain function symbols.
In that case, it might be tricky to find the right functions on N to assign to
these function symbols to make everything work. So here’s another trick: in-
stead of using i to interpret ci , just take the set of constant symbols itself as
the domain. Then M can assign every constant symbol to itself: ciM = ci . But
why not go all the way: let |M| be all terms of the language! If we do this,
there is an obvious assignment of functions (that take terms as arguments and
have terms as values) to function symbols: we assign to the function sym-
bol fin the function which, given n terms t1 , . . . , tn as input, produces the term
fin (t1 , . . . , tn ) as value.
The last piece of the puzzle is what to do with =. The predicate symbol =
has a fixed interpretation: M ⊨ t = t′ iff ValM (t) = ValM (t′ ). Now if we set
things up so that the value of a term t is t itself, then this structure will make
no sentence of the form t = t′ true unless t and t′ are one and the same term.
And of course this is a problem, since basically every interesting theory in a
language with function symbols will have as theorems sentences t = t′ where
t and t′ are not the same term (e.g., in theories of arithmetic: (0 + 0) = 0). To

131
10. T HE C OMPLETENESS T HEOREM

solve this problem, we change the domain of M: instead of using terms as the
objects in |M|, we use sets of terms, and each set is so that it contains all those
terms which the sentences in Γ require to be equal. So, e.g., if Γ is a theory of
arithmetic, one of these sets will contain: 0, (0 + 0), (0 × 0), etc. This will be
the set we assign to 0, and it will turn out that this set is also the value of all
the terms in it, e.g., also of (0 + 0). Therefore, the sentence (0 + 0) = 0 will be
true in this revised structure.
So here’s what we’ll do. First we investigate the properties of complete
consistent sets, in particular we prove that a complete consistent set contains
φ & ψ iff it contains both φ and ψ, φ ∨ ψ iff it contains at least one of them,
etc. (Proposition 10.2). Then we define and investigate “saturated” sets of
sentences. A saturated set is one which contains conditionals that link each
quantified sentence to instances of it (Definition 10.5). We show that any con-
sistent set Γ can always be extended to a saturated set Γ′ (Lemma 10.6). If a set
is consistent, saturated, and complete it also has the property that it contains
∃ x φ( x ) iff it contains φ(t) for some closed term t and ∀ x φ( x ) iff it contains
φ(t) for all closed terms t (Proposition 10.7). We’ll then take the saturated con-
sistent set Γ′ and show that it can be extended to a saturated, consistent, and
complete set Γ∗ (Lemma 10.8). This set Γ∗ is what we’ll use to define our term
model M(Γ∗ ). The term model has the set of closed terms as its domain, and
the interpretation of its predicate symbols is given by the atomic sentences
in Γ∗ (Definition 10.9). We’ll use the properties of saturated, complete con-
sistent sets to show that indeed M(Γ∗ ) ⊨ φ iff φ ∈ Γ∗ (Lemma 10.12), and
thus in particular, M(Γ∗ ) ⊨ Γ. Finally, we’ll consider how to define a term
model if Γ contains = as well (Definition 10.16) and show that it satisfies Γ∗
(Lemma 10.19).

10.3 Complete Consistent Sets of Sentences


Definition 10.1 (Complete set). A set Γ of sentences is complete iff for any sen-
tence φ, either φ ∈ Γ or ∼ φ ∈ Γ.

Complete sets of sentences leave no questions unanswered. For any sen-


tence φ, Γ “says” if φ is true or false. The importance of complete sets extends
beyond the proof of the completeness theorem. A theory which is complete
and axiomatizable, for instance, is always decidable.
Complete consistent sets are important in the completeness proof since we
can guarantee that every consistent set of sentences Γ is contained in a com-
plete consistent set Γ∗ . A complete consistent set contains, for each sentence φ,
either φ or its negation ∼ φ, but not both. This is true in particular for atomic
sentences, so from a complete consistent set in a language suitably expanded
by constant symbols, we can construct a structure where the interpretation of
predicate symbols is defined according to which atomic sentences are in Γ∗ .
This structure can then be shown to make all sentences in Γ∗ (and hence also

132
10.4. Henkin Expansion

all those in Γ) true. The proof of this latter fact requires that ∼ φ ∈ Γ∗ iff
φ∈ / Γ∗ , ( φ ∨ ψ) ∈ Γ∗ iff φ ∈ Γ∗ or ψ ∈ Γ∗ , etc.
In what follows, we will often tacitly use the properties of reflexivity, mono-
tonicity, and transitivity of ⊢ (see section 9.9).

Proposition 10.2. Suppose Γ is complete and consistent. Then:

1. If Γ ⊢ φ, then φ ∈ Γ.

2. φ & ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ.

3. φ ∨ ψ ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ.

4. φ ⊃ ψ ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ.

Proof. Let us suppose for all of the following that Γ is complete and consistent.

1. If Γ ⊢ φ, then φ ∈ Γ.
Suppose that Γ ⊢ φ. Suppose to the contrary that φ ∈ / Γ. Since Γ is com-
plete, ∼ φ ∈ Γ. By Proposition 9.20, Γ is inconsistent. This contradicts the
assumption that Γ is consistent. Hence, it cannot be the case that φ ∈ / Γ,
so φ ∈ Γ.

2. φ & ψ ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ:


For the forward direction, suppose φ & ψ ∈ Γ. Then by Proposition 9.22,
item (1), Γ ⊢ φ and Γ ⊢ ψ. By (1), φ ∈ Γ and ψ ∈ Γ, as required.
For the reverse direction, let φ ∈ Γ and ψ ∈ Γ. By Proposition 9.22,
item (2), Γ ⊢ φ & ψ. By (1), φ & ψ ∈ Γ.

3. First we show that if φ ∨ ψ ∈ Γ, then either φ ∈ Γ or ψ ∈ Γ. Suppose


φ ∨ ψ ∈ Γ but φ ∈ / Γ and ψ ∈ / Γ. Since Γ is complete, ∼ φ ∈ Γ and
∼ψ ∈ Γ. By Proposition 9.23, item (1), Γ is inconsistent, a contradiction.
Hence, either φ ∈ Γ or ψ ∈ Γ.
For the reverse direction, suppose that φ ∈ Γ or ψ ∈ Γ. By Proposi-
tion 9.23, item (2), Γ ⊢ φ ∨ ψ. By (1), φ ∨ ψ ∈ Γ, as required.

4. Exercise.

10.4 Henkin Expansion


Part of the challenge in proving the completeness theorem is that the model
we construct from a complete consistent set Γ must make all the quantified
formulae in Γ true. In order to guarantee this, we use a trick due to Leon
Henkin. In essence, the trick consists in expanding the language by infinitely
many constant symbols and adding, for each formula with one free variable

133
10. T HE C OMPLETENESS T HEOREM

φ( x ) a formula of the form ∃ x φ( x ) ⊃ φ(c), where c is one of the new constant


symbols. When we construct the structure satisfying Γ, this will guarantee
that each true existential sentence has a witness among the new constants.

Proposition 10.3. If Γ is consistent in L and L′ is obtained from L by adding


a countably infinite set of new constant symbols d0 , d1 , . . . , then Γ is consistent
in L′ .

Definition 10.4 (Saturated set). A set Γ of formulae of a language L is satu-


rated iff for each formula φ( x ) ∈ Frm(L) with one free variable x there is
a constant symbol c ∈ L such that ∃ x φ( x ) ⊃ φ(c) ∈ Γ.

The following definition will be used in the proof of the next theorem.

Definition 10.5. Let L′ be as in Proposition 10.3. Fix an enumeration φ0 ( x0 ),


φ1 ( x1 ), . . . of all formulae φi ( xi ) of L′ in which one variable (xi ) occurs free.
We define the sentences θn by induction on n.
Let c0 be the first constant symbol among the di we added to L which does
not occur in φ0 ( x0 ). Assuming that θ0 , . . . , θn−1 have already been defined,
let cn be the first among the new constant symbols di that occurs neither in θ0 ,
. . . , θn−1 nor in φn ( xn ).
Now let θn be the formula ∃ xn φn ( xn ) ⊃ φn (cn ).

Lemma 10.6. Every consistent set Γ can be extended to a saturated consistent set Γ′ .

Proof. Given a consistent set of sentences Γ in a language L, expand the lan-


guage by adding a countably infinite set of new constant symbols to form L′ .
By Proposition 10.3, Γ is still consistent in the richer language. Further, let θi
be as in Definition 10.5. Let

Γ0 = Γ
Γ n +1 = Γ n ∪ { θ n }

i.e., Γn+1 = Γ ∪ {θ0 , . . . , θn }, and let Γ′ = n Γn . Γ′ is clearly saturated.


S

If Γ′ were inconsistent, then for some n, Γn would be inconsistent (Exercise:


explain why). So to show that Γ′ is consistent it suffices to show, by induction
on n, that each set Γn is consistent.
The induction basis is simply the claim that Γ0 = Γ is consistent, which
is the hypothesis of the theorem. For the induction step, suppose that Γn is
consistent but Γn+1 = Γn ∪ {θn } is inconsistent. Recall that θn is ∃ xn φn ( xn ) ⊃
φn (cn ), where φn ( xn ) is a formula of L′ with only the variable xn free. By the
way we’ve chosen the cn (see Definition 10.5), cn does not occur in φn ( xn ) nor
in Γn .
If Γn ∪ {θn } is inconsistent, then Γn ⊢ ∼θn , and hence both of the following
hold:
Γn ⊢ ∃ xn φn ( xn ) Γn ⊢ ∼ φn (cn )

134
10.5. Lindenbaum’s Lemma

Since cn does not occur in Γn or in φn ( xn ), Theorem 9.25 applies. From Γn ⊢


∼ φn (cn ), we obtain Γn ⊢ ∀ xn ∼ φn ( xn ). Thus we have that both Γn ⊢ ∃ xn φn ( xn )
and Γn ⊢ ∀ xn ∼ φn ( xn ), so Γn itself is inconsistent. (Note that ∀ xn ∼ φn ( xn ) ⊢
∼∃ xn φn ( xn ).) Contradiction: Γn was supposed to be consistent. Hence Γn ∪
{θn } is consistent.

We’ll now show that complete, consistent sets which are saturated have the
property that it contains a universally quantified sentence iff it contains all its
instances and it contains an existentially quantified sentence iff it contains at
least one instance. We’ll use this to show that the structure we’ll generate from
a complete, consistent, saturated set makes all its quantified sentences true.

Proposition 10.7. Suppose Γ is complete, consistent, and saturated.

1. ∃ x φ( x ) ∈ Γ iff φ(t) ∈ Γ for at least one closed term t.

2. ∀ x φ( x ) ∈ Γ iff φ(t) ∈ Γ for all closed terms t.

Proof. 1. First suppose that ∃ x φ( x ) ∈ Γ. Because Γ is saturated, (∃ x φ( x ) ⊃


φ(c)) ∈ Γ for some constant symbol c. By Proposition 9.24, item (1), and
Proposition 10.2(1), φ(c) ∈ Γ.
For the other direction, saturation is not necessary: Suppose φ(t) ∈ Γ.
Then Γ ⊢ ∃ x φ( x ) by Proposition 9.26, item (1). By Proposition 10.2(1),
∃ x φ( x ) ∈ Γ.
2. Exercise.

10.5 Lindenbaum’s Lemma


We now prove a lemma that shows that any consistent set of sentences is con-
tained in some set of sentences which is not just consistent, but also complete.
The proof works by adding one sentence at a time, guaranteeing at each step
that the set remains consistent. We do this so that for every φ, either φ or ∼ φ
gets added at some stage. The union of all stages in that construction then
contains either φ or its negation ∼ φ and is thus complete. It is also consistent,
since we make sure at each stage not to introduce an inconsistency.

Lemma 10.8 (Lindenbaum’s Lemma). Every consistent set Γ in a language L can


be extended to a complete and consistent set Γ∗ .

Proof. Let Γ be consistent. Let φ0 , φ1 , . . . be an enumeration of all the sen-


tences of L. Define Γ0 = Γ, and
(
Γn ∪ { φn } if Γn ∪ { φn } is consistent;
Γ n +1 =
Γn ∪ {∼ φn } otherwise.

135
10. T HE C OMPLETENESS T HEOREM

Let Γ∗ = n≥0 Γn .
S

Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 = Γn ∪ { φn },


this is because the latter is consistent. If it isn’t, Γn+1 = Γn ∪ {∼ φn }. We have
to verify that Γn ∪ {∼ φn } is consistent. Suppose it’s not. Then both Γn ∪ { φn }
and Γn ∪ {∼ φn } are inconsistent. This means that Γn would be inconsistent
by Proposition 9.21, contrary to the induction hypothesis.
For every n and every i < n, Γi ⊆ Γn . This follows by a simple induction
on n. For n = 0, there are no i < 0, so the claim holds automatically. For
the inductive step, suppose it is true for n. We show that if i < n + 1 then
Γi ⊆ Γn+1 . We have Γn+1 = Γn ∪ { φn } or = Γn ∪ {∼ φn } by construction. So
Γn ⊆ Γn+1 . If i < n + 1, then Γi ⊆ Γn by inductive hypothesis (if i < n) or the
trivial fact that Γn ⊆ Γn (if i = n). We get that Γi ⊆ Γn+1 by transitivity of ⊆.
From this it follows that Γ∗ is consistent. Here’s why: Let Γ′ ⊆ Γ∗ be finite.
Each ψ ∈ Γ′ is also in Γi for some i. Let n be the largest of these. Since Γi ⊆ Γn
if i ≤ n, every ψ ∈ Γ′ is also ∈ Γn , i.e., Γ′ ⊆ Γn , and Γn is consistent. So, every
finite subset Γ′ ⊆ Γ∗ is consistent. By Proposition 9.17, Γ∗ is consistent.
Every sentence of Frm(L) appears on the list used to define Γ∗ . If φn ∈ / Γ∗ ,
then that is because Γn ∪ { φn } was inconsistent. But then ∼ φn ∈ Γ , so Γ∗ is

complete.

10.6 Construction of a Model


Right now we are not concerned about =, i.e., we only want to show that a
consistent set Γ of sentences not containing = is satisfiable. We first extend Γ
to a consistent, complete, and saturated set Γ∗ . In this case, the definition of a
model M(Γ∗ ) is simple: We take the set of closed terms of L′ as the domain.
We assign every constant symbol to itself, and make sure that more generally,

for every closed term t, ValM(Γ ) (t) = t. The predicate symbols are assigned
extensions in such a way that an atomic sentence is true in M(Γ∗ ) iff it is in Γ∗ .
This will obviously make all the atomic sentences in Γ∗ true in M(Γ∗ ). The rest
are true provided the Γ∗ we start with is consistent, complete, and saturated.

Definition 10.9 (Term model). Let Γ∗ be a complete and consistent, saturated


set of sentences in a language L. The term model M(Γ∗ ) of Γ∗ is the structure
defined as follows:

1. The domain |M(Γ∗ )| is the set of all closed terms of L.


∗)
2. The interpretation of a constant symbol c is c itself: cM(Γ = c.

3. The function symbol f is assigned the function which, given as argu-


ments the closed terms t1 , . . . , tn , has as value the closed term f (t1 , . . . , tn ):

f M( Γ ) ( t 1 , . . . , t n ) = f ( t 1 , . . . , t n )

136
10.6. Construction of a Model

4. If R is an n-place predicate symbol, then



⟨t1 , . . . , tn ⟩ ∈ RM(Γ ) iff R(t1 , . . . , tn ) ∈ Γ∗ .

We will now check that we indeed have ValM(Γ ) (t) = t.

Lemma 10.10. Let M(Γ∗ ) be the term model of Definition 10.9, then ValM(Γ ) (t) =
t.

Proof. The proof is by induction on t, where the base case, when t is a con-
stant symbol, follows directly from the definition of the term model. For the

induction step assume t1 , . . . , tn are closed terms such that ValM(Γ ) (ti ) = ti
and that f is an n-ary function symbol. Then
∗ ∗ ∗
ValM(Γ ) ( f (t1 , . . . , tn )) = f M(Γ ) (ValM(Γ ) (t1 ), . . . , ValM(Γ ) (tn ))


= f M( Γ ) ( t 1 , . . . , t n )
= f ( t1 , . . . , t n ),

and so by induction this holds for every closed term t.

A structure M may make an existentially quantified sentence ∃ x φ( x ) true


without there being an instance φ(t) that it makes true. A structure M may
make all instances φ(t) of a universally quantified sentence ∀ x φ( x ) true, with-
out making ∀ x φ( x ) true. This is because in general not every element of |M|
is the value of a closed term (M may not be covered). This is the reason the sat-
isfaction relation is defined via variable assignments. However, for our term
model M(Γ∗ ) this wouldn’t be necessary—because it is covered. This is the
content of the next result.

Proposition 10.11. Let M(Γ∗ ) be the term model of Definition 10.9.

1. M(Γ∗ ) ⊨ ∃ x φ( x ) iff M(Γ∗ ) ⊨ φ(t) for at least one closed term t.

2. M(Γ∗ ) ⊨ ∀ x φ( x ) iff M(Γ∗ ) ⊨ φ(t) for all closed terms t.

Proof. 1. By Proposition 7.19, M(Γ∗ ) ⊨ ∃ x φ( x ) iff for at least one vari-


able assignment s, M(Γ∗ ), s ⊨ φ( x ). As |M(Γ∗ )| consists of the closed
terms of L, this is the case iff there is at least one closed term t such that
s( x ) = t and M(Γ∗ ), s ⊨ φ( x ). By Proposition 7.23, M(Γ∗ ), s ⊨ φ( x ) iff
M(Γ∗ ), s ⊨ φ(t), where s( x ) = t. By Proposition 7.18, M(Γ∗ ), s ⊨ φ(t) iff
M(Γ∗ ) ⊨ φ(t), since φ(t) is a sentence.

2. Exercise.

Lemma 10.12 (Truth Lemma). Suppose φ does not contain =. Then M(Γ∗ ) ⊨ φ
iff φ ∈ Γ∗ .

137
10. T HE C OMPLETENESS T HEOREM

Proof. We prove both directions simultaneously, and by induction on φ.

1. φ ≡ ⊥: M(Γ∗ ) ⊭ ⊥ by definition of satisfaction. On the other hand,


⊥∈/ Γ∗ since Γ∗ is consistent.

2. φ ≡ R(t1 , . . . , tn ): M(Γ∗ ) ⊨ R(t1 , . . . , tn ) iff ⟨t1 , . . . , tn ⟩ ∈ RM(Γ ) (by
the definition of satisfaction) iff R(t1 , . . . , tn ) ∈ Γ∗ (by the construction
of M(Γ∗ )).

3. φ ≡ ∼ψ: M(Γ∗ ) ⊨ φ iff M(Γ∗ ) ⊭ ψ (by definition of satisfaction). By


induction hypothesis, M(Γ∗ ) ⊭ ψ iff ψ ∈
/ Γ∗ . Since Γ∗ is consistent and
∗ ∗
/ Γ iff ∼ψ ∈ Γ .
complete, ψ ∈

4. φ ≡ ψ & χ: M(Γ∗ ) ⊨ φ iff we have both M(Γ∗ ) ⊨ ψ and M(Γ∗ ) ⊨ χ (by


definition of satisfaction) iff both ψ ∈ Γ∗ and χ ∈ Γ∗ (by the induction
hypothesis). By Proposition 10.2(2), this is the case iff (ψ & χ) ∈ Γ∗ .

5. φ ≡ ψ ∨ χ: M(Γ∗ ) ⊨ φ iff M(Γ∗ ) ⊨ ψ or M(Γ∗ ) ⊨ χ (by definition of


satisfaction) iff ψ ∈ Γ∗ or χ ∈ Γ∗ (by induction hypothesis). This is the
case iff (ψ ∨ χ) ∈ Γ∗ (by Proposition 10.2(3)).

6. φ ≡ ψ ⊃ χ: exercise.

7. φ ≡ ∀ x ψ( x ): exercise.

8. φ ≡ ∃ x ψ( x ): M(Γ∗ ) ⊨ φ iff M(Γ∗ ) ⊨ ψ(t) for at least one term t


(Proposition 10.11). By induction hypothesis, this is the case iff ψ(t) ∈ Γ∗
for at least one term t. By Proposition 10.7, this in turn is the case iff
∃ x ψ( x ) ∈ Γ∗ .

10.7 Identity
The construction of the term model given in the preceding section is enough
to establish completeness for first-order logic for sets Γ that do not contain =.
The term model satisfies every φ ∈ Γ∗ which does not contain = (and hence
all φ ∈ Γ). It does not work, however, if = is present. The reason is that Γ∗
then may contain a sentence t = t′ , but in the term model the value of any
term is that term itself. Hence, if t and t′ are different terms, their values in
the term model—i.e., t and t′ , respectively—are different, and so t = t′ is false.
We can fix this, however, using a construction known as “factoring.”

Definition 10.13. Let Γ∗ be a consistent and complete set of sentences in L.


We define the relation ≈ on the set of closed terms of L by

t ≈ t′ iff t = t′ ∈ Γ∗

Proposition 10.14. The relation ≈ has the following properties:

138
10.7. Identity

1. ≈ is reflexive.

2. ≈ is symmetric.

3. ≈ is transitive.

4. If t ≈ t′ , f is a function symbol, and t1 , . . . , ti−1 , ti+1 , . . . , tn are closed terms,


then

f (t1 , . . . , ti−1 , t, ti+1 , . . . , tn ) ≈ f (t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn ).

5. If t ≈ t′ , R is a predicate symbol, and t1 , . . . , ti−1 , ti+1 , . . . , tn are closed terms,


then

R(t1 , . . . , ti−1 , t, ti+1 , . . . , tn ) ∈ Γ∗ iff


R ( t 1 , . . . , t i −1 , t ′ , t i +1 , . . . , t n ) ∈ Γ ∗ .

Proof. Since Γ∗ is consistent and complete, t = t′ ∈ Γ∗ iff Γ∗ ⊢ t = t′ . Thus it


is enough to show the following:

1. Γ∗ ⊢ t = t for all closed terms t.

2. If Γ∗ ⊢ t = t′ then Γ∗ ⊢ t′ = t.

3. If Γ∗ ⊢ t = t′ and Γ∗ ⊢ t′ = t′′ , then Γ∗ ⊢ t = t′′ .

4. If Γ∗ ⊢ t = t′ , then

Γ∗ ⊢ f (t1 , . . . , ti−1 , t, ti+1 , , . . . , tn ) = f (t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn )

for every n-place function symbol f and closed terms t1 , . . . , ti−1 , ti+1 ,
. . . , tn .

5. If Γ∗ ⊢ t = t′ and Γ∗ ⊢ R(t1 , . . . , ti−1 , t, ti+1 , . . . , tn ), then Γ∗ ⊢ R(t1 , . . . , ti−1 , t′ , ti+1 , . . . , tn )


for every n-place predicate symbol R and closed terms t1 , . . . , ti−1 , ti+1 ,
. . . , tn .

Definition 10.15. Suppose Γ∗ is a consistent and complete set in a language L,


t is a closed term, and ≈ as in the previous definition. Then:

[t]≈ = {t′ | t′ ∈ Trm(L), t ≈ t′ }

and Trm(L)/≈ = {[t]≈ | t ∈ Trm(L)}.

Definition 10.16. Let M = M(Γ∗ ) be the term model for Γ∗ from Defini-
tion 10.9. Then M/≈ is the following structure:

1. |M/≈ | = Trm(L)/≈ .

139
10. T HE C OMPLETENESS T HEOREM

2. cM/≈ = [c]≈

3. f M/≈ ([t1 ]≈ , . . . , [tn ]≈ ) = [ f (t1 , . . . , tn )]≈

4. ⟨[t1 ]≈ , . . . , [tn ]≈ ⟩ ∈ RM/≈ iff M ⊨ R(t1 , . . . , tn ), i.e., iff R(t1 , . . . , tn ) ∈ Γ∗ .

Note that we have defined f M/≈ and RM/≈ for elements of Trm(L)/≈ by
referring to them as [t]≈ , i.e., via representatives t ∈ [t]≈ . We have to make sure
that these definitions do not depend on the choice of these representatives, i.e.,
that for some other choices t′ which determine the same equivalence classes
([t]≈ = [t′ ]≈ ), the definitions yield the same result. For instance, if R is a one-
place predicate symbol, the last clause of the definition says that [t]≈ ∈ RM/≈
iff M ⊨ R(t). If for some other term t′ with t ≈ t′ , M ⊭ R(t), then the definition
would require [t′ ]≈ ∈ / RM/≈ . If t ≈ t′ , then [t]≈ = [t′ ]≈ , but we can’t have both
[t]≈ ∈ R M/ ≈ and [t]≈ ∈/ RM/≈ . However, Proposition 10.14 guarantees that
this cannot happen.

Proposition 10.17. M/≈ is well defined, i.e., if t1 , . . . , tn , t1′ , . . . , t′n are closed terms,
and ti ≈ ti′ then

1. [ f (t1 , . . . , tn )]≈ = [ f (t1′ , . . . , t′n )]≈ , i.e.,

f (t1 , . . . , tn ) ≈ f (t1′ , . . . , t′n )

and

2. M ⊨ R(t1 , . . . , tn ) iff M ⊨ R(t1′ , . . . , t′n ), i.e.,

R(t1 , . . . , tn ) ∈ Γ∗ iff R(t1′ , . . . , t′n ) ∈ Γ∗ .

Proof. Follows from Proposition 10.14 by induction on n.

As in the case of the term model, before proving the truth lemma we need
the following lemma.

Lemma 10.18. Let M = M(Γ∗ ), then ValM/≈ (t) = [t]≈ .

Proof. The proof is similar to that of Lemma 10.10.

Lemma 10.19. M/≈ ⊨ φ iff φ ∈ Γ∗ for all sentences φ.

Proof. By induction on φ, just as in the proof of Lemma 10.12. The only case
that needs additional attention is when φ ≡ t = t′ .

M/≈ ⊨ t = t′ iff [t]≈ = [t′ ]≈ (by definition of M/≈ )


iff t ≈ t′ (by definition of [t]≈ )
iff t = t′ ∈ Γ∗ (by definition of ≈).

140
10.8. The Completeness Theorem

Note that while M(Γ∗ ) is always countable and infinite, M/≈ may be fi-
nite, since it may turn out that there are only finitely many classes [t]≈ . This is
to be expected, since Γ may contain sentences which require any structure in
which they are true to be finite. For instance, ∀ x ∀y x = y is a consistent sen-
tence, but is satisfied only in structures with a domain that contains exactly
one element.

10.8 The Completeness Theorem


Let’s combine our results: we arrive at the completeness theorem.

Theorem 10.20 (Completeness Theorem). Let Γ be a set of sentences. If Γ is con-


sistent, it is satisfiable.

Proof. Suppose Γ is consistent. By Lemma 10.6, there is a saturated consistent


set Γ′ ⊇ Γ. By Lemma 10.8, there is a Γ∗ ⊇ Γ′ which is consistent and com-
plete. Since Γ′ ⊆ Γ∗ , for each formula φ( x ), Γ∗ contains a sentence of the
form ∃ x φ( x ) ⊃ φ(c) and so Γ∗ is saturated. If Γ does not contain =, then by
Lemma 10.12, M(Γ∗ ) ⊨ φ iff φ ∈ Γ∗ . From this it follows in particular that
for all φ ∈ Γ, M(Γ∗ ) ⊨ φ, so Γ is satisfiable. If Γ does contain =, then by
Lemma 10.19, for all sentences φ, M/≈ ⊨ φ iff φ ∈ Γ∗ . In particular, M/≈ ⊨ φ
for all φ ∈ Γ, so Γ is satisfiable.

Corollary 10.21 (Completeness Theorem, Second Version). For all Γ and sen-
tences φ: if Γ ⊨ φ then Γ ⊢ φ.

Proof. Note that the Γ’s in Corollary 10.21 and Theorem 10.20 are universally
quantified. To make sure we do not confuse ourselves, let us restate Theo-
rem 10.20 using a different variable: for any set of sentences ∆, if ∆ is consis-
tent, it is satisfiable. By contraposition, if ∆ is not satisfiable, then ∆ is incon-
sistent. We will use this to prove the corollary.
Suppose that Γ ⊨ φ. Then Γ ∪ {∼ φ} is unsatisfiable by Proposition 7.28.
Taking Γ ∪ {∼ φ} as our ∆, the previous version of Theorem 10.20 gives us that
Γ ∪ {∼ φ} is inconsistent. By Proposition 9.19, Γ ⊢ φ.

10.9 The Compactness Theorem


One important consequence of the completeness theorem is the compactness
theorem. The compactness theorem states that if each finite subset of a set
of sentences is satisfiable, the entire set is satisfiable—even if the set itself is
infinite. This is far from obvious. There is nothing that seems to rule out,
at first glance at least, the possibility of there being infinite sets of sentences
which are contradictory, but the contradiction only arises, so to speak, from
the infinite number. The compactness theorem says that such a scenario can

141
10. T HE C OMPLETENESS T HEOREM

be ruled out: there are no unsatisfiable infinite sets of sentences each finite
subset of which is satisfiable. Like the completeness theorem, it has a version
related to entailment: if an infinite set of sentences entails something, already
a finite subset does.

Definition 10.22. A set Γ of formulae is finitely satisfiable iff every finite Γ0 ⊆ Γ


is satisfiable.

Theorem 10.23 (Compactness Theorem). The following hold for any sentences Γ
and φ:

1. Γ ⊨ φ iff there is a finite Γ0 ⊆ Γ such that Γ0 ⊨ φ.

2. Γ is satisfiable iff it is finitely satisfiable.

Proof. We prove (2). If Γ is satisfiable, then there is a structure M such that


M ⊨ φ for all φ ∈ Γ. Of course, this M also satisfies every finite subset of Γ, so
Γ is finitely satisfiable.
Now suppose that Γ is finitely satisfiable. Then every finite subset Γ0 ⊆ Γ
is satisfiable. By soundness (Corollary 9.29), every finite subset is consistent.
Then Γ itself must be consistent by Proposition 9.17. By completeness (Theo-
rem 10.20), since Γ is consistent, it is satisfiable.

Example 10.24. In every model M of a theory Γ, each term t of course picks


out an element of |M|. Can we guarantee that it is also true that every element
of |M| is picked out by some term or other? In other words, are there theo-
ries Γ all models of which are covered? The compactness theorem shows that
this is not the case if Γ has infinite models. Here’s how to see this: Let M be
an infinite model of Γ, and let c be a constant symbol not in the language of Γ.
Let ∆ be the set of all sentences c ̸= t for t a term in the language L of Γ, i.e.,

∆ = {c ̸= t | t ∈ Trm(L)}.

A finite subset of Γ ∪ ∆ can be written as Γ′ ∪ ∆′ , with Γ′ ⊆ Γ and ∆′ ⊆ ∆. Since


∆′ is finite, it can contain only finitely many terms. Let a ∈ |M| be an element
of |M| not picked out by any of them, and let M′ be the structure that is just

like M, but also cM = a. Since a ̸= ValM (t) for all t occurring in ∆′ , M′ ⊨ ∆′ .
Since M ⊨ Γ, Γ′ ⊆ Γ, and c does not occur in Γ, also M′ ⊨ Γ′ . Together,
M′ ⊨ Γ′ ∪ ∆′ for every finite subset Γ′ ∪ ∆′ of Γ ∪ ∆. So every finite subset
of Γ ∪ ∆ is satisfiable. By compactness, Γ ∪ ∆ itself is satisfiable. So there are
models M ⊨ Γ ∪ ∆. Every such M is a model of Γ, but is not covered, since
ValM (c) ̸= ValM (t) for all terms t of L.

Example 10.25. Consider a language L containing the predicate symbol <,


constant symbols 0, 1, and function symbols +, ×, and −. Let Γ be the set of
all sentences in this language true in the structure Q with domain Q and the

142
10.10. A Direct Proof of the Compactness Theorem

obvious interpretations. Γ is the set of all sentences of L true about the rational
numbers. Of course, in Q (and even in R), there are no numbers r which are
greater than 0 but less than 1/k for all k ∈ Z+ . Such a number, if it existed,
would be an infinitesimal: non-zero, but infinitely small. The compactness
theorem can be used to show that there are models of Γ in which infinitesimals
exist. We do not have a function symbol for division in our language (division
by zero is undefined, and function symbols have to be interpreted by total
functions). However, we can still express that r < 1/k, since this is the case iff
r · k < 1. Now let c be a new constant symbol and let ∆ be
{ 0 < c } ∪ { c × k < 1 | k ∈ Z+ }
(where k = (1 + (1 + · · · + (1 + 1) . . . )) with k 1’s). For any finite subset ∆0
of ∆ there is a K such that for all the sentences c × k < 1 in ∆0 have k <

K. If we expand Q to Q′ with cQ = 1/K we have that Q′ ⊨ Γ0 ∪ ∆0 for
any finite Γ0 ⊆ Γ, and so Γ ∪ ∆ is finitely satisfiable (Exercise: prove this in
detail). By compactness, Γ ∪ ∆ is satisfiable. Any model S of Γ ∪ ∆ contains
an infinitesimal, namely cS .

Example 10.26. We know that first-order logic with identity predicate can ex-
press that the size of the domain must have some minimal size: The sen-
tence φ≥n (which says “there are at least n distinct objects”) is true only in
structures where |M| has at least n objects. So if we take
∆ = { φ ≥ n | n ≥ 1}
then any model of ∆ must be infinite. Thus, we can guarantee that a theory
only has infinite models by adding ∆ to it: the models of Γ ∪ ∆ are all and only
the infinite models of Γ.
So first-order logic can express infinitude. The compactness theorem shows
that it cannot express finitude, however. For suppose some set of sentences Λ
were satisfied in all and only finite structures. Then ∆ ∪ Λ is finitely satisfiable.
Why? Suppose ∆′ ∪ Λ′ ⊆ ∆ ∪ Λ is finite with ∆′ ⊆ ∆ and Λ′ ⊆ Λ. Let n be the
largest number such that φ≥n ∈ ∆′ . Λ, being satisfied in all finite structures,
has a model M with finitely many but ≥ n elements. But then M ⊨ ∆′ ∪ Λ′ .
By compactness, ∆ ∪ Λ has an infinite model, contradicting the assumption
that Λ is satisfied only in finite structures.

10.10 A Direct Proof of the Compactness Theorem


We can prove the Compactness Theorem directly, without appealing to the
Completeness Theorem, using the same ideas as in the proof of the complete-
ness theorem. In the proof of the Completeness Theorem we started with a
consistent set Γ of sentences, expanded it to a consistent, saturated, and com-
plete set Γ∗ of sentences, and then showed that in the term model M(Γ∗ ) con-
structed from Γ∗ , all sentences of Γ are true, so Γ is satisfiable.

143
10. T HE C OMPLETENESS T HEOREM

We can use the same method to show that a finitely satisfiable set of sen-
tences is satisfiable. We just have to prove the corresponding versions of
the results leading to the truth lemma where we replace “consistent” with
“finitely satisfiable.”

Proposition 10.27. Suppose Γ is complete and finitely satisfiable. Then:

1. ( φ & ψ) ∈ Γ iff both φ ∈ Γ and ψ ∈ Γ.

2. ( φ ∨ ψ) ∈ Γ iff either φ ∈ Γ or ψ ∈ Γ.

3. ( φ ⊃ ψ) ∈ Γ iff either φ ∈
/ Γ or ψ ∈ Γ.

Lemma 10.28. Every finitely satisfiable set Γ can be extended to a saturated finitely
satisfiable set Γ′ .

Proposition 10.29. Suppose Γ is complete, finitely satisfiable, and saturated.

1. ∃ x φ( x ) ∈ Γ iff φ(t) ∈ Γ for at least one closed term t.

2. ∀ x φ( x ) ∈ Γ iff φ(t) ∈ Γ for all closed terms t.

Lemma 10.30. Every finitely satisfiable set Γ can be extended to a complete and
finitely satisfiable set Γ∗ .

Theorem 10.31 (Compactness). Γ is satisfiable if and only if it is finitely satisfi-


able.

Proof. If Γ is satisfiable, then there is a structure M such that M ⊨ φ for all


φ ∈ Γ. Of course, this M also satisfies every finite subset of Γ, so Γ is finitely
satisfiable.
Now suppose that Γ is finitely satisfiable. By Lemma 10.28, there is a
finitely satisfiable, saturated set Γ′ ⊇ Γ. By Lemma 10.30, Γ′ can be extended
to a complete and finitely satisfiable set Γ∗ , and Γ∗ is still saturated. Construct
the term model M(Γ∗ ) as in Definition 10.9. Note that Proposition 10.11 did
not rely on the fact that Γ∗ is consistent (or complete or saturated, for that mat-
ter), but just on the fact that M(Γ∗ ) is covered. The proof of the Truth Lemma
(Lemma 10.12) goes through if we replace references to Proposition 10.2 and
Proposition 10.7 by references to Proposition 10.27 and Proposition 10.29

10.11 The Löwenheim–Skolem Theorem


The Löwenheim–Skolem Theorem says that if a theory has an infinite model,
then it also has a model that is at most countably infinite. An immediate con-
sequence of this fact is that first-order logic cannot express that the size of
a structure is uncountable: any sentence or set of sentences satisfied in all
uncountable structures is also satisfied in some countable structure.

144
10.11. The Löwenheim–Skolem Theorem

Theorem 10.32. If Γ is consistent then it has a countable model, i.e., it is satisfiable


in a structure whose domain is either finite or countably infinite.

Proof. If Γ is consistent, the structure M delivered by the proof of the com-


pleteness theorem has a domain |M| that is no larger than the set of the terms
of the language L. So M is at most countably infinite.

Theorem 10.33. If Γ is a consistent set of sentences in the language of first-order


logic without identity, then it has a countably infinite model, i.e., it is satisfiable in
a structure whose domain is infinite and countable.

Proof. If Γ is consistent and contains no sentences in which identity appears,


then the structure M delivered by the proof of the completeness theorem has a
domain |M| identical to the set of terms of the language L′ . So M is countably
infinite, since Trm(L′ ) is.

Example 10.34 (Skolem’s Paradox). Zermelo–Fraenkel set theory ZFC is a very


powerful framework in which practically all mathematical statements can be
expressed, including facts about the sizes of sets. So for instance, ZFC can
prove that the set R of real numbers is uncountable, it can prove Cantor’s
Theorem that the power set of any set is larger than the set itself, etc. If ZFC
is consistent, its models are all infinite, and moreover, they all contain ele-
ments about which the theory says that they are uncountable, such as the
element that makes true the theorem of ZFC that the power set of the natural
numbers exists. By the Löwenheim–Skolem Theorem, ZFC also has count-
able models—models that contain “uncountable” sets but which themselves
are countable.

145
Chapter 11

Beyond First-order Logic

11.1 Overview
First-order logic is not the only system of logic of interest: there are many ex-
tensions and variations of first-order logic. A logic typically consists of the
formal specification of a language, usually, but not always, a deductive sys-
tem, and usually, but not always, an intended semantics. But the technical use
of the term raises an obvious question: what do logics that are not first-order
logic have to do with the word “logic,” used in the intuitive or philosophical
sense? All of the systems described below are designed to model reasoning of
some form or another; can we say what makes them logical?
No easy answers are forthcoming. The word “logic” is used in different
ways and in different contexts, and the notion, like that of “truth,” has been
analyzed from numerous philosophical stances. For example, one might take
the goal of logical reasoning to be the determination of which statements are
necessarily true, true a priori, true independent of the interpretation of the
nonlogical terms, true by virtue of their form, or true by linguistic convention;
and each of these conceptions requires a good deal of clarification. Even if one
restricts one’s attention to the kind of logic used in mathematics, there is little
agreement as to its scope. For example, in the Principia Mathematica, Russell
and Whitehead tried to develop mathematics on the basis of logic, in the logi-
cist tradition begun by Frege. Their system of logic was a form of higher-type
logic similar to the one described below. In the end they were forced to intro-
duce axioms which, by most standards, do not seem purely logical (notably,
the axiom of infinity, and the axiom of reducibility), but one might nonetheless
hold that some forms of higher-order reasoning should be accepted as logical.
In contrast, Quine, whose ontology does not admit “propositions” as legiti-
mate objects of discourse, argues that second-order and higher-order logic are
really manifestations of set theory in sheep’s clothing; in other words, systems
involving quantification over predicates are not purely logical.
For now, it is best to leave such philosophical issues for a rainy day, and

147
11. B EYOND F IRST- ORDER L OGIC

simply think of the systems below as formal idealizations of various kinds of


reasoning, logical or otherwise.

11.2 Many-Sorted Logic


In first-order logic, variables and quantifiers range over a single domain. But
it is often useful to have multiple (disjoint) domains: for example, you might
want to have a domain of numbers, a domain of geometric objects, a domain
of functions from numbers to numbers, a domain of abelian groups, and so
on.
Many-sorted logic provides this kind of framework. One starts with a list
of “sorts”—the “sort” of an object indicates the “domain” it is supposed to
inhabit. One then has variables and quantifiers for each sort, and (usually)
an identity predicate for each sort. Functions and relations are also “typed”
by the sorts of objects they can take as arguments. Otherwise, one keeps the
usual rules of first-order logic, with versions of the quantifier-rules repeated
for each sort.
For example, to study international relations we might choose a language
with two sorts of objects, French citizens and German citizens. We might have
a unary relation, “drinks wine,” for objects of the first sort; another unary
relation, “eats wurst,” for objects of the second sort; and a binary relation,
“forms a multinational married couple,” which takes two arguments, where
the first argument is of the first sort and the second argument is of the second
sort. If we use variables a, b, c to range over French citizens and x, y, z to range
over German citizens, then

∀ a ∀ x [(Mar r i edT o ( a, x ) ⊃ (Dr i nksW i ne ( a) ∨ ∼EatsW ur st ( x ))]]


asserts that if any French person is married to a German, either the French
person drinks wine or the German doesn’t eat wurst.
Many-sorted logic can be embedded in first-order logic in a natural way,
by lumping all the objects of the many-sorted domains together into one first-
order domain, using unary predicate symbols to keep track of the sorts, and
relativizing quantifiers. For example, the first-order language corresponding
to the example above would have unary predicate symbols “Ger man” and
“F r ench,” in addition to the other relations described, with the sort require-
ments erased. A sorted quantifier ∀ x φ, where x is a variable of the German
sort, translates to
∀ x (Ger man ( x ) ⊃ φ).
We need to add axioms that insure that the sorts are separate—e.g., ∀ x ∼(Ger man ( x ) &
F r ench ( x ))—as well as axioms that guarantee that “drinks wine” only holds
of objects satisfying the predicate F r ench ( x ), etc. With these conventions and
axioms, it is not difficult to show that many-sorted sentences translate to first-
order sentences, and many-sorted derivations translate to first-order deriva-

148
11.3. Second-Order logic

tions. Also, many-sorted structures “translate” to corresponding first-order


structures and vice-versa, so we also have a completeness theorem for many-
sorted logic.

11.3 Second-Order logic


The language of second-order logic allows one to quantify not just over a do-
main of individuals, but over relations on that domain as well. Given a first-
order language L, for each k one adds variables R which range over k-ary
relations, and allows quantification over those variables. If R is a variable for
a k-ary relation, and t1 , . . . , tk are ordinary (first-order) terms, R(t1 , . . . , tk ) is
an atomic formula. Otherwise, the set of formulae is defined just as in the
case of first-order logic, with additional clauses for second-order quantifica-
tion. Note that we only have the identity predicate for first-order terms: if R
and S are relation variables of the same arity k, we can define R = S to be an
abbreviation for

∀ x1 . . . ∀ xk ( R( x1 , . . . , xk ) ≡ S( x1 , . . . , xk )).

The rules for second-order logic simply extend the quantifier rules to the
new second order variables. Here, however, one has to be a little bit careful
to explain how these variables interact with the predicate symbols of L, and
with formulae of L more generally. At the bare minimum, relation variables
count as terms, so one has inferences of the form

φ( R) ⊢ ∃ R φ( R)

But if L is the language of arithmetic with a constant relation symbol <, one
would also expect the following inference to be valid:

x < y ⊢ ∃ R R( x, y)

or for a given formula φ,

φ ( x1 , . . . , x k ) ⊢ ∃ R R ( x1 , . . . , x k )

More generally, we might want to allow inferences of the form

φ[λ⃗x. ψ(⃗x )/R] ⊢ ∃ R φ

where φ[λ⃗x. ψ(⃗x )/R] denotes the result of replacing every atomic formula of
the form Rt1 , . . . , tk in φ by ψ(t1 , . . . , tk ). This last rule is equivalent to having
a comprehension schema, i.e., an axiom of the form

∃ R ∀ x1 , . . . , xk ( φ( x1 , . . . , xk ) ≡ R( x1 , . . . , xk )),

149
11. B EYOND F IRST- ORDER L OGIC

one for each formula φ in the second-order language, in which R is not a free
variable. (Exercise: show that if R is allowed to occur in φ, this schema is
inconsistent!)
When logicians refer to the “axioms of second-order logic” they usually
mean the minimal extension of first-order logic by second-order quantifier
rules together with the comprehension schema. But it is often interesting to
study weaker subsystems of these axioms and rules. For example, note that
in its full generality the axiom schema of comprehension is impredicative: it
allows one to assert the existence of a relation R( x1 , . . . , xk ) that is “defined”
by a formula with second-order quantifiers; and these quantifiers range over
the set of all such relations—a set which includes R itself! Around the turn of
the twentieth century, a common reaction to Russell’s paradox was to lay the
blame on such definitions, and to avoid them in developing the foundations
of mathematics. If one prohibits the use of second-order quantifiers in the
formula φ, one has a predicative form of comprehension, which is somewhat
weaker.
From the semantic point of view, one can think of a second-order structure
as consisting of a first-order structure for the language, coupled with a set of
relations on the domain over which the second-order quantifiers range (more
precisely, for each k there is a set of relations of arity k). Of course, if com-
prehension is included in the derivation system, then we have the added re-
quirement that there are enough relations in the “second-order part” to satisfy
the comprehension axioms—otherwise the derivation system is not sound!
One easy way to ensure that there are enough relations around is to take the
second-order part to consist of all the relations on the first-order part. Such
a structure is called full, and, in a sense, is really the “intended structure” for
the language. If we restrict our attention to full structures we have what is
known as the full second-order semantics. In that case, specifying a structure
boils down to specifying the first-order part, since the contents of the second-
order part follow from that implicitly.
To summarize, there is some ambiguity when talking about second-order
logic. In terms of the derivation system, one might have in mind either

1. A “minimal” second-order derivation system, together with some com-


prehension axioms.

2. The “standard” second-order derivation system, with full comprehen-


sion.

In terms of the semantics, one might be interested in either

1. The “weak” semantics, where a structure consists of a first-order part,


together with a second-order part big enough to satisfy the comprehen-
sion axioms.

150
11.3. Second-Order logic

2. The “standard” second-order semantics, in which one considers full struc-


tures only.

When logicians do not specify the derivation system or the semantics they
have in mind, they are usually referring to the second item on each list. The
advantage to using this semantics is that, as we will see, it gives us categorical
descriptions of many natural mathematical structures; at the same time, the
derivation system is quite strong, and sound for this semantics. The drawback
is that the derivation system is not complete for the semantics; in fact, no effec-
tively given derivation system is complete for the full second-order semantics.
On the other hand, we will see that the derivation system is complete for the
weakened semantics; this implies that if a sentence is not provable, then there
is some structure, not necessarily the full one, in which it is false.
The language of second-order logic is quite rich. One can identify unary
relations with subsets of the domain, and so in particular you can quantify
over these sets; for example, one can express induction for the natural num-
bers with a single axiom

∀ R (( R(0) & ∀ x ( R( x ) ⊃ R( x ′ ))) ⊃ ∀ x R( x )).

If one takes the language of arithmetic to have symbols 0, ′, +, × and <, one
can add the following axioms to describe their behavior:

1. ∀ x ∼ x ′ = 0

2. ∀ x ∀y (s( x ) = s(y) ⊃ x = y)

3. ∀ x ( x + 0) = x

4. ∀ x ∀y ( x + y′ ) = ( x + y)′

5. ∀ x ( x × 0) = 0

6. ∀ x ∀y ( x × y′ ) = (( x × y) + x )

7. ∀ x ∀y ( x < y ≡ ∃z y = ( x + z′ ))

It is not difficult to show that these axioms, together with the axiom of induc-
tion above, provide a categorical description of the structure N, the standard
model of arithmetic, provided we are using the full second-order semantics.
Given any structure M in which these axioms are true, define a function f
from N to the domain of M using ordinary recursion on N, so that f (0) = 0M
and f ( x + 1) = ′M ( f ( x )). Using ordinary induction on N and the fact that ax-
ioms (1) and (2) hold in M, we see that f is injective. To see that f is surjective,
let P be the set of elements of |M| that are in the range of f . Since M is full, P is
in the second-order domain. By the construction of f , we know that 0M is in P,
and that P is closed under ′M . The fact that the induction axiom holds in M

151
11. B EYOND F IRST- ORDER L OGIC

(in particular, for P) guarantees that P is equal to the entire first-order domain
of M. This shows that f is a bijection. Showing that f is a homomorphism is
no more difficult, using ordinary induction on N repeatedly.
In set-theoretic terms, a function is just a special kind of relation; for ex-
ample, a unary function f can be identified with a binary relation R satisfying
∀ x ∃!y R( x, y). As a result, one can quantify over functions too. Using the full
semantics, one can then define the class of infinite structures to be the class of
structures M for which there is an injective function from the domain of M to
a proper subset of itself:

∃ f (∀ x ∀y ( f ( x ) = f (y) ⊃ x = y) & ∃y ∀ x f ( x ) ̸= y).

The negation of this sentence then defines the class of finite structures.
In addition, one can define the class of well-orderings, by adding the fol-
lowing to the definition of a linear ordering:

∀ P (∃ x P( x ) ⊃ ∃ x ( P( x ) & ∀y (y < x ⊃ ∼ P(y)))).

This asserts that every non-empty set has a least element, modulo the iden-
tification of “set” with “one-place relation”. For another example, one can
express the notion of connectedness for graphs, by saying that there is no non-
trivial separation of the vertices into disconnected parts:

∼∃ A (∃ x A( x ) & ∃y ∼ A(y) & ∀w ∀z (( A(w) & ∼ A(z)) ⊃ ∼ R(w, z))).

For yet another example, you might try as an exercise to define the class of
finite structures whose domain has even size. More strikingly, one can pro-
vide a categorical description of the real numbers as a complete ordered field
containing the rationals.
In short, second-order logic is much more expressive than first-order logic.
That’s the good news; now for the bad. We have already mentioned that there
is no effective derivation system that is complete for the full second-order
semantics. For better or for worse, many of the properties of first-order logic
are absent, including compactness and the Löwenheim–Skolem theorems.
On the other hand, if one is willing to give up the full second-order se-
mantics in terms of the weaker one, then the minimal second-order derivation
system is complete for this semantics. In other words, if we read ⊢ as “proves
in the minimal system” and ⊨ as “logically implies in the weaker semantics”,
we can show that whenever Γ ⊨ φ then Γ ⊢ φ. If one wants to include spe-
cific comprehension axioms in the derivation system, one has to restrict the
semantics to second-order structures that satisfy these axioms: for example, if
∆ consists of a set of comprehension axioms (possibly all of them), we have
that if Γ ∪ ∆ ⊨ φ, then Γ ∪ ∆ ⊢ φ. In particular, if φ is not provable using
the comprehension axioms we are considering, then there is a model of ∼ φ in
which these comprehension axioms nonetheless hold.

152
11.4. Higher-Order logic

The easiest way to see that the completeness theorem holds for the weaker
semantics is to think of second-order logic as a many-sorted logic, as follows.
One sort is interpreted as the ordinary “first-order” domain, and then for each
k we have a domain of “relations of arity k.” We take the language to have
built-in relation symbols “tr ue k ( R, x1 , . . . , xk )” which is meant to assert that
R holds of x1 , . . . , xk , where R is a variable of the sort “k-ary relation” and x1 ,
. . . , xk are objects of the first-order sort.
With this identification, the weak second-order semantics is essentially the
usual semantics for many-sorted logic; and we have already observed that
many-sorted logic can be embedded in first-order logic. Modulo the trans-
lations back and forth, then, the weaker conception of second-order logic is
really a form of first-order logic in disguise, where the domain contains both
“objects” and “relations” governed by the appropriate axioms.

11.4 Higher-Order logic


Passing from first-order logic to second-order logic enabled us to talk about
sets of objects in the first-order domain, within the formal language. Why stop
there? For example, third-order logic should enable us to deal with sets of sets
of objects, or perhaps even sets which contain both objects and sets of objects.
And fourth-order logic will let us talk about sets of objects of that kind. As
you may have guessed, one can iterate this idea arbitrarily.
In practice, higher-order logic is often formulated in terms of functions
instead of relations. (Modulo the natural identifications, this difference is
inessential.) Given some basic “sorts” A, B, C, . . . (which we will now call
“types”), we can create new ones by stipulating

If σ and τ are finite types then so is σ → τ.

Think of types as syntactic “labels,” which classify the objects we want in our
domain; σ → τ describes those objects that are functions which take objects
of type σ to objects of type τ. For example, we might want to have a type Ω
of truth values, “true” and “false,” and a type N of natural numbers. In that
case, you can think of objects of type N → Ω as unary relations, or subsets
of N; objects of type N → N are functions from natural numbers to natu-
ral numbers; and objects of type (N → N) → N are “functionals,” that is,
higher-type functions that take functions to numbers.
As in the case of second-order logic, one can think of higher-order logic as
a kind of many-sorted logic, where there is a sort for each type of object we
want to consider. But it is usually clearer just to define the syntax of higher-
type logic from the ground up. For example, we can define a set of finite types
inductively, as follows:

1. N is a finite type.

153
11. B EYOND F IRST- ORDER L OGIC

2. If σ and τ are finite types, then so is σ → τ.

3. If σ and τ are finite types, so is σ × τ.

Intuitively, N denotes the type of the natural numbers, σ → τ denotes the


type of functions from σ to τ, and σ × τ denotes the type of pairs of objects,
one from σ and one from τ. We can then define a set of terms inductively, as
follows:

1. For each type σ, there is a stock of variables x, y, z, . . . of type σ

2. 0 is a term of type N

3. S (successor) is a term of type N → N

4. If s is a term of type σ, and t is a term of type N → (σ → σ ), then Rst is


a term of type N → σ

5. If s is a term of type τ → σ and t is a term of type τ, then s(t) is a term


of type σ

6. If s is a term of type σ and x is a variable of type τ, then λx. s is a term of


type τ → σ.

7. If s is a term of type σ and t is a term of type τ, then ⟨s, t⟩ is a term of


type σ × τ.

8. If s is a term of type σ × τ then p1 (s) is a term of type σ and p2 (s) is a


term of type τ.

Intuitively, Rst denotes the function defined recursively by

Rst (0) = s
Rst ( x + 1) = t( x, Rst ( x )),

⟨s, t⟩ denotes the pair whose first component is s and whose second compo-
nent is t, and p1 (s) and p2 (s) denote the first and second elements (“projec-
tions”) of s. Finally, λx. s denotes the function f defined by

f (x) = s

for any x of type σ; so item (6) gives us a form of comprehension, enabling us


to define functions using terms. Formulae are built up from identity predicate
statements s = t between terms of the same type, the usual propositional
connectives, and higher-type quantification. One can then take the axioms
of the system to be the basic equations governing the terms defined above,
together with the usual rules of logic with quantifiers and identity predicate.
If one augments the finite type system with a type Ω of truth values, one
has to include axioms which govern its use as well. In fact, if one is clever, one

154
11.5. Intuitionistic Logic

can get rid of complex formulae entirely, replacing them with terms of type Ω!
The proof system can then be modified accordingly. The result is essentially
the simple theory of types set forth by Alonzo Church in the 1930s.
As in the case of second-order logic, there are different versions of higher-
type semantics that one might want to use. In the full version, variables of
type σ → τ range over the set of all functions from the objects of type σ to
objects of type τ. As you might expect, this semantics is too strong to ad-
mit a complete, effective derivation system. But one can consider a weaker
semantics, in which a structure consists of sets of elements Tτ for each type
τ, together with appropriate operations for application, projection, etc. If the
details are carried out correctly, one can obtain completeness theorems for the
kinds of derivation systems described above.
Higher-type logic is attractive because it provides a framework in which
we can embed a good deal of mathematics in a natural way: starting with N,
one can define real numbers, continuous functions, and so on. It is also partic-
ularly attractive in the context of intuitionistic logic, since the types have clear
“constructive” interpretations. In fact, one can develop constructive versions
of higher-type semantics (based on intuitionistic, rather than classical logic)
that clarify these constructive interpretations quite nicely, and are, in many
ways, more interesting than the classical counterparts.

11.5 Intuitionistic Logic


In contrast to second-order and higher-order logic, intuitionistic first-order
logic represents a restriction of the classical version, intended to model a more
“constructive” kind of reasoning. The following examples may serve to illus-
trate some of the underlying motivations.
Suppose someone came up to you one day and announced that they had
determined a natural number x, with the property that if x is prime, the Rie-
mann hypothesis is true, and if x is composite, the Riemann hypothesis is
false. Great news! Whether the Riemann hypothesis is true or not is one of
the big open questions of mathematics, and here they seem to have reduced
the problem to one of calculation, that is, to the determination of whether a
specific number is prime or not.
What is the magic value of x? They describe it as follows: x is the natural
number that is equal to 7 if the Riemann hypothesis is true, and 9 otherwise.
Angrily, you demand your money back. From a classical point of view, the
description above does in fact determine a unique value of x; but what you
really want is a value of x that is given explicitly.
To take another, perhaps less contrived example, consider the following
question. We know that it is possible to raise an irrational number to a rational
√ 2
power, and get a rational result. For example, 2 = 2. What is less clear
is whether or not it is possible to raise an irrational number to an irrational

155
11. B EYOND F IRST- ORDER L OGIC

power, and get a rational result. The following theorem answers this in the
affirmative:

Theorem 11.1. There are irrational numbers a and b such that ab is rational.
√ √2 √
Proof. Consider 2 . If this is rational, we are done: we can let a = b = 2.
Otherwise, it is irrational. Then we have
√ √2 √ √ √2· √2 √ 2
( 2 ) 2= 2 = 2 = 2,

√ 2 √
which is certainly rational. So, in this case, let a be 2 , and let b be 2.

Does this constitute a valid proof? Most mathematicians feel that it does.
But again, there is something a little bit unsatisfying here: we have proved the
existence of a pair of real numbers with a certain property, without being able
to say which pair of numbers it is. It is possible to prove the √
same result, but in
such a way that the pair a, b is given in the proof: take a = 3 and b = log3 4.
Then
√ log 4
ab = 3 3 = 31/2·log3 4 = (3log3 4 )1/2 = 41/2 = 2,
since 3log3 x = x.
Intuitionistic logic is designed to model a kind of reasoning where moves
like the one in the first proof are disallowed. Proving the existence of an x
satisfying φ( x ) means that you have to give a specific x, and a proof that it
satisfies φ, like in the second proof. Proving that φ or ψ holds requires that
you can prove one or the other.
Formally speaking, intuitionistic first-order logic is what you get if you
restrict a derivation system for first-order logic in a certain way. Similarly,
there are intuitionistic versions of second-order or higher-order logic. From
the mathematical point of view, these are just formal deductive systems, but,
as already noted, they are intended to model a kind of mathematical reason-
ing. One can take this to be the kind of reasoning that is justified on a cer-
tain philosophical view of mathematics (such as Brouwer’s intuitionism); one
can take it to be a kind of mathematical reasoning which is more “concrete”
and satisfying (along the lines of Bishop’s constructivism); and one can argue
about whether or not the formal description captures the informal motiva-
tion. But whatever philosophical positions we may hold, we can study intu-
itionistic logic as a formally presented logic; and for whatever reasons, many
mathematical logicians find it interesting to do so.
There is an informal constructive interpretation of the intuitionist connec-
tives, usually known as the BHK interpretation (named after Brouwer, Heyt-
ing, and Kolmogorov). It runs as follows: a proof of φ & ψ consists of a proof
of φ paired with a proof of ψ; a proof of φ ∨ ψ consists of either a proof of φ,
or a proof of ψ, where we have explicit information as to which is the case;

156
11.5. Intuitionistic Logic

a proof of φ ⊃ ψ consists of a procedure, which transforms a proof of φ to a


proof of ψ; a proof of ∀ x φ( x ) consists of a procedure which returns a proof
of φ( x ) for any value of x; and a proof of ∃ x φ( x ) consists of a value of x,
together with a proof that this value satisfies φ. One can describe the interpre-
tation in computational terms known as the “Curry–Howard isomorphism”
or the “formulae-as-types paradigm”: think of a formula as specifying a cer-
tain kind of data type, and proofs as computational objects of these data types
that enable us to see that the corresponding formula is true.
Intuitionistic logic is often thought of as being classical logic “minus” the
law of the excluded middle. This following theorem makes this more precise.

Theorem 11.2. Intuitionistically, the following axiom schemata are equivalent:

1. (∼ φ ⊃ ⊥) ⊃ φ.

2. φ ∨ ∼ φ

3. ∼∼ φ ⊃ φ

Obtaining instances of one schema from either of the others is a good exercise
in intuitionistic logic.
The first deductive systems for intuitionistic propositional logic, put forth
as formalizations of Brouwer’s intuitionism, are due, independently, to Kol-
mogorov, Glivenko, and Heyting. The first formalization of intuitionistic first-
order logic (and parts of intuitionist mathematics) is due to Heyting. Though
a number of classically valid schemata are not intuitionistically valid, many
are.
The double-negation translation describes an important relationship between
classical and intuitionist logic. It is defined inductively follows (think of φ N
as the “intuitionist” translation of the classical formula φ):

!A N ≡ ∼∼ φ for atomic formulae φ


( φ & ψ) ≡ ( φ & ψ N )
N N

( φ ∨ ψ) N ≡ ∼∼( φ N ∨ ψ N )
( φ ⊃ ψ) N ≡ ( φ N ⊃ ψ N )
(∀ x φ) N ≡ ∀ x φ N
(∃ x φ) N ≡ ∼∼∃ x φ N

Kolmogorov and Glivenko had versions of this translation for propositional


logic; for predicate logic, it is due to Gödel and Gentzen, independently. We
have

Theorem 11.3. 1. φ ≡ φ N is provable classically

157
11. B EYOND F IRST- ORDER L OGIC

2. If φ is provable classically, then φ N is provable intuitionistically.

We can now envision the following dialogue. Classical mathematician:


“I’ve proved φ!” Intuitionist mathematician: “Your proof isn’t valid. What
you’ve really proved is φ N .” Classical mathematician: “Fine by me!” As far as
the classical mathematician is concerned, the intuitionist is just splitting hairs,
since the two are equivalent. But the intuitionist insists there is a difference.
Note that the above translation concerns pure logic only; it does not ad-
dress the question as to what the appropriate nonlogical axioms are for classi-
cal and intuitionistic mathematics, or what the relationship is between them.
But the following slight extension of the theorem above provides some useful
information:

Theorem 11.4. If Γ proves φ classically, Γ N proves φ N intuitionistically.

In other words, if φ is provable from some hypotheses classically, then φ N


is provable from their double-negation translations.
To show that a sentence or propositional formula is intuitionistically valid,
all you have to do is provide a proof. But how can you show that it is not
valid? For that purpose, we need a semantics that is sound, and preferably
complete. A semantics due to Kripke nicely fits the bill.
We can play the same game we did for classical logic: define the semantics,
and prove soundness and completeness. It is worthwhile, however, to note
the following distinction. In the case of classical logic, the semantics was the
“obvious” one, in a sense implicit in the meaning of the connectives. Though
one can provide some intuitive motivation for Kripke semantics, the latter
does not offer the same feeling of inevitability. In addition, the notion of a
classical structure is a natural mathematical one, so we can either take the
notion of a structure to be a tool for studying classical first-order logic, or take
classical first-order logic to be a tool for studying mathematical structures.
In contrast, Kripke structures can only be viewed as a logical construct; they
don’t seem to have independent mathematical interest.
A Kripke structure M = ⟨W, R, V ⟩ for a propositional language consists
of a set W, partial order R on W with a least element, and an “monotone” as-
signment of propositional variables to the elements of W. The intuition is that
the elements of W represent “worlds,” or “states of knowledge”; an element
v ≥ u represents a “possible future state” of u; and the propositional variables
assigned to u are the propositions that are known to be true in state u. The
forcing relation M, w ⊩ φ then extends this relationship to arbitrary formulae
in the language; read M, w ⊩ φ as “φ is true in state w.” The relationship is
defined inductively, as follows:

1. M, w ⊩ pi iff pi is one of the propositional variables assigned to w.

2. M, w ⊮ ⊥.

158
11.6. Modal Logics

3. M, w ⊩ ( φ & ψ) iff M, w ⊩ φ and M, w ⊩ ψ.

4. M, w ⊩ ( φ ∨ ψ) iff M, w ⊩ φ or M, w ⊩ ψ.

5. M, w ⊩ ( φ ⊃ ψ) iff, whenever w′ ≥ w and M, w′ ⊩ φ, then M, w′ ⊩ ψ.

It is a good exercise to try to show that ∼( p & q) ⊃ (∼ p ∨ ∼q) is not intuition-


istically valid, by cooking up a Kripke structure that provides a counterexam-
ple.

11.6 Modal Logics


Consider the following example of a conditional sentence:

If Jeremy is alone in that room, then he is drunk and naked and


dancing on the chairs.

This is an example of a conditional assertion that may be materially true but


nonetheless misleading, since it seems to suggest that there is a stronger link
between the antecedent and conclusion other than simply that either the an-
tecedent is false or the consequent true. That is, the wording suggests that the
claim is not only true in this particular world (where it may be trivially true,
because Jeremy is not alone in the room), but that, moreover, the conclusion
would have been true had the antecedent been true. In other words, one can
take the assertion to mean that the claim is true not just in this world, but in
any “possible” world; or that it is necessarily true, as opposed to just true in
this particular world.
Modal logic was designed to make sense of this kind of necessity. One ob-
tains modal propositional logic from ordinary propositional logic by adding a
box operator; which is to say, if φ is a formula, so is □φ. Intuitively, □φ asserts
that φ is necessarily true, or true in any possible world. ♢φ is usually taken to
be an abbreviation for ∼□∼ φ, and can be read as asserting that φ is possibly
true. Of course, modality can be added to predicate logic as well.
Kripke structures can be used to provide a semantics for modal logic; in
fact, Kripke first designed this semantics with modal logic in mind. Rather
than restricting to partial orders, more generally one has a set of “possible
worlds,” P, and a binary “accessibility” relation R( x, y) between worlds. In-
tuitively, R( p, q) asserts that the world q is compatible with p; i.e., if we are
“in” world p, we have to entertain the possibility that the world could have
been like q.
Modal logic is sometimes called an “intensional” logic, as opposed to an
“extensional” one. The intended semantics for an extensional logic, like clas-
sical logic, will only refer to a single world, the “actual” one; while the seman-
tics for an “intensional” logic relies on a more elaborate ontology. In addition
to structureing necessity, one can use modality to structure other linguistic

159
11. B EYOND F IRST- ORDER L OGIC

constructions, reinterpreting □ and ♢ according to the application. For exam-


ple:

1. In provability logic, □φ is read “φ is provable” and ♢φ is read “φ is


consistent.”

2. In epistemic logic, one might read □φ as “I know φ” or “I believe φ.”

3. In temporal logic, one can read □φ as “φ is always true” and ♢φ as “φ is


sometimes true.”

One would like to augment logic with rules and axioms dealing with modal-
ity. For example, the system S4 consists of the ordinary axioms and rules of
propositional logic, together with the following axioms:

□( φ ⊃ ψ) ⊃ (□φ ⊃ □ψ)
□φ ⊃ φ
□φ ⊃ □□φ

as well as a rule, “from φ conclude □φ.” S5 adds the following axiom:

♢φ ⊃ □♢φ

Variations of these axioms may be suitable for different applications; for ex-
ample, S5 is usually taken to characterize the notion of logical necessity. And
the nice thing is that one can usually find a semantics for which the derivation
system is sound and complete by restricting the accessibility relation in the
Kripke structures in natural ways. For example, S4 corresponds to the class
of Kripke structures in which the accessibility relation is reflexive and transi-
tive. S5 corresponds to the class of Kripke structures in which the accessibility
relation is universal, which is to say that every world is accessible from every
other; so □φ holds if and only if φ holds in every world.

11.7 Other Logics


As you may have gathered by now, it is not hard to design a new logic. You
too can create your own a syntax, make up a deductive system, and fashion
a semantics to go with it. You might have to be a bit clever if you want the
derivation system to be complete for the semantics, and it might take some
effort to convince the world at large that your logic is truly interesting. But, in
return, you can enjoy hours of good, clean fun, exploring your logic’s mathe-
matical and computational properties.
Recent decades have witnessed a veritable explosion of formal logics. Fuzzy
logic is designed to model reasoning about vague properties. Probabilistic
logic is designed to model reasoning about uncertainty. Default logics and

160
11.7. Other Logics

nonmonotonic logics are designed to model defeasible forms of reasoning,


which is to say, “reasonable” inferences that can later be overturned in the face
of new information. There are epistemic logics, designed to model reasoning
about knowledge; causal logics, designed to model reasoning about causal re-
lationships; and even “deontic” logics, which are designed to model reason-
ing about moral and ethical obligations. Depending on whether the primary
motivation for introducing these systems is philosophical, mathematical, or
computational, you may find such creatures studies under the rubric of math-
ematical logic, philosophical logic, artificial intelligence, cognitive science, or
elsewhere.
The list goes on and on, and the possibilities seem endless. We may never
attain Leibniz’ dream of reducing all of human reason to calculation—but that
can’t stop us from trying.

161
Part III

Turing Machines

163
Chapter 12

Turing Machine Computations

12.1 Introduction
What does it mean for a function, say, from N to N to be computable? Among
the first answers, and the most well known one, is that a function is com-
putable if it can be computed by a Turing machine. This notion was set out
by Alan Turing in 1936. Turing machines are an example of a model of compu-
tation—they are a mathematically precise way of defining the idea of a “com-
putational procedure.” What exactly that means is debated, but it is widely
agreed that Turing machines are one way of specifying computational proce-
dures. Even though the term “Turing machine” evokes the image of a physi-
cal machine with moving parts, strictly speaking a Turing machine is a purely
mathematical construct, and as such it idealizes the idea of a computational
procedure. For instance, we place no restriction on either the time or memory
requirements of a Turing machine: Turing machines can compute something
even if the computation would require more storage space or more steps than
there are atoms in the universe.
It is perhaps best to think of a Turing machine as a program for a spe-
cial kind of imaginary mechanism. This mechanism consists of a tape and a
read-write head. In our version of Turing machines, the tape is infinite in one
direction (to the right), and it is divided into squares, each of which may con-
tain a symbol from a finite alphabet. Such alphabets can contain any number of
different symbols, but we will mainly make do with three: ▷, 0, and 1. When
the mechanism is started, the tape is empty (i.e., each square contains the sym-
bol 0) except for the leftmost square, which contains ▷, and a finite number of
squares which contain the input. At any time, the mechanism is in one of a
finite number of states. At the outset, the head scans the leftmost square and
in a specified initial state. At each step of the mechanism’s run, the content
of the square currently scanned together with the state the mechanism is in
and the Turing machine program determine what happens next. The Turing
machine program is given by a partial function which takes as input a state q

165
12. T URING M ACHINE C OMPUTATIONS

Figure 12.1: A Turing machine executing its program.

and a symbol σ and outputs a triple ⟨q′ , σ′ , D ⟩. Whenever the mechanism is in


state q and reads symbol σ, it replaces the symbol on the current square with
σ′ , the head moves left, right, or stays put according to whether D is L, R, or
N, and the mechanism goes into state q′ .
For instance, consider the situation in Figure 12.1. The visible part of the
tape of the Turing machine contains the end-of-tape symbol ▷ on the leftmost
square, followed by three 1’s, a 0, and four more 1’s. The head is reading
the third square from the left, which contains a 1, and is in state q1 —we say
“the machine is reading a 1 in state q1 .” If the program of the Turing machine
returns, for input ⟨q1 , 1⟩, the triple ⟨q2 , 0, N ⟩, the machine would now replace
the 1 on the third square with a 0, leave the read/write head where it is, and
switch to state q2 . If then the program returns ⟨q3 , 0, R⟩ for input ⟨q2 , 0⟩, the
machine would now overwrite the 0 with another 0 (effectively, leaving the
content of the tape under the read/write head unchanged), move one square
to the right, and enter state q3 . And so on.
We say that the machine halts when it encounters some state, qn , and sym-
bol, σ such that there is no instruction for ⟨qn , σ ⟩, i.e., the transition function
for input ⟨qn , σ ⟩ is undefined. In other words, the machine has no instruction
to carry out, and at that point, it ceases operation. Halting is sometimes repre-
sented by a specific halt state h. This will be demonstrated in more detail later
on.
The beauty of Turing’s paper, “On computable numbers,” is that he presents
not only a formal definition, but also an argument that the definition captures
the intuitive notion of computability. From the definition, it should be clear
that any function computable by a Turing machine is computable in the in-
tuitive sense. Turing offers three types of argument that the converse is true,
i.e., that any function that we would naturally regard as computable is com-
putable by such a machine. They are (in Turing’s words):

1. A direct appeal to intuition.

2. A proof of the equivalence of two definitions (in case the new definition
has a greater intuitive appeal).

166
12.2. Representing Turing Machines

3. Giving examples of large classes of numbers which are computable.

Our goal is to try to define the notion of computability “in principle,” i.e.,
without taking into account practical limitations of time and space. Of course,
with the broadest definition of computability in place, one can then go on
to consider computation with bounded resources; this forms the heart of the
subject known as “computational complexity.”

Historical Remarks Alan Turing invented Turing machines in 1936. While


his interest at the time was the decidability of first-order logic, the paper has
been described as a definitive paper on the foundations of computer design.
In the paper, Turing focuses on computable real numbers, i.e., real numbers
whose decimal expansions are computable; but he notes that it is not hard to
adapt his notions to computable functions on the natural numbers, and so on.
Notice that this was a full five years before the first working general purpose
computer was built in 1941 (by the German Konrad Zuse in his parent’s living
room), seven years before Turing and his colleagues at Bletchley Park built the
code-breaking Colossus (1943), nine years before the American ENIAC (1945),
twelve years before the first British general purpose computer—the Manch-
ester Small-Scale Experimental Machine—was built in Manchester (1948), and
thirteen years before the Americans first tested the BINAC (1949). The Manch-
ester SSEM has the distinction of being the first stored-program computer—
previous machines had to be rewired by hand for each new task.

12.2 Representing Turing Machines


Turing machines can be represented visually by state diagrams. The diagrams
are composed of state cells connected by arrows. Unsurprisingly, each state
cell represents a state of the machine. Each arrow represents an instruction
that can be carried out from that state, with the specifics of the instruction
written above or below the appropriate arrow. Consider the following ma-
chine, which has only two internal states, q0 and q1 , and one instruction:

0, 1, R
start q0 q1

Recall that the Turing machine has a read/write head and a tape with the
input written on it. The instruction can be read as if reading a 0 in state q0 , write
a 1, move right, and move to state q1 . This is equivalent to the transition function
mapping ⟨q0 , 0⟩ to ⟨q1 , 1, R⟩.

Example 12.1. Even Machine: The following Turing machine halts if, and only
if, there are an even number of 1’s on the tape (under the assumption that all

167
12. T URING M ACHINE C OMPUTATIONS

1’s come before the first 0 on the tape).

0, 0, R
1, 1, R

start q0 q1

1, 1, R

The state diagram corresponds to the following transition function:


δ(q0 , 1) = ⟨q1 , 1, R⟩,
δ(q1 , 1) = ⟨q0 , 1, R⟩,
δ(q1 , 0) = ⟨q1 , 0, R⟩

The above machine halts only when the input is an even number of strokes.
Otherwise, the machine (theoretically) continues to operate indefinitely. For
any machine and input, it is possible to trace through the configurations of the
machine in order to determine the output. We will give a formal definition
of configurations later. For now, we can intuitively think of configurations
as a series of diagrams showing the state of the machine at any point in time
during operation. Configurations show the content of the tape, the state of the
machine and the location of the read/write head.
Let us trace through the configurations of the even machine if it is started
with an input of four 1’s. In this case, we expect that the machine will halt.
We will then run the machine on an input of three 1’s, where the machine will
run forever.
The machine starts in state q0 , scanning the leftmost 1. We can represent
the initial state of the machine as follows:
▷10 1110 . . .
The above configuration is straightforward. As can be seen, the machine starts
in state one, scanning the leftmost 1. This is represented by a subscript of the
state name on the first 1. The applicable instruction at this point is δ(q0 , 1) =
⟨q1 , 1, R⟩, and so the machine moves right on the tape and changes to state q1 .
▷111 110 . . .
Since the machine is now in state q1 scanning a 1, we have to “follow” the
instruction δ(q1 , 1) = ⟨q0 , 1, R⟩. This results in the configuration
▷1110 10 . . .
As the machine continues, the rules are applied again in the same order, re-
sulting in the following two configurations:
▷11111 0 . . .

168
12.2. Representing Turing Machines

▷111100 . . .
The machine is now in state q0 scanning a 0. Based on the transition diagram,
we can easily see that there is no instruction to be carried out, and thus the
machine has halted. This means that the input has been accepted.
Suppose next we start the machine with an input of three 1’s. The first few
configurations are similar, as the same instructions are carried out, with only
a small difference of the tape input:

▷10 110 . . .

▷111 10 . . .
▷1110 0 . . .
▷11101 . . .
The machine has now traversed past all the 1’s, and is reading a 0 in state q1 .
As shown in the diagram, there is an instruction of the form δ(q1 , 0) = ⟨q1 , 0, R⟩.
Since the tape is filled with 0 indefinitely to the right, the machine will con-
tinue to execute this instruction forever, staying in state q1 and moving ever
further to the right. The machine will never halt, and does not accept the
input.
It is important to note that not all machines will halt. If halting means that
the machine runs out of instructions to execute, then we can create a machine
that never halts simply by ensuring that there is an outgoing arrow for each
symbol at each state. The even machine can be modified to run indefinitely
by adding an instruction for scanning a 0 at q0 .
Example 12.2.

0, 0, R 0, 0, R
1, 1, R

start q0 q1

1, 1, R

Machine tables are another way of representing Turing machines. Machine


tables have the tape alphabet displayed on the x-axis, and the set of machine
states across the y-axis. Inside the table, at the intersection of each state and
symbol, is written the rest of the instruction—the new state, new symbol, and
direction of movement. Machine tables make it easy to determine in what
state, and for what symbol, the machine halts. Whenever there is a gap in the
table is a possible point for the machine to halt. Unlike state diagrams and
instruction sets, where the points at which the machine halts are not always
immediately obvious, any halting points are quickly identified by finding the
gaps in the machine table.

169
12. T URING M ACHINE C OMPUTATIONS

1, 1, R 1, 1, R

1, 0, R 0, 0, R
start q0 q1 q2

0, 0, R 0, 1, R

q5 q4 q3
0, 0, L 1, 1, L

1, 1, L 1, 1, L 0, 1, L

Figure 12.2: A doubler machine

Example 12.3. The machine table for the even machine is:

0 1 ▷
q0 1, q1 , R
q1 0, q1 , R 1, q0 , R

As we can see, the machine halts when scanning a 0 in state q0 .

So far we have only considered machines that read and accept input. How-
ever, Turing machines have the capacity to both read and write. An example
of such a machine (although there are many, many examples) is a doubler. A
doubler, when started with a block of n 1’s on the tape, outputs a block of 2n
1’s.

Example 12.4. Before building a doubler machine, it is important to come up


with a strategy for solving the problem. Since the machine (as we have formu-
lated it) cannot remember how many 1’s it has read, we need to come up with
a way to keep track of all the 1’s on the tape. One such way is to separate the
output from the input with a 0. The machine can then erase the first 1 from
the input, traverse over the rest of the input, leave a 0, and write two new 1’s.
The machine will then go back and find the second 1 in the input, and double
that one as well. For each one 1 of input, it will write two 1’s of output. By
erasing the input as the machine goes, we can guarantee that no 1 is missed
or doubled twice. When the entire input is erased, there will be 2n 1’s left
on the tape. The state diagram of the resulting Turing machine is depicted in
Figure 12.2.

170
12.3. Turing Machines

12.3 Turing Machines


The formal definition of what constitutes a Turing machine looks abstract,
but is actually simple: it merely packs into one mathematical structure all
the information needed to specify the workings of a Turing machine. This
includes (1) which states the machine can be in, (2) which symbols are allowed
to be on the tape, (3) which state the machine should start in, and (4) what the
instruction set of the machine is.

Definition 12.5 (Turing machine). A Turing machine M is a tuple ⟨ Q, Σ, q0 , δ⟩


consisting of

1. a finite set of states Q,

2. a finite alphabet Σ which includes ▷ and 0,

3. an initial state q0 ∈ Q,

4. a finite instruction set δ : Q × Σ →


7 Q × Σ × { L, R, N }.
The partial function δ is also called the transition function of M.

We assume that the tape is infinite in one direction only. For this reason
it is useful to designate a special symbol ▷ as a marker for the left end of the
tape. This makes it easier for Turing machine programs to tell when they’re
“in danger” of running off the tape. We could assume that this symbol is never
overwritten, i.e., that δ(q, ▷) = ⟨q′ , ▷, x ⟩ if δ(q, ▷) is defined. Some textbooks
do this, we do not. You can simply be careful when constructing your Turing
machine that it never overwrites ▷. Moreover, there are cases where allowing
such overwriting provides some convenient flexibility.

Example 12.6. Even Machine: The even machine is formally the quadruple
⟨ Q, Σ, q0 , δ⟩ where

Q = { q0 , q1 }
Σ = {▷, 0, 1},
δ(q0 , 1) = ⟨q1 , 1, R⟩,
δ(q1 , 1) = ⟨q0 , 1, R⟩,
δ(q1 , 0) = ⟨q1 , 0, R⟩.

12.4 Configurations and Computations


Recall tracing through the configurations of the even machine earlier. The
imaginary mechanism consisting of tape, read/write head, and Turing ma-
chine program is really just an intuitive way of visualizing what a Turing ma-
chine computation is. Formally, we can define the computation of a Turing

171
12. T URING M ACHINE C OMPUTATIONS

machine on a given input as a sequence of configurations—and a configuration


in turn is a sequence of symbols (corresponding to the contents of the tape
at a given point in the computation), a number indicating the position of the
read/write head, and a state. Using these, we can define what the Turing
machine M computes on a given input.

Definition 12.7 (Configuration). A configuration of Turing machine M = ⟨ Q, Σ, q0 , δ⟩


is a triple ⟨C, m, q⟩ where

1. C ∈ Σ∗ is a finite sequence of symbols from Σ,

2. m ∈ N is a number < len(C ), and

3. q ∈ Q

Intuitively, the sequence C is the content of the tape (symbols of all squares
from the leftmost square to the last non-blank or previously visited square),
m is the number of the square the read/write head is scanning (beginning
with 0 being the number of the leftmost square), and q is the current state of
the machine.

The potential input for a Turing machine is a sequence of symbols, usually


a sequence that encodes a number in some form. The initial configuration of
the Turing machine is that configuration in which we start the Turing machine
to work on that input: the tape contains the tape end marker immediately
followed by the input written on the squares to the right, the read/write head
is scanning the leftmost square of the input (i.e., the square to the right of the
left end marker), and the mechanism is in the designated start state q0 .

Definition 12.8 (Initial configuration). The initial configuration of M for input


I ∈ Σ∗ is
⟨▷ ⌢ I, 1, q0 ⟩.

The ⌢ symbol is for concatenation—the input string begins immediately to


the left end marker.

Definition 12.9. We say that a configuration ⟨C, m, q⟩ yields the configuration


⟨C ′ , m′ , q′ ⟩ in one step (according to M), iff
1. the m-th symbol of C is σ,

2. the instruction set of M specifies δ(q, σ ) = ⟨q′ , σ′ , D ⟩,

3. the m-th symbol of C ′ is σ′ , and

4. a) D = L and m′ = m − 1 if m > 0, otherwise m′ = 0, or


b) D = R and m′ = m + 1, or
c) D = N and m′ = m,

172
12.5. Unary Representation of Numbers

5. if m′ = len(C ), then len(C ′ ) = len(C ) + 1 and the m′ -th symbol of C ′


is 0. Otherwise len(C ′ ) = len(C ).

6. for all i such that i < len(C ) and i ̸= m, C ′ (i ) = C (i ),

Definition 12.10. A run of M on input I is a sequence Ci of configurations of


M, where C0 is the initial configuration of M for input I, and each Ci yields
Ci+1 in one step.
We say that M halts on input I after k steps if Ck = ⟨C, m, q⟩, the mth symbol
of C is σ, and δ(q, σ ) is undefined. In that case, the output of M for input I is O,
where O is a string of symbols not ending in 0 such that C = ▷ ⌢ O ⌢ 0 j for
some j ∈ N. (0 j is a sequence of j 0’s.)

According to this definition, the output O of M always ends in a symbol


other than 0, or, if at time k the entire tape is filled with 0 (except for the
leftmost ▷), O is the empty string.

12.5 Unary Representation of Numbers


Turing machines work on sequences of symbols written on their tape. De-
pending on the alphabet a Turing machine uses, these sequences of symbols
can represent various inputs and outputs. Of particular interest, of course, are
Turing machines which compute arithmetical functions, i.e., functions of natu-
ral numbers. A simple way to represent positive integers is by coding them
as sequences of a single symbol 1. If n ∈ N, let 1n be the empty sequence if
n = 0, and otherwise the sequence consisting of exactly n 1’s.

Definition 12.11 (Computation). A Turing machine M computes the function


f : Nk → N iff M halts on input

1n1 01n2 0 . . . 01nk

with output 1 f (n1 ,...,nk ) .

Example 12.12. Addition: Let’s build a machine that computes the function
f (n, m) = n + m. This requires a machine that starts with two blocks of 1’s of
length n and m on the tape, and halts with one block consisting of n + m 1’s.
The two input blocks of 1’s are separated by a 0, so one method would be to
write a stroke on the square containing the 0, and erase the last 1.

In Example 12.4, we gave an example of a Turing machine that takes as


input a sequence of 1’s and halts with a sequence of twice as many 1’s on
the tape—the doubler machine. However, because the output contains 0’s to
the left of the doubled block of 1’s, it does not actually compute the function
f ( x ) = 2x, as you might have assumed. We’ll describe two ways of fixing
that.

173
12. T URING M ACHINE C OMPUTATIONS

1, 1, R 1, 1, R 1, 0, N

0, 1, N 0, 0, L
start q0 q1 q2

Figure 12.3: A machine computing f ( x, y) = x + y

1, 1, R 1, 1, L

0, 1, L
q2 q3

q6
0, 0, R 0, 0, L

R
1,
0, 0, 1, R
1, 1, R q1 q4

q7 1, 1, R
1, 0, R 1, 1, L

0, 0, L
start q0 q5
0, 1, R
q8 1, 0, N
1, 1, L

Figure 12.4: A machine computing f ( x ) = 2x

Example 12.13. The machine in Figure 12.4 computes the function f ( x ) = 2x.
Instead of erasing the input and writing two 1’s at the far right for every 1 in
the input as the machine from Example 12.4 does, this machine adds a single 1
to the right for every 1 in the input. It has to keep track of where the input
ends, so it leaves a 0 between the input and the added strokes, which it fills
with a 1 at the very end. And we have to “remember” where we are in the
input, so we temporarily replace a 1 in the input block by a 0.

Example 12.14. A second possibility for computing f ( x ) = 2x is to keep the


original doubler machine, but add states and instructions at the end which

174
12.5. Unary Representation of Numbers

0, 0, R 1, 1, R

0, 0, R 1, 1, R
start q6 q7 q8

0, 0, L 0, ▷, L

1, 0, L
q11 q10 q9 1, 1, L
0, 0, R
1,

▷, ▷, R
0,
L

1, 1, R

0, 1, R ▷, 0, N
q12 q13 q14

0, 0, R

Figure 12.5: Moving a block of 1’s to the left

move the doubled block of strokes to the far left of the tape. The machine
in Figure 12.5 does just this last part: started on a tape consisting of a block
of 0’s followed by a block of 1’s (and the head positioned anywhere in the
block of 0’s), it erases the 1’s one at a time and writes them at the beginning
of the tape. In order to be able to tell when it is done, it first marks the end
of the block of 1’s with a ▷ symbol, which gets deleted at the end. We’ve
started numbering the states at q6 , so they can be added to the doubler ma-
chine. All you’ll need is an additional instruction δ(q0 , 0) = ⟨q6 , 0, N ⟩, i.e., an
arrow from q0 to q6 labelled 0, 0, N. (There is one subtle problem: the resulting
machine does not work for input x = 0. We’ll leave this as an exercise.)

Definition 12.15. A Turing machine M computes the partial function f : Nk →


7
N iff,

1. M halts on input 1n1 ⌢ 0 ⌢ . . . ⌢ 0 ⌢ 1nk with output 1m if f (n1 , . . . , nk ) =


m.

2. M does not halt at all, or with an output that is not a single block of 1’s
if f (n1 , . . . , nk ) is undefined.

175
12. T URING M ACHINE C OMPUTATIONS

12.6 Halting States


Although we have defined our machines to halt only when there is no in-
struction to carry out, common representations of Turing machines have a
dedicated halting state h, such that h ∈ Q.
The idea behind a halting state is simple: when the machine has finished
operation (it is ready to accept input, or has finished writing the output), it
goes into a state h where it halts. Some machines have two halting states, one
that accepts input and one that rejects input.

Example 12.16. Halting States. To elucidate this concept, let us begin with an
alteration of the even machine. Instead of having the machine halt in state q0 if
the input is even, we can add an instruction to send the machine into a halting
state.
0, 0, R
1, 1, R

start q0 q1

1, 1, R
0, 0, N

Let us further expand the example. When the machine determines that the
input is odd, it never halts. We can alter the machine to include a reject state
by replacing the looping instruction with an instruction to go to a reject state r.

1, 1, R

start q0 q1

1, 1, R
0, 0, N 0, 0, N

h r

Adding a dedicated halting state can be advantageous in cases like this,


where it makes explicit when the machine accepts/rejects certain inputs. How-
ever, it is important to note that no computing power is gained by adding a
dedicated halting state. Similarly, a less formal notion of halting has its own

176
12.7. Disciplined Machines

advantages. The definition of halting used so far in this chapter makes the
proof of the Halting Problem intuitive and easy to demonstrate. For this rea-
son, we continue with our original definition.

12.7 Disciplined Machines


In section section 12.6, we considered Turing machines that have a single, des-
ignated halting state h—such machines are guaranteed to halt, if they halt at
all, in state h. In this way, machines with a single halting state are more “dis-
ciplined” than we allow Turing machines in general to be. There are other
restrictions we might impose on the behavior of Turing machines. For in-
stance, we also have not prohibited Turing machines from ever erasing the
tape-end marker on square 0, or to attempt to move left from square 0. (Our
definition states that the head simply stays on square 0 in this case; other def-
initions have the machine halt.) It is likewise sometimes desirable to be able
to assume that a Turing machine, if it halts at all, halts on square 1.

Definition 12.17. A Turing machine M is disciplined iff

1. it has a designated single halting state h,

2. it halts, if it halts at all, while scanning square 1,

3. it never erases the ▷ symbol on square 0, and

4. it never attempts to move left from square 0.

We have already discussed that any Turing machine can be changed into
one with the same behavior but with a designated halting state. This is done
simply by adding a new state h, and adding an instruction δ(q, σ ) = ⟨h, σ, N ⟩
for any pair ⟨q, σ⟩ where the original δ is undefined. It is true, although te-
dious to prove, that any Turing machine M can be turned into a disciplined
Turing machine M′ which halts on the same inputs and produces the same
output. For instance, if the Turing machine halts and is not on square 1, we
can add some instructions to make the head move left until it finds the tape-
end marker, then move one square to the right, then halt. We’ll leave you to
think about how the other conditions can be dealt with.

Example 12.18. In Figure 12.6, we turn the addition machine from Example 12.12
into a disciplined machine.

Proposition 12.19. For every Turing machine M, there is a disciplined Turing ma-
chine M′ which halts with output O if M halts with output O, and does not halt if
M does not halt. In particular, any function f : Nn → N computable by a Turing
machine is also computable by a disciplined Turing machine.

177
12. T URING M ACHINE C OMPUTATIONS

0, 1, N
start q0 q1

0, 0, L
1, 1, R 1, 1, R
q2

1, 1, L
1, 0, L

h q3
▷, ▷, R

Figure 12.6: A disciplined addition machine

12.8 Combining Turing Machines

The examples of Turing machines we have seen so far have been fairly simple
in nature. But in fact, any problem that can be solved with any modern pro-
gramming language can also be solved with Turing machines. To build more
complex Turing machines, it is important to convince ourselves that we can
combine them, so we can build machines to solve more complex problems by
breaking the procedure into simpler parts. If we can find a natural way to
break a complex problem down into constituent parts, we can tackle the prob-
lem in several stages, creating several simple Turing machines and combining
them into one machine that can solve the problem. This point is especially
important when tackling the Halting Problem in the next section.
How do we combine Turing machines M = ⟨ Q, Σ, q0 , δ⟩ and M′ = ⟨ Q′ , Σ′ , q0′ , δ′ ⟩?
We now use the configuration of the tape after M has halted as the input con-
figuration of a run of machine M′ . To get a single Turing machine M ⌢ M′
that does this, do the following:

1. Renumber (or relabel) all the states Q′ of M′ so that M and M′ have no


states in common (Q ∩ Q′ = ∅).

2. The states of M ⌢ M′ are Q ∪ Q′ .

3. The tape alphabet is Σ ∪ Σ′ .

4. The start state is q0 .

178
12.8. Combining Turing Machines

5. The transition function is the function δ′′ given by:


δ(q, σ)
 if q ∈ Q
δ′′ (q, σ ) = δ′ (q, σ ) if q ∈ Q′
 ′

⟨q0 , σ, N ⟩ if q ∈ Q and δ(q, σ) is undefined

The resulting machine uses the instructions of M when it is in a state q ∈ Q,


the instructions of M′ when it is in a state q ∈ Q′ . When it is in a state q ∈ Q
and is scanning a symbol σ for which M has no transition (i.e., M would have
halted), it enters the start state of M′ (and leaves the tape contents and head
position as it is).

Note that unless the machine M is disciplined, we don’t know where the
tape head is when M halts, so the halting configuration of M need not have
the head scanning square 1. When combining machines, it’s important to keep
this in mind.

Example 12.20. Combining Machines: We’ll design a machine which, when


started on input consisting of two blocks of 1’s of length n and m, halts with
a single block of 2(m + n) 1’s on the tape. In order to build this machine, we
can combine two machines we are already familiar with: the addition ma-
chine, and the doubler. We begin by drawing a state diagram for the addition
machine.

1, 1, R 1, 1, R 1, 0, N

0, 1, N 0, 0, L
start q0 q1 q2

Instead of halting in state q2 , we want to continue operation in order to double


the output. Recall that the doubler machine erases the first stroke in the input
and writes two strokes in a separate output. Let’s add an instruction to make
sure the tape head is reading the first stroke of the output of the addition

179
12. T URING M ACHINE C OMPUTATIONS

machine.

1, 1, R 1, 1, R

0, 1, N 0, 0, L
start q0 q1 q2

1, 0, L

1, 1, L q3

▷, ▷, R

q4

It is now easy to double the input—all we have to do is connect the doubler


machine onto state q4 . This requires renaming the states of the doubler ma-
chine so that they start at q4 instead of q0 —this way we don’t end up with two
starting states. The final diagram should look as in Figure 12.7.

Proposition 12.21. If M and M′ are disciplined and compute the functions f : Nk →


N and f ′ : N → N, respectively, then M ⌢ M′ is disciplined and computes f ′ ◦ f .

Proof. Since M is disciplined, when it halts with output f (n1 , . . . , nk ) = m, the


head is scanning square 1. If we now enter the start state of M′ , the machine
will halt with output f ′ (m), again scanning square 1. The other conditions of
Definition 12.17 are also satisfied.

12.9 Variants of Turing Machines


There are in fact many possible ways to define Turing machines, of which
ours is only one. In some ways, our definition is more liberal than others.
We allow arbitrary finite alphabets, a more restricted definition might allow
only two tape symbols, 1 and 0. We allow the machine to write a symbol to
the tape and move at the same time, other definitions allow either writing or
moving. We allow the possibility of writing without moving the tape head,
other definitions leave out the N “instruction.” In other ways, our definition
is more restrictive. We assumed that the tape is infinite in one direction only,
other definitions allow the tape to be infinite both to the left and the right. In
fact, one can even allow any number of separate tapes, or even an infinite grid
of squares. We represent the instruction set of the Turing machine by a tran-
sition function; other definitions use a transition relation where the machine
has more than one possible instruction in any given situation.

180
12.9. Variants of Turing Machines

1, 1, R 1, 1, R

0, 1, N 0, 0, L
start q0 q1 q2

1, 0, L

1, 1, L q3

1, 1, R 1, 1, R
▷, ▷, R
1, 0, R 0, 0, R
q4 q5 q6

0, 0, R 0, 1, R

q9 q8 q7
0, 0, L 1, 1, L

1, 1, L 1, 1, L 0, 1, L

Figure 12.7: Combining adder and doubler machines

This last relaxation of the definition is particularly interesting. In our def-


inition, when the machine is in state q reading symbol σ, δ(q, σ) determines
what the new symbol, state, and tape head position is. But if we allow the
instruction set to be a relation between current state-symbol pairs ⟨q, σ ⟩ and
new state-symbol-direction triples ⟨q′ , σ′ , D ⟩, the action of the Turing machine
may not be uniquely determined—the instruction relation may contain both
⟨q, σ, q′ , σ′ , D ⟩ and ⟨q, σ, q′′ , σ′′ , D ′ ⟩. In this case we have a non-deterministic
Turing machine. These play an important role in computational complexity
theory.
There are also different conventions for when a Turing machine halts: we
say it halts when the transition function is undefined, other definitions require
the machine to be in a special designated halting state. We have explained in
section 12.6 why requiring a designated halting state is not a restriction which
impacts what Turing machines can compute. Since the tapes of our Turing
machines are infinite in one direction only, there are cases where a Turing
machine can’t properly carry out an instruction: if it reads the leftmost square

181
12. T URING M ACHINE C OMPUTATIONS

and is supposed to move left. According to our definition, it just stays put
instead of “falling off”, but we could have defined it so that it halts when that
happens. This definition is also equivalent: we could simulate the behavior
of a Turing machine that halts when it attempts to move left from square 0
by deleting every transition δ(q, ▷) = ⟨q′ , σ, L⟩—then instead of attempting to
move left on ▷ the machine halts.1
There are also different ways of representing numbers (and hence the input-
output function computed by a Turing machine): we use unary representa-
tion, but you can also use binary representation. This requires two symbols in
addition to 0 and ▷.
Now here is an interesting fact: none of these variations matters as to
which functions are Turing computable. If a function is Turing computable ac-
cording to one definition, it is Turing computable according to all of them.
We won’t go into the details of verifying this. Here’s just one example:
we gain no additional computing power by allowing a tape that is infinite
in both directions, or multiple tapes. The reason is, roughly, that a Turing
machine with a single one-way infinite tape can simulate multiple or two-way
infinite tapes. E.g., using additional states and instructions, we can “translate”
a program for a machine with multiple tapes or two-way infinite tape into
one with a single one-way infinite tape. The translated machine can use the
even squares for the squares of tape 1 (or the “positive” squares of a two-way
infinite tape) and the odd squares for the squares of tape 2 (or the “negative”
squares).

12.10 The Church–Turing Thesis


Turing machines are supposed to be a precise replacement for the concept of
an effective procedure. Turing thought that anyone who grasped both the
concept of an effective procedure and the concept of a Turing machine would
have the intuition that anything that could be done via an effective procedure
could be done by Turing machine. This claim is given support by the fact
that all the other proposed precise replacements for the concept of an effective
procedure turn out to be extensionally equivalent to the concept of a Turing
machine —that is, they can compute exactly the same set of functions. This
claim is called the Church–Turing thesis.

Definition 12.22 (Church–Turing thesis). The Church–Turing Thesis states that


anything computable via an effective procedure is Turing computable.

The Church–Turing thesis is appealed to in two ways. The first kind of


use of the Church–Turing thesis is an excuse for laziness. Suppose we have a
1 This doesn’t quite work, since nothing prevents us from writing and reading ▷ on squares

other than square 0 (see Example 12.14). We can get around that by adding a second ▷′ symbol to
use instead for such a purpose.

182
12.10. The Church–Turing Thesis

description of an effective procedure to compute something, say, in “pseudo-


code.” Then we can invoke the Church–Turing thesis to justify the claim that
the same function is computed by some Turing machine, even if we have not
in fact constructed it.
The other use of the Church–Turing thesis is more philosophically inter-
esting. It can be shown that there are functions which cannot be computed
by Turing machines. From this, using the Church–Turing thesis, one can con-
clude that it cannot be effectively computed, using any procedure whatsoever.
For if there were such a procedure, by the Church–Turing thesis, it would fol-
low that there would be a Turing machine for it. So if we can prove that there
is no Turing machine that computes it, there also can’t be an effective pro-
cedure. In particular, the Church–Turing thesis is invoked to claim that the
so-called halting problem not only cannot be solved by Turing machines, it
cannot be effectively solved at all.

183
Chapter 13

Undecidability

13.1 Introduction
It might seem obvious that not every function, even every arithmetical func-
tion, can be computable. There are just too many, whose behavior is too
complicated. Functions defined from the decay of radioactive particles, for
instance, or other chaotic or random behavior. Suppose we start counting 1-
second intervals from a given time, and define the function f (n) as the num-
ber of particles in the universe that decay in the n-th 1-second interval after
that initial moment. This seems like a candidate for a function we cannot ever
hope to compute.
But it is one thing to not be able to imagine how one would compute such
functions, and quite another to actually prove that they are uncomputable.
In fact, even functions that seem hopelessly complicated may, in an abstract
sense, be computable. For instance, suppose the universe is finite in time—
some day, in the very distant future the universe will contract into a single
point, as some cosmological theories predict. Then there is only a finite (but
incredibly large) number of seconds from that initial moment for which f (n)
is defined. And any function which is defined for only finitely many inputs is
computable: we could list the outputs in one big table, or code it in one very
big Turing machine state transition diagram.
We are often interested in special cases of functions whose values give the
answers to yes/no questions. For instance, the question “is n a prime num-
ber?” is associated with the function
(
1 if n is prime
isprime(n) =
0 otherwise.
We say that a yes/no question can be effectively decided, if the associated 1/0-
valued function is effectively computable.
To prove mathematically that there are functions which cannot be effec-
tively computed, or problems that cannot effectively decided, it is essential to

185
13. U NDECIDABILITY

fix a specific model of computation, and show that there are functions it can-
not compute or problems it cannot decide. We can show, for instance, that not
every function can be computed by Turing machines, and not every problem
can be decided by Turing machines. We can then appeal to the Church–Turing
thesis to conclude that not only are Turing machines not powerful enough to
compute every function, but no effective procedure can.

The key to proving such negative results is the fact that we can assign
numbers to Turing machines themselves. The easiest way to do this is to enu-
merate them, perhaps by fixing a specific way to write down Turing machines
and their programs, and then listing them in a systematic fashion. Once we
see that this can be done, then the existence of Turing-uncomputable func-
tions follows by simple cardinality considerations: the set of functions from
N to N (in fact, even just from N to {0, 1}) are uncountable, but since we can
enumerate all the Turing machines, the set of Turing-computable functions is
only countably infinite.

We can also define specific functions and problems which we can prove
to be uncomputable and undecidable, respectively. One such problem is the
so-called Halting Problem. Turing machines can be finitely described by list-
ing their instructions. Such a description of a Turing machine, i.e., a Turing
machine program, can of course be used as input to another Turing machine.
So we can consider Turing machines that decide questions about other Tur-
ing machines. One particularly interesting question is this: “Does the given
Turing machine eventually halt when started on input n?” It would be nice if
there were a Turing machine that could decide this question: think of it as a
quality-control Turing machine which ensures that Turing machines don’t get
caught in infinite loops and such. The interesting fact, which Turing proved,
is that there cannot be such a Turing machine. There cannot be a single Turing
machine which, when started on input consisting of a description of a Turing
machine M and some number n, will always halt with either output 1 or 0
according to whether M machine would have halted when started on input n
or not.

Once we have examples of specific undecidable problems we can use them


to show that other problems are undecidable, too. For instance, one celebrated
undecidable problem is the question, “Is the first-order formula φ valid?”.
There is no Turing machine which, given as input a first-order formula φ, is
guaranteed to halt with output 1 or 0 according to whether φ is valid or not.
Historically, the question of finding a procedure to effectively solve this prob-
lem was called simply “the” decision problem; and so we say that the decision
problem is unsolvable. Turing and Church proved this result independently
at around the same time, so it is also called the Church–Turing Theorem.

186
13.2. Enumerating Turing Machines

0, 0, R
1, 1, R

start q0 q1

1, 1, R
0, 0, R
A, A, R

start s h

A, A, R

Figure 13.1: Variants of the Even machine

13.2 Enumerating Turing Machines


We can show that the set of all Turing machines is countable. This follows
from the fact that each Turing machine can be finitely described. The set of
states and the tape vocabulary are finite sets. The transition function is a par-
tial function from Q × Σ to Q × Σ × { L, R, N }, and so likewise can be speci-
fied by listing its values for the finitely many argument pairs for which it is
defined.
This is true as far as it goes, but there is a subtle difference. The definition
of Turing machines made no restriction on what elements the set of states
and tape alphabet can have. So, e.g., for every real number, there technically
is a Turing machine that uses that number as a state. However, the behavior
of the Turing machine is independent of which objects serve as states and
vocabulary. Consider the two Turing machines in Figure 13.1. These two
diagrams correspond to two machines, M with the tape alphabet Σ = {▷, 0, 1}
and set of states {q0 , q1 }, and M′ with alphabet Σ′ = {▷, 0, A} and states {s, h}.
But their instructions are otherwise the same: M will halt on a sequence of n
1’s iff n is even, and M′ will halt on a sequence of n A’s iff n is even. All
we’ve done is rename 1 to A, q0 to s, and q1 to h. This example generalizes:
we can think of Turing machines as the same as long as one results from the
other by such a renaming of symbols and states. In fact, we can simply think
of the symbols and states of a Turing machine as positive integers: instead of
σ0 think 1, instead of σ1 think 2, etc.; ▷ is 1, 0 is 2, etc. In this way, the Even
machine becomes the machine depicted in Figure 13.2. We might call a Turing
machine with states and symbols that are positive integers a standard machine,
and only consider standard machines from now on.1
1 The terminology “standard machine” is not standard.

187
13. U NDECIDABILITY

2, 2, R
3, 3, R

start 1 2

3, 3, R

Figure 13.2: A standard Even machine

We wanted to show that the set of Turing machines is countable, and with
the above considerations in mind, it is enough to show that the set of stan-
dard Turing machines is countable. Suppose we are given a standard Turing
machine M = ⟨ Q, Σ, q0 , δ⟩. How could we describe it using a finite string of
positive integers? We’ll first list the number of states, the states themselves,
the number of symbols, the symbols themselves, and the starting state. (Re-
member, all of these are positive integers, since M is a standard machine.)
What about δ? The set of possible arguments, i.e., pairs ⟨q, σ ⟩, is finite, since
Q and Σ are finite. So the information in δ is simply the finite list of all 5-
tuples ⟨q, σ, q′ , σ′ , d⟩ where δ(q, σ) = ⟨q′ , σ′ , D ⟩, and d is a number that codes
the direction D (say, 1 for L, 2 for R, and 3 for N).
In this way, every standard Turing machine can be described by a finite list
of positive integers, i.e., as a sequence s M ∈ (Z+ )∗ . For instance, the standard
Even machine is coded by the sequence

Σ δ(2,2)=⟨2,2,R⟩
z }| { z }| {
2, 1, 2 , 3, 1, 2, 3, 1, 1, 3, 2, 3, 2 , 2, 2, 2, 2, 2 , 2, 3, 1, 3, 2 .
|{z} | {z } | {z }
Q δ(1,3)=⟨2,3,R⟩ δ(2,3)=⟨1,3,R⟩

Theorem 13.1. There are functions from N to N which are not Turing computable.

Proof. We know that the set of finite sequences of positive integers (Z+ )∗ is
countable (problem 4.7). This gives us that the set of descriptions of standard
Turing machines, as a subset of (Z+ )∗ , is itself enumerable. Every Turing
computable function N to N is computed by some (in fact, many) Turing ma-
chines. By renaming its states and symbols to positive integers (in particular,
▷ as 1, 0 as 2, and 1 as 3) we can see that every Turing computable function is
computed by a standard Turing machine. This means that the set of all Turing
computable functions from N to N is also enumerable.
On the other hand, the set of all functions from N to N is not countable
(problem 4.21). If all functions were computable by some Turing machine,
we could enumerate the set of all functions by listing all the descriptions of

188
13.3. Universal Turing Machines

Turing machines that compute them. So there are some functions that are not
Turing computable.

13.3 Universal Turing Machines


In section 13.2 we discussed how every Turing machine can be described by
a finite sequence of integers. This sequence encodes the states, alphabet, start
state, and instructions of the Turing machine. We also pointed out that the
set of all of these descriptions is countable. Since the set of such descriptions
is countably infinite, this means that there is a surjective function from N to
these descriptions. Such a surjective function can be obtained, for instance,
using Cantor’s zig-zag method. It gives us a way of enumerating all (descrip-
tions) of Turing machines. If we fix one such enumeration, it now makes sense
to talk of the 1st, 2nd, . . . , eth Turing machine. These numbers are called in-
dices.

Definition 13.2. If M is the eth Turing machine (in our fixed enumeration), we
say that e is an index of M. We write Me for the eth Turing machine.

A machine may have more than one index, e.g., two descriptions of M
may differ in the order in which we list its instructions, and these different
descriptions will have different indices.
Importantly, it is possible to give the enumeration of Turing machine de-
scriptions in such a way that we can effectively compute the description of M
from its index, and to effectively compute an index of a machine M from its
description. By the Church–Turing thesis, it is then possible to find a Turing
machine which recovers the description of the Turing machine with index e
and writes the corresponding description on its tape as output. The descrip-
tion would be a sequence of blocks of 1’s (representing the positive integers in
the sequence describing Me ).
Given this, it now becomes natural to ask: what functions of Turing ma-
chine indices are themselves computable by Turing machines? What proper-
ties of Turing machine indices can be decided by Turing machines? An ex-
ample: the function that maps an index e to the number of states the Turing
machine with index e has, is computable by a Turing machine. Here’s what
such a Turing machine would do: started on a tape containing a single block
of e 1’s, it would first decode e into its description. The description is now
represented by a sequence of blocks of 1’s on the tape. Since the first element
in this sequence is the number of states. So all that has to be done now is to
erase everything but the first block of 1’s and then halt.
A remarkable result is the following:

Theorem 13.3. There is a universal Turing machine U which, when started on


input ⟨e, n⟩

189
13. U NDECIDABILITY

1. halts iff Me halts on input n, and

2. if Me halts with output m, so does U.

U thus computes the function f : N × N → 7 N given by f (e, n) = m if Me started


on input n halts with output m, and undefined otherwise.

Proof. To actually produce U is basically impossible, since it is an extremely


complicated machine. But we can describe in outline how it works, and then
invoke the Church–Turing thesis. When it starts, U’s tape contains a block of e
1’s followed by a block of n 1’s. It first “decodes” the index e to the right of the
input n. This produces a list of numbers (i.e., blocks of 1’s separated by 0’s)
that describes the instructions of machine Me . U then writes the number of the
start state of Me and the number 1 on the tape to the right of the description
of Me . (Again, these are represented in unary, as blocks of 1’s.) Next, it copies
the input (block of n 1’s) to the right—but it replaces each 1 by a block of three
1’s (remember, the number of the 1 symbol is 3, 1 being the number of ▷ and
2 being the number of 0). At the left end of this sequence of blocks (separated
by 0 symbols on the tape of U), it writes a single 1, the code for ▷.
U now has on its tape: the index e, the number n, the code number of the
start state (the “current state”), the number of the initial head position 1 (the
“current head position”), and the initial contents of the “tape” (a sequence
of blocks of 1’s representing the code numbers of the symbols of Me —the
“symbols”—separated by 0’s).
It now simulates what Me would do if started on input n, by doing the
following:

1. Find the number k of the “current head position” (at the beginning,
that’s 1),

2. Move to the kth block in the “tape” to see what the “symbol” there is,

3. Find the instruction matching the current “state” and “symbol,”

4. Move back to the kth block on the “tape” and replace the “symbol” there
with the code number of the symbol Me would write,

5. Move the head to where it records the current “state” and replace the
number there with the number of the new state,

6. Move to the place where it records the “tape position” and erase a 1 or
add a 1 (if the instruction says to move left or right, respectively).

7. Repeat.2
2 We’re glossing over some subtle difficulties here. E.g., U may need some extra space when

it increases the counter where it keeps track of the “current head position”—in that case it will
have to move the entire “tape” to the right.

190
13.4. The Halting Problem

If Me started on input n never halts, then U also never halts, so its output is
undefined.
If in step (3) it turns out that the description of Me contains no instruction
for the current “state”/“symbol” pair, then Me would halt. If this happens, U
erases the part of its tape to the left of the “tape.” For each block of three 1’s
(representing a 1 on Me ’s tape), it writes a 1 on the left end of its own tape, and
successively erases the “tape.” When this is done, U’s tape contains a single
block of 1’s of length m.
If U encounters something other than a block of three 1’s on the “tape,” it
immediately halts. Since U’s tape in this case does not contain a single block
of 1’s, its output is not a natural number, i.e., f (e, n) is undefined in this case.

13.4 The Halting Problem


Assume we have fixed some enumeration of Turing machine descriptions.
Each Turing machine thus receives an index: its place in the enumeration M1 ,
M2 , M3 , . . . of Turing machine descriptions.
We know that there must be non-Turing-computable functions: the set
of Turing machine descriptions—and hence the set of Turing machines—is
countable, but the set of all functions from N to N is not. But we can find
specific examples of non-computable functions as well. One such function is
the halting function.

Definition 13.4 (Halting function). The halting function h is defined as


(
0 if machine Me does not halt for input n
h(e, n) =
1 if machine Me halts for input n

Definition 13.5 (Halting problem). The Halting Problem is the problem of de-
termining (for any e, n) whether the Turing machine Me halts for an input of n
strokes.

We show that h is not Turing-computable by showing that a related func-


tion, s, is not Turing-computable. This proof relies on the fact that anything
that can be computed by a Turing machine can be computed by a disciplined
Turing machine (section 12.7), and the fact that two Turing machines can be
hooked together to create a single machine (section 12.8).

Definition 13.6. The function s is defined as


(
0 if machine Me does not halt for input e
s(e) =
1 if machine Me halts for input e

Lemma 13.7. The function s is not Turing computable.

191
13. U NDECIDABILITY

Proof. We suppose, for contradiction, that the function s is Turing computable.


Then there would be a Turing machine S that computes s. We may assume,
without loss of generality, that when S halts, it does so while scanning the
first square (i.e., that it is disciplined). This machine can be “hooked up” to
another machine J, which halts if it is started on input 0 (i.e., if it reads 0 in the
initial state while scanning the square to the right of the end-of-tape symbol),
and otherwise wanders off to the right, never halting. S ⌢ J, the machine
created by hooking S to J, is a Turing machine, so it is Me for some e (i.e., it
appears somewhere in the enumeration). Start Me on an input of e 1s. There
are two possibilities: either Me halts or it does not halt.

1. Suppose Me halts for an input of e 1s. Then s(e) = 1. So S, when started


on e, halts with a single 1 as output on the tape. Then J starts with a 1
on the tape. In that case J does not halt. But Me is the machine S ⌢ J,
so it should do exactly what S followed by J would do (i.e., in this case,
wander off to the right and never halt). So Me cannot halt for an input
of e 1’s.

2. Now suppose Me does not halt for an input of e 1s. Then s(e) = 0, and
S, when started on input e, halts with a blank tape. J, when started on
a blank tape, immediately halts. Again, Me does what S followed by J
would do, so Me must halt for an input of e 1’s.

In each case we arrive at a contradiction with our assumption. This shows


there cannot be a Turing machine S: s is not Turing computable.

Theorem 13.8 (Unsolvability of the Halting Problem). The halting problem is


unsolvable, i.e., the function h is not Turing computable.

Proof. Suppose h were Turing computable, say, by a Turing machine H. We


could use H to build a Turing machine that computes s: First, make a copy
of the input (separated by a 0 symbol). Then move back to the beginning,
and run H. We can clearly make a machine that does the former (see prob-
lem 12.13), and if H existed, we would be able to “hook it up” to such a copier
machine to get a new machine which would determine if Me halts on input e,
i.e., computes s. But we’ve already shown that no such machine can exist.
Hence, h is also not Turing computable.

13.5 The Decision Problem


We say that first-order logic is decidable iff there is an effective method for
determining whether or not a given sentence is valid. As it turns out, there is
no such method: the problem of deciding validity of first-order sentences is
unsolvable.

192
13.6. Representing Turing Machines

In order to establish this important negative result, we prove that the de-
cision problem cannot be solved by a Turing machine. That is, we show that
there is no Turing machine which, whenever it is started on a tape that con-
tains a first-order sentence, eventually halts and outputs either 1 or 0 depend-
ing on whether the sentence is valid or not. By the Church–Turing thesis,
every function which is computable is Turing computable. So if this “validity
function” were effectively computable at all, it would be Turing computable.
If it isn’t Turing computable, then, it also cannot be effectively computable.
Our strategy for proving that the decision problem is unsolvable is to re-
duce the halting problem to it. This means the following: We have proved that
the function h(e, w) that halts with output 1 if the Turing machine described
by e halts on input w and outputs 0 otherwise, is not Turing computable. We
will show that if there were a Turing machine that decides validity of first-
order sentences, then there is also Turing machine that computes h. Since h
cannot be computed by a Turing machine, there cannot be a Turing machine
that decides validity either.
The first step in this strategy is to show that for every input w and a Turing
machine M, we can effectively describe a sentence τ ( M, w) representing the
instruction set of M and the input w and a sentence α( M, w) expressing “M
eventually halts” such that:

⊨ τ ( M, w) ⊃ α( M, w) iff M halts for input w.

The bulk of our proof will consist in describing these sentences τ ( M, w) and α( M, w)
and in verifying that τ ( M, w) ⊃ α( M, w) is valid iff M halts on input w.

13.6 Representing Turing Machines

In order to represent Turing machines and their behavior by a sentence of


first-order logic, we have to define a suitable language. The language consists
of two parts: predicate symbols for describing configurations of the machine,
and expressions for numbering execution steps (“moments”) and positions on
the tape.
We introduce two kinds of predicate symbols, both of them 2-place: For
each state q, a predicate symbol Qq , and for each tape symbol σ, a predicate
symbol Sσ . The former allow us to describe the state of M and the position of
its tape head, the latter allow us to describe the contents of the tape.
In order to express the positions of the tape head and the number of steps
executed, we need a way to express numbers. This is done using a constant
symbol 0, and a 1-place function ′, the successor function. By convention it is
written after its argument (and we leave out the parentheses).

193
13. U NDECIDABILITY

For each number n there is a canonical term n, the numeral for n, which
represents it in L M . 0 is 0, 1 is 0′ , 2 is 0′′ , and so on. More formally:

0=0
n + 1 = n′

The term 0, i.e., 0 names the leftmost position on the tape as well as the time
before the first execution step (the initial configuration). The term 1, i.e., 0′
names the square to the right of the leftmost square, and the time after the
first execution step, and so on.
We also introduce a predicate symbol < to express both the ordering of
tape positions (when it means “to the left of”) and execution steps (then it
means “before”).
Once we have the language in place, we list the “axioms” of τ ( M, w), i.e.,
the sentences which, taken together, describe the behavior of M when run on
input w. There will be sentences which lay down conditions on 0, ′, and <,
sentences that describes the input configuration, and sentences that describe
what the configuration of M is after it executes a particular instruction.

Definition 13.9. Given a Turing machine M = ⟨ Q, Σ, q0 , δ⟩, the language L M


consists of:

1. A two-place predicate symbol Qq ( x, y) for every state q ∈ Q. Intu-


itively, Qq (m, n) expresses “after n steps, M is in state q scanning the
mth square.”

2. A two-place predicate symbol Sσ ( x, y) for every symbol σ ∈ Σ. Intu-


itively, Sσ (m, n) expresses “after n steps, the mth square contains sym-
bol σ.”

3. A constant symbol 0

4. A one-place function symbol ′

5. A two-place predicate symbol <

The sentences describing the operation of the Turing machine M on input


w = σi1 . . . σik are the following:

1. Axioms describing numbers and <:

a) A sentence that says that every number is less than its successor:

∀x x < x′

b) A sentence that ensures that < is transitive:

∀ x ∀y ∀z (( x < y & y < z) ⊃ x < z)

194
13.6. Representing Turing Machines

2. Axioms describing the input configuration:


a) After 0 steps—before the machine starts—M is in the initial state q0 ,
scanning square 1:
Qq0 (1, 0)
b) The first k + 1 squares contain the symbols ▷, σi1 , . . . , σik :

S▷ (0, 0) & Sσi (1, 0) & · · · & Sσi (k, 0)


1 k

c) Otherwise, the tape is empty:

∀ x (k < x ⊃ S0 ( x, 0))

3. Axioms describing the transition from one configuration to the next:


For the following, let φ( x, y) be the conjunction of all sentences of the
form
∀z (((z < x ∨ x < z) & Sσ (z, y)) ⊃ Sσ (z, y′ ))
where σ ∈ Σ. We use φ(m, n) to express “other than at square m, the
tape after n + 1 steps is the same as after n steps.”
a) For every instruction δ(qi , σ ) = ⟨q j , σ′ , R⟩, the sentence:

∀ x ∀y ((Qqi ( x, y) & Sσ ( x, y)) ⊃


(Qq j ( x ′ , y′ ) & Sσ′ ( x, y′ ) & φ( x, y)))
This says that if, after y steps, the machine is in state qi scanning
square x which contains symbol σ, then after y + 1 steps it is scan-
ning square x + 1, is in state q j , square x now contains σ′ , and every
square other than x contains the same symbol as it did after y steps.
b) For every instruction δ(qi , σ ) = ⟨q j , σ′ , L⟩, the sentence:

∀ x ∀y ((Qqi ( x ′ , y) & Sσ ( x ′ , y)) ⊃


(Qq j ( x, y′ ) & Sσ′ ( x ′ , y′ ) & φ( x, y))) &
∀y ((Qqi (0, y) & Sσ (0, y)) ⊃
(Qq j (0, y′ ) & Sσ′ (0, y′ ) & φ(0, y)))
Take a moment to think about how this works: now we don’t start
with “if scanning square x . . . ” but: “if scanning square x + 1 . . . ” A
move to the left means that in the next step the machine is scanning
square x. But the square that is written on is x + 1. We do it this
way since we don’t have subtraction or a predecessor function.
Note that numbers of the form x + 1 are 1, 2, . . . , i.e., this doesn’t
cover the case where the machine is scanning square 0 and is sup-
posed to move left (which of course it can’t—it just stays put). That

195
13. U NDECIDABILITY

special case is covered by the second conjunction: it says that if, af-
ter y steps, the machine is scanning square 0 in state qi and square 0
contains symbol σ, then after y + 1 steps it’s still scanning square 0,
is now in state q j , the symbol on square 0 is σ′ , and the squares
other than square 0 contain the same symbols they contained ofter
y steps.
c) For every instruction δ(qi , σ) = ⟨q j , σ′ , N ⟩, the sentence:

∀ x ∀y ((Qqi ( x, y) & Sσ ( x, y)) ⊃


(Qq j ( x, y′ ) & Sσ′ ( x, y′ ) & φ( x, y)))

Let τ ( M, w) be the conjunction of all the above sentences for Turing machine M
and input w.
In order to express that M eventually halts, we have to find a sentence that
says “after some number of steps, the transition function will be undefined.”
Let X be the set of all pairs ⟨q, σ ⟩ such that δ(q, σ ) is undefined. Let α( M, w)
then be the sentence
_
∃ x ∃y ( (Qq ( x, y) & Sσ ( x, y)))
⟨q,σ⟩∈ X

If we use a Turing machine with a designated halting state h, it is even


easier: then the sentence α( M, w)

∃ x ∃y Qh ( x, y)

expresses that the machine eventually halts.

Proposition 13.10. If m < k, then τ ( M, w) ⊨ m < k

Proof. Exercise.

13.7 Verifying the Representation


In order to verify that our representation works, we have to prove two things.
First, we have to show that if M halts on input w, then τ ( M, w) ⊃ α( M, w) is
valid. Then, we have to show the converse, i.e., that if τ ( M, w) ⊃ α( M, w) is
valid, then M does in fact eventually halt when run on input w.
The strategy for proving these is very different. For the first result, we have
to show that a sentence of first-order logic (namely, τ ( M, w) ⊃ α( M, w)) is
valid. The easiest way to do this is to give a derivation. Our proof is supposed
to work for all M and w, though, so there isn’t really a single sentence for
which we have to give a derivation, but infinitely many. So the best we can do
is to prove by induction that, whatever M and w look like, and however many

196
13.7. Verifying the Representation

steps it takes M to halt on input w, there will be a derivation of τ ( M, w) ⊃


α( M, w).
Naturally, our induction will proceed on the number of steps M takes be-
fore it reaches a halting configuration. In our inductive proof, we’ll estab-
lish that for each step n of the run of M on input w, τ ( M, w) ⊨ χ( M, w, n),
where χ( M, w, n) correctly describes the configuration of M run on w after n
steps. Now if M halts on input w after, say, n steps, χ( M, w, n) will describe
a halting configuration. We’ll also show that χ( M, w, n) ⊨ α( M, w), when-
ever χ( M, w, n) describes a halting configuration. So, if M halts on input w,
then for some n, M will be in a halting configuration after n steps. Hence,
τ ( M, w) ⊨ χ( M, w, n) where χ( M, w, n) describes a halting configuration, and
since in that case χ( M, w, n) ⊨ α( M, w), we get that T ( M, w) ⊨ α( M, w), i.e.,
that ⊨ τ ( M, w) ⊃ α( M, w).
The strategy for the converse is very different. Here we assume that ⊨
τ ( M, w) ⊃ α( M, w) and have to prove that M halts on input w. From the hy-
pothesis we get that τ ( M, w) ⊨ α( M, w), i.e., α( M, w) is true in every structure
in which τ ( M, w) is true. So we’ll describe a structure M in which τ ( M, w)
is true: its domain will be N, and the interpretation of all the Qq and Sσ
will be given by the configurations of M during a run on input w. So, e.g.,
M ⊨ Qq (m, n) iff T, when run on input w for n steps, is in state q and scan-
ning square m. Now since τ ( M, w) ⊨ α( M, w) by hypothesis, and since M ⊨
τ ( M, w) by construction, M ⊨ α( M, w). But M ⊨ α( M, w) iff there is some
n ∈ |M| = N so that M, run on input w, is in a halting configuration after n
steps.

Definition 13.11. Let χ( M, w, n) be the sentence

Qq (m, n) & Sσ0 (0, n) & · · · & Sσk (k, n) & ∀ x (k < x ⊃ S0 ( x, n))

where q is the state of M at time n, M is scanning square m at time n, square i


contains symbol σi at time n for 0 ≤ i ≤ k and k is the right-most non-blank
square of the tape at time 0, or the right-most square the tape head has visited
after n steps, whichever is greater.

Lemma 13.12. If M run on input w is in a halting configuration after n steps, then


χ( M, w, n) ⊨ α( M, w).

Proof. Suppose that M halts for input w after n steps. There is some state q,
square m, and symbol σ such that:

1. After n steps, M is in state q scanning square m on which σ appears.

2. The transition function δ(q, σ ) is undefined.

197
13. U NDECIDABILITY

χ( M, w, n) is the description of this configuration and will include the clauses


Qq (m, n) and Sσ (m, n). These clauses together imply α( M, w):
_
∃ x ∃y ( (Qq ( x, y) & Sσ ( x, y)))
⟨q,σ⟩∈ X

as ⟨q′ , σ′ ⟩ ∈ X.
W
since Qq′ (m, n) & Sσ′ (m, n) ⊨ ⟨q,σ⟩∈ X (Qq ( m, n ) & Sσ ( m, n )),

So if M halts for input w, then there is some n such that χ( M, w, n) ⊨


α( M, w). We will now show that for any time n, τ ( M, w) ⊨ χ( M, w, n).

Lemma 13.13. For each n, if M has not halted after n steps, τ ( M, w) ⊨ χ( M, w, n).

Proof. Induction basis: If n = 0, then the conjuncts of χ( M, w, 0) are also con-


juncts of τ ( M, w), so entailed by it.
Inductive hypothesis: If M has not halted before the nth step, then τ ( M, w) ⊨
χ( M, w, n). We have to show that (unless χ( M, w, n) describes a halting con-
figuration), τ ( M, w) ⊨ χ( M, w, n + 1).
Suppose n > 0 and after n steps, M started on w is in state q scanning
square m. Since M does not halt after n steps, there must be an instruction of
one of the following three forms in the program of M:

1. δ(q, σ ) = ⟨q′ , σ′ , R⟩

2. δ(q, σ ) = ⟨q′ , σ′ , L⟩

3. δ(q, σ ) = ⟨q′ , σ′ , N ⟩

We will consider each of these three cases in turn.

1. Suppose there is an instruction of the form (1). By Definition 13.9(3a),


this means that

∀ x ∀y ((Qq ( x, y) & Sσ ( x, y)) ⊃


(Qq′ ( x ′ , y′ ) & Sσ′ ( x, y′ ) & φ( x, y)))

is a conjunct of τ ( M, w). This entails the following sentence (universal


instantiation, m for x and n for y):

(Qq (m, n) & Sσ (m, n)) ⊃


(Qq′ (m′ , n′ ) & Sσ′ (m, n′ ) & φ(m, n)).

By induction hypothesis, τ ( M, w) ⊨ χ( M, w, n), i.e.,

Qq (m, n) & Sσ0 (0, n) & · · · & Sσk (k, n)&


∀ x (k < x ⊃ S0 ( x, n))

198
13.7. Verifying the Representation

Since after n steps, tape square m contains σ, the corresponding conjunct


is Sσ (m, n), so this entails:

Qq (m, n) & Sσ (m, n)

We now get

Qq′ (m′ , n′ ) & Sσ′ (m, n′ ) &


Sσ0 (0, n′ ) & · · · & Sσk (k, n′ ) &
∀ x (k < x ⊃ S0 ( x, n′ ))
as follows: The first line comes directly from the consequent of the pre-
ceding conditional, by modus ponens. Each conjunct in the middle
line—which excludes Sσm (m, n′ )—follows from the corresponding con-
junct in χ( M, w, n) together with φ(m, n).
If m < k, τ ( M, w) ⊢ m < k (Proposition 13.10) and by transitivity of <,
we have ∀ x (k < x ⊃ m < x ). If m = k, then ∀ x (k < x ⊃ m < x ) by
logic alone. The last line then follows from the corresponding conjunct
in χ( M, w, n), ∀ x (k < x ⊃ m < x ), and φ(m, n). If m < k, this already is
χ( M, w, n + 1).
Now suppose m = k. In that case, after n + 1 steps, the tape head has
also visited square k + 1, which now is the right-most square visited.

So χ( M, w, n + 1) has a new conjunct, S0 (k , n′ ), and the last conjunct is

∀ x (k < x ⊃ S0 ( x, n′ )). We have to verify that these two sentences are
also implied.
We already have ∀ x (k < x ⊃ S0 ( x, n′ )). In particular, this gives us
′ ′ ′
k < k ⊃ S0 (k , n′ ). From the axiom ∀ x x < x ′ we get k < k . By modus

ponens, S0 (k , n′ ) follows.

Also, since τ ( M, w) ⊢ k < k , the axiom for transitivity of < gives us

∀ x (k < x ⊃ S0 ( x, n′ )). (We leave the verification of this as an exercise.)
2. Suppose there is an instruction of the form (2). Then, by Definition 13.9(3b),
∀ x ∀y ((Qq ( x ′ , y) & Sσ ( x ′ , y)) ⊃
(Qq′ ( x, y′ ) & Sσ′ ( x ′ , y′ ) & φ( x, y))) &
∀y ((Qqi (0, y) & Sσ (0, y)) ⊃
(Qq j (0, y′ ) & Sσ′ (0, y′ ) & φ(0, y)))

is a conjunct of τ ( M, w). If m > 0, then let l = m − 1 (i.e., m = l + 1).


The first conjunct of the above sentence entails the following:
′ ′
(Qq (l , n) & Sσ (l , n)) ⊃

(Qq′ (l, n′ ) & Sσ′ (l , n′ ) & φ(l, n))

199
13. U NDECIDABILITY

Otherwise, let l = m = 0 and consider the following sentence entailed


by the second conjunct:

((Qqi (0, n) & Sσ (0, n)) ⊃


(Qq j (0, n′ ) & Sσ′ (0, n′ ) & φ(0, n)))

Either sentence implies

Qq′ (l, n′ ) & Sσ′ (m, n′ ) &


Sσ0 (0, n′ ) & · · · & Sσk (k, n′ ) &
∀ x (k < x ⊃ S0 ( x, n′ ))

as before. (Note that in the first case, l ≡ l + 1 ≡ m and in the second
case l ≡ 0.) But this just is χ( M, w, n + 1).

3. Case (3) is left as an exercise.

We have shown that for any n, τ ( M, w) ⊨ χ( M, w, n).

Lemma 13.14. If M halts on input w, then τ ( M, w) ⊃ α( M, w) is valid.

Proof. By Lemma 13.13, we know that, for any time n, the description χ( M, w, n)
of the configuration of M at time n is entailed by τ ( M, w). Suppose M halts af-
ter k steps. At that point, it will be scanning square m, for some m ∈ N. Then
χ( M, w, k) describes a halting configuration of M, i.e., it contains as conjuncts
both Qq (m, k ) and Sσ (m, k ) with δ(q, σ ) undefined. Thus, by Lemma 13.12,
χ( M, w, k) ⊨ α( M, w). But since τ ( M, w) ⊨ χ( M, w, k), we have τ ( M, w) ⊨
α( M, w) and therefore τ ( M, w) ⊃ α( M, w) is valid.

To complete the verification of our claim, we also have to establish the


reverse direction: if τ ( M, w) ⊃ α( M, w) is valid, then M does in fact halt
when started on input w.

Lemma 13.15. If ⊨ τ ( M, w) ⊃ α( M, w), then M halts on input w.

Proof. Consider the L M -structure M with domain N which interprets 0 as 0,


′ as the successor function, and < as the less-than relation, and the predicates
Qq and Sσ as follows:

started on w, after n steps,


QM
q = {⟨ m, n ⟩ | }
M is in state q scanning square m
started on w, after n steps,
SσM = {⟨m, n⟩ | }
square m of M contains symbol σ

200
13.8. The Decision Problem is Unsolvable

In other words, we construct the structure M so that it describes what M


started on input w actually does, step by step. Clearly, M ⊨ τ ( M, w). If
⊨ τ ( M, w) ⊃ α( M, w), then also M ⊨ α( M, w), i.e.,
_
M ⊨ ∃ x ∃y ( (Qq ( x, y) & Sσ ( x, y))).
⟨q,σ⟩∈ X

As |M| = N, there must be m, n ∈ N so that M ⊨ Qq (m, n) & Sσ (m, n) for


some q and σ such that δ(q, σ ) is undefined. By the definition of M, this means
that M started on input w after n steps is in state q and reading symbol σ, and
the transition function is undefined, i.e., M has halted.

13.8 The Decision Problem is Unsolvable


Theorem 13.16. The decision problem is unsolvable: There is no Turing machine D,
which when started on a tape that contains a sentence ψ of first-order logic as input,
D eventually halts, and outputs 1 iff ψ is valid and 0 otherwise.

Proof. Suppose the decision problem were solvable, i.e., suppose there were
a Turing machine D. Then we could solve the halting problem as follows.
We construct a Turing machine E that, given as input the number e of Turing
machine Me and input w, computes the corresponding sentence τ ( Me , w) ⊃
α( Me , w) and halts, scanning the leftmost square on the tape. The machine
E ⌢ D would then, given input e and w, first compute τ ( Me , w) ⊃ α( Me , w)
and then run the decision problem machine D on that input. D halts with out-
put 1 iff τ ( Me , w) ⊃ α( Me , w) is valid and outputs 0 otherwise. By Lemma 13.15
and Lemma 13.14, τ ( Me , w) ⊃ α( Me , w) is valid iff Me halts on input w. Thus,
E ⌢ D, given input e and w halts with output 1 iff Me halts on input w and
halts with output 0 otherwise. In other words, E ⌢ D would solve the halting
problem. But we know, by Theorem 13.8, that no such Turing machine can
exist.

Corollary 13.17. It is undecidable if an arbitrary sentence of first-order logic is sat-


isfiable.

Proof. Suppose satisfiability were decidable by a Turing machine S. Then we


could solve the decision problem as follows: Given a sentence B as input,
move ψ to the right one square. Return to square 1 and write the symbol ∼.
Now run the Turing machine S. It eventually halts with output either 1
(if ∼ψ is satisfiable) or 0 (if ∼ψ is unsatisfiable) on the tape. If there is a 1 on
square 1, erase it; if square 1 is empty, write a 1, then halt.
This Turing machine always halts, and its output is 1 iff ∼ψ is unsatisfiable
and 0 otherwise. Since ψ is valid iff ∼ψ is unsatisfiable, the machine outputs 1
iff ψ is valid, and 0 otherwise, i.e., it would solve the decision problem.

201
13. U NDECIDABILITY

So there is no Turing machine which always gives a correct “yes” or “no”


answer to the question “Is ψ a valid sentence of first-order logic?” However,
there is a Turing machine that always gives a correct “yes” answer—but sim-
ply does not halt if the answer is “no.” This follows from the soundness and
completeness theorem of first-order logic, and the fact that derivations can be
effectively enumerated.

Theorem 13.18. Validity of first-order sentences is semi-decidable: There is a Turing


machine E, which when started on a tape that contains a sentence ψ of first-order logic
as input, E eventually halts and outputs 1 iff ψ is valid, but does not halt otherwise.

Proof. All possible derivations of first-order logic can be generated, one after
another, by an effective algorithm. The machine E does this, and when it finds
a derivation that shows that ⊢ ψ, it halts with output 1. By the soundness
theorem, if E halts with output 1, it’s because ⊨ ψ. By the completeness the-
orem, if ⊨ ψ there is a derivation that shows that ⊢ ψ. Since E systematically
generates all possible derivations, it will eventually find one that shows ⊢ ψ,
so will eventually halt with output 1.

13.9 Trakhtenbrot’s Theorem


In section 13.6 we defined sentences τ ( M, w) and α( M, w) for a Turing ma-
chine M and input string w. Then we showed in Lemma 13.14 and Lemma 13.15
that τ ( M, w) ⊃ α( M, w) is valid iff M, started on input w, eventually halts.
Since the Halting Problem is undecidable, this implies that validity and satisfi-
ability of sentences of first-order logic is undecidable (Theorem 13.16 and Corol-
lary 13.17).
But validity and satisfiability of sentences is defined for arbitrary struc-
tures, finite or infinite. You might suspect that it is easier to decide if a sen-
tence is satisfiable in a finite structure (or valid in all finite structures). We can
adapt the proof of the unsolvability of the decision problem so that it shows
this is not the case.
First, if you go back to the proof of Lemma 13.15, you’ll see that what
we did there is produce a model M of τ ( M, w) which describes exactly what
machine M does when started on input w. The domain of that model was N,
i.e., infinite. But if M actually halts on input w, we can build a finite model M′
in the same way. Suppose M started on input w halts after k steps. Take as
domain |M′ | the set {0, . . . , n}, where n is the larger of k and the length of w,
and let (
M′ x + 1 if x < n
′ (x) =
n otherwise,

and ⟨ x, y⟩ ∈ <M iff x < y or x = y = n. Otherwise M′ is defined just like M.
By the definition of M′ , just like in the proof of Lemma 13.15, M′ ⊨ τ ( M, w).

202
13.9. Trakhtenbrot’s Theorem

And since we assumed that M halts on input w, M′ ⊨ α( M, w). So, M′ is a


finite model of τ ( M, w) & α( M, w) (note that we’ve replaced ⊃ with &).
We are halfway to a proof: we’ve shown that if M halts on input w, then
τ ( M, w) & α( M, w) has a finite model. Unfortunately, the converse of this does
not hold, i.e., there are Turing machines that don’t halt on some input w, but
τ ( M, w) & α( M, w) still has a finite model. For instance, consider the ma-
chine M with the single state q0 and instruction δ(q0 , 0) = ⟨q0 , 0, N ⟩. Started
on empty input w = Λ, this machine never halts: it is in an infinite loop, but
does not change the tape or move the head. All configurations are the same
(same state, same head position, same tape contents). We can define a finite
structure M′′ that satisfies τ ( M, Λ) & α( M, Λ) (exercise). We can, however,
change τ ( M, w) in a suitable way so that such structures are ruled out.
Consider the sentences describing the operation of the Turing machine M
on input w = σi1 . . . σik :

1. Axioms describing numbers and < (just like in the definition of τ ( M, w)


in section 13.6).

2. Axioms describing the input configuration: just like in the definition


of τ ( M, w).

3. Axioms describing the transition from one configuration to the next:


For the following, let φ( x, y) be as before, and let

ψ ( y ) ≡ ∀ x ( x < y ⊃ x ̸ = y ).

a) For every instruction δ(qi , σ ) = ⟨q j , σ′ , R⟩, the sentence:

∀ x ∀y ((Qqi ( x, y) & Sσ ( x, y)) ⊃


(Qq j ( x ′ , y′ ) & Sσ′ ( x, y′ ) & φ( x, y) & ψ(y′ )))

b) For every instruction δ(qi , σ ) = ⟨q j , σ′ , L⟩, the sentence

∀ x ∀y ((Qqi ( x ′ , y) & Sσ ( x ′ , y)) ⊃


(Qq j ( x, y′ ) & Sσ′ ( x ′ , y′ ) & φ( x, y))) &
∀y ((Qqi (0, y) & Sσ (0, y)) ⊃
(Qq j (0, y′ ) & Sσ′ (0, y′ ) & φ(0, y) & ψ(y′ )))

c) For every instruction δ(qi , σ ) = ⟨q j , σ′ , N ⟩, the sentence:

∀ x ∀y ((Qqi ( x, y) & Sσ ( x, y)) ⊃


(Qq j ( x, y′ ) & Sσ′ ( x, y′ ) & φ( x, y) & ψ(y′ )))

203
13. U NDECIDABILITY

As you can see, the sentences describing the transitions of M are the
same as the corresponding sentence in τ ( M, w), except we add ψ(y′ ) at
the end. ψ(y′ ) ensures that the number y′ of the “next” configuration is
different from all previous numbers 0, 0′ , . . . .

Let τ ′ ( M, w) be the conjunction of all the above sentences for Turing ma-
chine M and input w.

Lemma 13.19. If M started on input w halts, then τ ′ ( M, w) & α( M, w) has a finite


model.

Proof. Let M′ be as in the proof of Lemma 13.15, except

M′ = {0, . . . , n},
(
M′ x + 1 if x < n
′ (x) =
n otherwise,

⟨ x, y⟩ ∈ <M iff x < y or x = y = n,

where n = max(k, len(w)) and k is the least number such that M started on
input w has halted after k steps. We leave the verification that M′ ⊨ τ ′ ( M, w) &
E( M, w) as an exercise.

Lemma 13.20. If τ ′ ( M, w) & α( M, w) has a finite model, then M started on input w


halts.

Proof. We show the contrapositive. Suppose that M started on w does not


halt. If τ ′ ( M, w) & α( M, w) has no model at all, we are done. So assume M is
a model of τ ( M, w) & α( M, w). We have to show that it cannot be finite.
We can prove, just like in Lemma 13.13, that if M, started on input w, has
not halted after n steps, then τ ′ ( M, w) ⊨ χ( M, w, n) & ψ(n). Since M started
on input w does not halt, τ ′ ( M, w) ⊨ χ( M, w, n) & ψ(n) for all n ∈ N. Note
that by Proposition 13.10, τ ′ ( M, w) ⊨ k < n for all k < n. Also ψ(n) ⊨ k < n ⊃
k ̸= n. So, M ⊨ k ̸= n for all k < n, i.e., the infinitely many terms k must all
have different values in M. But this requires that |M| be infinite, so M cannot
be a finite model of τ ′ ( M, w) & α( M, w).

Theorem 13.21 (Trakhtenbrot’s Theorem). It is undecidable if an arbitrary sen-


tence of first-order logic has a finite model (i.e., is finitely satisfiable).

Proof. Suppose there were a Turing machine F that decides the finite satisfi-
ability problem. Then given any Turing machine M and input w, we could
compute the sentence τ ′ ( M, w) & α( M, w), and use F to decide if it has a finite
model. By Lemmata 13.19 and 13.20, it does iff M started on input w halts. So
we could use F to solve the halting problem, which we know is unsolvable.

204
13.9. Trakhtenbrot’s Theorem

Corollary 13.22. There can be no derivation system that is sound and complete for
finite validity, i.e., a derivation system which has ⊢ ψ iff M ⊨ ψ for every finite
structure M.

Proof. Exercise.

205
Part IV

Computability and Incompleteness

207
Chapter 14

Introduction to Incompleteness

14.1 Historical Background


In this section, we will briefly discuss historical developments that will help
put the incompleteness theorems in context. In particular, we will give a very
sketchy overview of the history of mathematical logic; and then say a few
words about the history of the foundations of mathematics.
The phrase “mathematical logic” is ambiguous. One can interpret the
word “mathematical” as describing the subject matter, as in, “the logic of
mathematics,” denoting the principles of mathematical reasoning; or as de-
scribing the methods, as in “the mathematics of logic,” denoting a mathemat-
ical study of the principles of reasoning. The account that follows involves
mathematical logic in both senses, often at the same time.
The study of logic began, essentially, with Aristotle, who lived approxi-
mately 384–322 BCE. His Categories, Prior analytics, and Posterior analytics in-
clude systematic studies of the principles of scientific reasoning, including a
thorough and systematic study of the syllogism.
Aristotle’s logic dominated scholastic philosophy through the middle ages;
indeed, as late as the eighteenth century, Kant maintained that Aristotle’s logic
was perfect and in no need of revision. But the theory of the syllogism is far
too limited to model anything but the most superficial aspects of mathemati-
cal reasoning. A century earlier, Leibniz, a contemporary of Newton’s, imag-
ined a complete “calculus” for logical reasoning, and made some rudimentary
steps towards designing such a calculus, essentially describing a version of
propositional logic.
The nineteenth century was a watershed for logic. In 1854 George Boole
wrote The Laws of Thought, with a thorough algebraic study of propositional
logic that is not far from modern presentations. In 1879 Gottlob Frege pub-
lished his Begriffsschrift (Concept writing) which extends propositional logic
with quantifiers and relations, and thus includes first-order logic. In fact,
Frege’s logical systems included higher-order logic as well, and more. In his

209
14. I NTRODUCTION TO I NCOMPLETENESS

Basic Laws of Arithmetic, Frege set out to show that all of arithmetic could be
derived in his Begriffsschrift from purely logical assumption. Unfortunately,
these assumptions turned out to be inconsistent, as Russell showed in 1902.
But setting aside the inconsistent axiom, Frege more or less invented mod-
ern logic singlehandedly, a startling achievement. Quantificational logic was
also developed independently by algebraically-minded thinkers after Boole,
including Peirce and Schröder.
Let us now turn to developments in the foundations of mathematics. Of
course, since logic plays an important role in mathematics, there is a good deal
of interaction with the developments just described. For example, Frege de-
veloped his logic with the explicit purpose of showing that all of mathematics
could be based solely on his logical framework; in particular, he wished to
show that mathematics consists of a priori analytic truths instead of, as Kant
had maintained, a priori synthetic ones.
Many take the birth of mathematics proper to have occurred with the
Greeks. Euclid’s Elements, written around 300 B.C., is already a mature rep-
resentative of Greek mathematics, with its emphasis on rigor and precision.
The definitions and proofs in Euclid’s Elements survive more or less intact
in high school geometry textbooks today (to the extent that geometry is still
taught in high schools). This model of mathematical reasoning has been held
to be a paradigm for rigorous argumentation not only in mathematics but in
branches of philosophy as well. (Spinoza even presented moral and religious
arguments in the Euclidean style, which is strange to see!)
Calculus was invented by Newton and Leibniz in the seventeenth century.
(A fierce priority dispute raged for centuries, but most scholars today hold
that the two developments were for the most part independent.) Calculus in-
volves reasoning about, for example, infinite sums of infinitely small quanti-
ties; these features fueled criticism by Bishop Berkeley, who argued that belief
in God was no less rational than the mathematics of his time. The methods of
calculus were widely used in the eighteenth century, for example by Leonhard
Euler, who used calculations involving infinite sums with dramatic results.
In the nineteenth century, mathematicians tried to address Berkeley’s crit-
icisms by putting calculus on a firmer foundation. Efforts by Cauchy, Weier-
strass, Bolzano, and others led to our contemporary definitions of limits, con-
tinuity, differentiation, and integration in terms of “epsilons and deltas,” in
other words, devoid of any reference to infinitesimals. Later in the century,
mathematicians tried to push further, and explain all aspects of calculus, in-
cluding the real numbers themselves, in terms of the natural numbers. (Kro-
necker: “God created the whole numbers, all else is the work of man.”) In
1872, Dedekind wrote “Continuity and the irrational numbers,” where he
showed how to “construct” the real numbers as sets of rational numbers (which,
as you know, can be viewed as pairs of natural numbers); in 1888 he wrote
“Was sind und was sollen die Zahlen” (roughly, “What are the natural num-

210
14.1. Historical Background

bers, and what should they be?”) which aimed to explain the natural numbers
in purely “logical” terms. In 1887 Kronecker wrote “Über den Zahlbegriff”
(“On the concept of number”) where he spoke of representing all mathemati-
cal object in terms of the integers; in 1889 Giuseppe Peano gave formal, sym-
bolic axioms for the natural numbers.
The end of the nineteenth century also brought a new boldness in dealing
with the infinite. Before then, infinitary objects and structures (like the set of
natural numbers) were treated gingerly; “infinitely many” was understood
as “as many as you want,” and “approaches in the limit” was understood as
“gets as close as you want.” But Georg Cantor showed that it was possible to
take the infinite at face value. Work by Cantor, Dedekind, and others help to
introduce the general set-theoretic understanding of mathematics that is now
widely accepted.
This brings us to twentieth century developments in logic and founda-
tions. In 1902 Russell discovered the paradox in Frege’s logical system. In 1904
Zermelo proved Cantor’s well-ordering principle, using the so-called “axiom
of choice”; the legitimacy of this axiom prompted a good deal of debate. Be-
tween 1910 and 1913 the three volumes of Russell and Whitehead’s Principia
Mathematica appeared, extending the Fregean program of establishing mathe-
matics on logical grounds. Unfortunately, Russell and Whitehead were forced
to adopt two principles that seemed hard to justify as purely logical: an axiom
of infinity and an axiom of “reducibility.” In the 1900’s Poincaré criticized the
use of “impredicative definitions” in mathematics, and in the 1910’s Brouwer
began proposing to refound all of mathematics in an “intuitionistic” basis,
which avoided the use of the law of the excluded middle (φ ∨ ∼ φ).
Strange days indeed! The program of reducing all of mathematics to logic
is now referred to as “logicism,” and is commonly viewed as having failed,
due to the difficulties mentioned above. The program of developing mathe-
matics in terms of intuitionistic mental constructions is called “intuitionism,”
and is viewed as posing overly severe restrictions on everyday mathemat-
ics. Around the turn of the century, David Hilbert, one of the most influen-
tial mathematicians of all time, was a strong supporter of the new, abstract
methods introduced by Cantor and Dedekind: “no one will drive us from the
paradise that Cantor has created for us.” At the same time, he was sensitive
to foundational criticisms of these new methods (oddly enough, now called
“classical”). He proposed a way of having one’s cake and eating it too:

1. Represent classical methods with formal axioms and rules; represent


mathematical questions as formulae in an axiomatic system.

2. Use safe, “finitary” methods to prove that these formal deductive sys-
tems are consistent.

Hilbert’s work went a long way toward accomplishing the first goal. In
1899, he had done this for geometry in his celebrated book Foundations of ge-

211
14. I NTRODUCTION TO I NCOMPLETENESS

ometry. In subsequent years, he and a number of his students and collabo-


rators worked on other areas of mathematics to do what Hilbert had done
for geometry. Hilbert himself gave axiom systems for arithmetic and analy-
sis. Zermelo gave an axiomatization of set theory, which was expanded on by
Fraenkel, Skolem, von Neumann, and others. By the mid-1920s, there were
two approaches that laid claim to the title of an axiomatization of “all” of
mathematics, the Principia mathematica of Russell and Whitehead, and what
came to be known as Zermelo–Fraenkel set theory.
In 1921, Hilbert set out on a research project to establish the goal of proving
these systems to be consistent. He was aided in this project by several of
his students, in particular Bernays, Ackermann, and later Gentzen. The basic
idea for accomplishing this goal was to cast the question of the possibility of
a derivation of an inconsistency in mathematics as a combinatorial problem
about possible sequences of symbols, namely possible sequences of sentences
which meet the criterion of being a correct derivation of, say, φ & ∼ φ from
the axioms of an axiom system for arithmetic, analysis, or set theory. A proof
of the impossibility of such a sequence of symbols would—since it is itself
a mathematical proof—be formalizable in these axiomatic systems. In other
words, there would be some sentence Con which states that, say, arithmetic
is consistent. Moreover, this sentence should be provable in the systems in
question, especially if its proof requires only very restricted, “finitary” means.
The second aim, that the axiom systems developed would settle every
mathematical question, can be made precise in two ways. In one way, we
can formulate it as follows: For any sentence φ in the language of an axiom
system for mathematics, either φ or ∼ φ is provable from the axioms. If this
were true, then there would be no sentences which can neither be proved nor
refuted on the basis of the axioms, no questions which the axioms do not set-
tle. An axiom system with this property is called complete. Of course, for any
given sentence it might still be a difficult task to determine which of the two
alternatives holds. But in principle there should be a method to do so. In fact,
for the axiom and derivation systems considered by Hilbert, completeness
would imply that such a method exists—although Hilbert did not realize this.
The second way to interpret the question would be this stronger requirement:
that there be a mechanical, computational method which would determine,
for a given sentence φ, whether it is derivable from the axioms or not.
In 1931, Gödel proved the two “incompleteness theorems,” which showed
that this program could not succeed. There is no axiom system for mathemat-
ics which is complete, specifically, the sentence that expresses the consistency
of the axioms is a sentence which can neither be proved nor refuted.
This struck a lethal blow to Hilbert’s original program. However, as is so
often the case in mathematics, it also opened up exciting new avenues for re-
search. If there is no one, all-encompassing formal system of mathematics,
it makes sense to develop more circumscribed systems and investigate what

212
14.2. Definitions

can be proved in them. It also makes sense to develop less restricted methods
of proof for establishing the consistency of these systems, and to find ways to
measure how hard it is to prove their consistency. Since Gödel showed that
(almost) every formal system has questions it cannot settle, it makes sense to
look for “interesting” questions a given formal system cannot settle, and to
figure out how strong a formal system has to be to settle them. To the present
day, logicians have been pursuing these questions in a new mathematical dis-
cipline, the theory of proofs.

14.2 Definitions
In order to carry out Hilbert’s project of formalizing mathematics and show-
ing that such a formalization is consistent and complete, the first order of busi-
ness would be that of picking a language, logical framework, and a system of
axioms. For our purposes, let us suppose that mathematics can be formalized
in a first-order language, i.e., that there is some set of constant symbols, func-
tion symbols, and predicate symbols which, together with the connectives and
quantifiers of first-order logic, allow us to express the claims of mathematics.
Most people agree that such a language exists: the language of set theory, in
which ∈ is the only non-logical symbol. That such a simple language is so
expressive is of course a very implausible claim at first sight, and it took a
lot of work to establish that practically of all mathematics can be expressed
in this very austere vocabulary. To keep things simple, for now, let’s restrict
our discussion to arithmetic, so the part of mathematics that just deals with
the natural numbers N. The natural language in which to express facts of
arithmetic is L A . L A contains a single two-place predicate symbol <, a sin-
gle constant symbol 0, one one-place function symbol ′, and two two-place
function symbols + and ×.

Definition 14.1. A set of sentences Γ is a theory if it is closed under entailment,


i.e., if Γ = { φ | Γ ⊨ φ}.

There are two easy ways to specify theories. One is as the set of sentences
true in some structure. For instance, consider the structure for L A in which
the domain is N and all non-logical symbols are interpreted as you would
expect.

Definition 14.2. The standard model of arithmetic is the structure N defined as


follows:

1. |N| = N

2. 0N = 0

3. ′N (n) = n + 1 for all n ∈ N

213
14. I NTRODUCTION TO I NCOMPLETENESS

4. +N (n, m) = n + m for all n, m ∈ N


5. ×N (n, m) = n · m for all n, m ∈ N
6. <N = {⟨n, m⟩ | n ∈ N, m ∈ N, n < m}

Note the difference between × and ·: × is a symbol in the language of


arithmetic. Of course, we’ve chosen it to remind us of multiplication, but ×
is not the multiplication operation but a two-place function symbol (officially,
f12 ). By contrast, · is the ordinary multiplication function. When you see some-
thing like n · m, we mean the product of the numbers n and m; when you see
something like x × y we are talking about a term in the language of arith-
metic. In the standard model, the function symbol times is interpreted as the
function · on the natural numbers. For addition, we use + as both the function
symbol of the language of arithmetic, and the addition function on the natural
numbers. Here you have to use the context to determine what is meant.
Definition 14.3. The theory of true arithmetic is the set of sentences satisfied in
the standard model of arithmetic, i.e.,
TA = { φ | N ⊨ φ}.

TA is a theory, for whenever TA ⊨ φ, φ is satisfied in every structure which


satisfies TA. Since M ⊨ TA, M ⊨ φ, and so φ ∈ TA.
The other way to specify a theory Γ is as the set of sentences entailed by
some set of sentences Γ0 . In that case, Γ is the “closure” of Γ0 under entailment.
Specifying a theory this way is only interesting if Γ0 is explicitly specified, e.g.,
if the elements of Γ0 are listed. At the very least, Γ0 has to be decidable, i.e.,
there has to be a computable test for when a sentence counts as an element
of Γ0 or not. We call the sentences in Γ0 axioms for Γ, and Γ axiomatized by Γ0 .
Definition 14.4. A theory Γ is axiomatized by Γ0 iff
Γ = { φ | Γ0 ⊨ φ }

Definition 14.5. The theory Q axiomatized by the following sentences is known


as “Robinson’s Q” and is a very simple theory of arithmetic.
∀ x ∀y ( x ′ = y′ ⊃ x = y) (Q1 )
∀ x 0 ̸= x′ (Q2 )
∀ x ( x = 0 ∨ ∃y x = y′ ) (Q3 )
∀ x ( x + 0) = x (Q4 )
∀ x ∀y ( x + y′ ) = ( x + y)′ (Q5 )
∀ x ( x × 0) = 0 (Q6 )
∀ x ∀y ( x × y′ ) = (( x × y) + x ) (Q7 )
∀ x ∀y ( x < y ≡ ∃z (z′ + x ) = y) (Q8 )

214
14.2. Definitions

The set of sentences { Q1 , . . . , Q8 } are the axioms of Q, so Q consists of all


sentences entailed by them:

Q = { φ | { Q1 , . . . , Q8 } ⊨ φ }.

Definition 14.6. Suppose φ( x ) is a formula in L A with free variables x and y1 ,


. . . , yn . Then any sentence of the form

∀y1 . . . ∀yn (( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x ))

is an instance of the induction schema.


Peano arithmetic PA is the theory axiomatized by the axioms of Q together
with all instances of the induction schema.

Every instance of the induction schema is true in N. This is easiest to see


if the formula φ only has one free variable x. Then φ( x ) defines a subset X φ
of N in N. X φ is the set of all n ∈ N such that N, s ⊨ φ( x ) when s( x ) = n. The
corresponding instance of the induction schema is

(( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x )).

If its antecedent is true in N, then 0 ∈ X φ and, whenever n ∈ X φ , so is n + 1.


Since 0 ∈ X φ , we get 1 ∈ X φ . With 1 ∈ X φ we get 2 ∈ X φ . And so on. So for
every n ∈ N, n ∈ X φ . But this means that ∀ x φ( x ) is satisfied in N.
Both Q and PA are axiomatized theories. The big question is, how strong
are they? For instance, can PA prove all the truths about N that can be ex-
pressed in L A ? Specifically, do the axioms of PA settle all the questions that
can be formulated in L A ?
Another way to put this is to ask: Is PA = TA? TA obviously does prove
(i.e., it includes) all the truths about N, and it settles all the questions that
can be formulated in L A , since if φ is a sentence in L A , then either N ⊨ φ or
N ⊨ ∼ φ, and so either TA ⊨ φ or TA ⊨ ∼ φ. Call such a theory complete.

Definition 14.7. A theory Γ is complete iff for every sentence φ in its language,
either Γ ⊨ φ or Γ ⊨ ∼ φ.

By the Completeness Theorem, Γ ⊨ φ iff Γ ⊢ φ, so Γ is complete iff for


every sentence φ in its language, either Γ ⊢ φ or Γ ⊢ ∼ φ.
Another question we are led to ask is this: Is there a computational pro-
cedure we can use to test if a sentence is in TA, in PA, or even just in Q? We
can make this more precise by defining when a set (e.g., a set of sentences) is
decidable.

Definition 14.8. A set X is decidable iff there is a computational procedure


which on input x returns 1 if x ∈ X and 0 otherwise.

215
14. I NTRODUCTION TO I NCOMPLETENESS

So our question becomes: Is TA (PA, Q) decidable?


The answer to all these questions will be: no. None of these theories are
decidable. However, this phenomenon is not specific to these particular theo-
ries. In fact, any theory that satisfies certain conditions is subject to the same
results. One of these conditions, which Q and PA satisfy, is that they are ax-
iomatized by a decidable set of axioms.

Definition 14.9. A theory is axiomatizable if it is axiomatized by a decidable


set of axioms.

Example 14.10. Any theory axiomatized by a finite set of sentences is axioma-


tizable, since any finite set is decidable. Thus, Q, for instance, is axiomatizable.
Schematically axiomatized theories like PA are also axiomatizable. For to
test if ψ is among the axioms of PA, i.e., to compute the function χ X where
χ X (ψ) = 1 if ψ is an axiom of PA and = 0 otherwise, we can do the following:
First, check if ψ is one of the axioms of Q. If it is, the answer is “yes” and the
value of χ X (ψ) = 1. If not, test if it is an instance of the induction schema. This
can be done systematically; in this case, perhaps it’s easiest to see that it can be
done as follows: Any instance of the induction schema begins with a number
of universal quantifiers, and then a sub-formula that is a conditional. The
consequent of that conditional is ∀ x φ( x, y1 , . . . , yn ) where x and y1 , . . . , yn are
all the free variables of φ and the initial quantifiers of ψ bind the variables y1 ,
. . . , yn . Once we have extracted this φ and checked that its free variables match
the variables bound by the universal quantifiers at the front and ∀ x, we go on
to check that the antecedent of the conditional matches

φ(0, y1 , . . . , yn ) & ∀ x ( φ( x, y1 , . . . , yn ) ⊃ φ( x ′ , y1 , . . . , yn ))

Again, if it does, ψ is an instance of the induction schema, and if it doesn’t, ψ


isn’t.

In answering this question—and the more general question of which theo-


ries are complete or decidable—it will be useful to consider also the following
definition. Recall that a set X is countable iff it is empty or if there is a surjec-
tive function f : N → X. Such a function is called an enumeration of X.

Definition 14.11. A set X is called computably enumerable (c.e. for short) iff it
is empty or it has a computable enumeration.

In addition to axiomatizability, another condition on theories to which the


incompleteness theorems apply will be that they are strong enough to prove
basic facts about computable functions and decidable relations. By “basic
facts,” we mean sentences which express what the values of computable func-
tions are for each of their arguments. And by “strong enough” we mean that
the theories in question count these sentences among its theorems. For in-
stance, consider a prototypical computable function: addition. The value of

216
14.3. Overview of Incompleteness Results

+ for arguments 2 and 3 is 5, i.e., 2 + 3 = 5. A sentence in the language of


arithmetic that expresses that the value of + for arguments 2 and 3 is 5 is:
(2 + 3) = 5. And, e.g., Q proves this sentence. More generally, we would
like there to be, for each computable function f ( x1 , x2 ) a formula φ f ( x1 , x2 , y)
in L A such that Q ⊢ φ f (n1 , n2 , m) whenever f (n1 , n2 ) = m. In this way, Q
proves that the value of f for arguments n1 , n2 is m. In fact, we require that
it proves a bit more, namely that no other number is the value of f for argu-
ments n1 , n2 . And the same goes for decidable relations. This is made precise
in the following two definitions.

Definition 14.12. A formula φ( x1 , . . . , xk , y) represents the function f : Nk →


N in Γ iff whenever f (n1 , . . . , nk ) = m, then

1. Γ ⊢ φ(n1 , . . . , nk , m), and

2. Γ ⊢ ∀y( φ(n1 , . . . , nk , y) ⊃ y = m).

Definition 14.13. A formula φ( x1 , . . . , xk ) represents the relation R ⊆ Nk iff,

1. whenever R(n1 , . . . , nk ), Γ ⊢ φ(n1 , . . . , nk ), and

2. whenever not R(n1 , . . . , nk ), Γ ⊢ ∼ φ(n1 , . . . , nk ).

A theory is “strong enough” for the incompleteness theorems to apply if


it represents all computable functions and all decidable relations. Q and its
extensions satisfy this condition, but it will take us a while to establish this—
it’s a non-trivial fact about the kinds of things Q can prove, and it’s hard
to show because Q has only a few axioms from which we’ll have to prove
all these facts. However, Q is a very weak theory. So although it’s hard to
prove that Q represents all computable functions, most interesting theories
are stronger than Q, i.e., prove more than Q does. And if Q proves some-
thing, any stronger theory does; since Q represents all computable functions,
every stronger theory does. This means that many interesting theories meet
this condition of the incompleteness theorems. So our hard work will pay
off, since it shows that the incompleteness theorems apply to a wide range of
theories. Certainly, any theory aiming to formalize “all of mathematics” must
prove everything that Q proves, since it should at the very least be able to cap-
ture the results of elementary computations. So any theory that is a candidate
for a theory of “all of mathematics” will be one to which the incompleteness
theorems apply.

14.3 Overview of Incompleteness Results


Hilbert expected that mathematics could be formalized in an axiomatizable
theory which it would be possible to prove complete and decidable. More-
over, he aimed to prove the consistency of this theory with very weak, “fini-

217
14. I NTRODUCTION TO I NCOMPLETENESS

tary,” means, which would defend classical mathematics against the chal-
lenges of intuitionism. Gödel’s incompleteness theorems showed that these
goals cannot be achieved.
Gödel’s first incompleteness theorem showed that a version of Russell and
Whitehead’s Principia Mathematica is not complete. But the proof was actu-
ally very general and applies to a wide variety of theories. This means that it
wasn’t just that Principia Mathematica did not manage to completely capture
mathematics, but that no acceptable theory does. It took a while to isolate
the features of theories that suffice for the incompleteness theorems to apply,
and to generalize Gödel’s proof to apply make it depend only on these fea-
tures. But we are now in a position to state a very general version of the first
incompleteness theorem for theories in the language L A of arithmetic.

Theorem 14.14. If Γ is a consistent and axiomatizable theory in L A which repre-


sents all computable functions and decidable relations, then Γ is not complete.

To say that Γ is not complete is to say that for at least one sentence φ,
Γ ⊬ φ and Γ ⊬ ∼ φ. Such a sentence is called independent (of Γ). We can in
fact relatively quickly prove that there must be independent sentences. But
the power of Gödel’s proof of the theorem lies in the fact that it exhibits a
specific example of such an independent sentence. The intriguing construction
produces a sentence γΓ , called a Gödel sentence for Γ, which is unprovable
because in Γ, γΓ is equivalent to the claim that γΓ is unprovable in Γ. It does
so constructively, i.e., given an axiomatization of Γ and a description of the
derivation system, the proof gives a method for actually writing down γΓ .
The construction in Gödel’s proof requires that we find a way to express
in L A the properties of and operations on terms and formulae of L A itself.
These include properties such as “φ is a sentence,” “δ is a derivation of φ,”
and operations such as φ[t/x ]. This way must (a) express these properties
and relations via a “coding” of symbols and sequences thereof (which is what
terms, formulae, derivations, etc. are) as natural numbers (which is what L A
can talk about). It must (b) do this in such a way that Γ will prove the relevant
facts, so we must show that these properties are coded by decidable properties
of natural numbers and the operations correspond to computable functions on
natural numbers. This is called “arithmetization of syntax.”
Before we investigate how syntax can be arithmetized, however, we will
consider the condition that Γ is “strong enough,” i.e., represents all com-
putable functions and decidable relations. This requires that we give a precise
definition of “computable.” This can be done in a number of ways, e.g., via
the model of Turing machines, or as those functions computable by programs
in some general-purpose programming language. Since our aim is to repre-
sent these functions and relations in a theory in the language L A , however, it
is best to pick a simple definition of computability of just numerical functions.
This is the notion of recursive function. So we will first discuss the recursive

218
14.4. Undecidability and Incompleteness

functions. We will then show that Q already represents all recursive functions
and relations. This will allow us to apply the incompleteness theorem to spe-
cific theories such as Q and PA, since we will have established that these are
examples of theories that are “strong enough.”
The end result of the arithmetization of syntax is a formula ProvΓ ( x ) which,
via the coding of formulae as numbers, expresses provability from the axioms
of Γ. Specifically, if φ is coded by the number n, and Γ ⊢ φ, then Γ ⊢ ProvΓ (n).
This “provability predicate” for Γ allows us also to express, in a certain sense,
the consistency of Γ as a sentence of L A : let the “consistency statement” for Γ
be the sentence ∼ProvΓ (n), where we take n to be the code of a contradiction,
e.g., of ⊥. The second incompleteness theorem states that consistent axioma-
tizable theories also do not prove their own consistency statements. The con-
ditions required for this theorem to apply are a bit more stringent than just
that the theory represents all computable functions and decidable relations,
but we will show that PA satisfies them.

14.4 Undecidability and Incompleteness


Gödel’s proof of the incompleteness theorems require arithmetization of syn-
tax. But even without that we can obtain some nice results just on the assump-
tion that a theory represents all decidable relations. The proof is a diagonal
argument similar to the proof of the undecidability of the halting problem.

Theorem 14.15. If Γ is a consistent theory that represents every decidable relation,


then Γ is not decidable.

Proof. Suppose Γ were decidable. We show that if Γ represents every decid-


able relation, it must be inconsistent.
Decidable properties (one-place relations) are represented by formulae with
one free variable. Let φ0 ( x ), φ1 ( x ), . . . , be a computable enumeration of all
such formulae. Now consider the following set D ⊆ N:

D = {n | Γ ⊢ ∼ φn (n)}

The set D is decidable, since we can test if n ∈ D by first computing φn ( x ), and


from this ∼ φn (n). Obviously, substituting the term n for every free occurrence
of x in φn ( x ) and prefixing φ(n) by ∼ is a mechanical matter. By assumption,
Γ is decidable, so we can test if ∼ φ(n) ∈ Γ. If it is, n ∈ D, and if it isn’t, n ∈
/ D.
So D is likewise decidable.
Since Γ represents all decidable properties, it represents D. And the for-
mulae which represent D in Γ are all among φ0 ( x ), φ1 ( x ), . . . . So let d be a
number such that φd ( x ) represents D in Γ. If d ∈ / D, then, since φd ( x ) repre-
sents D, Γ ⊢ ∼ φd (d). But that means that d meets the defining condition of D,
and so d ∈ D. This contradicts d ∈ / D. So by indirect proof, d ∈ D.

219
14. I NTRODUCTION TO I NCOMPLETENESS

Since d ∈ D, by the definition of D, Γ ⊢ ∼ φd (d). On the other hand, since


φd ( x ) represents D in Γ, Γ ⊢ φd (d). Hence, Γ is inconsistent.

The preceding theorem shows that no consistent theory that represents all
decidable relations can be decidable. We will show that Q does represent all
decidable relations; this means that all theories that include Q, such as PA and
TA, also do, and hence also are not decidable. (Since all these theories are true
in the standard model, they are all consistent.)
We can also use this result to obtain a weak version of the first incomplete-
ness theorem. Any theory that is axiomatizable and complete is decidable.
Consistent theories that are axiomatizable and represent all decidable proper-
ties then cannot be complete.

Theorem 14.16. If Γ is axiomatizable and complete it is decidable.

Proof. Any inconsistent theory is decidable, since inconsistent theories contain


all sentences, so the answer to the question “is φ ∈ Γ” is always “yes,” i.e., can
be decided.
So suppose Γ is consistent, and furthermore is axiomatizable, and com-
plete. Since Γ is axiomatizable, it is computably enumerable. For we can
enumerate all the correct derivations from the axioms of Γ by a computable
function. From a correct derivation we can compute the sentence it derives,
and so together there is a computable function that enumerates all theorems
of Γ. A sentence is a theorem of Γ iff ∼ φ is not a theorem, since Γ is consistent
and complete. We can therefore decide if φ ∈ Γ as follows. Enumerate all
theorems of Γ. When φ appears on this list, we know that Γ ⊢ φ. When ∼ φ
appears on this list, we know that Γ ⊬ φ. Since Γ is complete, one of these
cases eventually obtains, so the procedure eventually produces an answer.

Corollary 14.17. If Γ is consistent, axiomatizable, and represents every decidable


property, it is not complete.

Proof. If Γ were complete, it would be decidable by the previous theorem


(since it is axiomatizable and consistent). But since Γ represents every de-
cidable property, it is not decidable, by the first theorem.

Once we have established that, e.g., Q, represents all decidable properties,


the corollary tells us that Q must be incomplete. However, its proof does not
provide an example of an independent sentence; it merely shows that such
a sentence must exist. For this, we have to arithmetize syntax and follow
Gödel’s original proof idea. And of course, we still have to show the first
claim, namely that Q does, in fact, represent all decidable properties.
It should be noted that not every interesting theory is incomplete or unde-
cidable. There are many theories that are sufficiently strong to describe inter-
esting mathematical facts that do not satisify the conditions of Gödel’s result.

220
14.4. Undecidability and Incompleteness

For instance, Pres = { φ ∈ L A+ | N ⊨ φ}, the set of sentences of the language


of arithmetic without × true in the standard model, is both complete and de-
cidable. This theory is called Presburger arithmetic, and proves all the truths
about natural numbers that can be formulated just with 0, ′, and +.

221
Chapter 15

Recursive Functions

15.1 Introduction
In order to develop a mathematical theory of computability, one has to, first
of all, develop a model of computability. We now think of computability as the
kind of thing that computers do, and computers work with symbols. But at
the beginning of the development of theories of computability, the paradig-
matic example of computation was numerical computation. Mathematicians
were always interested in number-theoretic functions, i.e., functions f : Nn →
N that can be computed. So it is not surprising that at the beginning of the
theory of computability, it was such functions that were studied. The most
familiar examples of computable numerical functions, such as addition, mul-
tiplication, exponentiation (of natural numbers) share an interesting feature:
they can be defined recursively. It is thus quite natural to attempt a general
definition of computable function on the basis of recursive definitions. Among
the many possible ways to define number-theoretic functions recursively, one
particularly simple pattern of definition here becomes central: so-called prim-
itive recursion.
In addition to computable functions, we might be interested in computable
sets and relations. A set is computable if we can compute the answer to
whether or not a given number is an element of the set, and a relation is com-
putable iff we can compute whether or not a tuple ⟨n1 , . . . , nk ⟩ is an element
of the relation. By considering the characteristic function of a set or relation,
discussion of computable sets and relations can be subsumed under that of
computable functions. Thus we can define primitive recursive relations as
well, e.g., the relation “n evenly divides m” is a primitive recursive relation.
Primitive recursive functions—those that can be defined using just primi-
tive recursion—are not, however, the only computable number-theoretic func-
tions. Many generalizations of primitive recursion have been considered, but
the most powerful and widely-accepted additional way of computing func-
tions is by unbounded search. This leads to the definition of partial recur-

223
15. R ECURSIVE F UNCTIONS

sive functions, and a related definition to general recursive functions. General


recursive functions are computable and total, and the definition character-
izes exactly the partial recursive functions that happen to be total. Recursive
functions can simulate every other model of computation (Turing machines,
lambda calculus, etc.) and so represent one of the many accepted models of
computation.

15.2 Primitive Recursion


A characteristic of the natural numbers is that every natural number can be
reached from 0 by applying the successor operation +1 finitely many times—
any natural number is either 0 or the successor of . . . the successor of 0. One
way to specify a function h : N → N that makes use of this fact is this: (a) spec-
ify what the value of h is for argument 0, and (b) also specify how to, given
the value of h( x ), compute the value of h( x + 1). For (a) tells us directly what
h(0) is, so h is defined for 0. Now, using the instruction given by (b) for x = 0,
we can compute h(1) = h(0 + 1) from h(0). Using the same instructions for
x = 1, we compute h(2) = h(1 + 1) from h(1), and so on. For every natural
number x, we’ll eventually reach the step where we define h( x ) from h( x + 1),
and so h( x ) is defined for all x ∈ N.
For instance, suppose we specify h : N → N by the following two equa-
tions:

h (0) = 1
h ( x + 1) = 2 · h ( x )

If we already know how to multiply, then these equations give us the infor-
mation required for (a) and (b) above. By successively applying the second
equation, we get that

h(1) = 2 · h(0) = 2,
h(2) = 2 · h(1) = 2 · 2,
h(3) = 2 · h(2) = 2 · 2 · 2,
..
.

We see that the function h we have specified is h( x ) = 2x .


The characteristic feature of the natural numbers guarantees that there is
only one function h that meets these two criteria. A pair of equations like
these is called a definition by primitive recursion of the function h. It is so-called
because we define h “recursively,” i.e., the definition, specifically the second
equation, involves h itself on the right-hand-side. It is “primitive” because in
defining h( x + 1) we only use the value h( x ), i.e., the immediately preceding
value. This is the simplest way of defining a function on N recursively.

224
15.2. Primitive Recursion

We can define even more fundamental functions like addition and mul-
tiplication by primitive recursion. In these cases, however, the functions in
question are 2-place. We fix one of the argument places, and use the other for
the recursion. E.g, to define add( x, y) we can fix x and define the value first
for y = 0 and then for y + 1 in terms of y. Since x is fixed, it will appear on the
left and on the right side of the defining equations.

add( x, 0) = x
add( x, y + 1) = add( x, y) + 1

These equations specify the value of add for all x and y. To find add(2, 3), for
instance, we apply the defining equations for x = 2, using the first to find
add(2, 0) = 2, then using the second to successively find add(2, 1) = 2 + 1 =
3, add(2, 2) = 3 + 1 = 4, add(2, 3) = 4 + 1 = 5.
In the definition of add we used + on the right-hand-side of the second
equation, but only to add 1. In other words, we used the successor func-
tion succ(z) = z + 1 and applied it to the previous value add( x, y) to define
add( x, y + 1). So we can think of the recursive definition as given in terms of
a single function which we apply to the previous value. However, it doesn’t
hurt—and sometimes is necessary—to allow the function to depend not just
on the previous value but also on x and y. Consider:

mult( x, 0) = 0
mult( x, y + 1) = add(mult( x, y), x )

This is a primitive recursive definition of a function mult by applying the func-


tion add to both the preceding value mult( x, y) and the first argument x. It
also defines the function mult( x, y) for all arguments x and y. For instance,
mult(2, 3) is determined by successively computing mult(2, 0), mult(2, 1), mult(2, 2),
and mult(2, 3):

mult(2, 0) = 0
mult(2, 1) = mult(2, 0 + 1) = add(mult(2, 0), 2) = add(0, 2) = 2
mult(2, 2) = mult(2, 1 + 1) = add(mult(2, 1), 2) = add(2, 2) = 4
mult(2, 3) = mult(2, 2 + 1) = add(mult(2, 2), 2) = add(4, 2) = 6

The general pattern then is this: to give a primitive recursive definition of


a function h( x0 , . . . , xk−1 , y), we provide two equations. The first defines the
value of h( x0 , . . . , xk−1 , 0) without reference to h. The second defines the value
of h( x0 , . . . , xk−1 , y + 1) in terms of h( x0 , . . . , xk−1 , y), the other arguments x0 ,
. . . , xk−1 , and y. Only the immediately preceding value of h may be used in
that second equation. If we think of the operations given by the right-hand-
sides of these two equations as themselves being functions f and g, then the

225
15. R ECURSIVE F UNCTIONS

general pattern to define a new function h by primitive recursion is this:

h ( x 0 , . . . , x k −1 , 0 ) = f ( x 0 , . . . , x k −1 )
h( x0 , . . . , xk−1 , y + 1) = g( x0 , . . . , xk−1 , y, h( x0 , . . . , xk−1 , y))

In the case of add, we have k = 1 and f ( x0 ) = x0 (the identity function), and


g( x0 , y, z) = z + 1 (the 3-place function that returns the successor of its third
argument):

add( x0 , 0) = f ( x0 ) = x0
add( x0 , y + 1) = g( x0 , y, add( x0 , y)) = succ(add( x0 , y))

In the case of mult, we have f ( x0 ) = 0 (the constant function always return-


ing 0) and g( x0 , y, z) = add(z, x0 ) (the 3-place function that returns the sum
of its last and first argument):

mult( x0 , 0) = f ( x0 ) = 0
mult( x0 , y + 1) = g( x0 , y, mult( x0 , y)) = add(mult( x0 , y), x0 )

15.3 Composition
If f and g are two one-place functions of natural numbers, we can compose
them: h( x ) = g( f ( x )). The new function h( x ) is then defined by composition
from the functions f and g. We’d like to generalize this to functions of more
than one argument.
Here’s one way of doing this: suppose f is a k-place function, and g0 , . . . ,
gk−1 are k functions which are all n-place. Then we can define a new n-place
function h as follows:

h( x0 , . . . , xn−1 ) = f ( g0 ( x0 , . . . , xn−1 ), . . . , gk−1 ( x0 , . . . , xn−1 ))

If f and all gi are computable, so is h: To compute h( x0 , . . . , xn−1 ), first com-


pute the values yi = gi ( x0 , . . . , xn−1 ) for each i = 0, . . . , k − 1. Then feed these
values into f to compute h( x0 , . . . , xk−1 ) = f (y0 , . . . , yk−1 ).
This may seem like an overly restrictive characterization of what happens
when we compute a new function using some existing ones. For one thing,
sometimes we do not use all the arguments of a function, as when we de-
fined g( x, y, z) = succ(z) for use in the primitive recursive definition of add.
Suppose we are allowed use of the following functions:

Pin ( x0 , . . . , xn−1 ) = xi

The functions Pik are called projection functions: Pin is an n-place function. Then
g can be defined by
g( x, y, z) = succ( P23 ( x, y, z)).

226
15.4. Primitive Recursion Functions

Here the role of f is played by the 1-place function succ, so k = 1. And we


have one 3-place function P23 which plays the role of g0 . The result is a 3-place
function that returns the successor of the third argument.
The projection functions also allow us to define new functions by reorder-
ing or identifying arguments. For instance, the function h( x ) = add( x, x ) can
be defined by
h( x0 ) = add( P01 ( x0 ), P01 ( x0 )).
Here k = 2, n = 1, the role of f (y0 , y1 ) is played by add, and the roles of g0 ( x0 )
and g1 ( x0 ) are both played by P01 ( x0 ), the one-place projection function (aka
the identity function).
If f (y0 , y1 ) is a function we already have, we can define the function h( x0 , x1 ) =
f ( x1 , x0 ) by
h( x0 , x1 ) = f ( P12 ( x0 , x1 ), P02 ( x0 , x1 )).
Here k = 2, n = 2, and the roles of g0 and g1 are played by P12 and P02 , respec-
tively.
You may also worry that g0 , . . . , gk−1 are all required to have the same
arity n. (Remember that the arity of a function is the number of arguments;
an n-place function has arity n.) But adding the projection functions provides
the desired flexibility. For example, suppose f and g are 3-place functions and
h is the 2-place function defined by

h( x, y) = f ( x, g( x, x, y), y).

The definition of h can be rewritten with the projection functions, as

h( x, y) = f ( P02 ( x, y), g( P02 ( x, y), P02 ( x, y), P12 ( x, y)), P12 ( x, y)).

Then h is the composition of f with P02 , l, and P12 , where

l ( x, y) = g( P02 ( x, y), P02 ( x, y), P12 ( x, y)),

i.e., l is the composition of g with P02 , P02 , and P12 .

15.4 Primitive Recursion Functions


Let us record again how we can define new functions from existing ones using
primitive recursion and composition.

Definition 15.1. Suppose f is a k-place function (k ≥ 1) and g is a (k + 2)-


place function. The function defined by primitive recursion from f and g is the
(k + 1)-place function h defined by the equations

h ( x 0 , . . . , x k −1 , 0 ) = f ( x 0 , . . . , x k −1 )
h( x0 , . . . , xk−1 , y + 1) = g( x0 , . . . , xk−1 , y, h( x0 , . . . , xk−1 , y))

227
15. R ECURSIVE F UNCTIONS

Definition 15.2. Suppose f is a k-place function, and g0 , . . . , gk−1 are k func-


tions which are all n-place. The function defined by composition from f and g0 ,
. . . , gk−1 is the n-place function h defined by

h( x0 , . . . , xn−1 ) = f ( g0 ( x0 , . . . , xn−1 ), . . . , gk−1 ( x0 , . . . , xn−1 )).

In addition to succ and the projection functions

Pin ( x0 , . . . , xn−1 ) = xi ,

for each natural number n and i < n, we will include among the primitive
recursive functions the function zero( x ) = 0.

Definition 15.3. The set of primitive recursive functions is the set of functions
from Nn to N, defined inductively by the following clauses:

1. zero is primitive recursive.

2. succ is primitive recursive.

3. Each projection function Pin is primitive recursive.

4. If f is a k-place primitive recursive function and g0 , . . . , gk−1 are n-


place primitive recursive functions, then the composition of f with g0 ,
. . . , gk−1 is primitive recursive.

5. If f is a k-place primitive recursive function and g is a k + 2-place primi-


tive recursive function, then the function defined by primitive recursion
from f and g is primitive recursive.

Put more concisely, the set of primitive recursive functions is the smallest
set containing zero, succ, and the projection functions Pjn , and which is closed
under composition and primitive recursion.
Another way of describing the set of primitive recursive functions is by
defining it in terms of “stages.” Let S0 denote the set of starting functions:
zero, succ, and the projections. These are the primitive recursive functions of
stage 0. Once a stage Si has been defined, let Si+1 be the set of all functions
you get by applying a single instance of composition or primitive recursion to
functions already in Si . Then
[
S= Si
i ∈N

is the set of all primitive recursive functions


Let us verify that add is a primitive recursive function.

Proposition 15.4. The addition function add( x, y) = x + y is primitive recursive.

228
15.4. Primitive Recursion Functions

Proof. We already have a primitive recursive definition of add in terms of two


functions f and g which matches the format of Definition 15.1:

add( x0 , 0) = f ( x0 ) = x0
add( x0 , y + 1) = g( x0 , y, add( x0 , y)) = succ(add( x0 , y))

So add is primitive recursive provided f and g are as well. f ( x0 ) = x0 =


P01 ( x0 ), and the projection functions count as primitive recursive, so f is prim-
itive recursive. The function g is the three-place function g( x0 , y, z) defined
by
g( x0 , y, z) = succ(z).
This does not yet tell us that g is primitive recursive, since g and succ are not
quite the same function: succ is one-place, and g has to be three-place. But we
can define g “officially” by composition as

g( x0 , y, z) = succ( P23 ( x0 , y, z))

Since succ and P23 count as primitive recursive functions, g does as well, since
it can be defined by composition from primitive recursive functions.

Proposition 15.5. The multiplication function mult( x, y) = x · y is primitive re-


cursive.

Proof. Exercise.

Example 15.6. Here’s our very first example of a primitive recursive defini-
tion:

h (0) = 1
h ( y + 1) = 2 · h ( y ).

This function cannot fit into the form required by Definition 15.1, since k = 0.
The definition also involves the constants 1 and 2. To get around the first
problem, let’s introduce a dummy argument and define the function h′ :

h ′ ( x0 , 0) = f ( x0 ) = 1
h′ ( x0 , y + 1) = g( x0 , y, h′ ( x0 , y)) = 2 · h′ ( x0 , y).

The function f ( x0 ) = 1 can be defined from succ and zero by composition:


f ( x0 ) = succ(zero( x0 )). The function g can be defined by composition from
g′ (z) = 2 · z and projections:

g( x0 , y, z) = g′ ( P23 ( x0 , y, z))

and g′ in turn can be defined by composition as

g′ (z) = mult( g′′ (z), P01 (z))

229
15. R ECURSIVE F UNCTIONS

and

g′′ (z) = succ( f (z)),

where f is as above: f (z) = succ(zero(z)). Now that we have h′ , we can


use composition again to let h(y) = h′ ( P01 (y), P01 (y)). This shows that h can
be defined from the basic functions using a sequence of compositions and
primitive recursions, so h is primitive recursive.

15.5 Primitive Recursion Notations


One advantage to having the precise inductive description of the primitive re-
cursive functions is that we can be systematic in describing them. For exam-
ple, we can assign a “notation” to each such function, as follows. Use symbols
zero, succ, and Pin for zero, successor, and the projections. Now suppose h is
defined by composition from a k-place function f and n-place functions g0 ,
. . . , gk−1 , and we have assigned notations F, G0 , . . . , Gk−1 to the latter func-
tions. Then, using a new symbol Compk,n , we can denote the function h by
Compk,n [ F, G0 , . . . , Gk−1 ].
For functions defined by primitive recursion, we can use analogous no-
tations. Suppose the (k + 1)-ary function h is defined by primitive recursion
from the k-ary function f and the (k + 2)-ary function g, and the notations
assigned to f and g are F and G, respectively. Then the notation assigned to h
is Reck [ F, G ].
Recall that the addition function is defined by primitive recursion as

add( x0 , 0) = P01 ( x0 ) = x0
add( x0 , y + 1) = succ( P23 ( x0 , y, add( x0 , y))) = add( x0 , y) + 1

Here the role of f is played by P01 , and the role of g is played by succ( P23 ( x0 , y, z)),
which is assigned the notation Comp1,3 [succ, P23 ] as it is the result of defining
a function by composition from the 1-ary function succ and the 3-ary func-
tion P23 . With this setup, we can denote the addition function by

Rec1 [ P01 , Comp1,3 [succ, P23 ]].

Having these notations sometimes proves useful, e.g., when enumerating prim-
itive recursive functions.

15.6 Primitive Recursive Functions are Computable


Suppose a function h is defined by primitive recursion

h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y))

230
15.7. Examples of Primitive Recursive Functions

and suppose the functions f and g are computable. (We use ⃗x to abbreviate x0 ,
. . . , xk−1 .) Then h(⃗x, 0) can obviously be computed, since it is just f (⃗x ) which
we assume is computable. h(⃗x, 1) can then also be computed, since 1 = 0 + 1
and so h(⃗x, 1) is just

h(⃗x, 1) = g(⃗x, 0, h(⃗x, 0)) = g(⃗x, 0, f (⃗x )).

We can go on in this way and compute

h(⃗x, 2) = g(⃗x, 1, h(⃗x, 1)) = g(⃗x, 1, g(⃗x, 0, f (⃗x )))


h(⃗x, 3) = g(⃗x, 2, h(⃗x, 2)) = g(⃗x, 2, g(⃗x, 1, g(⃗x, 0, f (⃗x ))))
h(⃗x, 4) = g(⃗x, 3, h(⃗x, 3)) = g(⃗x, 3, g(⃗x, 2, g(⃗x, 1, g(⃗x, 0, f (⃗x )))))
..
.

Thus, to compute h(⃗x, y) in general, successively compute h(⃗x, 0), h(⃗x, 1), . . . ,
until we reach h(⃗x, y).
Thus, a primitive recursive definition yields a new computable function if
the functions f and g are computable. Composition of functions also results
in a computable function if the functions f and gi are computable.
Since the basic functions zero, succ, and Pin are computable, and compo-
sition and primitive recursion yield computable functions from computable
functions, this means that every primitive recursive function is computable.

15.7 Examples of Primitive Recursive Functions


We already have some examples of primitive recursive functions: the addition
and multiplication functions add and mult. The identity function id( x ) = x
is primitive recursive, since it is just P01 . The constant functions constn ( x ) = n
are primitive recursive since they can be defined from zero and succ by suc-
cessive composition. This is useful when we want to use constants in primi-
tive recursive definitions, e.g., if we want to define the function f ( x ) = 2 · x
can obtain it by composition from constn ( x ) and multiplication as f ( x ) =
mult(const2 ( x ), P01 ( x )). We’ll make use of this trick from now on.

Proposition 15.7. The exponentiation function exp( x, y) = x y is primitive recur-


sive.

Proof. We can define exp primitive recursively as

exp( x, 0) = 1
exp( x, y + 1) = mult( x, exp( x, y)).

231
15. R ECURSIVE F UNCTIONS

Strictly speaking, this is not a recursive definition from primitive recursive


functions. Officially, though, we have:

exp( x, 0) = f ( x )
exp( x, y + 1) = g( x, y, exp( x, y)).

where

f ( x ) = succ(zero( x )) = 1
g( x, y, z) = mult( P03 ( x, y, z), P23 ( x, y, z)) = x · z

and so f and g are defined from primitive recursive functions by composi-


tion.

Proposition 15.8. The predecessor function pred(y) defined by


(
0 if y = 0
pred(y) =
y − 1 otherwise

is primitive recursive.

Proof. Note that

pred(0) = 0 and
pred(y + 1) = y.

This is almost a primitive recursive definition. It does not, strictly speaking, fit
into the pattern of definition by primitive recursion, since that pattern requires
at least one extra argument x. It is also odd in that it does not actually use
pred(y) in the definition of pred(y + 1). But we can first define pred′ ( x, y) by

pred′ ( x, 0) = zero( x ) = 0,
pred′ ( x, y + 1) = P13 ( x, y, pred′ ( x, y)) = y.

and then define pred from it by composition, e.g., as pred( x ) = pred′ (zero( x ), P01 ( x )).

Proposition 15.9. The factorial function fac( x ) = x ! = 1 · 2 · 3 · · · · · x is primitive


recursive.

Proof. The obvious primitive recursive definition is

fac(0) = 1
fac(y + 1) = fac(y) · (y + 1).

232
15.7. Examples of Primitive Recursive Functions

Officially, we have to first define a two-place function h

h( x, 0) = const1 ( x )
h( x, y + 1) = g( x, y, h( x, y))

where g( x, y, z) = mult( P23 ( x, y, z), succ( P13 ( x, y, z))) and then let

fac(y) = h( P01 (y), P01 (y)) = h(y, y).

From now on we’ll be a bit more laissez-faire and not give the official defini-
tions by composition and primitive recursion.

Proposition 15.10. Truncated subtraction, x −̇ y, defined by


(
0 if x < y
x −̇ y =
x − y otherwise

is primitive recursive.

Proof. We have:

x −̇ 0 = x
x −̇ (y + 1) = pred( x −̇ y)

Proposition 15.11. The distance between x and y, | x − y|, is primitive recursive.

Proof. We have | x − y| = ( x −̇ y) + (y −̇ x ), so the distance can be defined by


composition from + and −̇, which are primitive recursive.

Proposition 15.12. The maximum of x and y, max( x, y), is primitive recursive.

Proof. We can define max( x, y) by composition from + and −̇ by

max( x, y) = x + (y −̇ x ).

If x is the maximum, i.e., x ≥ y, then y −̇ x = 0, so x + (y −̇ x ) = x + 0 = x. If


y is the maximum, then y −̇ x = y − x, and so x + (y −̇ x ) = x + (y − x ) = y.

Proposition 15.13. The minimum of x and y, min( x, y), is primitive recursive.

Proof. Exercise.

Proposition 15.14. The set of primitive recursive functions is closed under the fol-
lowing two operations:

233
15. R ECURSIVE F UNCTIONS

1. Finite sums: if f (⃗x, z) is primitive recursive, then so is the function


y
g(⃗x, y) = ∑ f (⃗x, z).
z =0

2. Finite products: if f (⃗x, z) is primitive recursive, then so is the function


y
h(⃗x, y) = ∏ f (⃗x, z).
z =0

Proof. For example, finite sums are defined recursively by the equations

g(⃗x, 0) = f (⃗x, 0)
g(⃗x, y + 1) = g(⃗x, y) + f (⃗x, y + 1).

15.8 Primitive Recursive Relations


Definition 15.15. A relation R(⃗x ) is said to be primitive recursive if its char-
acteristic function, 
1 if R(⃗x )
χ R (⃗x ) =
0 otherwise
is primitive recursive.

In other words, when one speaks of a primitive recursive relation R(⃗x ),


one is referring to a relation of the form χ R (⃗x ) = 1, where χ R is a primitive
recursive function which, on any input, returns either 1 or 0. For example,
the relation IsZero( x ), which holds if and only if x = 0, corresponds to the
function χIsZero , defined using primitive recursion by

χIsZero (0) = 1,
χIsZero ( x + 1) = 0.

It should be clear that one can compose relations with other primitive re-
cursive functions. So the following are also primitive recursive:

1. The equality relation, x = y, defined by IsZero(| x − y|)

2. The less-than relation, x ≤ y, defined by IsZero( x −̇ y)

Proposition 15.16. The set of primitive recursive relations is closed under Boolean
operations, that is, if P(⃗x ) and Q(⃗x ) are primitive recursive, so are

1. ∼ P(⃗x )

2. P(⃗x ) & Q(⃗x )

234
15.8. Primitive Recursive Relations

3. P(⃗x ) ∨ Q(⃗x )

4. P(⃗x ) ⊃ Q(⃗x )

Proof. Suppose P(⃗x ) and Q(⃗x ) are primitive recursive, i.e., their characteristic
functions χ P and χQ are. We have to show that the characteristic functions of
∼ P(⃗x ), etc., are also primitive recursive.
(
0 if χ P (⃗x ) = 1
χ∼ P (⃗x ) =
1 otherwise

We can define χ∼ P (⃗x ) as 1 −̇ χ P (⃗x ).


(
1 if χ P (⃗x ) = χQ (⃗x ) = 1
χ P&Q (⃗x ) =
0 otherwise

We can define χ P&Q (⃗x ) as χ P (⃗x ) · χQ (⃗x ) or as min(χ P (⃗x ), χQ (⃗x )). Similarly,

χ P∨Q (⃗x ) = max(χ P (⃗x ), χQ (⃗x ))) and


χ P⊃Q (⃗x ) = max(1 −̇ χ P (⃗x ), χQ (⃗x )).

Proposition 15.17. The set of primitive recursive relations is closed under bounded
quantification, i.e., if R(⃗x, z) is a primitive recursive relation, then so are the relations

(∀z < y) R(⃗x, z) and


(∃z < y) R(⃗x, z).

(∀z < y) R(⃗x, z) holds of ⃗x and y if and only if R(⃗x, z) holds for every z less than y,
and similarly for (∃z < y) R(⃗x, z).

Proof. By convention, we take (∀z < 0) R(⃗x, z) to be true (for the trivial reason
that there are no z less than 0) and (∃z < 0) R(⃗x, z) to be false. A bounded
universal quantifier functions just like a finite product or iterated minimum,
i.e., if P(⃗x, y) ⇔ (∀z < y) R(⃗x, z) then χ P (⃗x, y) can be defined by

χ P (⃗x, 0) = 1
χ P (⃗x, y + 1) = min(χ P (⃗x, y), χ R (⃗x, y))).

Bounded existential quantification can similarly be defined using max. Al-


ternatively, it can be defined from bounded universal quantification, using
the equivalence (∃z < y) R(⃗x, z) ≡ ∼(∀z < y) ∼ R(⃗x, z). Note that, for ex-
ample, a bounded quantifier of the form (∃ x ≤ y) . . . x . . . is equivalent to
(∃ x < y + 1) . . . x . . . .

235
15. R ECURSIVE F UNCTIONS

Another useful primitive recursive function is the conditional function,


cond( x, y, z), defined by
(
y if x = 0
cond( x, y, z) =
z otherwise.

This is defined recursively by

cond(0, y, z) = y,
cond( x + 1, y, z) = z.

One can use this to justify definitions of primitive recursive functions by cases
from primitive recursive relations:

Proposition 15.18. If g0 (⃗x ), . . . , gm (⃗x ) are primitive recursive functions, and R0 (⃗x ),
. . . , Rm−1 (⃗x ) are primitive recursive relations, then the function f defined by


 g0 (⃗x ) if R0 (⃗x )

 g1 (⃗x ) if R1 (⃗x ) and not R0 (⃗x )



.

f (⃗x ) = ..


 gm−1 (⃗x ) if Rm−1 (⃗x ) and none of the previous hold




gm (⃗x ) otherwise

is also primitive recursive.

Proof. When m = 1, this is just the function defined by

f (⃗x ) = cond(χ∼ R0 (⃗x ), g0 (⃗x ), g1 (⃗x )).

For m greater than 1, one can just compose definitions of this form.

15.9 Bounded Minimization


It is often useful to define a function as the least number satisfying some prop-
erty or relation P. If P is decidable, we can compute this function simply by
trying out all the possible numbers, 0, 1, 2, . . . , until we find the least one satis-
fying P. This kind of unbounded search takes us out of the realm of primitive
recursive functions. However, if we’re only interested in the least number
less than some independently given bound, we stay primitive recursive. In other
words, and a bit more generally, suppose we have a primitive recursive rela-
tion R( x, z). Consider the function that maps x and y to the least z < y such
that R( x, z). It, too, can be computed, by testing whether R( x, 0), R( x, 1), . . . ,
R( x, y − 1). But why is it primitive recursive?

236
15.10. Primes

Proposition 15.19. If R(⃗x, z) is primitive recursive, so is the function m R (⃗x, y)


which returns the least z less than y such that R(⃗x, z) holds, if there is one, and y
otherwise. We will write the function m R as

(min z < y) R(⃗x, z),

Proof. Note than there can be no z < 0 such that R(⃗x, z) since there is no z < 0
at all. So m R (⃗x, 0) = 0.
In case the bound is of the form y + 1 we have three cases:
1. There is a z < y such that R(⃗x, z), in which case m R (⃗x, y + 1) = m R (⃗x, y).
2. There is no such z < y but R(⃗x, y) holds, then m R (⃗x, y + 1) = y.
3. There is no z < y + 1 such that R(⃗x, z), then m R (⃗z, y + 1) = y + 1.
So we can define m R (⃗x, 0) by primitive recursion as follows:

m R (⃗x, 0) = 0

m R (⃗x, y)
 if m R (⃗x, y) ̸= y
m R (⃗x, y + 1) = y if m R (⃗x, y) = y and R(⃗x, y)

y+1 otherwise.

Note that there is a z < y such that R(⃗x, z) iff m R (⃗x, y) ̸= y.

15.10 Primes
Bounded quantification and bounded minimization provide us with a good
deal of machinery to show that natural functions and relations are primitive
recursive. For example, consider the relation “x divides y”, written x | y. The
relation x | y holds if division of y by x is possible without remainder, i.e.,
if y is an integer multiple of x. (If it doesn’t hold, i.e., the remainder when
dividing x by y is > 0, we write x ∤ y.) In other words, x | y iff for some z,
x · z = y. Obviously, any such z, if it exists, must be ≤ y. So, we have that
x | y iff for some z ≤ y, x · z = y. We can define the relation x | y by bounded
existential quantification from = and multiplication by

x | y ⇔ (∃z ≤ y) ( x · z) = y.

We’ve thus shown that x | y is primitive recursive.


A natural number x is prime if it is neither 0 nor 1 and is only divisible by
1 and itself. In other words, prime numbers are such that, whenever y | x,
either y = 1 or y = x. To test if x is prime, we only have to check if y | x for
all y ≤ x, since if y > x, then automatically y ∤ x. So, the relation Prime( x ),
which holds iff x is prime, can be defined by

Prime( x ) ⇔ x ≥ 2 & (∀y ≤ x ) (y | x ⊃ y = 1 ∨ y = x )

237
15. R ECURSIVE F UNCTIONS

and is thus primitive recursive.


The primes are 2, 3, 5, 7, 11, etc. Consider the function p( x ) which returns
the xth prime in that sequence, i.e., p(0) = 2, p(1) = 3, p(2) = 5, etc. (For
convenience we will often write p( x ) as p x (p0 = 2, p1 = 3, etc.)
If we had a function nextPrime(x), which returns the first prime number
larger than x, p can be easily defined using primitive recursion:

p (0) = 2
p( x + 1) = nextPrime( p( x ))

Since nextPrime( x ) is the least y such that y > x and y is prime, it can be
easily computed by unbounded search. But it can also be defined by bounded
minimization, thanks to a result due to Euclid: there is always a prime number
between x and x ! + 1.

nextPrime(x) = (min y ≤ x ! + 1) (y > x & Prime(y)).

This shows, that nextPrime( x ) and hence p( x ) are (not just computable but)
primitive recursive.
(If you’re curious, here’s a quick proof of Euclid’s theorem. Suppose pn
is the largest prime ≤ x and consider the product p = p0 · p1 · · · · · pn of all
primes ≤ x. Either p + 1 is prime or there is a prime between x and p + 1.
Why? Suppose p + 1 is not prime. Then some prime number q | p + 1 where
q < p + 1. None of the primes ≤ x divide p + 1. (By definition of p, each
of the primes pi ≤ x divides p, i.e., with remainder 0. So, each of the primes
pi ≤ x divides p + 1 with remainder 1, and so pi ∤ p + 1.) Hence, q is a prime
> x and < p + 1. And p ≤ x !, so there is a prime > x and ≤ x ! + 1.)

15.11 Sequences
The set of primitive recursive functions is remarkably robust. But we will be
able to do even more once we have developed a adequate means of handling
sequences. We will identify finite sequences of natural numbers with natural
numbers in the following way: the sequence ⟨ a0 , a1 , a2 , . . . , ak ⟩ corresponds to
the number
a +1 a +1 a +1
p00 · p11 · p2a2 +1 · · · · · pkk .
We add one to the exponents to guarantee that, for example, the sequences
⟨2, 7, 3⟩ and ⟨2, 7, 3, 0, 0⟩ have distinct numeric codes. We can take both 0 and 1
to code the empty sequence; for concreteness, let Λ denote 0.
The reason that this coding of sequences works is the so-called Fundamen-
tal Theorem of Arithmetic: every natural number n ≥ 2 can be written in one
and only one way in the form
a a a
n = p00 · p11 · · · · · pkk

238
15.11. Sequences

with ak ≥ 1. This guarantees that the mapping ⟨⟩( a0 , . . . , ak ) = ⟨ a0 , . . . , ak ⟩ is


injective: different sequences are mapped to different numbers; to each num-
ber only at most one sequence corresponds.
We’ll now show that the operations of determining the length of a se-
quence, determining its ith element, appending an element to a sequence, and
concatenating two sequences, are all primitive recursive.

Proposition 15.20. The function len(s), which returns the length of the sequence s,
is primitive recursive.

Proof. Let R(i, s) be the relation defined by

R(i, s) iff pi | s & pi+1 ∤ s.

R is clearly primitive recursive. Whenever s is the code of a non-empty se-


quence, i.e.,
a +1 a +1
s = p00 · · · · · pkk ,
R(i, s) holds if pi is the largest prime such that pi | s, i.e., i = k. The length of
s thus is i + 1 iff pi is the largest prime that divides s, so we can let
(
0 if s = 0 or s = 1
len(s) =
1 + (min i < s) R(i, s) otherwise

We can use bounded minimization, since there is only one i that satisfies R(s, i )
when s is a code of a sequence, and if i exists it is less than s itself.

Proposition 15.21. The function append(s, a), which returns the result of append-
ing a to the sequence s, is primitive recursive.

Proof. append can be defined by:


(
2 a +1 if s = 0 or s = 1
append(s, a) = a +1
s · plen (s)
otherwise.

Proposition 15.22. The function element(s, i ), which returns the ith element of s
(where the initial element is called the 0th), or 0 if i is greater than or equal to the
length of s, is primitive recursive.

Proof. Note that a is the ith element of s iff pia+1 is the largest power of pi that
divides s, i.e., pia+1 | s but pia+2 ∤ s. So:
(
0 if i ≥ len(s)
element(s, i ) =
(min a < s) ( pia+2 ∤ s) otherwise.

239
15. R ECURSIVE F UNCTIONS

Instead of using the official names for the functions defined above, we
introduce a more compact notation. We will use (s)i instead of element(s, i ),
and ⟨s0 , . . . , sk ⟩ to abbreviate

append(append(. . . append(Λ, s0 ) . . . ), sk ).

Note that if s has length k, the elements of s are (s)0 , . . . , (s)k−1 .

Proposition 15.23. The function concat(s, t), which concatenates two sequences, is
primitive recursive.

Proof. We want a function concat with the property that

concat(⟨ a0 , . . . , ak ⟩, ⟨b0 , . . . , bl ⟩) = ⟨ a0 , . . . , ak , b0 , . . . , bl ⟩.

We’ll use a “helper” function hconcat(s, t, n) which concatenates the first n


symbols of t to s. This function can be defined by primitive recursion as fol-
lows:

hconcat(s, t, 0) = s
hconcat(s, t, n + 1) = append(hconcat(s, t, n), (t)n )

Then we can define concat by

concat(s, t) = hconcat(s, t, len(t)).

We will write s ⌢ t instead of concat(s, t).


It will be useful for us to be able to bound the numeric code of a sequence
in terms of its length and its largest element. Suppose s is a sequence of
length k, each element of which is less than or equal to some number x. Then
s has at most k prime factors, each at most pk−1 , and each raised to at most
x + 1 in the prime factorization of s. In other words, if we define
k ·( x +1)
sequenceBound( x, k) = pk−1 ,

then the numeric code of the sequence s described above is at most sequenceBound( x, k).
Having such a bound on sequences gives us a way of defining new func-
tions using bounded search. For example, we can define concat using bounded
search. All we need to do is write down a primitive recursive specification of
the object (number of the concatenated sequence) we are looking for, and a
bound on how far to look. The following works:

concat(s, t) = (min v < sequenceBound(s + t, len(s) + len(t)))


(len(v) = len(s) + len(t) &
(∀i < len(s)) ((v)i = (s)i ) &
(∀ j < len(t)) ((v)len(s)+ j = (t) j ))

240
15.12. Trees

Proposition 15.24. The function subseq(s, i, n) which returns the subsequence of s


of length n beginning at the ith element, is primitive recursive.

Proof. Exercise.

15.12 Trees
Sometimes it is useful to represent trees as natural numbers, just like we can
represent sequences by numbers and properties of and operations on them by
primitive recursive relations and functions on their codes. We’ll use sequences
and their codes to do this. A tree can be either a single node (possibly with a
label) or else a node (possibly with a label) connected to a number of subtrees.
The node is called the root of the tree, and the subtrees it is connected to its
immediate subtrees.
We code trees recursively as a sequence ⟨k, d1 , . . . , dk ⟩, where k is the num-
ber of immediate subtrees and d1 , . . . , dk the codes of the immediate subtrees.
If the nodes have labels, they can be included after the immediate subtrees. So
a tree consisting just of a single node with label l would be coded by ⟨0, l ⟩, and
a tree consisting of a root (labelled l1 ) connected to two single nodes (labelled
l2 , l3 ) would be coded by ⟨2, ⟨0, l2 ⟩, ⟨0, l3 ⟩, l1 ⟩.
Proposition 15.25. The function SubtreeSeq(t), which returns the code of a se-
quence the elements of which are the codes of all subtrees of the tree with code t, is
primitive recursive.

Proof. First note that ISubtrees(t) = subseq(t, 1, (t)0 ) is primitive recursive


and returns the codes of the immediate subtrees of a tree t. Now we can
define a helper function hSubtreeSeq(t, n) which computes the sequence of all
subtrees which are n nodes removed from the root. The sequence of subtrees
of t which is 0 nodes removed from the root—in other words, begins at the root
of t—is the sequence consisting just of t. To obtain a sequence of all level n + 1
subtrees of t, we concatenate the level n subtrees with a sequence consisting
of all immediate subtrees of the level n subtrees. To get a list of all these, note
that if f ( x ) is a primitive recursive function returning codes of sequences, then
g f (s, k) = f ((s)0 ) ⌢ . . . ⌢ f ((s)k ) is also primitive recursive:

g(s, 0) = f ((s)0 )
g(s, k + 1) = g(s, k) ⌢ f ((s)k+1 )

For instance, if s is a sequence of trees, then h(s) = gISubtrees (s, len(s)) gives
the sequence of the immediate subtrees of the elements of s. We can use it to
define hSubtreeSeq by

hSubtreeSeq(t, 0) = ⟨t⟩
hSubtreeSeq(t, n + 1) = hSubtreeSeq(t, n) ⌢ h(hSubtreeSeq(t, n)).

241
15. R ECURSIVE F UNCTIONS

The maximum level of subtrees in a tree coded by t, i.e., the maximum dis-
tance between the root and a leaf node, is bounded by the code t. So a se-
quence of codes of all subtrees of the tree coded by t is given by hSubtreeSeq(t, t).

15.13 Other Recursions


Using pairing and sequencing, we can justify more exotic (and useful) forms
of primitive recursion. For example, it is often useful to define two functions
simultaneously, such as in the following definition:

h0 (⃗x, 0) = f 0 (⃗x )
h1 (⃗x, 0) = f 1 (⃗x )
h0 (⃗x, y + 1) = g0 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))
h1 (⃗x, y + 1) = g1 (⃗x, y, h0 (⃗x, y), h1 (⃗x, y))

This is an instance of simultaneous recursion. Another useful way of defining


functions is to give the value of h(⃗x, y + 1) in terms of all the values h(⃗x, 0),
. . . , h(⃗x, y), as in the following definition:

h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, ⟨h(⃗x, 0), . . . , h(⃗x, y)⟩).

The following schema captures this idea more succinctly:

h(⃗x, y) = g(⃗x, y, ⟨h(⃗x, 0), . . . , h(⃗x, y − 1)⟩)

with the understanding that the last argument to g is just the empty sequence
when y is 0. In either formulation, the idea is that in computing the “successor
step,” the function h can make use of the entire sequence of values computed
so far. This is known as a course-of-values recursion. For a particular example,
it can be used to justify the following type of definition:
(
g(⃗x, y, h(⃗x, k(⃗x, y))) if k (⃗x, y) < y
h(⃗x, y) =
f (⃗x ) otherwise

In other words, the value of h at y can be computed in terms of the value of h


at any previous value, given by k.
You should think about how to obtain these functions using ordinary prim-
itive recursion. One final version of primitive recursion is more flexible in that
one is allowed to change the parameters (side values) along the way:

h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(k (⃗x ), y))

This, too, can be simulated with ordinary primitive recursion. (Doing so is


tricky. For a hint, try unwinding the computation by hand.)

242
15.14. Non-Primitive Recursive Functions

15.14 Non-Primitive Recursive Functions


The primitive recursive functions do not exhaust the intuitively computable
functions. It should be intuitively clear that we can make a list of all the unary
primitive recursive functions, f 0 , f 1 , f 2 , . . . such that we can effectively com-
pute the value of f x on input y; in other words, the function g( x, y), defined
by
g( x, y) = f x (y)
is computable. But then so is the function

h( x ) = g( x, x ) + 1
= f x ( x ) + 1.

For each primitive recursive function f i , the value of h and f i differ at i. So h


is computable, but not primitive recursive; and one can say the same about g.
This is an “effective” version of Cantor’s diagonalization argument.
One can provide more explicit examples of computable functions that are
not primitive recursive. For example, let the notation gn ( x ) denote g( g(. . . g( x ))),
with n g’s in all; and define a sequence g0 , g1 , . . . of functions by

g0 ( x )= x+1
gn+1 ( x ) = gnx ( x )

You can confirm that each function gn is primitive recursive. Each successive
function grows much faster than the one before; g1 ( x ) is equal to 2x, g2 ( x ) is
equal to 2x · x, and g3 ( x ) grows roughly like an exponential stack of x 2’s. The
Ackermann–Péter function is essentially the function G ( x ) = gx ( x ), and one
can show that this grows faster than any primitive recursive function.
Let us return to the issue of enumerating the primitive recursive functions.
Remember that we have assigned symbolic notations to each primitive recur-
sive function; so it suffices to enumerate notations. We can assign a natural
number #( F ) to each notation F, recursively, as follows:

#(0) = ⟨0⟩
#( S ) = ⟨1⟩
#( Pin ) = ⟨2, n, i ⟩
#(Compk,l [ H, G0 , . . . , Gk−1 ]) = ⟨3, k, l, #( H ), #( G0 ), . . . , #( Gk−1 )⟩
#(Recl [ G, H ]) = ⟨4, l, #( G ), #( H )⟩

Here we are using the fact that every sequence of numbers can be viewed as
a natural number, using the codes from the last section. The upshot is that
every code is assigned a natural number. Of course, some sequences (and
hence some numbers) do not correspond to notations; but we can let f i be the
unary primitive recursive function with notation coded as i, if i codes such a

243
15. R ECURSIVE F UNCTIONS

notation; and the constant 0 function otherwise. The net result is that we have
an explicit way of enumerating the unary primitive recursive functions.
(In fact, some functions, like the constant zero function, will appear more
than once on the list. This is not just an artifact of our coding, but also a result
of the fact that the constant zero function has more than one notation. We will
later see that one can not computably avoid these repetitions; for example,
there is no computable function that decides whether or not a given notation
represents the constant zero function.)
We can now take the function g( x, y) to be given by f x (y), where f x refers
to the enumeration we have just described. How do we know that g( x, y) is
computable? Intuitively, this is clear: to compute g( x, y), first “unpack” x, and
see if it is a notation for a unary function. If it is, compute the value of that
function on input y.
You may already be convinced that (with some work!) one can write
a program (say, in Java or C++) that does this; and now we can appeal to
the Church–Turing thesis, which says that anything that, intuitively, is com-
putable can be computed by a Turing machine.
Of course, a more direct way to show that g( x, y) is computable is to de-
scribe a Turing machine that computes it, explicitly. This would, in particular,
avoid the Church–Turing thesis and appeals to intuition. Soon we will have
built up enough machinery to show that g( x, y) is computable, appealing to a
model of computation that can be simulated on a Turing machine: namely, the
recursive functions.

15.15 Partial Recursive Functions


To motivate the definition of the recursive functions, note that our proof that
there are computable functions that are not primitive recursive actually estab-
lishes much more. The argument was simple: all we used was the fact that it
is possible to enumerate functions f 0 , f 1 , . . . such that, as a function of x and y,
f x (y) is computable. So the argument applies to any class of functions that can be
enumerated in such a way. This puts us in a bind: we would like to describe the
computable functions explicitly; but any explicit description of a collection of
computable functions cannot be exhaustive!
The way out is to allow partial functions to come into play. We will see
that it is possible to enumerate the partial computable functions. In fact, we
already pretty much know that this is the case, since it is possible to enumerate
Turing machines in a systematic way. We will come back to our diagonal
argument later, and explore why it does not go through when partial functions
are included.
The question is now this: what do we need to add to the primitive recur-
sive functions to obtain all the partial recursive functions? We need to do two
things:

244
15.15. Partial Recursive Functions

1. Modify our definition of the primitive recursive functions to allow for


partial functions as well.

2. Add something to the definition, so that some new partial functions are
included.

The first is easy. As before, we will start with zero, successor, and projec-
tions, and close under composition and primitive recursion. The only differ-
ence is that we have to modify the definitions of composition and primitive
recursion to allow for the possibility that some of the terms in the definition
are not defined. If f and g are partial functions, we will write f ( x ) ↓ to mean
that f is defined at x, i.e., x is in the domain of f ; and f ( x ) ↑ to mean the
opposite, i.e., that f is not defined at x. We will use f ( x ) ≃ g( x ) to mean that
either f ( x ) and g( x ) are both undefined, or they are both defined and equal.
We will use these notations for more complicated terms as well. We will adopt
the convention that if h and g0 , . . . , gk all are partial functions, then

h( g0 (⃗x ), . . . , gk (⃗x ))

is defined if and only if each gi is defined at ⃗x, and h is defined at g0 (⃗x ),


. . . , gk (⃗x ). With this understanding, the definitions of composition and prim-
itive recursion for partial functions is just as above, except that we have to
replace “=” by “≃”.
What we will add to the definition of the primitive recursive functions to
obtain partial functions is the unbounded search operator. If f ( x, ⃗z) is any partial
function on the natural numbers, define µx f ( x, ⃗z) to be

the least x such that f (0, ⃗z), f (1, ⃗z), . . . , f ( x, ⃗z) are all defined, and
f ( x, ⃗z) = 0, if such an x exists

with the understanding that µx f ( x, ⃗z) is undefined otherwise. This defines


µx f ( x, ⃗z) uniquely.
Note that our definition makes no reference to Turing machines, or algo-
rithms, or any specific computational model. But like composition and prim-
itive recursion, there is an operational, computational intuition behind un-
bounded search. When it comes to the computability of a partial function,
arguments where the function is undefined correspond to inputs for which
the computation does not halt. The procedure for computing µx f ( x, ⃗z) will
amount to this: compute f (0, ⃗z), f (1, ⃗z), f (2, ⃗z) until a value of 0 is returned. If
any of the intermediate computations do not halt, however, neither does the
computation of µx f ( x, ⃗z).
If R( x, ⃗z) is any relation, µx R( x, ⃗z) is defined to be µx (1 −̇ χ R ( x, ⃗z)). In
other words, µx R( x, ⃗z) returns the least value of x such that R( x, ⃗z) holds. So,
if f ( x, ⃗z) is a total function, µx f ( x, ⃗z) is the same as µx ( f ( x, ⃗z) = 0). But note
that our original definition is more general, since it allows for the possibility

245
15. R ECURSIVE F UNCTIONS

that f ( x, ⃗z) is not everywhere defined (whereas, in contrast, the characteristic


function of a relation is always total).

Definition 15.26. The set of partial recursive functions is the smallest set of par-
tial functions from the natural numbers to the natural numbers (of various
arities) containing zero, successor, and projections, and closed under compo-
sition, primitive recursion, and unbounded search.

Of course, some of the partial recursive functions will happen to be total,


i.e., defined for every argument.

Definition 15.27. The set of recursive functions is the set of partial recursive
functions that are total.

A recursive function is sometimes called “total recursive” to emphasize


that it is defined everywhere.

15.16 The Normal Form Theorem


Theorem 15.28 (Kleene’s Normal Form Theorem). There is a primitive recur-
sive relation T (e, x, s) and a primitive recursive function U (s), with the following
property: if f is any partial recursive function, then for some e,

f ( x ) ≃ U (µs T (e, x, s))

for every x.

The proof of the normal form theorem is involved, but the basic idea is
simple. Every partial recursive function has an index e, intuitively, a number
coding its program or definition. If f ( x ) ↓, the computation can be recorded
systematically and coded by some number s, and the fact that s codes the
computation of f on input x can be checked primitive recursively using only
x and the definition e. Consequently, the relation T, “the function with index e
has a computation for input x, and s codes this computation,” is primitive
recursive. Given the full record of the computation s, the “upshot” of s is the
value of f ( x ), and it can be obtained from s primitive recursively as well.
The normal form theorem shows that only a single unbounded search is
required for the definition of any partial recursive function. Basically, we can
search through all numbers until we find one that codes a computation of
the function with index e for input x. We can use the numbers e as “names”
of partial recursive functions, and write φe for the function f defined by the
equation in the theorem. Note that any partial recursive function can have
more than one index—in fact, every partial recursive function has infinitely
many indices.

246
15.17. The Halting Problem

15.17 The Halting Problem


The halting problem in general is the problem of deciding, given the specifica-
tion e (e.g., program) of a computable function and a number n, whether the
computation of the function on input n halts, i.e., produces a result. Famously,
Alan Turing proved that this problem itself cannot be solved by a computable
function, i.e., the function
(
1 if computation e halts on input n
h(e, n) =
0 otherwise,

is not computable.
In the context of partial recursive functions, the role of the specification
of a program may be played by the index e given in Kleene’s normal form
theorem. If f is a partial recursive function, any e for which the equation in
the normal form theorem holds, is an index of f . Given a number e, the normal
form theorem states that

φe ( x ) ≃ U (µs T (e, x, s))

is partial recursive, and for every partial recursive f : N → N, there is an


e ∈ N such that φe ( x ) ≃ f ( x ) for all x ∈ N. In fact, for each such f there is
not just one, but infinitely many such e. The halting function h is defined by
(
1 if φe ( x ) ↓
h(e, x ) =
0 otherwise.

Note that h(e, x ) = 0 if φe ( x ) ↑, but also when e is not the index of a partial
recursive function at all.

Theorem 15.29. The halting function h is not partial recursive.

Proof. If h were partial recursive, we could define


(
1 if h(y, y) = 0
d(y) =
µx x ̸= x otherwise.

Since no number x satisfies x ̸= x, there is no µx x ̸= x, and so d(y) ↑ iff


h(y, y) ̸= 0. From this definition it follows that

1. d(y) ↓ iff φy (y) ↑ or y is not the index of a partial recursive function.

2. d(y) ↑ iff φy (y) ↓.

If h were partial recursive, then d would be partial recursive as well. Thus,


by the Kleene normal form theorem, it has an index ed . Consider the value of
h(ed , ed ). There are two possible cases, 0 and 1.

247
15. R ECURSIVE F UNCTIONS

1. If h(ed , ed ) = 1 then φed (ed ) ↓. But φed ≃ d, and d(ed ) is defined iff
h(ed , ed ) = 0. So h(ed , ed ) ̸= 1.

2. If h(ed , ed ) = 0 then either ed is not the index of a partial recursive func-


tion, or it is and φed (ed ) ↑. But again, φed ≃ d, and d(ed ) is undefined iff
φ ed ( e d ) ↓.

The upshot is that ed cannot, after all, be the index of a partial recursive func-
tion. But if h were partial recursive, d would be too, and so our definition of
ed as an index of it would be admissible. We must conclude that h cannot be
partial recursive.

15.18 General Recursive Functions


There is another way to obtain a set of total functions. Say a total function
f ( x, ⃗z) is regular if for every sequence of natural numbers ⃗z, there is an x such
that f ( x, ⃗z) = 0. In other words, the regular functions are exactly those func-
tions to which one can apply unbounded search, and end up with a total func-
tion. One can, conservatively, restrict unbounded search to regular functions:

Definition 15.30. The set of general recursive functions is the smallest set of
functions from the natural numbers to the natural numbers (of various ari-
ties) containing zero, successor, and projections, and closed under composi-
tion, primitive recursion, and unbounded search applied to regular functions.

Clearly every general recursive function is total. The difference between


Definition 15.30 and Definition 15.27 is that in the latter one is allowed to
use partial recursive functions along the way; the only requirement is that
the function you end up with at the end is total. So the word “general,” a
historic relic, is a misnomer; on the surface, Definition 15.30 is less general
than Definition 15.27. But, fortunately, the difference is illusory; though the
definitions are different, the set of general recursive functions and the set of
recursive functions are one and the same.

248
Chapter 16

Arithmetization of Syntax

16.1 Introduction
In order to connect computability and logic, we need a way to talk about the
objects of logic (symbols, terms, formulae, derivations), operations on them,
and their properties and relations, in a way amenable to computational treat-
ment. We can do this directly, by considering computable functions and re-
lations on symbols, sequences of symbols, and other objects built from them.
Since the objects of logical syntax are all finite and built from a countable sets
of symbols, this is possible for some models of computation. But other models
of computation—such as the recursive functions—-are restricted to numbers,
their relations and functions. Moreover, ultimately we also want to be able to
deal with syntax within certain theories, specifically, in theories formulated in
the language of arithmetic. In these cases it is necessary to arithmetize syntax,
i.e., to represent syntactic objects, operations on them, and their relations, as
numbers, arithmetical functions, and arithmetical relations, respectively. The
idea, which goes back to Leibniz, is to assign numbers to syntactic objects.
It is relatively straightforward to assign numbers to symbols as their “codes.”
Some symbols pose a bit of a challenge, since, e.g., there are infinitely many
variables, and even infinitely many function symbols of each arity n. But of
course it’s possible to assign numbers to symbols systematically in such a way
that, say, v2 and v3 are assigned different codes. Sequences of symbols (such
as terms and formulae) are a bigger challenge. But if we can deal with se-
quences of numbers purely arithmetically (e.g., by the powers-of-primes cod-
ing of sequences), we can extend the coding of individual symbols to coding
of sequences of symbols, and then further to sequences or other arrangements
of formulae, such as derivations. This extended coding is called “Gödel num-
bering.” Every term, formula, and derivation is assigned a Gödel number.
By coding sequences of symbols as sequences of their codes, and by chos-
ing a system of coding sequences that can be dealt with using computable
functions, we can then also deal with Gödel numbers using computable func-

249
16. A RITHMETIZATION OF S YNTAX

tions. In practice, all the relevant functions will be primitive recursive. For
instance, computing the length of a sequence and computing the i-th element
of a sequence from the code of the sequence are both primitive recursive. If
the number coding the sequence is, e.g., the Gödel number of a formula φ,
we immediately see that the length of a formula and the (code of the) i-th
symbol in a formula can also be computed from the Gödel number of φ. It
is a bit harder to prove that, e.g., the property of being the Gödel number of
a correctly formed term or of a correct derivation is primitive recursive. It
is nevertheless possible, because the sequences of interest (terms, formulae,
derivations) are inductively defined.
As an example, consider the operation of substitution. If φ is a formula,
x a variable, and t a term, then φ[t/x ] is the result of replacing every free
occurrence of x in φ by t. Now suppose we have assigned Gödel numbers to φ,
x, t—say, k, l, and m, respectively. The same scheme assigns a Gödel number
to φ[t/x ], say, n. This mapping—of k, l, and m to n—is the arithmetical analog
of the substitution operation. When the substitution operation maps φ, x, t to
φ[t/x ], the arithmetized substitution functions maps the Gödel numbers k, l,
m to the Gödel number n. We will see that this function is primitive recursive.
Arithmetization of syntax is not just of abstract interest, although it was
originally a non-trivial insight that languages like the language of arithmetic,
which do not come with mechanisms for “talking about” languages can, after
all, formalize complex properties of expressions. It is then just a small step to
ask what a theory in this language, such as Peano arithmetic, can prove about
its own language (including, e.g., whether sentences are provable or true).
This leads us to the famous limitative theorems of Gödel (about unprovability)
and Tarski (the undefinability of truth). But the trick of arithmetizing syntax
is also important in order to prove some important results in computability
theory, e.g., about the computational power of theories or the relationship be-
tween different models of computability. The arithmetization of syntax serves
as a model for arithmetizing other objects and properties. For instance, it is
similarly possible to arithmetize configurations and computations (say, of Tur-
ing machines). This makes it possible to simulate computations in one model
(e.g., Turing machines) in another (e.g., recursive functions).

16.2 Coding Symbols


The basic language L of first order logic makes use of the symbols

⊥ ∼ ∨ & ⊃ ∀ ∃ = ( ) ,

together with countable sets of variables and constant symbols, and countable
sets of function symbols and predicate symbols of arbitrary arity. We can as-
sign codes to each of these symbols in such a way that every symbol is assigned
a unique number as its code, and no two different symbols are assigned the

250
16.2. Coding Symbols

same number. We know that this is possible since the set of all symbols is
countable and so there is a bijection between it and the set of natural num-
bers. But we want to make sure that we can recover the symbol (as well as
some information about it, e.g., the arity of a function symbol) from its code
in a computable way. There are many possible ways of doing this, of course.
Here is one such way, which uses primitive recursive functions. (Recall that
⟨n0 , . . . , nk ⟩ is the number coding the sequence of numbers n0 , . . . , nk .)

Definition 16.1. If s is a symbol of L, let the symbol code cs be defined as fol-


lows:

1. If s is among the logical symbols, cs is given by the following table:

⊥ ∼ ∨ & ⊃ ∀
⟨0, 0⟩ ⟨0, 1⟩ ⟨0, 2⟩ ⟨0, 3⟩ ⟨0, 4⟩ ⟨0, 5⟩
∃ = ( ) ,
⟨0, 6⟩ ⟨0, 7⟩ ⟨0, 8⟩ ⟨0, 9⟩ ⟨0, 10⟩

2. If s is the i-th variable vi , then cs = ⟨1, i ⟩.

3. If s is the i-th constant symbol ci , then cs = ⟨2, i ⟩.

4. If s is the i-th n-ary function symbol fin , then cs = ⟨3, n, i ⟩.

5. If s is the i-th n-ary predicate symbol Pin , then cs = ⟨4, n, i ⟩.

Proposition 16.2. The following relations are primitive recursive:

1. Fn( x, n) iff x is the code of fin for some i, i.e., x is the code of an n-ary function
symbol.

2. Pred( x, n) iff x is the code of Pin for some i or x is the code of = and n = 2,
i.e., x is the code of an n-ary predicate symbol.

Definition 16.3. If s0 , . . . , sn−1 is a sequence of symbols, its Gödel number is


⟨ c s 0 , . . . , c s n −1 ⟩ .

Note that codes and Gödel numbers are different things. For instance, the
variable v5 has a code cv5 = ⟨1, 5⟩ = 22 · 36 . But the variable v5 considered as
a term is also a sequence of symbols (of length 1). The Gödel number # v5 # of the
2 6
term v5 is ⟨cv5 ⟩ = 2cv5 +1 = 22 ·3 +1 .

Example 16.4. Recall that if k0 , . . . , k n−1 is a sequence of numbers, then the


code of the sequence ⟨k0 , . . . , k n−1 ⟩ in the power-of-primes coding is
k
2k0 +1 · 3k1 +1 · · · · · pnn−−11 ,

251
16. A RITHMETIZATION OF S YNTAX

where pi is the i-th prime (starting with p0 = 2). So for instance, the formula
v0 = 0, or, more explicitly, =(v0 , c0 ), has the Gödel number

⟨c= , c( , cv0 , c, , cc0 , c) ⟩.

Here, c= is ⟨0, 7⟩ = 20+1 · 37+1 , cv0 is ⟨1, 0⟩ = 21+1 · 30+1 , etc. So # = (v0 , c0 )# is

2c= +1 · 3c( +1 · 5cv0 +1 · 7c, +1 · 11cc0 +1 · 13c) +1 =


1 · 38 + 1 1 · 39 + 1 2 · 31 + 1 1 ·311 +1 3 · 31 + 1 1 ·310 +1
22 · 32 · 52 · 72 · 112 · 132 =
213 123 · 339 367 · 513 · 7354 295 · 1125 · 13118 099 .

16.3 Coding Terms


A term is simply a certain kind of sequence of symbols: it is built up induc-
tively from constants and variables according to the formation rules for terms.
Since sequences of symbols can be coded as numbers—using a coding scheme
for the symbols plus a way to code sequences of numbers—assigning Gödel
numbers to terms is not difficult. The challenge is rather to show that the
property a number has if it is the Gödel number of a correctly formed term is
computable, or in fact primitive recursive.
Variables and constant symbols are the simplest terms, and testing whether
x is the Gödel number of such a term is easy: Var( x ) holds if x is # vi # for some i.
In other words, x is a sequence of length 1 and its single element ( x )0 is the
code of some variable vi , i.e., x is ⟨⟨1, i ⟩⟩ for some i. Similarly, Const( x ) holds
if x is # ci # for some i. Both of these relations are primitive recursive, since if
such an i exists, it must be < x:

Var( x ) ⇔ (∃i < x ) x = ⟨⟨1, i ⟩⟩


Const( x ) ⇔ (∃i < x ) x = ⟨⟨2, i ⟩⟩

Proposition 16.5. The relations Term( x ) and ClTerm( x ) which hold iff x is the
Gödel number of a term or a closed term, respectively, are primitive recursive.

Proof. A sequence of symbols s is a term iff there is a sequence s0 , . . . , sk−1 = s


of terms which records how the term s was formed from constant symbols
and variables according to the formation rules for terms. To express that such
a putative formation sequence follows the formation rules it has to be the case
that, for each i < k, either

1. si is a variable v j , or

2. si is a constant symbol c j , or

3. si is built from n terms t1 , . . . , tn occurring prior to place i using an n-


place function symbol f jn .

252
16.4. Coding Formulae

To show that the corresponding relation on Gödel numbers is primitive re-


cursive, we have to express this condition primitive recursively, i.e., using
primitive recursive functions, relations, and bounded quantification.
Suppose y is the number that codes the sequence s0 , . . . , sk−1 , i.e., y =
⟨ s0 , . . . , # sk−1 # ⟩. It codes a formation sequence for the term with Gödel num-
# #

ber x iff for all i < k:


1. Var((y)i ), or

2. Const((y)i ), or

3. there is an n and a number z = ⟨z1 , . . . , zn ⟩ such that each zl is equal to


some (y)i′ for i′ < i and

(y)i = # f jn (# ⌢ flatten(z) ⌢ # )# ,

and moreover (y)k−1 = x. (The function flatten(z) turns the sequence ⟨# t1 # , . . . , # tn # ⟩


into # t1 , . . . , tn # and is primitive recursive.)
The indices j, n, the Gödel numbers zl of the terms tl , and the code z of the
sequence ⟨z1 , . . . , zn ⟩, in (3) are all less than y. We can replace k above with
len(y). Hence we can express “y is the code of a formation sequence of the
term with Gödel number x” in a way that shows that this relation is primitive
recursive.
We now just have to convince ourselves that there is a primitive recursive
bound on y. But if x is the Gödel number of a term, it must have a formation
sequence with at most len( x ) terms (since every term in the formation se-
quence of s must start at some place in s, and no two subterms can start at the
same place). The Gödel number of each subterm of s is of course ≤ x. Hence,
k ( x +1)
there always is a formation sequence with code ≤ pk−1 , where k = len( x ).
For ClTerm, simply leave out the clause for variables.

Proposition 16.6. The function num(n) = # n# is primitive recursive.

Proof. We define num(n) by primitive recursion:

num(0) = # 0#
num(n + 1) = # ′(# ⌢ num(n) ⌢ # )# .

16.4 Coding Formulae


Proposition 16.7. The relation Atom( x ) which holds iff x is the Gödel number of
an atomic formula, is primitive recursive.

Proof. The number x is the Gödel number of an atomic formula iff one of the
following holds:

253
16. A RITHMETIZATION OF S YNTAX

1. There are n, j < x, and z < x such that for each i < n, Term((z)i ) and
x=
# n #
P j ( ⌢ flatten(z) ⌢ # )# .

2. There are z1 , z2 < x such that Term(z1 ), Term(z2 ), and x =


#
=(# ⌢ z1 ⌢ # ,# ⌢ z2 ⌢ # )# .

3. x = # ⊥# .

Proposition 16.8. The relation Frm( x ) which holds iff x is the Gödel number of
a formula is primitive recursive.

Proof. A sequence of symbols s is a formula iff there is formation sequence s0 ,


. . . , sk−1 = s of formula which records how s was formed from atomic formu-
lae according to the formation rules. The code for each si (and indeed of the
code of the sequence ⟨s0 , . . . , sk−1 ⟩) is less than the code x of s.

Proposition 16.9. The relation FreeOcc( x, z, i ), which holds iff the i-th symbol of
the formula with Gödel number x is a free occurrence of the variable with Gödel num-
ber z, is primitive recursive.

Proof. Exercise.

Proposition 16.10. The property Sent( x ) which holds iff x is the Gödel number of
a sentence is primitive recursive.

Proof. A sentence is a formula without free occurrences of variables. So Sent( x )


holds iff

(∀i < len( x )) (∀z < x )


((∃ j < z) z = # v j # ⊃ ∼FreeOcc( x, z, i )).

16.5 Substitution
Recall that substitution is the operation of replacing all free occurrences of
a variable u in a formula φ by a term t, written φ[t/u]. This operation, when
carried out on Gödel numbers of variables, formulae, and terms, is primitive
recursive.

Proposition 16.11. There is a primitive recursive function Subst( x, y, z) with the


property that
Subst(# φ# , # t# , # u# ) = # φ[t/u]# .

254
16.6. Derivations in Natural Deduction

Proof. We can then define a function hSubst by primitive recursion as follows:

hSubst( x, y, z, 0) = Λ
hSubst( x, y, z, i + 1) =
(
hSubst( x, y, z, i ) ⌢ y if FreeOcc( x, z, i )
append(hSubst( x, y, z, i ), ( x )i ) otherwise.

Subst( x, y, z) can now be defined as hSubst( x, y, z, len( x )).

Proposition 16.12. The relation FreeFor( x, y, z), which holds iff the term with Gödel
number y is free for the variable with Gödel number z in the formula with Gödel num-
ber x, is primitive recursive.

Proof. Exercise.

16.6 Derivations in Natural Deduction


In order to arithmetize derivations, we must represent derivations as num-
bers. Since derivations are trees of formulae where each inference carries one
or two labels, a recursive representation is the most obvious approach: we
represent a derivation as a tuple, the components of which are the number of
immediate sub-derivations leading to the premises of the last inference, the
representations of these sub-derivations, and the end-formula, the discharge
label of the last inference, and a number indicating the type of the last infer-
ence.

Definition 16.13. If δ is a derivation in natural deduction, then # δ# is defined


inductively as follows:

1. If δ consists only of the assumption φ, then # δ# is ⟨0, # φ# , n⟩. The num-


ber n is 0 if it is an undischarged assumption, and the numerical label
otherwise.

2. If δ ends in an inference with one, two, or three premises, then # δ# is

⟨1, # δ1 # , # φ# , n, k⟩,
⟨2, # δ1 # , # δ2 # , # φ# , n, k⟩, or
⟨3, # δ1 # , # δ2 # , # δ3 # , # φ# , n, k⟩,

respectively. Here δ1 , δ2 , δ3 are the sub-derivations ending in the premise(s)


of the last inference in δ, φ is the conclusion of the last inference in δ, n
is the discharge label of the last inference (0 if the inference does not dis-
charge any assumptions), and k is given by the following table according
to which rule was used in the last inference.

255
16. A RITHMETIZATION OF S YNTAX

Rule: &Intro &Elim ∨Intro ∨Elim


k: 1 2 3 4
Rule: ⊃Intro ⊃Elim ∼Intro ∼Elim
k: 5 6 7 8
Rule: ⊥I ⊥C ∀Intro ∀Elim
k: 9 10 11 12
Rule: ∃Intro ∃Elim =Intro =Elim
k: 13 14 15 16

Example 16.14. Consider the very simple derivation

[ φ & ψ ]1
φ &Elim
1 ⊃Intro
( φ & ψ) ⊃ φ

The Gödel number of the assumption would be d0 = ⟨0, # φ & ψ# , 1⟩. The
Gödel number of the derivation ending in the conclusion of &Elim would
be d1 = ⟨1, d0 , # φ# , 0, 2⟩ (1 since &Elim has one premise, the Gödel num-
ber of conclusion φ, 0 because no assumption is discharged, and 2 is the
number coding &Elim). The Gödel number of the entire derivation then is
⟨1, d1 , # (( φ & ψ) ⊃ φ)# , 1, 5⟩, i.e.,

⟨1, ⟨1, ⟨0, # ( φ & ψ)# , 1⟩, # φ# , 0, 2⟩, # (( φ & ψ) ⊃ φ)# , 1, 5⟩.

Having settled on a representation of derivations, we must also show that


we can manipulate Gödel numbers of such derivations primitive recursively,
and express their essential properties and relations. Some operations are sim-
ple: e.g., given a Gödel number d of a derivation, EndFmla(d) = (d)(d)0 +1
gives us the Gödel number of its end-formula, DischargeLabel(d) = (d)(d)0 +2
gives us the discharge label and LastRule(d) = (d)(d)0 +3 the number indicat-
ing the type of the last inference. Some are much harder. We’ll at least sketch
how to do this. The goal is to show that the relation “δ is a derivation of φ
from Γ” is a primitive recursive relation of the Gödel numbers of δ and φ.

Proposition 16.15. The following relations are primitive recursive:

1. φ occurs as an assumption in δ with label n.

2. All assumptions in δ with label n are of the form φ (i.e., we can discharge the
assumption φ using label n in δ).

Proof. We have to show that the corresponding relations between Gödel num-
bers of formulae and Gödel numbers of derivations are primitive recursive.

256
16.6. Derivations in Natural Deduction

1. We want to show that Assum( x, d, n), which holds if x is the Gödel num-
ber of an assumption of the derivation with Gödel number d labelled n,
is primitive recursive. This is the case if the derivation with Gödel num-
ber ⟨0, x, n⟩ is a sub-derivation of d. Note that the way we code deriva-
tions is a special case of the coding of trees introduced in section 15.12,
so the primitive recursive function SubtreeSeq(d) gives a sequence of
Gödel numbers of all sub-derivations of d (of length a most d). So we
can define
Assum( x, d, n) ⇔ (∃i < d) (SubtreeSeq(d))i = ⟨0, x, n⟩.

2. We want to show that Discharge( x, d, n), which holds if all assumptions


with label n in the derivation with Gödel number d all are the formula
with Gödel number x. But this relation holds iff (∀y < d) (Assum(y, d, n) ⊃
y = x ).

Proposition 16.16. The property Correct(d) which holds iff the last inference in the
derivation δ with Gödel number d is correct, is primitive recursive.

Proof. Here we have to show that for each rule of inference R the relation
FollowsByR (d) is primitive recursive, where FollowsByR (d) holds iff d is the
Gödel number of derivation δ, and the end-formula of δ follows by a correct
application of R from the immediate sub-derivations of δ.
A simple case is that of the &Intro rule. If δ ends in a correct &Intro infer-
ence, it looks like this:

δ1 δ2

φ ψ
&Intro
φ&ψ
Then the Gödel number d of δ is ⟨2, d1 , d2 , # ( φ & ψ)# , 0, k⟩ where EndFmla(d1 ) =
# #
φ , EndFmla(d2 ) = # ψ# , n = 0, and k = 1. So we can define FollowsBy&Intro (d)
as

(d)0 = 2 & DischargeLabel(d) = 0 & LastRule(d) = 1 &


EndFmla(d) = # (# ⌢ EndFmla((d)1 ) ⌢ # &# ⌢ EndFmla((d)2 ) ⌢ # )# .
Another simple example if the =Intro rule. Here the premise is an empty
derivation, i.e., (d)1 = 0, and no discharge label, i.e., n = 0. However, φ must
be of the form t = t, for a closed term t. Here, a primitive recursive definition
is

(d)0 = 1 & (d)1 = 0 & DischargeLabel(d) = 0 &


(∃t < d) (ClTerm(t) & EndFmla(d) = # =(# ⌢ t ⌢ # ,# ⌢ t ⌢ # )# )

257
16. A RITHMETIZATION OF S YNTAX

For a more complicated example, FollowsBy⊃Intro (d) holds iff the end-
formula of δ is of the form ( φ ⊃ ψ), where the end-formula of δ1 is ψ, and
any assumption in δ labelled n is of the form φ. We can express this primitive
recursively by

( d )0 = 1 &
(∃ a < d) (Discharge( a, (d)1 , DischargeLabel(d)) &
EndFmla(d) = (# (# ⌢ a ⌢ # ⊃# ⌢ EndFmla((d)1 ) ⌢ # )# ))

(Think of a as the Gödel number of φ).


For another example, consider ∃Intro. Here, the last inference in δ is correct
iff there is a formula φ, a closed term t and a variable x such that φ[t/x ] is
the end-formula of the derivation δ1 and ∃ x φ is the conclusion of the last
inference. So, FollowsBy∃Intro (d) holds iff

(d)0 = 1 & DischargeLabel(d) = 0 &


(∃ a < d) (∃ x < d) (∃t < d) (ClTerm(t) & Var( x ) &
Subst( a, t, x ) = EndFmla((d)1 ) & EndFmla(d) = (# ∃# ⌢ x ⌢ a)).

We then define Correct(d) as

Sent(EndFmla(d)) &
(LastRule(d) = 1 & FollowsBy&Intro (d)) ∨ · · · ∨
(LastRule(d) = 16 & FollowsBy=Elim (d)) ∨
(∃n < d) (∃ x < d) (d = ⟨0, x, n⟩).

The first line ensures that the end-formula of d is a sentence. The last line
covers the case where d is just an assumption.

Proposition 16.17. The relation Deriv(d) which holds if d is the Gödel number of a
correct derivation δ, is primitive recursive.

Proof. A derivation δ is correct if every one of its inferences is a correct ap-


plication of a rule, i.e., if every one of its sub-derivations ends in a correct
inference. So, Deriv(d) iff

(∀i < len(SubtreeSeq(d))) Correct((SubtreeSeq(d))i )

Proposition 16.18. The relation OpenAssum(z, d) that holds if z is the Gödel num-
ber of an undischarged assumption φ of the derivation δ with Gödel number d, is
primitive recursive.

258
16.6. Derivations in Natural Deduction

Proof. An occurrence of an assumption is discharged if it occurs with label n


in a sub-derivation of δ that ends in a rule with discharge label n. So φ is
an undischarged assumption of δ if at least one of its occurrences is not dis-
charged in δ. We must be careful: δ may contain both discharged and undis-
charged occurrences of φ.
Consider a sequence δ0 , . . . , δk where δ0 = δ, δk is the assumption [ φ]n (for
some n), and δi+1 is an immediate sub-derivation of δi . If such a sequence
exists in which no δi ends in an inference with discharge label n, then φ is
an undischarged assumption of δ.
The primitive recursive function SubtreeSeq(d) provides us with a sequence
of Gödel numbers of all sub-derivations of δ. Any sequence of Gödel numbers
of sub-derivations of δ is a subsequence of it. Being a subsequence of is a prim-
itive recursive relation: Subseq(s, s′ ) holds iff (∀i < len(s)) ∃ j < len(s′ ) (s)i =
(s) j . Being an immediate sub-derivation is as well: Subderiv(d, d′ ) iff (∃ j <
(d′ )0 ) d = (d′ ) j . So we can define OpenAssum(z, d) by

(∃s < SubtreeSeq(d)) (Subseq(s, SubtreeSeq(d)) & (s)0 = d &


(∃n < d) ((s)len(s)−̇1 = ⟨0, z, n⟩ &
(∀i < (len(s) −̇ 1)) (Subderiv((s)i+1 , (s)i )] &
DischargeLabel((s)i+1 ) ̸= n))).

Proposition 16.19. Suppose Γ is a primitive recursive set of sentences. Then the


relation PrfΓ ( x, y) expressing “x is the code of a derivation δ of φ from undischarged
assumptions in Γ and y is the Gödel number of φ” is primitive recursive.

Proof. Suppose “y ∈ Γ” is given by the primitive recursive predicate RΓ (y).


We have to show that PrfΓ ( x, y) which holds iff y is the Gödel number of
a sentence φ and x is the code of a natural deduction derivation with end
formula φ and all undischarged assumptions in Γ is primitive recursive.
By Proposition 16.17, the property Deriv( x ) which holds iff x is the Gödel
number of a correct derivation δ in natural deduction is primitive recursive.
Thus we can define PrfΓ ( x, y) by

PrfΓ ( x, y) ⇔ Deriv( x ) & EndFmla( x ) = y &


(∀z < x ) (OpenAssum(z, x ) ⊃ RΓ (z)).

259
Chapter 17

Representability in Q

17.1 Introduction
The incompleteness theorems apply to theories in which basic facts about
computable functions can be expressed and proved. We will describe a very
minimal such theory called “Q” (or, sometimes, “Robinson’s Q,” after Raphael
Robinson). We will say what it means for a function to be representable in Q,
and then we will prove the following:
A function is representable in Q if and only if it is computable.
For one thing, this provides us with another model of computability. But we
will also use it to show that the set { φ | Q ⊢ φ} is not decidable, by reducing
the halting problem to it. By the time we are done, we will have proved much
stronger things than this.
The language of Q is the language of arithmetic; Q consists of the fol-
lowing axioms (to be used in conjunction with the other axioms and rules of
first-order logic with identity predicate):

∀ x ∀y ( x ′ = y′ ⊃ x = y) (Q1 )
∀ x 0 ̸= x′ (Q2 )
∀ x ( x = 0 ∨ ∃y x = y′ ) (Q3 )
∀ x ( x + 0) = x (Q4 )
∀ x ∀y ( x + y′ ) = ( x + y)′ (Q5 )
∀ x ( x × 0) = 0 (Q6 )
∀ x ∀y ( x × y′ ) = (( x × y) + x ) (Q7 )
∀ x ∀y ( x < y ≡ ∃z (z′ + x ) = y) (Q8 )

For each natural number n, define the numeral n to be the term 0′′...′ where
there are n tick marks in all. So, 0 is the constant symbol 0 by itself, 1 is 0′ , 2 is
0′′ , etc.

261
17. R EPRESENTABILITY IN Q

As a theory of arithmetic, Q is extremely weak; for example, you can’t even


prove very simple facts like ∀ x x ̸= x ′ or ∀ x ∀y ( x + y) = (y + x ). But we will
see that much of the reason that Q is so interesting is because it is so weak. In
fact, it is just barely strong enough for the incompleteness theorem to hold.
Another reason Q is interesting is because it has a finite set of axioms.
A stronger theory than Q (called Peano arithmetic PA) is obtained by adding
a schema of induction to Q:

( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x )

where φ( x ) is any formula. If φ( x ) contains free variables other than x, we add


universal quantifiers to the front to bind all of them (so that the corresponding
instance of the induction schema is a sentence). For instance, if φ( x, y) also
contains the variable y free, the corresponding instance is

∀y (( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x ))

Using instances of the induction schema, one can prove much more from the
axioms of PA than from those of Q. In fact, it takes a good deal of work to
find “natural” statements about the natural numbers that can’t be proved in
Peano arithmetic!

Definition 17.1. A function f ( x0 , . . . , xk ) from the natural numbers to the nat-


ural numbers is said to be representable in Q if there is a formula φ f ( x0 , . . . , xk , y)
such that whenever f (n0 , . . . , nk ) = m, Q proves

1. φ f (n0 , . . . , nk , m)

2. ∀y ( φ f (n0 , . . . , nk , y) ⊃ m = y).

There are other ways of stating the definition; for example, we could equiv-
alently require that Q proves ∀y ( φ f (n0 , . . . , nk , y) ≡ y = m).

Theorem 17.2. A function is representable in Q if and only if it is computable.

There are two directions to proving the theorem. The left-to-right direction
is fairly straightforward once arithmetization of syntax is in place. The other
direction requires more work. Here is the basic idea: we pick “general recur-
sive” as a way of making “computable” precise, and show that every general
recursive function is representable in Q. Recall that a function is general re-
cursive if it can be defined from zero, the successor function succ, and the
projection functions Pin , using composition, primitive recursion, and regular
minimization. So one way of showing that every general recursive function is
representable in Q is to show that the basic functions are representable, and
whenever some functions are representable, then so are the functions defined
from them using composition, primitive recursion, and regular minimization.

262
17.2. Functions Representable in Q are Computable

In other words, we might show that the basic functions are representable, and
that the representable functions are “closed under” composition, primitive
recursion, and regular minimization. This guarantees that every general re-
cursive function is representable.
It turns out that the step where we would show that representable func-
tions are closed under primitive recursion is hard. In order to avoid this step,
we show first that in fact we can do without primitive recursion. That is, we
show that every general recursive function can be defined from basic func-
tions using composition and regular minimization alone. To do this, we show
that primitive recursion can actually be done by a specific regular minimiza-
tion. However, for this to work, we have to add some additional basic func-
tions: addition, multiplication, and the characteristic function of the identity
relation χ= . Then, we can prove the theorem by showing that all of these basic
functions are representable in Q, and the representable functions are closed
under composition and regular minimization.

17.2 Functions Representable in Q are Computable


We’ll prove that every function that is representable in Q is computable. We
first have to establish a lemma about functions representable in Q.

Lemma 17.3. If f ( x0 , . . . , xk ) is representable in Q, there is a formula φ( x0 , . . . , xk , y)


such that
Q ⊢ φ f (n0 , . . . , nk , m) iff m = f (n0 , . . . , nk ).

Proof. The “if” part is Definition 17.1(1). The “only if” part is seen as follows:
Suppose Q ⊢ φ f (n0 , . . . , nk , m) but m ̸= f (n0 , . . . , nk ). Let l = f (n0 , . . . , nk ).
By Definition 17.1(1), Q ⊢ φ f (n0 , . . . , nk , l ). By Definition 17.1(2), ∀y ( φ f (n0 , . . . , nk , y) ⊃
l = y). Using logic and the assumption that Q ⊢ φ f (n0 , . . . , nk , m), we get that
Q ⊢ l = m. On the other hand, by Lemma 17.14, Q ⊢ l ̸= m. So Q is incon-
sistent. But that is impossible, since Q is satisfied by the standard model (see
Definition 14.2), N ⊨ Q, and satisfiable theories are always consistent by the
Soundness Theorem (Corollary 9.29).

Lemma 17.4. Every function that is representable in Q is computable.

Proof. Let’s first give the intuitive idea for why this is true. To compute f , we
do the following. List all the possible derivations δ in the language of arith-
metic. This is possible to do mechanically. For each one, check if it is a deriva-
tion of a formula of the form φ f (n0 , . . . , nk , m) (the formula representing f in Q
from Lemma 17.3). If it is, m = f (n0 , . . . , nk ) by Lemma 17.3, and we’ve found
the value of f . The search terminates because Q ⊢ φ f (n0 , . . . , nk , f (n0 , . . . , nk )),
so eventually we find a δ of the right sort.

263
17. R EPRESENTABILITY IN Q

This is not quite precise because our procedure operates on derivations


and formulae instead of just on numbers, and we haven’t explained exactly
why “listing all possible derivations” is mechanically possible. But as we’ve
seen, it is possible to code terms, formulae, and derivations by Gödel num-
bers. We’ve also introduced a precise model of computation, the general re-
cursive functions. And we’ve seen that the relation PrfQ (d, y), which holds
iff d is the Gödel number of a derivation of the formula with Gödel num-
ber y from the axioms of Q, is (primitive) recursive. Other primitive recursive
functions we’ll need are num (Proposition 16.6) and Subst (Proposition 16.11).
From these, it is possible to define f by minimization; thus, f is recursive.
First, define

A ( n0 , . . . , n k , m ) =
Subst(Subst(. . . Subst(# φ f # , num(n0 ), # x0 # ),
. . . ), num(nk ), # xk # ), num(m), # y# )
This looks complicated, but it’s just the function A(n0 , . . . , nk , m) = # φ f (n0 , . . . , nk , m)# .
Now, consider the relation R(n0 , . . . , nk , s) which holds if (s)0 is the Gödel
number of a derivation from Q of φ f (n0 , . . . , nk , (s)1 ):
R ( n0 , . . . , n k , s ) iff PrfQ ((s)0 , A(n0 , . . . , nk , (s)1 ))
If we can find an s such that R(n0 , . . . , nk , s) hold, we have found a pair of
numbers—(s)0 and (s1 )—such that (s)0 is the Gödel number of a derivation
of A f (n0 , . . . , nk , (s)1 ). So looking for s is like looking for the pair d and m
in the informal proof. And a computable function that “looks for” such an
s can be defined by regular minimization. Note that R is regular: for ev-
ery n0 , . . . , nk , there is a derivation δ of Q ⊢ φ f (n0 , . . . , nk , f (n0 , . . . , nk )), so
R(n0 , . . . , nk , s) holds for s = ⟨# δ# , f (n0 , . . . , nk )⟩. So, we can write f as
f (n0 , . . . , nk ) = (µs R(n0 , . . . , nk , s))1 .

17.3 The Beta Function Lemma


In order to show that we can carry out primitive recursion if addition, multi-
plication, and χ= are available, we need to develop functions that handle se-
quences. (If we had exponentiation as well, our task would be easier.) When
we had primitive recursion, we could define things like the “n-th prime,”
and pick a fairly straightforward coding. But here we do not have primitive
recursion—in fact we want to show that we can do primitive recursion using
minimization—so we need to be more clever.
Lemma 17.5. There is a function β(d, i ) such that for every sequence a0 , . . . , an there
is a number d, such that for every i ≤ n, β(d, i ) = ai . Moreover, β can be defined
from the basic functions using just composition and regular minimization.

264
17.3. The Beta Function Lemma

Think of d as coding the sequence ⟨ a0 , . . . , an ⟩, and β(d, i ) returning the


i-th element. (Note that this “coding” does not use the power-of-primes cod-
ing we’re already familiar with!). The lemma is fairly minimal; it doesn’t say
we can concatenate sequences or append elements, or even that we can com-
pute d from a0 , . . . , an using functions definable by composition and regular
minimization. All it says is that there is a “decoding” function such that every
sequence is “coded.”
The use of the notation β is Gödel’s. To repeat, the hard part of proving
the lemma is defining a suitable β using the seemingly restricted resources,
i.e., using just composition and minimization—however, we’re allowed to use
addition, multiplication, and χ= . There are various ways to prove this lemma,
but one of the cleanest is still Gödel’s original method, which used a number-
theoretic fact called Sunzi’s Theorem (traditionally, the “Chinese Remainder
Theorem”).

Definition 17.6. Two natural numbers a and b are relatively prime iff their great-
est common divisor is 1; in other words, they have no other divisors in com-
mon.

Definition 17.7. Natural numbers a and b are congruent modulo c, a ≡ b mod c,


iff c | ( a − b), i.e., a and b have the same remainder when divided by c.

Here is Sunzi’s Theorem:

Theorem 17.8. Suppose x0 , . . . , xn are (pairwise) relatively prime. Let y0 , . . . , yn be


any numbers. Then there is a number z such that

z ≡ y0 mod x0
z ≡ y1 mod x1
..
.
z ≡ yn mod xn .

Here is how we will use Sunzi’s Theorem: if x0 , . . . , xn are bigger than y0 ,


. . . , yn respectively, then we can take z to code the sequence ⟨y0 , . . . , yn ⟩. To
recover yi , we need only divide z by xi and take the remainder. To use this
coding, we will need to find suitable values for x0 , . . . , xn .
A couple of observations will help us in this regard. Given y0 , . . . , yn , let

j = max(n, y0 , . . . , yn ) + 1,

265
17. R EPRESENTABILITY IN Q

and let

x0 = 1 + j !
x1 = 1 + 2 · j !
x2 = 1 + 3 · j !
..
.
x n = 1 + ( n + 1) · j !

Then two things are true:

1. x0 , . . . , xn are relatively prime.

2. For each i, yi < xi .

To see that (1) is true, note that if p is a prime number and p | xi and p | xk ,
then p | 1 + (i + 1) j ! and p | 1 + (k + 1) j !. But then p divides their difference,

(1 + (i + 1) j !) − (1 + (k + 1) j !) = (i − k) j !.

Since p divides 1 + (i + 1) j !, it can’t divide j ! as well (otherwise, the first


division would leave a remainder of 1). So p divides i − k, since p divides
(i − k) j !. But |i − k| is at most n, and we have chosen j > n, so this implies
that p | j !, again a contradiction. So there is no prime number dividing both
xi and xk . Clause (2) is easy: we have yi < j < j ! < xi .
Now let us prove the β function lemma. Remember that we can use 0,
successor, plus, times, χ= , projections, and any function defined from them
using composition and minimization applied to regular functions. We can
also use a relation if its characteristic function is so definable. As before we
can show that these relations are closed under Boolean combinations and
bounded quantification; for example:

not( x ) = χ= ( x, 0)
(min x ≤ z) R( x, y) = µx ( R( x, y) ∨ x = z)
(∃ x ≤ z) R( x, y) ⇔ R((min x ≤ z) R( x, y), y)

We can then show that all of the following are also definable without primitive
recursion:

1. The pairing function, J ( x, y) = 12 [( x + y)( x + y + 1)] + x;

2. the projection functions

K (z) = (min x ≤ z) (∃y ≤ z) z = J ( x, y),


L(z) = (min y ≤ z) (∃ x ≤ z) z = J ( x, y);

266
17.4. Simulating Primitive Recursion

3. the less-than relation x < y;

4. the divisibility relation x | y;

5. the function rem( x, y) which returns the remainder when y is divided


by x.

Now define

β∗ (d0 , d1 , i ) = rem(1 + (i + 1)d1 , d0 ) and


β(d, i ) = β∗ (K (d), L(d), i ).

This is the function we want. Given a0 , . . . , an as above, let

j = max(n, a0 , . . . , an ) + 1,

and let d1 = j !. By (1) above, we know that 1 + d1 , 1 + 2d1 , . . . , 1 + (n + 1)d1


are relatively prime, and by (2) that all are greater than a0 , . . . , an . By Sunzi’s
Theorem there is a value d0 such that for each i,

d0 ≡ a i mod (1 + (i + 1)d1 )

and so (because d1 is greater than ai ),

ai = rem(1 + (i + 1)d1 , d0 ).

Let d = J (d0 , d1 ). Then for each i ≤ n, we have

β(d, i ) = β∗ (d0 , d1 , i )
= rem(1 + (i + 1)d1 , d0 )
= ai

which is what we need. This completes the proof of the β-function lemma.

17.4 Simulating Primitive Recursion


Now we can show that definition by primitive recursion can be “simulated”
by regular minimization using the beta function. Suppose we have f (⃗x ) and
g(⃗x, y, z). Then the function h( x, ⃗z) defined from f and g by primitive recur-
sion is

h(⃗x, 0) = f (⃗x )
h(⃗x, y + 1) = g(⃗x, y, h(⃗x, y)).

We need to show that h can be defined from f and g using just composition
and regular minimization, using the basic functions and functions defined
from them using composition and regular minimization (such as β).

267
17. R EPRESENTABILITY IN Q

Lemma 17.9. If h can be defined from f and g using primitive recursion, it can be
defined from f , g, the functions zero, succ, Pin , add, mult, χ= , using composition
and regular minimization.

Proof. First, define an auxiliary function ĥ(⃗x, y) which returns the least num-
ber d such that d codes a sequence which satisfies

1. (d)0 = f (⃗x ), and

2. for each i < y, (d)i+1 = g(⃗x, i, (d)i ),

where now (d)i is short for β(d, i ). In other words, ĥ returns the sequence
⟨h(⃗x, 0), h(⃗x, 1), . . . , h(⃗x, y)⟩. We can write ĥ as

ĥ(⃗x, y) = µd ( β(d, 0) = f (⃗x ) & (∀i < y) β(d, i + 1) = g(⃗x, i, β(d, i )).

Note: no primitive recursion is needed here, just minimization. The function


we minimize is regular because of the beta function lemma Lemma 17.5.
But now we have
h(⃗x, y) = β(ĥ(⃗x, y), y),
so h can be defined from the basic functions using just composition and regu-
lar minimization.

17.5 Basic Functions are Representable in Q


First we have to show that all the basic functions are representable in Q. In the
end, we need to show how to assign to each k-ary basic function f ( x0 , . . . , xk−1 )
a formula φ f ( x0 , . . . , xk−1 , y) that represents it.
We will be able to represent zero, successor, plus, times, the characteristic
function for equality, and projections. In each case, the appropriate represent-
ing function is entirely straightforward; for example, zero is represented by
the formula y = 0, successor is represented by the formula x0′ = y, and addi-
tion is represented by the formula ( x0 + x1 ) = y. The work involves showing
that Q can prove the relevant sentences; for example, saying that addition
is represented by the formula above involves showing that for every pair of
natural numbers m and n, Q proves

n + m = n + m and
∀y ((n + m) = y ⊃ y = n + m).

Proposition 17.10. The zero function zero( x ) = 0 is represented in Q by φzero ( x, y) ≡


y = 0.

Proposition 17.11. The successor function succ( x ) = x + 1 is represented in Q by


φsucc ( x, y) ≡ y = x ′ .

268
17.5. Basic Functions are Representable in Q

Proposition 17.12. The projection function Pin ( x0 , . . . , xn−1 ) = xi is represented


in Q by
φ P n ( x 0 , . . . , x n −1 , y ) ≡ y = x i .
i

Proposition 17.13. The characteristic function of =,


(
1 if x0 = x1
χ = ( x0 , x1 ) =
0 otherwise

is represented in Q by

φ χ = ( x0 , x1 , y ) ≡ ( x0 = x1 & y = 1) ∨ ( x0 ̸ = x1 & y = 0).

The proof requires the following lemma.

Lemma 17.14. Given natural numbers n and m, if n ̸= m, then Q ⊢ n ̸= m.

Proof. Use induction on n to show that for every m, if n ̸= m, then Q ⊢ n ̸= m.


In the base case, n = 0. If m is not equal to 0, then m = k + 1 for some
natural number k. We have an axiom that says ∀ x 0 ̸= x ′ . By a quantifier
′ ′
axiom, replacing x by k, we can conclude 0 ̸= k . But k is just m.
In the induction step, we can assume the claim is true for n, and consider
n + 1. Let m be any natural number. There are two possibilities: either m = 0
or for some k we have m = k + 1. The first case is handled as above. In the
second case, suppose n + 1 ̸= k + 1. Then n ̸= k. By the induction hypothesis
for n we have Q ⊢ n ̸= k. We have an axiom that says ∀ x ∀y x ′ = y′ ⊃ x = y.

Using a quantifier axiom, we have n′ = k ⊃ n = k. Using propositional

logic, we can conclude, in Q, n ̸= k ⊃ n′ ̸= k . Using modus ponens, we can
′ ′
conclude n′ ̸= k , which is what we want, since k is m.

Note that the lemma does not say much: in essence it says that Q can
prove that different numerals denote different objects. For example, Q proves
0′′ ̸= 0′′′ . But showing that this holds in general requires some care. Note also
that although we are using induction, it is induction outside of Q.

Proof of Proposition 17.13. If n = m, then n and m are the same term, and
χ= (n, m) = 1. But Q ⊢ (n = m & 1 = 1), so it proves φ= (n, m, 1). If n ̸= m,
then χ= (n, m) = 0. By Lemma 17.14, Q ⊢ n ̸= m and so also (n ̸= m & 0 = 0).
Thus Q ⊢ φ= (n, m, 0).
For the second part, we also have two cases. If n = m, we have to show
that Q ⊢ ∀y ( φ= (n, m, y) ⊃ y = 1). Arguing informally, suppose φ= (n, m, y),
i.e.,
( n = n & y = 1) ∨ ( n ̸ = n & y = 0)
The left disjunct implies y = 1 by logic; the right contradicts n = n which is
provable by logic.

269
17. R EPRESENTABILITY IN Q

Suppose, on the other hand, that n ̸= m. Then φ= (n, m, y) is

( n = m & y = 1) ∨ ( n ̸ = m & y = 0)

Here, the left disjunct contradicts n ̸= m, which is provable in Q by Lemma 17.14;


the right disjunct entails y = 0.

Proposition 17.15. The addition function add( x0 , x1 ) = x0 + x1 is represented


in Q by
φadd ( x0 , x1 , y) ≡ y = ( x0 + x1 ).

Lemma 17.16. Q ⊢ (n + m) = n + m

Proof. We prove this by induction on m. If m = 0, the claim is that Q ⊢ (n +


0) = n. This follows by axiom Q4 . Now suppose the claim for m; let’s prove
the claim for m + 1, i.e., prove that Q ⊢ (n + m + 1) = n + m + 1. Note that

m + 1 is just m′ , and n + m + 1 is just n + m . By axiom Q5 , Q ⊢ (n + m′ ) =
(n + m)′ . By induction hypothesis, Q ⊢ (n + m) = n + m. So Q ⊢ (n + m′ ) =

n+m .

Proof of Proposition 17.15. The formula φadd ( x0 , x1 , y) representing add is y =


( x0 + x1 ). First we show that if add(n, m) = k, then Q ⊢ φadd (n, m, k), i.e.,
Q ⊢ k = (n + m). But since k = n + m, k just is n + m, and we’ve shown in
Lemma 17.16 that Q ⊢ (n + m) = n + m.
We also have to show that if add(n, m) = k, then

Q ⊢ ∀y ( φadd (n, m, y) ⊃ y = k).

Suppose we have (n + m) = y. Since

Q ⊢ (n + m) = n + m,

we can replace the left side with n + m and get n + m = y, for arbitrary y.

Proposition 17.17. The multiplication function mult( x0 , x1 ) = x0 · x1 is repre-


sented in Q by
φmult ( x0 , x1 , y) ≡ y = ( x0 × x1 ).

Proof. Exercise.

Lemma 17.18. Q ⊢ (n × m) = n · m

Proof. Exercise.

270
17.6. Composition is Representable in Q

Recall that we use × for the function symbol of the language of arith-
metic, and · for the ordinary multiplication operation on numbers. So · can
appear between expressions for numbers (such as in m · n) while × appears
only between terms of the language of arithmetic (such as in (m × n)). Even
more confusingly, + is used for both the function symbol and the addition
operation. When it appears between terms—e.g., in (n + m)—it is the 2-place
function symbol of the language of arithmetic, and when it appears between
numbers—e.g., in n + m—it is the addition operation. This includes the case
n + m: this is the standard numeral corresponding to the number n + m.

17.6 Composition is Representable in Q


Suppose h is defined by

h( x0 , . . . , xl −1 ) = f ( g0 ( x0 , . . . , xl −1 ), . . . , gk−1 ( x0 , . . . , xl −1 )).

where we have already found formulae φ f , φ g0 , . . . , φ gk−1 representing the


functions f , and g0 , . . . , gk−1 , respectively. We have to find a formula φh rep-
resenting h.
Let’s start with a simple case, where all functions are 1-place, i.e., consider
h( x ) = f ( g( x )). If φ f (y, z) represents f , and φ g ( x, y) represents g, we need
a formula φh ( x, z) that represents h. Note that h( x ) = z iff there is a y such
that both z = f (y) and y = g( x ). (If h( x ) = z, then g( x ) is such a y; if such a
y exists, then since y = g( x ) and z = f (y), z = f ( g( x )).) This suggests that
∃y ( φ g ( x, y) & φ f (y, z)) is a good candidate for φh ( x, z). We just have to verify
that Q proves the relevant formulae.
Proposition 17.19. If h(n) = m, then Q ⊢ φh (n, m).

Proof. Suppose h(n) = m, i.e., f ( g(n)) = m. Let k = g(n). Then

Q ⊢ φ g (n, k )

since φ g represents g, and

Q ⊢ φ f (k, m)

since φ f represents f . Thus,

Q ⊢ φ g (n, k ) & φ f (k, m)

and consequently also

Q ⊢ ∃y ( φ g (n, y) & φ f (y, m)),

i.e., Q ⊢ φh (n, m).

271
17. R EPRESENTABILITY IN Q

Proposition 17.20. If h(n) = m, then Q ⊢ ∀z ( φh (n, z) ⊃ z = m).

Proof. Suppose h(n) = m, i.e., f ( g(n)) = m. Let k = g(n). Then

Q ⊢ ∀y ( φ g (n, y) ⊃ y = k )

since φ g represents g, and

Q ⊢ ∀z ( φ f (k, z) ⊃ z = m)

since φ f represents f . Using just a little bit of logic, we can show that also

Q ⊢ ∀z (∃y ( φ g (n, y) & φ f (y, z)) ⊃ z = m).

i.e., Q ⊢ ∀y ( φh (n, y) ⊃ y = m).

The same idea works in the more complex case where f and gi have arity
greater than 1.

Proposition 17.21. If φ f (y0 , . . . , yk−1 , z) represents f (y0 , . . . , yk−1 ) in Q, and φ gi ( x0 , . . . , xl −1 , y)


represents gi ( x0 , . . . , xl −1 ) in Q, then

∃ y 0 . . . ∃ y k − 1 ( φ g0 ( x 0 , . . . , x l − 1 , y 0 ) & · · · &
!A gk−1 ( x0 , . . . , xl −1 , yk−1 ) & φ f (y0 , . . . , yk−1 , z))

represents

h( x0 , . . . , xl −1 ) = f ( g0 ( x0 , . . . , xl −1 ), . . . , gk−1 ( x0 , . . . , xl −1 )).

Proof. Exercise.

17.7 Regular Minimization is Representable in Q


Let’s consider unbounded search. Suppose g( x, z) is regular and representable
in Q, say by the formula φ g ( x, z, y). Let f be defined by f (z) = µx [ g( x, z) =
0]. We would like to find a formula φ f (z, y) representing f . The value of f (z)
is that number x which (a) satisfies g( x, z) = 0 and (b) is the least such, i.e.,
for any w < x, g(w, z) ̸= 0. So the following is a natural choice:

φ f (z, y) ≡ φ g (y, z, 0) & ∀w (w < y ⊃ ∼ φ g (w, z, 0)).

In the general case, of course, we would have to replace z with z0 , . . . , zk .


The proof, again, will involve some lemmas about things Q is strong enough
to prove.

272
17.7. Regular Minimization is Representable in Q

Lemma 17.22. For every constant symbol a and every natural number n,

Q ⊢ ( a′ + n) = ( a + n)′ .

Proof. The proof is, as usual, by induction on n. In the base case, n = 0, we


need to show that Q proves ( a′ + 0) = ( a + 0)′ . But we have:

Q ⊢ ( a ′ + 0) = a ′ by axiom Q4 (17.1)
Q ⊢ ( a + 0) = a by axiom Q4 (17.2)
′ ′
Q ⊢ ( a + 0) = a by eq. (17.2) (17.3)
′ ′
Q ⊢ ( a + 0) = ( a + 0) by eq. (17.1) and eq. (17.3)

In the induction step, we can assume that we have shown that Q ⊢ ( a′ + n) =


( a + n)′ . Since n + 1 is n′ , we need to show that Q proves ( a′ + n′ ) = ( a + n′ )′ .
We have:

Q ⊢ ( a′ + n′ ) = ( a′ + n)′ by axiom Q5 (17.4)


′ ′ ′ ′
Q ⊢ (a + n ) = (a + n ) inductive hypothesis (17.5)
′ ′ ′ ′
Q ⊢ ( a + n) = ( a + n ) by eq. (17.4) and eq. (17.5).

It is again worth mentioning that this is weaker than saying that Q proves
∀ x ∀y ( x ′ + y) = ( x + y)′ . Although this sentence is true in N, Q does not
prove it.

Lemma 17.23. Q ⊢ ∀ x ∼ x < 0.

Proof. We give the proof informally (i.e., only giving hints as to how to con-
struct the formal derivation).
We have to prove ∼ a < 0 for an arbitrary a. By the definition of <, we
need to prove ∼∃y (y′ + a) = 0 in Q. We’ll assume ∃y (y′ + a) = 0 and prove a
contradiction. Suppose (b′ + a) = 0. Using Q3 , we have that a = 0 ∨ ∃y a = y′ .
We distinguish cases.
Case 1: a = 0 holds. From (b′ + a) = 0, we have (b′ + 0) = 0. By axiom Q4
of Q, we have (b′ + 0) = b′ , and hence b′ = 0. But by axiom Q2 we also have
b′ ̸= 0, a contradiction.
Case 2: For some c, a = c′ . But then we have (b′ + c′ ) = 0. By axiom Q5 ,
we have (b′ + c)′ = 0, again contradicting axiom Q2 .

Lemma 17.24. For every natural number n,

Q ⊢ ∀ x ( x < n + 1 ⊃ ( x = 0 ∨ · · · ∨ x = n)).

Proof. We use induction on n. Let us consider the base case, when n = 0. In


that case, we need to show a < 1 ⊃ a = 0, for arbitrary a. Suppose a < 1.
Then by the defining axiom for <, we have ∃y (y′ + a) = 0′ (since 1 ≡ 0′ ).

273
17. R EPRESENTABILITY IN Q

Suppose b has that property, i.e., we have (b′ + a) = 0′ . We need to show


a = 0. By axiom Q3 , we have either a = 0 or that there is a c such that a = c′ .
In the former case, there is nothing to show. So suppose a = c′ . Then we have
(b′ + c′ ) = 0′ . By axiom Q5 of Q, we have (b′ + c)′ = 0′ . By axiom Q1 , we
have (b′ + c) = 0. But this means, by axiom Q8 , that c < 0, contradicting
Lemma 17.23.
Now for the inductive step. We prove the case for n + 1, assuming the case
for n. So suppose a < n + 2. Again using Q3 we can distinguish two cases:
a = 0 and for some b, a = c′ . In the first case, a = 0 ∨ · · · ∨ a = n + 1 follows

trivially. In the second case, we have c′ < n + 2, i.e., c′ < n + 1 . By axiom Q8 ,
′ ′
for some d, (d′ + c′ ) = n + 1 . By axiom Q5 , (d′ + c)′ = n + 1 . By axiom Q1 ,
(d′ + c) = n + 1, and so c < n + 1 by axiom Q8 . By inductive hypothesis,
c = 0 ∨ · · · ∨ c = n. From this, we get c′ = 0′ ∨ · · · ∨ c′ = n′ by logic, and so
a = 1 ∨ · · · ∨ a = n + 1 since a = c′ .

Lemma 17.25. For every natural number m,

Q ⊢ ∀y ((y < m ∨ m < y) ∨ y = m).

Proof. By induction on m. First, consider the case m = 0. Q ⊢ ∀y (y = 0 ∨


∃z y = z′ ) by Q3 . Let a be arbitrary. Then either a = 0 or for some b, a = b′ .
In the former case, we also have ( a < 0 ∨ 0 < a) ∨ a = 0. But if a = b′ ,
then (b′ + 0) = ( a + 0) by the logic of =. By Q4 , ( a + 0) = a, so we have
(b′ + 0) = a, and hence ∃z (z′ + 0) = a. By the definition of < in Q8 , 0 < a. If
0 < a, then also (0 < a ∨ a < 0) ∨ a = 0.
Now suppose we have

Q ⊢ ∀y ((y < m ∨ m < y) ∨ y = m)

and we want to show

Q ⊢ ∀y ((y < m + 1 ∨ m + 1 < y) ∨ y = m + 1)

Let a be arbitrary. By Q3 , either a = 0 or for some b, a = b′ . In the first case,


we have m′ + a = m + 1 by Q4 , and so a < m + 1 by Q8 .
Now consider the second case, a = b′ . By the induction hypothesis, (b <
m ∨ m < b) ∨ b = m.
The first disjunct b < m is equivalent (by Q8 ) to ∃z (z′ + b) = m. Suppose
c has this property. If (c′ + b) = m, then also (c′ + b)′ = m′ . By Q5 , (c′ + b)′ =
(c′ + b′ ). Hence, (c′ + b′ ) = m′ . We get ∃u (u′ + b′ ) = m + 1 by existentially
generalizing on c′ and keeping in mind that m′ ≡ m + 1. Hence, if b < m then
b′ < m + 1 and so a < m + 1.
Now suppose m < b, i.e., ∃z (z′ + m) = b. Suppose c is such a z, i.e.,
(c + m) = b. By logic, (c′ + m)′ = b′ . By Q5 , (c′ + m′ ) = b′ . Since a = b′ and

m′ ≡ m + 1, (c′ + m + 1) = a. By Q8 , m + 1 < a.

274
17.8. Computable Functions are Representable in Q

Finally, assume b = m. Then, by logic, b′ = m′ , and so a = m + 1.


Hence, from each disjunct of the case for m and b, we can obtain the corre-
sponding disjunct for for m + 1 and a.

Proposition 17.26. If φ g ( x, z, y) represents g( x, z) in Q, then

φ f (z, y) ≡ φ g (y, z, 0) & ∀w (w < y ⊃ ∼ φ g (w, z, 0))

represents f (z) = µx [ g( x, z) = 0].

Proof. First we show that if f (n) = m, then Q ⊢ φ f (n, m), i.e.,

Q ⊢ φ g (m, n, 0) & ∀w (w < m ⊃ ∼ φ g (w, n, 0)).

Since φ g ( x, z, y) represents g( x, z) and g(m, n) = 0 if f (n) = m, we have

Q ⊢ φ g (m, n, 0).

If f (n) = m, then for every k < m, g(k, n) ̸= 0. So

Q ⊢ ∼ φ g (k, n, 0).

We get that

Q ⊢ ∀w (w < m ⊃ ∼ φ g (w, n, 0)). (17.6)

by Lemma 17.23 in case m = 0 and by Lemma 17.24 otherwise.


Now let’s show that if f (n) = m, then Q ⊢ ∀y ( φ f (n, y) ⊃ y = m). We
again sketch the argument informally, leaving the formalization to the reader.
Suppose φ f (n, b). From this we get (a) φ g (b, n, 0) and (b) ∀w (w < b ⊃
∼ φ g (w, n, 0)). By Lemma 17.25, (b < m ∨ m < b) ∨ b = m. We’ll show that
both b < m and m < b leads to a contradiction.
If m < b, then ∼ φ g (m, n, 0) from (b). But m = f (n), so g(m, n) = 0, and so
Q ⊢ φ g (m, n, 0) since φ g represents g. So we have a contradiction.
Now suppose b < m. Then since Q ⊢ ∀w (w < m ⊃ ∼ φ g (w, n, 0)) by
eq. (17.6), we get ∼ φ g (b, n, 0). This again contradicts (a).

17.8 Computable Functions are Representable in Q


Theorem 17.27. Every computable function is representable in Q.

Proof. For definiteness, and using the Church–Turing Thesis, let’s say that a
function is computable iff it is general recursive. The general recursive func-
tions are those which can be defined from the zero function zero, the successor

275
17. R EPRESENTABILITY IN Q

function succ, and the projection function Pin using composition, primitive re-
cursion, and regular minimization. By Lemma 17.9, any function h that can
be defined from f and g can also be defined using composition and regular
minimization from f , g, and zero, succ, Pin , add, mult, χ= . Consequently, a
function is general recursive iff it can be defined from zero, succ, Pin , add,
mult, χ= using composition and regular minimization.
We’ve furthermore shown that the basic functions in question are rep-
resentable in Q (Propositions 17.10 to 17.13, 17.15 and 17.17), and that any
function defined from representable functions by composition or regular min-
imization (Proposition 17.21, Proposition 17.26) is also representable. Thus
every general recursive function is representable in Q.

We have shown that the set of computable functions can be characterized


as the set of functions representable in Q. In fact, the proof is more general.
From the definition of representability, it is not hard to see that any theory
extending Q (or in which one can interpret Q) can represent the computable
functions. But, conversely, in any derivation system in which the notion of
derivation is computable, every representable function is computable. So,
for example, the set of computable functions can be characterized as the set
of functions representable in Peano arithmetic, or even Zermelo–Fraenkel set
theory. As Gödel noted, this is somewhat surprising. We will see that when
it comes to provability, questions are very sensitive to which theory you con-
sider; roughly, the stronger the axioms, the more you can prove. But across a
wide range of axiomatic theories, the representable functions are exactly the
computable ones; stronger theories do not represent more functions as long as
they are axiomatizable.

17.9 Representing Relations


Let us say what it means for a relation to be representable.
Definition 17.28. A relation R( x0 , . . . , xk ) on the natural numbers is representable
in Q if there is a formula φ R ( x0 , . . . , xk ) such that whenever R(n0 , . . . , nk ) is
true, Q proves φ R (n0 , . . . , nk ), and whenever R(n0 , . . . , nk ) is false, Q proves
∼ φ R ( n0 , . . . , n k ).

Theorem 17.29. A relation is representable in Q if and only if it is computable.

Proof. For the forwards direction, suppose R( x0 , . . . , xk ) is represented by the


formula φ R ( x0 , . . . , xk ). Here is an algorithm for computing R: on input n0 ,
. . . , nk , simultaneously search for a proof of φ R (n0 , . . . , nk ) and a proof of
∼ φ R (n0 , . . . , nk ). By our hypothesis, the search is bound to find one or the
other; if it is the first, report “yes,” and otherwise, report “no.”
In the other direction, suppose R( x0 , . . . , xk ) is computable. By definition,
this means that the function χ R ( x0 , . . . , xk ) is computable. By Theorem 17.2,

276
17.10. Undecidability

χ R is represented by a formula, say φχR ( x0 , . . . , xk , y). Let φ R ( x0 , . . . , xk ) be


the formula φχR ( x0 , . . . , xk , 1). Then for any n0 , . . . , nk , if R(n0 , . . . , nk ) is true,
then χ R (n0 , . . . , nk ) = 1, in which case Q proves φχR (n0 , . . . , nk , 1), and so
Q proves φ R (n0 , . . . , nk ). On the other hand, if R(n0 , . . . , nk ) is false, then
χ R (n0 , . . . , nk ) = 0. This means that Q proves
∀ y ( φ χ R ( n0 , . . . , n k , y ) ⊃ y = 0).
Since Q proves 0 ̸= 1, Q proves ∼ φχR (n0 , . . . , nk , 1), and so it proves ∼ φ R (n0 , . . . , nk ).

17.10 Undecidability
We call a theory T undecidable if there is no computational procedure which, af-
ter finitely many steps and unfailingly, provides a correct answer to the ques-
tion “does T prove φ?” for any sentence φ in the language of T. So Q would
be decidable iff there were a computational procedure which decides, given a
sentence φ in the language of arithmetic, whether Q ⊢ φ or not. We can make
this more precise by asking: Is the relation ProvQ (y), which holds of y iff y is
the Gödel number of a sentence provable in Q, recursive? The answer is: no.
Theorem 17.30. Q is undecidable, i.e., the relation
ProvQ (y) ⇔ Sent(y) & ∃ x PrfQ ( x, y)
is not recursive.

Proof. Suppose it were. Then we could solve the halting problem as follows:
Given e and n, we know that φe (n) ↓ iff there is an s such that T (e, n, s), where
T is Kleene’s predicate from Theorem 15.28. Since T is primitive recursive it
is representable in Q by a formula ψT , that is, Q ⊢ ψT (e, n, s) iff T (e, n, s). If
Q ⊢ ψT (e, n, s) then also Q ⊢ ∃y ψT (e, n, y). If no such s exists, then Q ⊢
∼ψT (e, n, s) for every s. But Q is ω-consistent, i.e., if Q ⊢ ∼ φ(n) for every n ∈
N, then Q ⊬ ∃y φ(y). We know this because the axioms of Q are true in the
standard model N. So, Q ⊬ ∃y ψT (e, n, y). In other words, Q ⊢ ∃y ψT (e, n, y)
iff there is an s such that T (e, n, s), i.e., iff φe (n) ↓. From e and n we can
compute # ∃y ψT (e, n, y)# , let g(e, n) be the primitive recursive function which
does that. So (
1 if ProvQ ( g(e, n))
h(e, n) =
0 otherwise.
This would show that h is recursive if ProvQ is. But h is not recursive, by
Theorem 15.29, so ProvQ cannot be either.

Corollary 17.31. First-order logic is undecidable.

Proof. If first-order logic were decidable, provability in Q would be as well,


since Q ⊢ φ iff ⊢ ω ⊃ φ, where ω is the conjunction of the axioms of Q.

277
Chapter 18

Incompleteness and Provability

18.1 Introduction

Hilbert thought that a system of axioms for a mathematical structure, such


as the natural numbers, is inadequate unless it allows one to derive all true
statements about the structure. Combined with his later interest in formal
systems of deduction, this suggests that he thought that we should guarantee
that, say, the formal systems we are using to reason about the natural numbers
is not only consistent, but also complete, i.e., every statement in its language
is either derivable or its negation is. Gödel’s first incompleteness theorem
shows that no such system of axioms exists: there is no complete, consistent,
axiomatizable formal system for arithmetic. In fact, no “sufficiently strong,”
consistent, axiomatizable mathematical theory is complete.
A more important goal of Hilbert’s, the centerpiece of his program for the
justification of modern (“classical”) mathematics, was to find finitary consis-
tency proofs for formal systems representing classical reasoning. With regard
to Hilbert’s program, then, Gödel’s second incompleteness theorem was a
much bigger blow. The second incompleteness theorem can be stated in vague
terms, like the first incompleteness theorem. Roughly speaking, it says that no
sufficiently strong theory of arithmetic can prove its own consistency. We will
have to take “sufficiently strong” to include a little bit more than Q.
The idea behind Gödel’s original proof of the incompleteness theorem can
be found in the Epimenides paradox. Epimenides, a Cretan, asserted that all
Cretans are liars; a more direct form of the paradox is the assertion “this sen-
tence is false.” Essentially, by replacing truth with derivability, Gödel was
able to formalize a sentence which, in a roundabout way, asserts that it it-
self is not derivable. If that sentence were derivable, the theory would then
be inconsistent. Gödel showed that the negation of that sentence is also not
derivable from the system of axioms he was considering. (For this second
part, Gödel had to assume that the theory T is what’s called “ω-consistent.”

279
18. I NCOMPLETENESS AND P ROVABILITY

ω-Consistency is related to consistency, but is a stronger property.1 A few


years after Gödel, Rosser showed that assuming simple consistency of T is
enough.)
The first challenge is to understand how one can construct a sentence that
refers to itself. For every formula φ in the language of Q, let ⌜φ⌝ denote the
numeral corresponding to # φ# . Think about what this means: φ is a formula in
the language of Q, # φ# is a natural number, and ⌜φ⌝ is a term in the language
of Q. So every formula φ in the language of Q has a name, ⌜φ⌝, which is a
term in the language of Q; this provides us with a conceptual framework in
which formulae in the language of Q can “say” things about other formulae.
The following lemma is known as the fixed-point lemma.

Lemma 18.1. Let T be any theory extending Q, and let ψ( x ) be any formula with
only the variable x free. Then there is a sentence φ such that T ⊢ φ ≡ ψ(⌜φ⌝).

The lemma asserts that given any property ψ( x ), there is a sentence φ that
asserts “ψ( x ) is true of me,” and T “knows” this.
How can we construct such a sentence? Consider the following version of
the Epimenides paradox, due to Quine:

“Yields falsehood when preceded by its quotation” yields false-


hood when preceded by its quotation.

This sentence is not directly self-referential. It simply makes an assertion


about the syntactic objects between quotes, and, in doing so, it is on par with
sentences like

1. “Robert” is a nice name.

2. “I ran.” is a short sentence.

3. “Has three words” has three words.

But what happens when one takes the phrase “yields falsehood when pre-
ceded by its quotation,” and precedes it with a quoted version of itself? Then
one has the original sentence! In short, the sentence asserts that it is false.

18.2 The Fixed-Point Lemma


The fixed-point lemma says that for any formula ψ( x ), there is a sentence φ
such that T ⊢ φ ≡ ψ(⌜φ⌝), provided T extends Q. In the case of the liar sen-
tence, we’d want φ to be equivalent (provably in T) to “⌜φ⌝ is false,” i.e., the
statement that # φ# is the Gödel number of a false sentence. To understand the
idea of the proof, it will be useful to compare it with Quine’s informal gloss
1 That is, any ω-consistent theory is consistent, but not vice versa.

280
18.2. The Fixed-Point Lemma

of φ as, “‘yields a falsehood when preceded by its own quotation’ yields a


falsehood when preceded by its own quotation.” The operation of taking an
expression, and then forming a sentence by preceding this expression by its
own quotation may be called diagonalizing the expression, and the result its
diagonalization. So, the diagonalization of ‘yields a falsehood when preceded
by its own quotation’ is “‘yields a falsehood when preceded by its own quo-
tation’ yields a falsehood when preceded by its own quotation.” Now note
that Quine’s liar sentence is not the diagonalization of ‘yields a falsehood’ but
of ‘yields a falsehood when preceded by its own quotation.’ So the property
being diagonalized to yield the liar sentence itself involves diagonalization!
In the language of arithmetic, we form quotations of a formula with one
free variable by computing its Gödel numbers and then substituting the stan-
dard numeral for that Gödel number into the free variable. The diagonal-
ization of α( x ) is α(n), where n = # α( x )# . (From now on, let’s abbreviate
# α ( x )# as ⌜α ( x )⌝.) So if ψ ( x ) is “is a falsehood,” then “yields a falsehood if

preceded by its own quotation,” would be “yields a falsehood when applied


to the Gödel number of its diagonalization.” If we had a symbol di ag for the
function diag(n) which computes the Gödel number of the diagonalization of
the formula with Gödel number n, we could write α( x ) as ψ(di ag ( x )). And
Quine’s version of the liar sentence would then be the diagonalization of it,
i.e., α(⌜α( x )⌝) or ψ(di ag (⌜ψ(di ag ( x ))⌝)). Of course, ψ( x ) could now be any
other property, and the same construction would work. For the incomplete-
ness theorem, we’ll take ψ( x ) to be “x is not derivable in T.” Then α( x ) would
be “yields a sentence not derivable in T when applied to the Gödel number of
its diagonalization.”
To formalize this in T, we have to find a way to formalize diag. The func-
tion diag(n) is computable, in fact, it is primitive recursive: if n is the Gödel
number of a formula α( x ), diag(n) returns the Gödel number of α(⌜α( x )⌝).
(Recall, ⌜α( x )⌝ is the standard numeral of the Gödel number of α( x ), i.e.,
# α ( x )# ). If di ag were a function symbol in T representing the function diag,

we could take φ to be the formula ψ(di ag (⌜ψ(di ag ( x ))⌝)). Notice that

diag(# ψ(di ag ( x ))# ) = # ψ(di ag (⌜ψ(di ag ( x ))⌝))#


= # φ# .

Assuming T can derive

di ag (⌜ψ(di ag ( x ))⌝) = ⌜φ⌝,

it can derive ψ(di ag (⌜ψ(di ag ( x ))⌝)) ≡ ψ(⌜φ⌝). But the left hand side is, by
definition, φ.
Of course, di ag will in general not be a function symbol of T, and cer-
tainly is not one of Q. But, since diag is computable, it is representable in Q
by some formula θdiag ( x, y). So instead of writing ψ(di ag ( x )) we can write

281
18. I NCOMPLETENESS AND P ROVABILITY

∃y (θdiag ( x, y) & ψ(y)). Otherwise, the proof sketched above goes through,
and in fact, it goes through already in Q.

Lemma 18.2. Let ψ( x ) be any formula with one free variable x. Then there is a
sentence φ such that Q ⊢ φ ≡ ψ(⌜φ⌝).

Proof. Given ψ( x ), let α( x ) be the formula ∃y (θdiag ( x, y) & ψ(y)) and let φ be
its diagonalization, i.e., the formula α(⌜α( x )⌝).
Since θdiag represents diag, and diag(# α( x )# ) = # φ# , Q can derive

!Ddiag (⌜α( x )⌝, ⌜φ⌝) (18.1)


∀y (θdiag (⌜α( x )⌝, y) ⊃ y = ⌜φ⌝). (18.2)

Now we show that Q ⊢ φ ≡ ψ(⌜φ⌝). We argue informally, using just logic


and facts derivable in Q.
First, suppose φ, i.e., α(⌜α( x )⌝). Going back to the definition of α( x ), we
see that α(⌜α( x )⌝) just is

∃y (θdiag (⌜α( x )⌝, y) & ψ(y)).

Consider such a y. Since θdiag (⌜α( x )⌝, y), by eq. (18.2), y = ⌜φ⌝. So, from ψ(y)
we have ψ(⌜φ⌝).
Now suppose ψ(⌜φ⌝). By eq. (18.1), we have

!Ddiag (⌜α( x )⌝, ⌜φ⌝) & ψ(⌜φ⌝).

It follows that

∃y (θdiag (⌜α( x )⌝, y) & ψ(y)).

But that’s just α(⌜α( x )⌝), i.e., φ.

You should compare this to the proof of the fixed-point lemma in com-
putability theory. The difference is that here we want to define a statement in
terms of itself, whereas there we wanted to define a function in terms of itself;
this difference aside, it is really the same idea.

18.3 The First Incompleteness Theorem


We can now describe Gödel’s original proof of the first incompleteness theo-
rem. Let T be any computably axiomatized theory in a language extending
the language of arithmetic, such that T includes the axioms of Q. This means
that, in particular, T represents computable functions and relations.
We have argued that, given a reasonable coding of formulas and proofs
as numbers, the relation PrfT ( x, y) is computable, where PrfT ( x, y) holds if

282
18.3. The First Incompleteness Theorem

and only if x is the Gödel number of a derivation of the formula with Gödel
number y in T. In fact, for the particular theory that Gödel had in mind, Gödel
was able to show that this relation is primitive recursive, using the list of 45
functions and relations in his paper. The 45th relation, xBy, is just PrfT ( x, y)
for his particular choice of T. Remember that where Gödel uses the word
“recursive” in his paper, we would now use the phrase “primitive recursive.”
Since PrfT ( x, y) is computable, it is representable in T. We will use Prf T ( x, y)
to refer to the formula that represents it. Let Prov T (y) be the formula ∃ x Prf T ( x, y).
This describes the 46th relation, Bew(y), on Gödel’s list. As Gödel notes, this
is the only relation that “cannot be asserted to be recursive.” What he proba-
bly meant is this: from the definition, it is not clear that it is computable; and
later developments, in fact, show that it isn’t.
Let T be an axiomatizable theory containing Q. Then PrfT ( x, y) is decid-
able, hence representable in Q by a formula Prf T ( x, y). Let Prov T (y) be the
formula we described above. By the fixed-point lemma, there is a formula γT
such that Q (and hence T) derives

γT ≡ ∼Prov T (⌜γT ⌝). (18.3)

Note that γT says, in essence, “γT is not derivable in T.”

Lemma 18.3. If T is a consistent, axiomatizable theory extending Q, then T ⊬ γT .

Proof. Suppose T derives γT . Then there is a derivation, and so, for some
number m, the relation PrfT (m, # γT # ) holds. But then Q derives the sentence
Prf T (m, ⌜γT ⌝). So Q derives ∃ x Prf T ( x, ⌜γT ⌝), which is, by definition, Prov T (⌜γT ⌝).
By eq. (18.3), Q derives ∼γT , and since T extends Q, so does T. We have
shown that if T derives γT , then it also derives ∼γT , and hence it would be
inconsistent.

Definition 18.4. A theory T is ω-consistent if the following holds: if ∃ x φ( x ) is


any sentence and T derives ∼ φ(0), ∼ φ(1), ∼ φ(2), . . . then T does not prove
∃ x φ ( x ).

Note that every ω-consistent theory is also consistent. This follows simply
from the fact that if T is inconsistent, then T ⊢ φ for every φ. In particular, if T
is inconsistent, it derives both ∼ φ(n) for every n and also derives ∃ x φ( x ). So,
if T is inconsistent, it is ω-inconsistent. By contraposition, if T is ω-consistent,
it must be consistent.

Lemma 18.5. If T is an ω-consistent, axiomatizable theory extending Q, then T ⊬


∼ γT .

Proof. We show that if T derives ∼γT , then it is ω-inconsistent. Suppose T


derives ∼γT . If T is inconsistent, it is ω-inconsistent, and we are done. Oth-
erwise, T is consistent, so it does not derive γT by Lemma 18.3. Since there is

283
18. I NCOMPLETENESS AND P ROVABILITY

no derivation of γT in T, Q derives

∼Prf T (0, ⌜γT ⌝), ∼Prf T (1, ⌜γT ⌝), ∼Prf T (2, ⌜γT ⌝), . . .

and so does T. On the other hand, by eq. (18.3), ∼γT is equivalent to ∃ x Prf T ( x, ⌜γT ⌝).
So T is ω-inconsistent.

Theorem 18.6. Let T be any ω-consistent, axiomatizable theory extending Q. Then


T is not complete.

Proof. If T is ω-consistent, it is consistent, so T ⊬ γT by Lemma 18.3. By


Lemma 18.5, T ⊬ ∼γT . This means that T is incomplete, since it derives nei-
ther γT nor ∼γT .

18.4 Rosser’s Theorem


Can we modify Gödel’s proof to get a stronger result, replacing “ω-consistent”
with simply “consistent”? The answer is “yes,” using a trick discovered by
Rosser. Rosser’s trick is to use a “modified” derivability predicate RProv T (y)
instead of Prov T (y).

Theorem 18.7. Let T be any consistent, axiomatizable theory extending Q. Then T


is not complete.

Proof. Recall that Prov T (y) is defined as ∃ x Prf T ( x, y), where Prf T ( x, y) repre-
sents the decidable relation which holds iff x is the Gödel number of a deriva-
tion of the sentence with Gödel number y. The relation that holds between x
and y if x is the Gödel number of a refutation of the sentence with Gödel num-
ber y is also decidable. Let not( x ) be the primitive recursive function which
does the following: if x is the code of a formula φ, not( x ) is a code of ∼ φ.
Then RefT ( x, y) holds iff PrfT ( x, not(y)). Let Ref T ( x, y) represent it. Then, if
T ⊢ ∼ φ and δ is a corresponding derivation, Q ⊢ Ref T (⌜δ⌝, ⌜φ⌝). We define
RProv T (y) as

∃ x (Prf T ( x, y) & ∀z (z < x ⊃ ∼Ref T (z, y))).

Roughly, RProv T (y) says “there is a proof of y in T, and there is no shorter


refutation of y.” Assuming T is consistent, RProv T (y) is true of the same
numbers as Prov T (y); but from the point of view of provability in T (and we
now know that there is a difference between truth and provability!) the two
have different properties. If T is inconsistent, then the two do not hold of the
same numbers! (RProv T (y) is often read as “y is Rosser provable.” Since, as
just discussed, Rosser provability is not some special kind of provability—
in inconsistent theories, there are sentences that are provable but not Rosser
provable—this may be confusing. To avoid the confusion, you could instead
read it as “y is shmovable.”)

284
18.4. Rosser’s Theorem

By the fixed-point lemma, there is a formula ρT such that

Q ⊢ ρT ≡ ∼RProv T (⌜ρT ⌝). (18.4)

In contrast to the proof of Theorem 18.6, here we claim that if T is consistent,


T doesn’t derive ρT , and T also doesn’t derive ∼ρT . (In other words, we don’t
need the assumption of ω-consistency.)
First, let’s show that T ⊬ ρ T . Suppose it did, so there is a derivation of ρ T
from T; let n be its Gödel number. Then Q ⊢ Prf T (n, ⌜ρ T ⌝), since Prf T repre-
sents PrfT in Q. Also, for each k < n, k is not the Gödel number of a deriva-
tion of ∼ρ T , since T is consistent. So for each k < n, Q ⊢ ∼Ref T (k, ⌜ρ T ⌝). By
Lemma 17.24, Q ⊢ ∀z (z < n ⊃ ∼Ref T (z, ⌜ρ T ⌝)). Thus,

Q ⊢ ∃ x (Prf T ( x, ⌜ρ T ⌝) & ∀z (z < x ⊃ ∼Ref T (z, ⌜ρ T ⌝))),

but that’s just RProv T (⌜ρ T ⌝). By eq. (18.4), Q ⊢ ∼ρ T . Since T extends Q, also
T ⊢ ∼ρ T . We’ve assumed that T ⊢ ρ T , so T would be inconsistent, contrary to
the assumption of the theorem.
Now, let’s show that T ⊬ ∼ρ T . Again, suppose it did, and suppose n is
the Gödel number of a derivation of ∼ρ T . Then RefT (n, # ρ T # ) holds, and since
Ref T represents RefT in Q, Q ⊢ Ref T (n, ⌜ρ T ⌝). We’ll again show that T would
then be inconsistent because it would also derive ρ T . Since

Q ⊢ ρ T ≡ ∼RProv T (⌜ρ T ⌝),

and since T extends Q, it suffices to show that

Q ⊢ ∼RProv T (⌜ρ T ⌝).

The sentence ∼RProv T (⌜ρ T ⌝), i.e.,

∼∃ x (Prf T ( x, ⌜ρ T ⌝) & ∀z (z < x ⊃ ∼Ref T (z, ⌜ρ T ⌝))),

is logically equivalent to

∀ x (Prf T ( x, ⌜ρ T ⌝) ⊃ ∃z (z < x & Ref T (z, ⌜ρ T ⌝))).

We argue informally using logic, making use of facts about what Q derives.
Suppose x is arbitrary and Prf T ( x, ⌜ρ T ⌝). We already know that T ⊬ ρ T , and
so for every k, Q ⊢ ∼Prf T (k, ⌜ρ T ⌝). Thus, for every k it follows that x ̸= k.
In particular, we have (a) that x ̸= n. We also have ∼( x = 0 ∨ x = 1 ∨
· · · ∨ x = n − 1) and so by Lemma 17.24, (b) ∼( x < n). By Lemma 17.25,
n < x. Since Q ⊢ Ref T (n, ⌜ρ T ⌝), we have n < x & Ref T (n, ⌜ρ T ⌝), and from that
∃z (z < x & Ref T (z, ⌜ρ T ⌝)). Since x was arbitrary we get, as required, that

∀ x (Prf T ( x, ⌜ρ T ⌝) ⊃ ∃z (z < x & Ref T (z, ⌜ρ T ⌝))).

285
18. I NCOMPLETENESS AND P ROVABILITY

18.5 Comparison with Gödel’s Original Paper


It is worthwhile to spend some time with Gödel’s 1931 paper. The introduc-
tion sketches the ideas we have just discussed. Even if you just skim through
the paper, it is easy to see what is going on at each stage: first Gödel describes
the formal system P (syntax, axioms, proof rules); then he defines the prim-
itive recursive functions and relations; then he shows that xBy is primitive
recursive, and argues that the primitive recursive functions and relations are
represented in P. He then goes on to prove the incompleteness theorem, as
above. In Section 3, he shows that one can take the unprovable assertion to
be a sentence in the language of arithmetic. This is the origin of the β-lemma,
which is what we also used to handle sequences in showing that the recursive
functions are representable in Q. Gödel doesn’t go so far to isolate a minimal
set of axioms that suffice, but we now know that Q will do the trick. Finally,
in Section 4, he sketches a proof of the second incompleteness theorem.

18.6 The Derivability Conditions for PA


Peano arithmetic, or PA, is the theory extending Q with induction axioms for
all formulae. In other words, one adds to Q axioms of the form

( φ(0) & ∀ x ( φ( x ) ⊃ φ( x ′ ))) ⊃ ∀ x φ( x )

for every formula φ. Notice that this is really a schema, which is to say, in-
finitely many axioms (and it turns out that PA is not finitely axiomatizable).
But since one can effectively determine whether or not a string of symbols is
an instance of an induction axiom, the set of axioms for PA is computable. PA
is a much more robust theory than Q. For example, one can easily prove that
addition and multiplication are commutative, using induction in the usual
way. In fact, most finitary number-theoretic and combinatorial arguments can
be carried out in PA.
Since PA is computably axiomatized, the derivability predicate PrfPA ( x, y)
is computable and hence represented in Q (and so, in PA). As before, we will
take Prf PA ( x, y) to denote the formula representing the relation. Let ProvPA (y)
be the formula ∃ x PrfPA ( x, y), which, intuitively says, “y is derivable from the
axioms of PA.” The reason we need a little bit more than the axioms of Q is
we need to know that the theory we are using is strong enough to derive a
few basic facts about this derivability predicate. In fact, what we need are the
following facts:

P1. If PA ⊢ φ, then PA ⊢ ProvPA (⌜φ⌝).

P2. For all formulae φ and ψ,

PA ⊢ ProvPA (⌜φ ⊃ ψ⌝) ⊃ (ProvPA (⌜φ⌝) ⊃ ProvPA (⌜ψ⌝)).

286
18.7. The Second Incompleteness Theorem

P3. For every formula φ,

PA ⊢ ProvPA (⌜φ⌝) ⊃ ProvPA (⌜ProvPA (⌜φ⌝)⌝).

The only way to verify that these three properties hold is to describe the for-
mula ProvPA (y) carefully and use the axioms of PA to describe the relevant
formal derivations. Conditions (1) and (2) are easy; it is really condition (3)
that requires work. (Think about what kind of work it entails . . . ) Carrying
out the details would be tedious and uninteresting, so here we will ask you
to take it on faith that PA has the three properties listed above. A reasonable
choice of ProvPA (y) will also satisfy

P4. If PA ⊢ ProvPA (⌜φ⌝), then PA ⊢ φ.

But we will not need this fact.


Incidentally, Gödel was lazy in the same way we are being now. At the
end of the 1931 paper, he sketches the proof of the second incompleteness
theorem, and promises the details in a later paper. He never got around to it;
since everyone who understood the argument believed that it could be carried
out (he did not need to fill in the details.)

18.7 The Second Incompleteness Theorem


How can we express the assertion that PA doesn’t prove its own consistency?
Saying PA is inconsistent amounts to saying that PA ⊢ 0 = 1. So we can take
the consistency statement ConPA to be the sentence ∼ProvPA (⌜0 = 1⌝), and
then the following theorem does the job:

Theorem 18.8. Assuming PA is consistent, then PA does not derive ConPA .

It is important to note that the theorem depends on the particular represen-


tation of ConPA (i.e., the particular representation of ProvPA (y)). All we will
use is that the representation of ProvPA (y) satisfies the three derivability con-
ditions, so the theorem generalizes to any theory with a derivability predicate
having these properties.
It is informative to read Gödel’s sketch of an argument, since the theorem
follows like a good punch line. It goes like this. Let γPA be the Gödel sentence
that we constructed in the proof of Theorem 18.6. We have shown “If PA is
consistent, then PA does not derive γPA .” If we formalize this in PA, we have
a proof of
ConPA ⊃ ∼ProvPA (⌜γPA ⌝).
Now suppose PA derives ConPA . Then it derives ∼ProvPA (⌜γPA ⌝). But since
γPA is a Gödel sentence, this is equivalent to γPA . So PA derives γPA .
But: we know that if PA is consistent, it doesn’t derive γPA ! So if PA is
consistent, it can’t derive ConPA .

287
18. I NCOMPLETENESS AND P ROVABILITY

To make the argument more precise, we will let γPA be the Gödel sentence
for PA and use the derivability conditions (P1)–(P3) to show that PA derives
ConPA ⊃ γPA . This will show that PA doesn’t derive ConPA . Here is a sketch
of the proof, in PA. (For simplicity, we drop the PA subscripts.)

!G ≡ ∼Prov(⌜γ⌝) (18.5)
γ is a Gödel sentence
!G ⊃ ∼Prov(⌜γ⌝) (18.6)
from eq. (18.5)
!G ⊃ (Prov(⌜γ⌝) ⊃ ⊥) (18.7)
from eq. (18.6) by logic
Prov(⌜γ ⊃ (Prov(⌜γ⌝) ⊃ ⊥)⌝) (18.8)
by from eq. (18.7) by condition P1
Prov(⌜γ⌝) ⊃ Prov(⌜(Prov(⌜γ⌝) ⊃ ⊥)⌝) (18.9)
from eq. (18.8) by condition P2
Prov(⌜γ⌝) ⊃ (Prov(⌜Prov(⌜γ⌝)⌝) ⊃ Prov(⌜⊥⌝)) (18.10)
from eq. (18.9) by condition P2 and logic
Prov(⌜γ⌝) ⊃ Prov(⌜Prov(⌜γ⌝)⌝) (18.11)
by P3
Prov(⌜γ⌝) ⊃ Prov(⌜⊥⌝) (18.12)
from eq. (18.10) and eq. (18.11) by logic
Con ⊃ ∼Prov(⌜γ⌝) (18.13)
contraposition of eq. (18.12) and Con ≡ ∼Prov(⌜⊥⌝)
Con ⊃ γ
from eq. (18.5) and eq. (18.13) by logic

The use of logic in the above just elementary facts from propositional logic,
e.g., eq. (18.7) uses ⊢ ∼ φ ≡ ( φ ⊃ ⊥) and eq. (18.12) uses φ ⊃ (ψ ⊃ χ), φ ⊃
ψ ⊢ φ ⊃ χ. The use of condition P2 in eq. (18.9) and eq. (18.10) relies on
instances of P2, Prov(⌜φ ⊃ ψ⌝) ⊃ (Prov(⌜φ⌝) ⊃ Prov(⌜ψ⌝)). In the first one,
φ ≡ γ and ψ ≡ Prov(⌜γ⌝) ⊃ ⊥; in the second, φ ≡ Prov(⌜G⌝) and ψ ≡ ⊥.
The more abstract version of the second incompleteness theorem is as fol-
lows:

Theorem 18.9. Let T be any consistent, axiomatized theory extending Q and let
Prov T (y) be any formula satisfying derivability conditions P1–P3 for T. Then T
does not derive ConT .

The moral of the story is that no “reasonable” consistent theory for math-
ematics can derive its own consistency statement. Suppose T is a theory of

288
18.8. Löb’s Theorem

mathematics that includes Q and Hilbert’s “finitary” reasoning (whatever that


may be). Then, the whole of T cannot derive the consistency statement of T,
and so, a fortiori, the finitary fragment can’t derive the consistency statement
of T either. In that sense, there cannot be a finitary consistency proof for “all
of mathematics.”
There is some leeway in interpreting the term “finitary,” and Gödel, in the
1931 paper, grants the possibility that something we may consider “finitary”
may lie outside the kinds of mathematics Hilbert wanted to formalize. But
Gödel was being charitable; today, it is hard to see how we might find some-
thing that can reasonably be called finitary but is not formalizable in, say, ZFC,
Zermelo–Fraenkel set theory with the axiom of choice.

18.8 Löb’s Theorem


The Gödel sentence for a theory T is a fixed point of ∼Prov T (y), i.e., a sen-
tence γ such that
T ⊢ ∼Prov T (⌜γ⌝) ≡ γ.
It is not derivable, because if T ⊢ γ, (a) by derivability condition (1), T ⊢
Prov T (⌜γ⌝), and (b) T ⊢ γ together with T ⊢ ∼Prov T (⌜γ⌝) ≡ γ gives T ⊢
∼Prov T (⌜γ⌝), and so T would be inconsistent. Now it is natural to ask about
the status of a fixed point of Prov T (y), i.e., a sentence δ such that

T ⊢ Prov T (⌜δ⌝) ≡ δ.

If it were derivable, T ⊢ Prov T (⌜δ⌝) by condition (1), but the same conclusion
follows if we apply modus ponens to the equivalence above. Hence, we don’t
get that T is inconsistent, at least not by the same argument as in the case of
the Gödel sentence. This of course does not show that T does derive δ.
We can make headway on this question if we generalize it a bit. The left-to-
right direction of the fixed point equivalence, Prov T (⌜δ⌝) ⊃ δ, is an instance
of a general schema called a reflection principle: Prov T (⌜φ⌝) ⊃ φ. It is called
that because it expresses, in a sense, that T can “reflect” about what it can
derive; basically it says, “If T can derive φ, then φ is true,” for any φ. This is
true for sound theories only, of course, and this suggests that theories will in
general not derive every instance of it. So which instances can a theory (strong
enough, and satisfying the derivability conditions) derive? Certainly all those
where φ itself is derivable. And that’s it, as the next result shows.

Theorem 18.10. Let T be an axiomatizable theory extending Q, and suppose Prov T (y)
is a formula satisfying conditions P1–P3 from section 18.7. If T derives Prov T (⌜φ⌝) ⊃
φ, then in fact T derives φ.

Put differently, if T ⊬ φ, then T ⊬ Prov T (⌜φ⌝) ⊃ φ. This result is known as


Löb’s theorem.

289
18. I NCOMPLETENESS AND P ROVABILITY

The heuristic for the proof of Löb’s theorem is a clever proof that Santa
Claus exists. (If you don’t like that conclusion, you are free to substitute any
other conclusion you would like.) Here it is:

1. Let X be the sentence, “If X is true, then Santa Claus exists.”

2. Suppose X is true.

3. Then what it says holds; i.e., we have: if X is true, then Santa Claus
exists.

4. Since we are assuming X is true, we can conclude that Santa Claus exists,
by modus ponens from (2) and (3).

5. We have succeeded in deriving (4), “Santa Claus exists,” from the as-
sumption (2), “X is true.” By conditional proof, we have shown: “If X is
true, then Santa Claus exists.”

6. But this is just the sentence X. So we have shown that X is true.

7. But then, by the argument (2)–(4) above, Santa Claus exists.

A formalization of this idea, replacing “is true” with “is derivable,” and “Santa
Claus exists” with φ, yields the proof of Löb’s theorem. The trick is to apply
the fixed-point lemma to the formula Prov T (y) ⊃ φ. The fixed point of that
corresponds to the sentence X in the preceding sketch.

Proof of Theorem 18.10. Suppose φ is a sentence such that T derives Prov T (⌜φ⌝) ⊃
φ. Let ψ(y) be the formula Prov T (y) ⊃ φ, and use the fixed-point lemma to
find a sentence θ such that T derives θ ≡ ψ(⌜θ⌝). Then each of the following

290
18.8. Löb’s Theorem

is derivable in T:
!D ≡ (Prov T (⌜θ⌝) ⊃ φ) (18.14)
θ is a fixed point of ψ(y)
!D ⊃ (Prov T (⌜θ⌝) ⊃ φ) (18.15)
from eq. (18.14)
Prov T (⌜θ ⊃ (Prov T (⌜θ⌝) ⊃ φ)⌝) (18.16)
from eq. (18.15) by condition P1
Prov T (⌜θ⌝) ⊃ Prov T (⌜Prov T (⌜θ⌝) ⊃ φ⌝) (18.17)
from eq. (18.16) using condition P2
Prov T (⌜θ⌝) ⊃ (Prov T (⌜Prov T (⌜θ⌝)⌝) ⊃ Prov T (⌜φ⌝)) (18.18)
from eq. (18.17) using P2 again
Prov T (⌜θ⌝) ⊃ Prov T (⌜Prov T (⌜θ⌝)⌝) (18.19)
by derivability condition P3
Prov T (⌜θ⌝) ⊃ Prov T (⌜φ⌝) (18.20)
from eq. (18.18) and eq. (18.19)
Prov T (⌜φ⌝) ⊃ φ (18.21)
by assumption of the theorem
Prov T (⌜θ⌝) ⊃ φ (18.22)
from eq. (18.20) and eq. (18.21)
(Prov T (⌜θ⌝) ⊃ φ) ⊃ θ (18.23)
from eq. (18.14)
!D (18.24)
from eq. (18.22) and eq. (18.23)
Prov T (⌜θ⌝) (18.25)
from eq. (18.24) by condition P1
!A from eq. (18.21) and eq. (18.25)
With Löb’s theorem in hand, there is a short proof of the second incom-
pleteness theorem (for theories having a derivability predicate satisfying con-
ditions P1–P3): if T ⊢ Prov T (⌜⊥⌝) ⊃ ⊥, then T ⊢ ⊥. If T is consistent, T ⊬ ⊥.
So, T ⊬ Prov T (⌜⊥⌝) ⊃ ⊥, i.e., T ⊬ ConT . We can also apply it to show that δ,
the fixed point of Prov T ( x ), is derivable. For since
T ⊢ Prov T (⌜δ⌝) ≡ δ

in particular

T ⊢ Prov T (⌜δ⌝) ⊃ δ
and so by Löb’s theorem, T ⊢ δ.

291
18. I NCOMPLETENESS AND P ROVABILITY

18.9 The Undefinability of Truth


The notion of definability depends on having a formal semantics for the lan-
guage of arithmetic. We have described a set of formulas and sentences in
the language of arithmetic. The “intended interpretation” is to read such sen-
tences as making assertions about the natural numbers, and such an assertion
can be true or false. Let N be the structure with domain N and the standard in-
terpretation for the symbols in the language of arithmetic. Then N ⊨ φ means
“φ is true in the standard interpretation.”
Definition 18.11. A relation R( x1 , . . . , xk ) of natural numbers is definable in N
if and only if there is a formula φ( x1 , . . . , xk ) in the language of arithmetic
such that for every n1 , . . . , nk , R(n1 , . . . , nk ) if and only if N ⊨ φ(n1 , . . . , nk ).

Put differently, a relation is definable in N if and only if it is representable


in the theory TA, where TA = { φ | N ⊨ φ} is the set of true sentences of
arithmetic. (If this is not immediately clear to you, you should go back and
check the definitions and convince yourself that this is the case.)
Lemma 18.12. Every computable relation is definable in N.

Proof. It is easy to check that the formula representing a relation in Q defines


the same relation in N.

Now one can ask, is the converse also true? That is, is every relation defin-
able in N computable? The answer is no. For example:
Lemma 18.13. The halting relation is definable in N.

Proof. Let H be the halting relation, i.e.,

H = {⟨e, x ⟩ | ∃s T (e, x, s)}.

Let θ T define T in N. Then

H = {⟨e, x ⟩ | N ⊨ ∃s θ T (e, x, s)},

so ∃s θ T (z, x, s) defines H in N.

What about TA itself? Is it definable in arithmetic? That is: is the set


{ φ# | N ⊨ φ} definable in arithmetic? Tarski’s theorem answers this in the
#

negative.
Theorem 18.14. The set of true sentences of arithmetic is not definable in arithmetic.

Proof. Suppose θ ( x ) defined it, i.e., N ⊨ φ iff N ⊨ θ (⌜φ⌝). By the fixed-point


lemma, there is a formula φ such that Q ⊢ φ ≡ ∼θ (⌜φ⌝), and hence N ⊨ φ ≡
∼θ (⌜φ⌝). But then N ⊨ φ if and only if N ⊨ ∼θ (⌜φ⌝), which contradicts the
fact that θ (y) is supposed to define the set of true statements of arithmetic.

292
18.9. The Undefinability of Truth

Tarski applied this analysis to a more general philosophical notion of truth.


Given any language L, Tarski argued that an adequate notion of truth for L
would have to satisfy, for each sentence X,

‘X’ is true if and only if X.

Tarski’s oft-quoted example, for English, is the sentence

‘Snow is white’ is true if and only if snow is white.

However, for any language strong enough to represent the diagonal function,
and any linguistic predicate T ( x ), we can construct a sentence X satisfying
“X if and only if not T (‘X’).” Given that we do not want a truth predicate
to declare some sentences to be both true and false, Tarski concluded that
one cannot specify a truth predicate for all sentences in a language without,
somehow, stepping outside the bounds of the language. In other words, a the
truth predicate for a language cannot be defined in the language itself.

293
Part V

Methods

295
Appendix A

Proofs

A.1 Introduction
Based on your experiences in introductory logic, you might be comfortable
with a derivation system—probably a natural deduction or Fitch style deriva-
tion system, or perhaps a proof-tree system. You probably remember doing
proofs in these systems, either proving a formula or show that a given argu-
ment is valid. In order to do this, you applied the rules of the system un-
til you got the desired end result. In reasoning about logic, we also prove
things, but in most cases we are not using a derivation system. In fact, most
of the proofs we consider are done in English (perhaps, with some symbolic
language thrown in) rather than entirely in the language of first-order logic.
When constructing such proofs, you might at first be at a loss—how do I prove
something without a derivation system? How do I start? How do I know if
my proof is correct?
Before attempting a proof, it’s important to know what a proof is and how
to construct one. As implied by the name, a proof is meant to show that some-
thing is true. You might think of this in terms of a dialogue—someone asks
you if something is true, say, if every prime other than two is an odd number.
To answer “yes” is not enough; they might want to know why. In this case,
you’d give them a proof.
In everyday discourse, it might be enough to gesture at an answer, or give
an incomplete answer. In logic and mathematics, however, we want rigorous
proof—we want to show that something is true beyond any doubt. This means
that every step in our proof must be justified, and the justification must be
cogent (i.e., the assumption you’re using is actually assumed in the statement
of the theorem you’re proving, the definitions you apply must be correctly
applied, the justifications appealed to must be correct inferences, etc.).
Usually, we’re proving some statement. We call the statements we’re prov-
ing by various names: propositions, theorems, lemmas, or corollaries. A
proposition is a basic proof-worthy statement: important enough to record,

297
A. P ROOFS

but perhaps not particularly deep nor applied often. A theorem is a signifi-
cant, important proposition. Its proof often is broken into several steps, and
sometimes it is named after the person who first proved it (e.g., Cantor’s The-
orem, the Löwenheim–Skolem theorem) or after the fact it concerns (e.g., the
completeness theorem). A lemma is a proposition or theorem that is used
in the proof of a more important result. Confusingly, sometimes lemmas are
important results in themselves, and also named after the person who intro-
duced them (e.g., Zorn’s Lemma). A corollary is a result that easily follows
from another one.
A statement to be proved often contains assumptions that clarify which
kinds of things we’re proving something about. It might begin with “Let φ
be a formula of the form ψ ⊃ χ” or “Suppose Γ ⊢ φ” or something of the
sort. These are hypotheses of the proposition, theorem, or lemma, and you may
assume these to be true in your proof. They restrict what we’re proving, and
also introduce some names for the objects we’re talking about. For instance, if
your proposition begins with “Let φ be a formula of the form ψ ⊃ χ,” you’re
proving something about all formulas of a certain sort only (namely, condi-
tionals), and it’s understood that ψ ⊃ χ is an arbitrary conditional that your
proof will talk about.

A.2 Starting a Proof

But where do you even start?


You’ve been given something to prove, so this should be the last thing that
is mentioned in the proof (you can, obviously, announce that you’re going to
prove it at the beginning, but you don’t want to use it as an assumption). Write
what you are trying to prove at the bottom of a fresh sheet of paper—this way
you don’t lose sight of your goal.
Next, you may have some assumptions that you are able to use (this will
be made clearer when we talk about the type of proof you are doing in the next
section). Write these at the top of the page and make sure to flag that they are
assumptions (i.e., if you are assuming p, write “assume that p,” or “suppose
that p”). Finally, there might be some definitions in the question that you
need to know. You might be told to use a specific definition, or there might
be various definitions in the assumptions or conclusion that you are working
towards. Write these down and ensure that you understand what they mean.
How you set up your proof will also be dependent upon the form of the
question. The next section provides details on how to set up your proof based
on the type of sentence.

298
A.3. Using Definitions

A.3 Using Definitions


We mentioned that you must be familiar with all definitions that may be used
in the proof, and that you can properly apply them. This is a really impor-
tant point, and it is worth looking at in a bit more detail. Definitions are used
to abbreviate properties and relations so we can talk about them more suc-
cinctly. The introduced abbreviation is called the definiendum, and what it
abbreviates is the definiens. In proofs, we often have to go back to how the
definiendum was introduced, because we have to exploit the logical structure
of the definiens (the long version of which the defined term is the abbrevia-
tion) to get through our proof. By unpacking definitions, you’re ensuring that
you’re getting to the heart of where the logical action is.
We’ll start with an example. Suppose you want to prove the following:

Proposition A.1. For any sets A and B, A ∪ B = B ∪ A.

In order to even start the proof, we need to know what it means for two sets
to be identical; i.e., we need to know what the “=” in that equation means for
sets. Sets are defined to be identical whenever they have the same elements.
So the definition we have to unpack is:

Definition A.2. Sets A and B are identical, A = B, iff every element of A is


an element of B, and vice versa.

This definition uses A and B as placeholders for arbitrary sets. What it


defines—the definiendum—is the expression “A = B” by giving the condition
under which A = B is true. This condition—“every element of A is an element
of B, and vice versa”—is the definiens.1 The definition specifies that A = B is
true if, and only if (we abbreviate this to “iff”) the condition holds.
When you apply the definition, you have to match the A and B in the
definition to the case you’re dealing with. In our case, it means that in order
for A ∪ B = B ∪ A to be true, each z ∈ A ∪ B must also be in B ∪ A, and
vice versa. The expression A ∪ B in the proposition plays the role of A in the
definition, and B ∪ A that of B. Since A and B are used both in the definition
and in the statement of the proposition we’re proving, but in different uses,
you have to be careful to make sure you don’t mix up the two. For instance, it
would be a mistake to think that you could prove the proposition by showing
that every element of A is an element of B, and vice versa—that would show
that A = B, not that A ∪ B = B ∪ A. (Also, since A and B may be any two
sets, you won’t get very far, because if nothing is assumed about A and B they
may well be different sets.)
1 In this particular case—and very confusingly!—when A = B, the sets A and B are just one

and the same set, even though we use different letters for it on the left and the right side. But the
ways in which that set is picked out may be different, and that makes the definition non-trivial.

299
A. P ROOFS

Within the proof we are dealing with set-theoretic notions such as union,
and so we must also know the meanings of the symbol ∪ in order to under-
stand how the proof should proceed. And sometimes, unpacking the defini-
tion gives rise to further definitions to unpack. For instance, A ∪ B is defined
as {z | z ∈ A or z ∈ B}. So if you want to prove that x ∈ A ∪ B, unpacking
the definition of ∪ tells you that you have to prove x ∈ {z | z ∈ A or z ∈ B}.
Now you also have to remember that x ∈ {z | . . . z . . .} iff . . . x . . . . So, further
unpacking the definition of the {z | . . . z . . .} notation, what you have to show
is: x ∈ A or x ∈ B. So, “every element of A ∪ B is also an element of B ∪ A”
really means: “for every x, if x ∈ A or x ∈ B, then x ∈ B or x ∈ A.” If we fully
unpack the definitions in the proposition, we see that what we have to show
is this:

Proposition A.3. For any sets A and B: (a) for every x, if x ∈ A or x ∈ B, then
x ∈ B or x ∈ A, and (b) for every x, if x ∈ B or x ∈ A, then x ∈ A or x ∈ B.

What’s important is that unpacking definitions is a necessary part of con-


structing a proof. Properly doing it is sometimes difficult: you must be careful
to distinguish and match the variables in the definition and the terms in the
claim you’re proving. In order to be successful, you must know what the
question is asking and what all the terms used in the question mean—you
will often need to unpack more than one definition. In simple proofs such as
the ones below, the solution follows almost immediately from the definitions
themselves. Of course, it won’t always be this simple.

A.4 Inference Patterns


Proofs are composed of individual inferences. When we make an inference,
we typically indicate that by using a word like “so,” “thus,” or “therefore.”
The inference often relies on one or two facts we already have available in our
proof—it may be something we have assumed, or something that we’ve con-
cluded by an inference already. To be clear, we may label these things, and in
the inference we indicate what other statements we’re using in the inference.
An inference will often also contain an explanation of why our new conclusion
follows from the things that come before it. There are some common patterns
of inference that are used very often in proofs; we’ll go through some below.
Some patterns of inference, like proofs by induction, are more involved (and
will be discussed later).
We’ve already discussed one pattern of inference: unpacking, or applying,
a definition. When we unpack a definition, we just restate something that
involves the definiendum by using the definiens. For instance, suppose that
we have already established in the course of a proof that D = E (a). Then we
may apply the definition of = for sets and infer: “Thus, by definition from (a),
every element of D is an element of E and vice versa.”

300
A.4. Inference Patterns

Somewhat confusingly, we often do not write the justification of an in-


ference when we actually make it, but before. Suppose we haven’t already
proved that D = E, but we want to. If D = E is the conclusion we aim for,
then we can restate this aim also by applying the definition: to prove D = E
we have to prove that every element of D is an element of E and vice versa. So
our proof will have the form: (a) prove that every element of D is an element
of E; (b) every element of E is an element of D; (c) therefore, from (a) and (b)
by definition of =, D = E. But we would usually not write it this way. Instead
we might write something like,

We want to show D = E. By definition of =, this amounts to


showing that every element of D is an element of E and vice versa.
(a) . . . (a proof that every element of D is an element of E) . . .
(b) . . . (a proof that every element of E is an element of D) . . .

Using a Conjunction
Perhaps the simplest inference pattern is that of drawing as conclusion one of
the conjuncts of a conjunction. In other words: if we have assumed or already
proved that p and q, then we’re entitled to infer that p (and also that q). This is
such a basic inference that it is often not mentioned. For instance, once we’ve
unpacked the definition of D = E we’ve established that every element of D is
an element of E and vice versa. From this we can conclude that every element
of E is an element of D (that’s the “vice versa” part).

Proving a Conjunction
Sometimes what you’ll be asked to prove will have the form of a conjunc-
tion; you will be asked to “prove p and q.” In this case, you simply have
to do two things: prove p, and then prove q. You could divide your proof
into two sections, and for clarity, label them. When you’re making your first
notes, you might write “(1) Prove p” at the top of the page, and “(2) Prove q”
in the middle of the page. (Of course, you might not be explicitly asked to
prove a conjunction but find that your proof requires that you prove a con-
junction. For instance, if you’re asked to prove that D = E you will find that,
after unpacking the definition of =, you have to prove: every element of D is
an element of E and every element of E is an element of D).

Proving a Disjunction
When what you are proving takes the form of a disjunction (i.e., it is an state-
ment of the form “p or q”), it is enough to show that one of the disjuncts is true.
However, it basically never happens that either disjunct just follows from the
assumptions of your theorem. More often, the assumptions of your theorem

301
A. P ROOFS

are themselves disjunctive, or you’re showing that all things of a certain kind
have one of two properties, but some of the things have the one and others
have the other property. This is where proof by cases is useful (see below).

Conditional Proof
Many theorems you will encounter are in conditional form (i.e., show that if
p holds, then q is also true). These cases are nice and easy to set up—simply
assume the antecedent of the conditional (in this case, p) and prove the con-
clusion q from it. So if your theorem reads, “If p then q,” you start your proof
with “assume p” and at the end you should have proved q.
Conditionals may be stated in different ways. So instead of “If p then q,”
a theorem may state that “p only if q,” “q if p,” or “q, provided p.” These all
mean the same and require assuming p and proving q from that assumption.
Recall that a biconditional (“p if and only if (iff) q”) is really two conditionals
put together: if p then q, and if q then p. All you have to do, then, is two
instances of conditional proof: one for the first conditional and another one
for the second. Sometimes, however, it is possible to prove an “iff” statement
by chaining together a bunch of other “iff” statements so that you start with
“p” an end with “q”—but in that case you have to make sure that each step
really is an “iff.”

Universal Claims
Using a universal claim is simple: if something is true for anything, it’s true
for each particular thing. So if, say, the hypothesis of your proof is A ⊆ B, that
means (unpacking the definition of ⊆), that, for every x ∈ A, x ∈ B. Thus, if
you already know that z ∈ A, you can conclude z ∈ B.
Proving a universal claim may seem a little bit tricky. Usually these state-
ments take the following form: “If x has P, then it has Q” or “All Ps are Qs.”
Of course, it might not fit this form perfectly, and it takes a bit of practice to
figure out what you’re asked to prove exactly. But: we often have to prove
that all objects with some property have a certain other property.
The way to prove a universal claim is to introduce names or variables, for
the things that have the one property and then show that they also have the
other property. We might put this by saying that to prove something for all Ps
you have to prove it for an arbitrary P. And the name introduced is a name
for an arbitrary P. We typically use single letters as these names for arbitrary
things, and the letters usually follow conventions: e.g., we use n for natural
numbers, φ for formulae, A for sets, f for functions, etc.
The trick is to maintain generality throughout the proof. You start by as-
suming that an arbitrary object (“x”) has the property P, and show (based only
on definitions or what you are allowed to assume) that x has the property Q.
Because you have not stipulated what x is specifically, other that it has the

302
A.4. Inference Patterns

property P, then you can assert that everything with P has the property Q. In
short, x is a stand-in for all things with property P.

Proposition A.4. For all sets A and B, A ⊆ A ∪ B.

Proof. Let A and B be arbitrary sets. We want to show that A ⊆ A ∪ B. By


definition of ⊆, this amounts to: for every x, if x ∈ A then x ∈ A ∪ B. So let
x ∈ A be an arbitrary element of A. We have to show that x ∈ A ∪ B. Since
x ∈ A, x ∈ A or x ∈ B. Thus, x ∈ { x | x ∈ A ∨ x ∈ B}. But that, by definition
of ∪, means x ∈ A ∪ B.

Proof by Cases
Suppose you have a disjunction as an assumption or as an already established
conclusion—you have assumed or proved that p or q is true. You want to
prove r. You do this in two steps: first you assume that p is true, and prove r,
then you assume that q is true and prove r again. This works because we
assume or know that one of the two alternatives holds. The two steps establish
that either one is sufficient for the truth of r. (If both are true, we have not one
but two reasons for why r is true. It is not necessary to separately prove that
r is true assuming both p and q.) To indicate what we’re doing, we announce
that we “distinguish cases.” For instance, suppose we know that x ∈ B ∪ C.
B ∪ C is defined as { x | x ∈ B or x ∈ C }. In other words, by definition, x ∈ B
or x ∈ C. We would prove that x ∈ A from this by first assuming that x ∈ B,
and proving x ∈ A from this assumption, and then assume x ∈ C, and again
prove x ∈ A from this. You would write “We distinguish cases” under the
assumption, then “Case (1): x ∈ B” underneath, and “Case (2): x ∈ C halfway
down the page. Then you’d proceed to fill in the top half and the bottom half
of the page.
Proof by cases is especially useful if what you’re proving is itself disjunc-
tive. Here’s a simple example:

Proposition A.5. Suppose B ⊆ D and C ⊆ E. Then B ∪ C ⊆ D ∪ E.

Proof. Assume (a) that B ⊆ D and (b) C ⊆ E. By definition, any x ∈ B is also


∈ D (c) and any x ∈ C is also ∈ E (d). To show that B ∪ C ⊆ D ∪ E, we have to
show that if x ∈ B ∪ C then x ∈ D ∪ E (by definition of ⊆). x ∈ B ∪ C iff x ∈ B
or x ∈ C (by definition of ∪). Similarly, x ∈ D ∪ E iff x ∈ D or x ∈ E. So, we
have to show: for any x, if x ∈ B or x ∈ C, then x ∈ D or x ∈ E.

So far we’ve only unpacked definitions! We’ve reformulated our


proposition without ⊆ and ∪ and are left with trying to prove a
universal conditional claim. By what we’ve discussed above, this
is done by assuming that x is something about which we assume
the “if” part is true, and we’ll go on to show that the “then” part is

303
A. P ROOFS

true as well. In other words, we’ll assume that x ∈ B or x ∈ C and


show that x ∈ D or x ∈ E.2

Suppose that x ∈ B or x ∈ C. We have to show that x ∈ D or x ∈ E. We


distinguish cases.
Case 1: x ∈ B. By (c), x ∈ D. Thus, x ∈ D or x ∈ E. (Here we’ve made the
inference discussed in the preceding subsection!)
Case 2: x ∈ C. By (d), x ∈ E. Thus, x ∈ D or x ∈ E.

Proving an Existence Claim


When asked to prove an existence claim, the question will usually be of the
form “prove that there is an x such that . . . x . . . ”, i.e., that some object that
has the property described by “. . . x . . . ”. In this case you’ll have to identify a
suitable object show that is has the required property. This sounds straightfor-
ward, but a proof of this kind can be tricky. Typically it involves constructing
or defining an object and proving that the object so defined has the required
property. Finding the right object may be hard, proving that it has the re-
quired property may be hard, and sometimes it’s even tricky to show that
you’ve succeeded in defining an object at all!
Generally, you’d write this out by specifying the object, e.g., “let x be . . . ”
(where . . . specifies which object you have in mind), possibly proving that . . .
in fact describes an object that exists, and then go on to show that x has the
property Q. Here’s a simple example.

Proposition A.6. Suppose that x ∈ B. Then there is an A such that A ⊆ B and


A ̸= ∅.

Proof. Assume x ∈ B. Let A = { x }.

Here we’ve defined the set A by enumerating its elements. Since


we assume that x is an object, and we can always form a set by
enumerating its elements, we don’t have to show that we’ve suc-
ceeded in defining a set A here. However, we still have to show
that A has the properties required by the proposition. The proof
isn’t complete without that!

Since x ∈ A, A ̸= ∅.

This relies on the definition of A as { x } and the obvious facts that


/ ∅.
x ∈ { x } and x ∈

Since x is the only element of { x }, and x ∈ B, every element of A is also


an element of B. By definition of ⊆, A ⊆ B.
2 This paragraph just explains what we’re doing—it’s not part of the proof, and you don’t

have to go into all this detail when you write down your own proofs.

304
A.4. Inference Patterns

Using Existence Claims


Suppose you know that some existence claim is true (you’ve proved it, or it’s
a hypothesis you can use), say, “for some x, x ∈ A” or “there is an x ∈ A.” If
you want to use it in your proof, you can just pretend that you have a name
for one of the things which your hypothesis says exist. Since A contains at
least one thing, there are things to which that name might refer. You might of
course not be able to pick one out or describe it further (other than that it is
∈ A). But for the purpose of the proof, you can pretend that you have picked
it out and give a name to it. It’s important to pick a name that you haven’t
already used (or that appears in your hypotheses), otherwise things can go
wrong. In your proof, you indicate this by going from “for some x, x ∈ A” to
“Let a ∈ A.” Now you can reason about a, use some other hypotheses, etc.,
until you come to a conclusion, p. If p no longer mentions a, p is independent
of the asusmption that a ∈ A, and you’ve shown that it follows just from the
assumption “for some x, x ∈ A.”

Proposition A.7. If A ̸= ∅, then A ∪ B ̸= ∅.

Proof. Suppose A ̸= ∅. So for some x, x ∈ A.

Here we first just restated the hypothesis of the proposition. This


hypothesis, i.e., A ̸= ∅, hides an existential claim, which you get
to only by unpacking a few definitions. The definition of = tells
us that A = ∅ iff every x ∈ A is also ∈ ∅ and every x ∈ ∅ is also
∈ A. Negating both sides, we get: A ̸= ∅ iff either some x ∈ A
/ ∅ or some x ∈ ∅ is ∈
is ∈ / A. Since nothing is ∈ ∅, the second
disjunct can never be true, and “x ∈ A and x ∈ / ∅” reduces to just
x ∈ A. So x ̸= ∅ iff for some x, x ∈ A. That’s an existence claim.
Now we use that existence claim by introducing a name for one of
the elements of A:

Let a ∈ A.

Now we’ve introduced a name for one of the things ∈ A. We’ll


continue to argue about a, but we’ll be careful to only assume that
a ∈ A and nothing else:

Since a ∈ A, a ∈ A ∪ B, by definition of ∪. So for some x, x ∈ A ∪ B, i.e.,


A ∪ B ̸= ∅.

In that last step, we went from “a ∈ A ∪ B” to “for some x, x ∈


A ∪ B.” That doesn’t mention a anymore, so we know that “for
some x, x ∈ A ∪ B” follows from “for some x, x ∈ A alone.” But
that means that A ∪ B ̸= ∅.

305
A. P ROOFS

It’s maybe good practice to keep bound variables like “x” separate from
hypothetical names like a, like we did. In practice, however, we often don’t
and just use x, like so:

Suppose A ̸= ∅, i.e., there is an x ∈ A. By definition of ∪, x ∈


A ∪ B. So A ∪ B ̸= ∅.

However, when you do this, you have to be extra careful that you use different
x’s and y’s for different existential claims. For instance, the following is not a
correct proof of “If A ̸= ∅ and B ̸= ∅ then A ∩ B ̸= ∅” (which is not true).

Suppose A ̸= ∅ and B ̸= ∅. So for some x, x ∈ A and also for


some x, x ∈ B. Since x ∈ A and x ∈ B, x ∈ A ∩ B, by definition
of ∩. So A ∩ B ̸= ∅.

Can you spot where the incorrect step occurs and explain why the result does
not hold?

A.5 An Example
Our first example is the following simple fact about unions and intersections
of sets. It will illustrate unpacking definitions, proofs of conjunctions, of uni-
versal claims, and proof by cases.

Proposition A.8. For any sets A, B, and C, A ∪ ( B ∩ C ) = ( A ∪ B) ∩ ( A ∪ C )

Let’s prove it!

Proof. We want to show that for any sets A, B, and C, A ∪ ( B ∩ C ) = ( A ∪ B) ∩


( A ∪ C)

First we unpack the definition of “=” in the statement of the propo-


sition. Recall that proving sets identical means showing that the
sets have the same elements. That is, all elements of A ∪ ( B ∩ C )
are also elements of ( A ∪ B) ∩ ( A ∪ C ), and vice versa. The “vice
versa” means that also every element of ( A ∪ B) ∩ ( A ∪ C ) must
be an element of A ∪ ( B ∩ C ). So in unpacking the definition, we
see that we have to prove a conjunction. Let’s record this:

By definition, A ∪ ( B ∩ C ) = ( A ∪ B) ∩ ( A ∪ C ) iff every element of A ∪ ( B ∩ C )


is also an element of ( A ∪ B) ∩ ( A ∪ C ), and every element of ( A ∪ B) ∩ ( A ∪ C )
is an element of A ∪ ( B ∩ C ).

Since this is a conjunction, we must prove each conjunct separately.


Lets start with the first: let’s prove that every element of A ∪ ( B ∩
C ) is also an element of ( A ∪ B) ∩ ( A ∪ C ).

306
A.5. An Example

This is a universal claim, and so we consider an arbitrary element


of A ∪ ( B ∩ C ) and show that it must also be an element of ( A ∪
B) ∩ ( A ∪ C ). We’ll pick a variable to call this arbitrary element by,
say, z. Our proof continues:

First, we prove that every element of A ∪ ( B ∩ C ) is also an element of ( A ∪


B) ∩ ( A ∪ C ). Let z ∈ A ∪ ( B ∩ C ). We have to show that z ∈ ( A ∪ B) ∩ ( A ∪ C ).

Now it is time to unpack the definition of ∪ and ∩. For instance,


the definition of ∪ is: A ∪ B = {z | z ∈ A or z ∈ B}. When we
apply the definition to “A ∪ ( B ∩ C ),” the role of the “B” in the
definition is now played by “B ∩ C,” so A ∪ ( B ∩ C ) = {z | z ∈
A or z ∈ B ∩ C }. So our assumption that z ∈ A ∪ ( B ∩ C ) amounts
to: z ∈ {z | z ∈ A or z ∈ B ∩ C }. And z ∈ {z | . . . z . . .} iff . . . z . . . ,
i.e., in this case, z ∈ A or z ∈ B ∩ C.

By the definition of ∪, either z ∈ A or z ∈ B ∩ C.

Since this is a disjunction, it will be useful to apply proof by cases.


We take the two cases, and show that in each one, the conclusion
we’re aiming for (namely, “z ∈ ( A ∪ B) ∩ ( A ∪ C )”) obtains.

Case 1: Suppose that z ∈ A.

There’s not much more to work from based on our assumptions.


So let’s look at what we have to work with in the conclusion. We
want to show that z ∈ ( A ∪ B) ∩ ( A ∪ C ). Based on the definition
of ∩, if we want to show that z ∈ ( A ∪ B) ∩ ( A ∪ C ), we have to
show that it’s in both ( A ∪ B) and ( A ∪ C ). But z ∈ A ∪ B iff z ∈ A
or z ∈ B, and we already have (as the assumption of case 1) that
z ∈ A. By the same reasoning—switching C for B—z ∈ A ∪ C.
This argument went in the reverse direction, so let’s record our
reasoning in the direction needed in our proof.

Since z ∈ A, z ∈ A or z ∈ B, and hence, by definition of ∪, z ∈ A ∪ B.


Similarly, z ∈ A ∪ C. But this means that z ∈ ( A ∪ B) ∩ ( A ∪ C ), by definition
of ∩.

This completes the first case of the proof by cases. Now we want
to derive the conclusion in the second case, where z ∈ B ∩ C.

Case 2: Suppose that z ∈ B ∩ C.

Again, we are working with the intersection of two sets. Let’s ap-
ply the definition of ∩:

Since z ∈ B ∩ C, z must be an element of both B and C, by definition of ∩.

307
A. P ROOFS

It’s time to look at our conclusion again. We have to show that z is


in both ( A ∪ B) and ( A ∪ C ). And again, the solution is immediate.
Since z ∈ B, z ∈ ( A ∪ B). Since z ∈ C, also z ∈ ( A ∪ C ). So, z ∈ ( A ∪ B) ∩
( A ∪ C ).
Here we applied the definitions of ∪ and ∩ again, but since we’ve
already recalled those definitions, and already showed that if z is
in one of two sets it is in their union, we don’t have to be as explicit
in what we’ve done.
We’ve completed the second case of the proof by cases, so now we
can assert our first conclusion.
So, if z ∈ A ∪ ( B ∩ C ) then z ∈ ( A ∪ B) ∩ ( A ∪ C ).
Now we just want to show the other direction, that every element
of ( A ∪ B) ∩ ( A ∪ C ) is an element of A ∪ ( B ∩ C ). As before, we
prove this universal claim by assuming we have an arbitrary ele-
ment of the first set and show it must be in the second set. Let’s
state what we’re about to do.
Now, assume that z ∈ ( A ∪ B) ∩ ( A ∪ C ). We want to show that z ∈ A ∪ ( B ∩
C ).
We are now working from the hypothesis that z ∈ ( A ∪ B) ∩ ( A ∪
C ). It hopefully isn’t too confusing that we’re using the same z here
as in the first part of the proof. When we finished that part, all the
assumptions we’ve made there are no longer in effect, so now we
can make new assumptions about what z is. If that is confusing to
you, just replace z with a different variable in what follows.
We know that z is in both A ∪ B and A ∪ C, by definition of ∩. And
by the definition of ∪, we can further unpack this to: either z ∈ A
or z ∈ B, and also either z ∈ A or z ∈ C. This looks like a proof
by cases again—except the “and” makes it confusing. You might
think that this amounts to there being three possibilities: z is either
in A, B or C. But that would be a mistake. We have to be careful,
so let’s consider each disjunction in turn.
By definition of ∩, z ∈ A ∪ B and z ∈ A ∪ C. By definition of ∪, z ∈ A or
z ∈ B. We distinguish cases.
Since we’re focusing on the first disjunction, we haven’t gotten our
second disjunction (from unpacking A ∪ C) yet. In fact, we don’t
need it yet. The first case is z ∈ A, and an element of a set is also
an element of the union of that set with any other. So case 1 is easy:
Case 1: Suppose that z ∈ A. It follows that z ∈ A ∪ ( B ∩ C ).

308
A.6. Another Example

Now for the second case, z ∈ B. Here we’ll unpack the second ∪
and do another proof-by-cases:
Case 2: Suppose that z ∈ B. Since z ∈ A ∪ C, either z ∈ A or z ∈ C. We
distinguish cases further:
Case 2a: z ∈ A. Then, again, z ∈ A ∪ ( B ∩ C ).
Ok, this was a bit weird. We didn’t actually need the assumption
that z ∈ B for this case, but that’s ok.
Case 2b: z ∈ C. Then z ∈ B and z ∈ C, so z ∈ B ∩ C, and consequently,
z ∈ A ∪ ( B ∩ C ).
This concludes both proofs-by-cases and so we’re done with the
second half.
So, if z ∈ ( A ∪ B) ∩ ( A ∪ C ) then z ∈ A ∪ ( B ∩ C ).

A.6 Another Example


Proposition A.9. If A ⊆ C, then A ∪ (C \ A) = C.

Proof. Suppose that A ⊆ C. We want to show that A ∪ (C \ A) = C.


We begin by observing that this is a conditional statement. It is
tacitly universally quantified: the proposition holds for all sets A
and C. So A and C are variables for arbitrary sets. To prove such a
statement, we assume the antecedent and prove the consequent.
We continue by using the assumption that A ⊆ C. Let’s unpack
the definition of ⊆: the assumption means that all elements of A
are also elements of C. Let’s write this down—it’s an important
fact that we’ll use throughout the proof.
By the definition of ⊆, since A ⊆ C, for all z, if z ∈ A, then z ∈ C.
We’ve unpacked all the definitions that are given to us in the as-
sumption. Now we can move onto the conclusion. We want to
show that A ∪ (C \ A) = C, and so we set up a proof similarly
to the last example: we show that every element of A ∪ (C \ A) is
also an element of C and, conversely, every element of C is an ele-
ment of A ∪ (C \ A). We can shorten this to: A ∪ (C \ A) ⊆ C and
C ⊆ A ∪ (C \ A). (Here we’re doing the opposite of unpacking a
definition, but it makes the proof a bit easier to read.) Since this is
a conjunction, we have to prove both parts. To show the first part,
i.e., that every element of A ∪ (C \ A) is also an element of C, we
assume that z ∈ A ∪ (C \ A) for an arbitrary z and show that z ∈ C.
By the definition of ∪, we can conclude that z ∈ A or z ∈ C \ A
from z ∈ A ∪ (C \ A). You should now be getting the hang of this.

309
A. P ROOFS

A ∪ (C \ A) = C iff A ∪ (C \ A) ⊆ C and C ⊆ ( A ∪ (C \ A). First we prove


that A ∪ (C \ A) ⊆ C. Let z ∈ A ∪ (C \ A). So, either z ∈ A or z ∈ (C \ A).

We’ve arrived at a disjunction, and from it we want to prove that


z ∈ C. We do this using proof by cases.

Case 1: z ∈ A. Since for all z, if z ∈ A, z ∈ C, we have that z ∈ C.

Here we’ve used the fact recorded earlier which followed from the
hypothesis of the proposition that A ⊆ C. The first case is com-
plete, and we turn to the second case, z ∈ (C \ A). Recall that
C \ A denotes the difference of the two sets, i.e., the set of all ele-
ments of C which are not elements of A. But any element of C not
in A is in particular an element of C.

Case 2: z ∈ (C \ A). This means that z ∈ C and z ∈


/ A. So, in particular, z ∈ C.

Great, we’ve proved the first direction. Now for the second direc-
tion. Here we prove that C ⊆ A ∪ (C \ A). So we assume that
z ∈ C and prove that z ∈ A ∪ (C \ A).

Now let z ∈ C. We want to show that z ∈ A or z ∈ C \ A.

Since all elements of A are also elements of C, and C \ A is the set of


all things that are elements of C but not A, it follows that z is either
in A or in C \ A. This may be a bit unclear if you don’t already
know why the result is true. It would be better to prove it step-by-
step. It will help to use a simple fact which we can state without
proof: z ∈ A or z ∈ / A. This is called the “principle of excluded
middle:” for any statement p, either p is true or its negation is true.
(Here, p is the statement that z ∈ A.) Since this is a disjunction, we
can again use proof-by-cases.

Either z ∈ A or z ∈
/ A. In the former case, z ∈ A ∪ (C \ A). In the latter case,
z ∈ C and z ∈
/ A, so z ∈ C \ A. But then z ∈ A ∪ (C \ A).

Our proof is complete: we have shown that A ∪ (C \ A) = C.

A.7 Proof by Contradiction


In the first instance, proof by contradiction is an inference pattern that is used
to prove negative claims. Suppose you want to show that some claim p is false,
i.e., you want to show ∼ p. The most promising strategy is to (a) suppose that
p is true, and (b) show that this assumption leads to something you know to
be false. “Something known to be false” may be a result that conflicts with—
contradicts—p itself, or some other hypothesis of the overall claim you are

310
A.7. Proof by Contradiction

considering. For instance, a proof of “if q then ∼ p” involves assuming that


q is true and proving ∼ p from it. If you prove ∼ p by contradiction, that means
assuming p in addition to q. If you can prove ∼q from p, you have shown that
the assumption p leads to something that contradicts your other assumption q,
since q and ∼q cannot both be true. Of course, you have to use other inference
patterns in your proof of the contradiction, as well as unpacking definitions.
Let’s consider an example.

Proposition A.10. If A ⊆ B and B = ∅, then A has no elements.

Proof. Suppose A ⊆ B and B = ∅. We want to show that A has no elements.

Since this is a conditional claim, we assume the antecedent and


want to prove the consequent. The consequent is: A has no ele-
ments. We can make that a bit more explicit: it’s not the case that
there is an x ∈ A.

A has no elements iff it’s not the case that there is an x such that x ∈ A.

So we’ve determined that what we want to prove is really a nega-


tive claim ∼ p, namely: it’s not the case that there is an x ∈ A. To
use proof by contradiction, we have to assume the corresponding
positive claim p, i.e., there is an x ∈ A, and prove a contradiction
from it. We indicate that we’re doing a proof by contradiction by
writing “by way of contradiction, assume” or even just “suppose
not,” and then state the assumption p.

Suppose not: there is an x ∈ A.

This is now the new assumption we’ll use to obtain a contradic-


tion. We have two more assumptions: that A ⊆ B and that B = ∅.
The first gives us that x ∈ B:

Since A ⊆ B, x ∈ B.

But since B = ∅, every element of B (e.g., x) must also be an ele-


ment of ∅.

Since B = ∅, x ∈ ∅. This is a contradiction, since by definition ∅ has no


elements.

This already completes the proof: we’ve arrived at what we need


(a contradiction) from the assumptions we’ve set up, and this means
that the assumptions can’t all be true. Since the first two assump-
tions (A ⊆ B and B = ∅) are not contested, it must be the last
assumption introduced (there is an x ∈ A) that must be false. But
if we want to be thorough, we can spell this out.

311
A. P ROOFS

Thus, our assumption that there is an x ∈ A must be false, hence, A has no


elements by proof by contradiction.

Every positive claim is trivially equivalent to a negative claim: p iff ∼∼ p.


So proofs by contradiction can also be used to establish positive claims “indi-
rectly,” as follows: To prove p, read it as the negative claim ∼∼ p. If we can
prove a contradiction from ∼ p, we’ve established ∼∼ p by proof by contradic-
tion, and hence p.
In the last example, we aimed to prove a negative claim, namely that A
has no elements, and so the assumption we made for the purpose of proof
by contradiction (i.e., that there is an x ∈ A) was a positive claim. It gave
us something to work with, namely the hypothetical x ∈ A about which we
continued to reason until we got to x ∈ ∅.
When proving a positive claim indirectly, the assumption you’d make for
the purpose of proof by contradiction would be negative. But very often you
can easily reformulate a positive claim as a negative claim, and a negative
claim as a positive claim. Our previous proof would have been essentially the
same had we proved “A = ∅” instead of the negative consequent “A has no
elements.” (By definition of =, “A = ∅” is a general claim, since it unpacks to
“every element of A is an element of ∅ and vice versa”.) But it is easily seen
to be equivalent to the negative claim “not: there is an x ∈ A.”
So it is sometimes easier to work with ∼ p as an assumption than it is to
prove p directly. Even when a direct proof is just as simple or even simpler
(as in the next examples), some people prefer to proceed indirectly. If the dou-
ble negation confuses you, think of a proof by contradiction of some claim as
a proof of a contradiction from the opposite claim. So, a proof by contradic-
tion of ∼ p is a proof of a contradiction from the assumption p; and proof by
contradiction of p is a proof of a contradiction from ∼ p.
Proposition A.11. A ⊆ A ∪ B.

Proof. We want to show that A ⊆ A ∪ B.


On the face of it, this is a positive claim: every x ∈ A is also in
A ∪ B. The negation of that is: some x ∈ A is ∈ / A ∪ B. So we can
prove the claim indirectly by assuming this negated claim, and
showing that it leads to a contradiction.
Suppose not, i.e., A ⊈ A ∪ B.
We have a definition of A ⊆ A ∪ B: every x ∈ A is also ∈ A ∪ B.
To understand what A ⊈ A ∪ B means, we have to use some ele-
mentary logical manipulation on the unpacked definition: it’s false
that every x ∈ A is also ∈ A ∪ B iff there is some x ∈ A that is

/ C. (This is a place where you want to be very careful: many stu-
dents’ attempted proofs by contradiction fail because they analyze

312
A.7. Proof by Contradiction

the negation of a claim like “all As are Bs” incorrectly.) In other


words, A ⊈ A ∪ B iff there is an x such that x ∈ A and x ∈ / A ∪ B.
From then on, it’s easy.
So, there is an x ∈ A such that x ∈
/ A ∪ B. By definition of ∪, x ∈ A ∪ B
iff x ∈ A or x ∈ B. Since x ∈ A, we have x ∈ A ∪ B. This contradicts the
assumption that x ∈/ A ∪ B.

Proposition A.12. If A ⊆ B and B ⊆ C then A ⊆ C.

Proof. Suppose A ⊆ B and B ⊆ C. We want to show A ⊆ C.


Let’s proceed indirectly: we assume the negation of what we want
to etablish.
Suppose not, i.e., A ⊈ C.
As before, we reason that A ⊈ C iff not every x ∈ A is also ∈ C,
i.e., some x ∈ A is ∈
/ C. Don’t worry, with practice you won’t have
to think hard anymore to unpack negations like this.
In other words, there is an x such that x ∈ A and x ∈
/ C.
Now we can use this to get to our contradiction. Of course, we’ll
have to use the other two assumptions to do it.
Since A ⊆ B, x ∈ B. Since B ⊆ C, x ∈ C. But this contradicts x ∈
/ C.

Proposition A.13. If A ∪ B = A ∩ B then A = B.

Proof. Suppose A ∪ B = A ∩ B. We want to show that A = B.


The beginning is now routine:
Assume, by way of contradiction, that A ̸= B.
Our assumption for the proof by contradiction is that A ̸= B. Since
A = B iff A ⊆ B an B ⊆ A, we get that A ̸= B iff A ⊈ B or B ⊈ A.
(Note how important it is to be careful when manipulating nega-
tions!) To prove a contradiction from this disjunction, we use a
proof by cases and show that in each case, a contradiction follows.
A ̸= B iff A ⊈ B or B ⊈ A. We distinguish cases.
In the first case, we assume A ⊈ B, i.e., for some x, x ∈ A but ∈
/ B.
A ∩ B is defined as those elements that A and B have in common,
so if something isn’t in one of them, it’s not in the intersection.
A ∪ B is A together with B, so anything in either is also in the
union. This tells us that x ∈ A ∪ B but x ∈ / A ∩ B, and hence that
A ∩ B ̸= A ∪ B.

313
A. P ROOFS

Case 1: A ⊈ B. Then for some x, x ∈ A but x ∈ / B. Since x ∈/ B, then


x ∈
/ A ∩ B. Since x ∈ A, x ∈ A ∪ B. So, A ∩ B ̸= A ∪ B, contradicting the
assumption that A ∩ B = A ∪ B.
Case 2: B ⊈ A. Then for some y, y ∈ B but y ∈ / A. As before, we have
y ∈ A ∪ B but y ∈
/ A ∩ B, and so A ∩ B ̸= A ∪ B, again contradicting A ∩ B =
A ∪ B.

A.8 Reading Proofs


Proofs you find in textbooks and articles very seldom give all the details we
have so far included in our examples. Authors often do not draw attention
to when they distinguish cases, when they give an indirect proof, or don’t
mention that they use a definition. So when you read a proof in a textbook,
you will often have to fill in those details for yourself in order to understand
the proof. Doing this is also good practice to get the hang of the various moves
you have to make in a proof. Let’s look at an example.

Proposition A.14 (Absorption). For all sets A, B,

A ∩ ( A ∪ B) = A

Proof. If z ∈ A ∩ ( A ∪ B), then z ∈ A, so A ∩ ( A ∪ B) ⊆ A. Now suppose


z ∈ A. Then also z ∈ A ∪ B, and therefore also z ∈ A ∩ ( A ∪ B).

The preceding proof of the absorption law is very condensed. There is no


mention of any definitions used, no “we have to prove that” before we prove
it, etc. Let’s unpack it. The proposition proved is a general claim about any
sets A and B, and when the proof mentions A or B, these are variables for
arbitrary sets. The general claims the proof establishes is what’s required to
prove identity of sets, i.e., that every element of the left side of the identity is
an element of the right and vice versa.

“If z ∈ A ∩ ( A ∪ B), then z ∈ A, so A ∩ ( A ∪ B) ⊆ A.”

This is the first half of the proof of the identity: it establishes that if an
arbitrary z is an element of the left side, it is also an element of the right, i.e.,
A ∩ ( A ∪ B) ⊆ A. Assume that z ∈ A ∩ ( A ∪ B). Since z is an element of
the intersection of two sets iff it is an element of both sets, we can conclude
that z ∈ A and also z ∈ A ∪ B. In particular, z ∈ A, which is what we
wanted to show. Since that’s all that has to be done for the first half, we know
that the rest of the proof must be a proof of the second half, i.e., a proof that
A ⊆ A ∩ ( A ∪ B ).

“Now suppose z ∈ A. Then also z ∈ A ∪ B, and therefore also


z ∈ A ∩ ( A ∪ B).”

314
A.9. I Can’t Do It!

We start by assuming that z ∈ A, since we are showing that, for any z, if


z ∈ A then z ∈ A ∩ ( A ∪ B). To show that z ∈ A ∩ ( A ∪ B), we have to show
(by definition of “∩”) that (i) z ∈ A and also (ii) z ∈ A ∪ B. Here (i) is just
our assumption, so there is nothing further to prove, and that’s why the proof
does not mention it again. For (ii), recall that z is an element of a union of sets
iff it is an element of at least one of those sets. Since z ∈ A, and A ∪ B is the
union of A and B, this is the case here. So z ∈ A ∪ B. We’ve shown both (i)
z ∈ A and (ii) z ∈ A ∪ B, hence, by definition of “∩,” z ∈ A ∩ ( A ∪ B). The
proof doesn’t mention those definitions; it’s assumed the reader has already
internalized them. If you haven’t, you’ll have to go back and remind yourself
what they are. Then you’ll also have to recognize why it follows from z ∈ A
that z ∈ A ∪ B, and from z ∈ A and z ∈ A ∪ B that z ∈ A ∩ ( A ∪ B).
Here’s another version of the proof above, with everything made explicit:

Proof. [By definition of = for sets, A ∩ ( A ∪ B) = A we have to show (a)


A ∩ ( A ∪ B) ⊆ A and (b) A ∩ ( A ∪ B) ⊆ A. (a): By definition of ⊆, we have
to show that if z ∈ A ∩ ( A ∪ B), then z ∈ A.] If z ∈ A ∩ ( A ∪ B), then
z ∈ A [since by definition of ∩, z ∈ A ∩ ( A ∪ B) iff z ∈ A and z ∈ A ∪ B],
so A ∩ ( A ∪ B) ⊆ A. [(b): By definition of ⊆, we have to show that if z ∈ A,
then z ∈ A ∩ ( A ∪ B).] Now suppose [(1)] z ∈ A. Then also [(2)] z ∈ A ∪ B
[since by (1) z ∈ A or z ∈ B, which by definition of ∪ means z ∈ A ∪ B], and
therefore also z ∈ A ∩ ( A ∪ B) [since the definition of ∩ requires that z ∈ A,
i.e., (1), and z ∈ A ∪ B), i.e., (2)].

A.9 I Can’t Do It!


We all get to a point where we feel like giving up. But you can do it. Your
instructor and teaching assistant, as well as your fellow students, can help.
Ask them for help! Here are a few tips to help you avoid a crisis, and what to
do if you feel like giving up.
To make sure you can solve problems successfully, do the following:
1. Start as far in advance as possible. We get busy throughout the semester
and many of us struggle with procrastination, one of the best things you
can do is to start your homework assignments early. That way, if you’re
stuck, you have time to look for a solution (that isn’t crying).
2. Talk to your classmates. You are not alone. Others in the class may also
struggle—but they may struggle with different things. Talking it out
with your peers can give you a different perspective on the problem that
might lead to a breakthrough. Of course, don’t just copy their solution:
ask them for a hint, or explain where you get stuck and ask them for the
next step. And when you do get it, reciprocate. Helping someone else
along, and explaining things will help you understand better, too.

315
A. P ROOFS

3. Ask for help. You have many resources available to you—your instructor
and teaching assistant are there for you and want you to succeed. They
should be able to help you work out a problem and identify where in
the process you’re struggling.

4. Take a break. If you’re stuck, it might be because you’ve been staring at the
problem for too long. Take a short break, have a cup of tea, or work on
a different problem for a while, then return to the problem with a fresh
mind. Sleep on it.

Notice how these strategies require that you’ve started to work on the
proof well in advance? If you’ve started the proof at 2am the day before it’s
due, these might not be so helpful.
This might sound like doom and gloom, but solving a proof is a challenge
that pays off in the end. Some people do this as a career—so there must be
something to enjoy about it. Like basically everything, solving problems and
doing proofs is something that requires practice. You might see classmates
who find this easy: they’ve probably just had lots of practice already. Try not
to give in too easily.
If you do run out of time (or patience) on a particular problem: that’s ok. It
doesn’t mean you’re stupid or that you will never get it. Find out (from your
instructor or another student) how it is done, and identify where you went
wrong or got stuck, so you can avoid doing that the next time you encounter
a similar issue. Then try to do it without looking at the solution. And next
time, start (and ask for help) earlier.

A.10 Other Resources


There are many books on how to do proofs in mathematics which may be
useful. Check out How to Read and do Proofs: An Introduction to Mathemati-
cal Thought Processes (Solow, 2013) and How to Prove It: A Structured Approach
(Velleman, 2019) in particular. The Book of Proof (Hammack, 2013) and Math-
ematical Reasoning (Sandstrum, 2019) are books on proof that are freely avail-
able online. Philosophers might find More Precisely: The Math you need to do
Philosophy (Steinhart, 2018) to be a good primer on mathematical reasoning.
There are also various shorter guides to proofs available on the internet;
e.g., “Introduction to Mathematical Arguments” (Hutchings, 2003) and “How
to write proofs” (Cheng, 2004).

Motivational Videos
Feel like you have no motivation to do your homework? Feeling down? These
videos might help!

• https://www.youtube.com/watch?v=ZXsQAXx_ao0

316
A.10. Other Resources

• https://www.youtube.com/watch?v=BQ4yd2W50No

• https://www.youtube.com/watch?v=StTqXEQ2l-Y

317
Appendix B

Induction

B.1 Introduction
Induction is an important proof technique which is used, in different forms,
in almost all areas of logic, theoretical computer science, and mathematics. It
is needed to prove many of the results in logic.
Induction is often contrasted with deduction, and characterized as the in-
ference from the particular to the general. For instance, if we observe many
green emeralds, and nothing that we would call an emerald that’s not green,
we might conclude that all emeralds are green. This is an inductive infer-
ence, in that it proceeds from many particular cases (this emerald is green,
that emerald is green, etc.) to a general claim (all emeralds are green). Math-
ematical induction is also an inference that concludes a general claim, but it is
of a very different kind than this “simple induction.”
Very roughly, an inductive proof in mathematics concludes that all math-
ematical objects of a certain sort have a certain property. In the simplest case,
the mathematical objects an inductive proof is concerned with are natural
numbers. In that case an inductive proof is used to establish that all natural
numbers have some property, and it does this by showing that

1. 0 has the property, and

2. whenever a number k has the property, so does k + 1.

Induction on natural numbers can then also often be used to prove general
claims about mathematical objects that can be assigned numbers. For instance,
finite sets each have a finite number n of elements, and if we can use induction
to show that every number n has the property “all finite sets of size n are . . . ”
then we will have shown something about all finite sets.
Induction can also be generalized to mathematical objects that are induc-
tively defined. For instance, expressions of a formal language such as those of
first-order logic are defined inductively. Structural induction is a way to prove

319
B. I NDUCTION

results about all such expressions. Structural induction, in particular, is very


useful—and widely used—in logic.

B.2 Induction on N
In its simplest form, induction is a technique used to prove results for all nat-
ural numbers. It uses the fact that by starting from 0 and repeatedly adding 1
we eventually reach every natural number. So to prove that something is true
for every number, we can (1) establish that it is true for 0 and (2) show that
whenever it is true for a number n, it is also true for the next number n + 1. If
we abbreviate “number n has property P” by P(n) (and “number k has prop-
erty P” by P(k), etc.), then a proof by induction that P(n) for all n ∈ N consists
of:
1. a proof of P(0), and

2. a proof that, for any k, if P(k ) then P(k + 1).


To make this crystal clear, suppose we have both (1) and (2). Then (1) tells us
that P(0) is true. If we also have (2), we know in particular that if P(0) then
P(0 + 1), i.e., P(1). This follows from the general statement “for any k, if P(k )
then P(k + 1)” by putting 0 for k. So by modus ponens, we have that P(1).
From (2) again, now taking 1 for n, we have: if P(1) then P(2). Since we’ve
just established P(1), by modus ponens, we have P(2). And so on. For any
number n, after doing this n times, we eventually arrive at P(n). So (1) and (2)
together establish P(n) for any n ∈ N.
Let’s look at an example. Suppose we want to find out how many different
sums we can throw with n dice. Although it might seem silly, let’s start with
0 dice. If you have no dice there’s only one possible sum you can “throw”:
no dots at all, which sums to 0. So the number of different possible throws
is 1. If you have only one die, i.e., n = 1, there are six possible values, 1
through 6. With two dice, we can throw any sum from 2 through 12, that’s
11 possibilities. With three dice, we can throw any number from 3 to 18, i.e.,
16 different possibilities. 1, 6, 11, 16: looks like a pattern: maybe the answer
is 5n + 1? Of course, 5n + 1 is the maximum possible, because there are only
5n + 1 numbers between n, the lowest value you can throw with n dice (all
1’s) and 6n, the highest you can throw (all 6’s).
Theorem B.1. With n dice one can throw all 5n + 1 possible values between n and
6n.

Proof. Let P(n) be the claim: “It is possible to throw any number between n
and 6n using n dice.” To use induction, we prove:
1. The induction basis P(1), i.e., with just one die, you can throw any num-
ber between 1 and 6.

320
B.2. Induction on N

2. The induction step, for all k, if P(k) then P(k + 1).

(1) Is proved by inspecting a 6-sided die. It has all 6 sides, and every num-
ber between 1 and 6 shows up one on of the sides. So it is possible to throw
any number between 1 and 6 using a single die.
To prove (2), we assume the antecedent of the conditional, i.e., P(k ). This
assumption is called the inductive hypothesis. We use it to prove P(k + 1). The
hard part is to find a way of thinking about the possible values of a throw of
k + 1 dice in terms of the possible values of throws of k dice plus of throws of
the extra k + 1-st die—this is what we have to do, though, if we want to use
the inductive hypothesis.
The inductive hypothesis says we can get any number between k and 6k
using k dice. If we throw a 1 with our (k + 1)-st die, this adds 1 to the total.
So we can throw any value between k + 1 and 6k + 1 by throwing k dice and
then rolling a 1 with the (k + 1)-st die. What’s left? The values 6k + 2 through
6k + 6. We can get these by rolling k 6s and then a number between 2 and 6
with our (k + 1)-st die. Together, this means that with k + 1 dice we can throw
any of the numbers between k + 1 and 6(k + 1), i.e., we’ve proved P(k + 1)
using the assumption P(k), the inductive hypothesis.

Very often we use induction when we want to prove something about a


series of objects (numbers, sets, etc.) that is itself defined “inductively,” i.e.,
by defining the (n + 1)-st object in terms of the n-th. For instance, we can
define the sum sn of the natural numbers up to n by

s0 = 0
s n +1 = s n + ( n + 1 )

This definition gives:

s0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 = 1+2 = 3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.

Now we can prove, by induction, that sn = n(n + 1)/2.

Proposition B.2. sn = n(n + 1)/2.

Proof. We have to prove (1) that s0 = 0 · (0 + 1)/2 and (2) if sk = k(k + 1)/2
then sk+1 = (k + 1)(k + 2)/2. (1) is obvious. To prove (2), we assume the
inductive hypothesis: sk = k (k + 1)/2. Using it, we have to show that sk+1 =
(k + 1)(k + 2)/2.

321
B. I NDUCTION

What is sk+1 ? By the definition, sk+1 = sk + (k + 1). By inductive hypoth-


esis, sk = k(k + 1)/2. We can substitute this into the previous equation, and
then just need a bit of arithmetic of fractions:

k ( k + 1)
s k +1 = + ( k + 1) =
2
k ( k + 1) 2( k + 1)
= + =
2 2
k ( k + 1) + 2( k + 1)
= =
2
(k + 2)(k + 1)
= .
2

The important lesson here is that if you’re proving something about some
inductively defined sequence an , induction is the obvious way to go. And
even if it isn’t (as in the case of the possibilities of dice throws), you can use
induction if you can somehow relate the case for k + 1 to the case for k.

B.3 Strong Induction


In the principle of induction discussed above, we prove P(0) and also if P(k),
then P(k + 1). In the second part, we assume that P(k) is true and use this
assumption to prove P(k + 1). Equivalently, of course, we could assume P(k −
1) and use it to prove P(k)—the important part is that we be able to carry out
the inference from any number to its successor; that we can prove the claim in
question for any number under the assumption it holds for its predecessor.
There is a variant of the principle of induction in which we don’t just as-
sume that the claim holds for the predecessor k − 1 of k, but for all numbers
smaller than k, and use this assumption to establish the claim for k. This also
gives us the claim P(n) for all n ∈ N. For once we have established P(0), we
have thereby established that P holds for all numbers less than 1. And if we
know that if P(l ) for all l < k, then P(k), we know this in particular for k = 1.
So we can conclude P(1). With this we have proved P(0) and P(1), i.e., P(l )
for all l < 2, and since we have also the conditional, if P(l ) for all l < 2, then
P(2), we can conclude P(2), and so on.
In fact, if we can establish the general conditional “for all k, if P(l ) for all
l < k, then P(k ),” we do not have to establish P(0) anymore, since it follows
from it. For remember that a general claim like “for all l < k, P(l )” is true if
there are no l < k. This is a case of vacuous quantification: “all As are Bs” is
true if there are no As, ∀ x ( φ( x ) ⊃ ψ( x )) is true if no x satisfies φ( x ). In this
case, the formalized version would be “∀l (l < k ⊃ P(l ))”—and that is true if
there are no l < k. And if k = 0 that’s exactly the case: no l < 0, hence “for all
l < 0, P(0)” is true, whatever P is. A proof of “if P(l ) for all l < k, then P(k)”
thus automatically establishes P(0).

322
B.4. Inductive Definitions

This variant is useful if establishing the claim for k can’t be made to just
rely on the claim for k − 1 but may require the assumption that it is true for
one or more l < k.

B.4 Inductive Definitions


In logic we very often define kinds of objects inductively, i.e., by specifying
rules for what counts as an object of the kind to be defined which explain how
to get new objects of that kind from old objects of that kind. For instance,
we often define special kinds of sequences of symbols, such as the terms and
formulae of a language, by induction. For a simple example, consider strings
of consisting of letters a, b, c, d, the symbol ◦, and brackets [ and ], such
as “[[c ◦ d][”, “[a[]◦]”, “a” or “[[a ◦ b] ◦ d]”. You probably feel that there’s
something “wrong” with the first two strings: the brackets don’t “balance” at
all in the first, and you might feel that the “◦” should “connect” expressions
that themselves make sense. The third and fourth string look better: for every
“[” there’s a closing “]” (if there are any at all), and for any ◦ we can find “nice”
expressions on either side, surrounded by a pair of parentheses.
We would like to precisely specify what counts as a “nice term.” First of
all, every letter by itself is nice. Anything that’s not just a letter by itself should
be of the form “[t ◦ s]” where s and t are themselves nice. Conversely, if t and
s are nice, then we can form a new nice term by putting a ◦ between them and
surround them by a pair of brackets. We might use these operations to define
the set of nice terms. This is an inductive definition.

Definition B.3 (Nice terms). The set of nice terms is inductively defined as fol-
lows:

1. Any letter a, b, c, d is a nice term.

2. If s1 and s2 are nice terms, then so is [s1 ◦ s2 ].

3. Nothing else is a nice term.

This definition tells us that something counts as a nice term iff it can be
constructed according to the two conditions (1) and (2) in some finite number
of steps. In the first step, we construct all nice terms just consisting of letters
by themselves, i.e.,
a, b, c, d
In the second step, we apply (2) to the terms we’ve constructed. We’ll get

[a ◦ a], [a ◦ b], [b ◦ a], . . . , [d ◦ d]

for all combinations of two letters. In the third step, we apply (2) again, to any
two nice terms we’ve constructed so far. We get new nice term such as [a ◦ [a ◦

323
B. I NDUCTION

a]]—where t is a from step 1 and s is [a ◦ a] from step 2—and [[b ◦ c] ◦ [d ◦ b]]


constructed out of the two terms [b ◦ c] and [d ◦ b] from step 2. And so on.
Clause (3) rules out that anything not constructed in this way sneaks into the
set of nice terms.
Note that we have not yet proved that every sequence of symbols that
“feels” nice is nice according to this definition. However, it should be clear
that everything we can construct does in fact “feel nice”: brackets are bal-
anced, and ◦ connects parts that are themselves nice.
The key feature of inductive definitions is that if you want to prove some-
thing about all nice terms, the definition tells you which cases you must con-
sider. For instance, if you are told that t is a nice term, the inductive definition
tells you what t can look like: t can be a letter, or it can be [s1 ◦ s2 ] for some pair
of nice terms s1 and s2 . Because of clause (3), those are the only possibilities.
When proving claims about all of an inductively defined set, the strong
form of induction becomes particularly important. For instance, suppose we
want to prove that for every nice term of length n, the number of [ in it is <
n/2. This can be seen as a claim about all n: for every n, the number of [ in
any nice term of length n is < n/2.

Proposition B.4. For any n, the number of [ in a nice term of length n is < n/2.

Proof. To prove this result by (strong) induction, we have to show that the
following conditional claim is true:

If for every l < k, any nice term of length l has < l/2 [’s, then any
nice term of length k has < k/2 [’s.

To show this conditional, assume that its antecedent is true, i.e., assume that
for any l < k, nice terms of length l contain < l/2 [’s. We call this assumption
the inductive hypothesis. We want to show the same is true for nice terms of
length k.
So suppose t is a nice term of length k. Because nice terms are inductively
defined, we have two cases: (1) t is a letter by itself, or (2) t is [s1 ◦ s2 ] for some
nice terms s1 and s2 .

1. t is a letter. Then k = 1, and the number of [ in t is 0. Since 0 < 1/2, the


claim holds.

2. t is [s1 ◦ s2 ] for some nice terms s1 and s2 . Let’s let l1 be the length of s1
and l2 be the length of s2 . Then the length k of t is l1 + l2 + 3 (the lengths
of s1 and s2 plus three symbols [, ◦, ]). Since l1 + l2 + 3 is always greater
than l1 , l1 < k. Similarly, l2 < k. That means that the induction hypothe-
sis applies to the terms s1 and s2 : the number m1 of [ in s1 is < l1 /2, and
the number m2 of [ in s2 is < l2 /2.

324
B.5. Structural Induction

The number of [ in t is the number of [ in s1 , plus the number of [ in s2 ,


plus 1, i.e., it is m1 + m2 + 1. Since m1 < l1 /2 and m2 < l2 /2 we have:

l1 l l + l2 + 2 l + l2 + 3
m1 + m2 + 1 < + 2 +1 = 1 < 1 = k/2.
2 2 2 2

In each case, we’ve shown that the number of [ in t is < k/2 (on the basis of
the inductive hypothesis). By strong induction, the proposition follows.

B.5 Structural Induction


So far we have used induction to establish results about all natural numbers.
But a corresponding principle can be used directly to prove results about all
elements of an inductively defined set. This often called structural induction,
because it depends on the structure of the inductively defined objects.
Generally, an inductive definition is given by (a) a list of “initial” elements
of the set and (b) a list of operations which produce new elements of the set
from old ones. In the case of nice terms, for instance, the initial objects are the
letters. We only have one operation: the operations are

o (s1 , s2 ) =[s1 ◦ s2 ]

You can even think of the natural numbers N themselves as being given by an
inductive definition: the initial object is 0, and the operation is the successor
function x + 1.
In order to prove something about all elements of an inductively defined
set, i.e., that every element of the set has a property P, we must:

1. Prove that the initial objects have P

2. Prove that for each operation o, if the arguments have P, so does the
result.

For instance, in order to prove something about all nice terms, we would
prove that it is true about all letters, and that it is true about [s1 ◦ s2 ] provided
it is true of s1 and s2 individually.

Proposition B.5. The number of [ equals the number of ] in any nice term t.

Proof. We use structural induction. Nice terms are inductively defined, with
letters as initial objects and the operation o for constructing new nice terms
out of old ones.

1. The claim is true for every letter, since the number of [ in a letter by itself
is 0 and the number of ] in it is also 0.

325
B. I NDUCTION

2. Suppose the number of [ in s1 equals the number of ], and the same is


true for s2 . The number of [ in o (s1 , s2 ), i.e., in [s1 ◦ s2 ], is the sum of the
number of [ in s1 and s2 plus one. The number of ] in o (s1 , s2 ) is the sum
of the number of ] in s1 and s2 plus one. Thus, the number of [ in o (s1 , s2 )
equals the number of ] in o (s1 , s2 ).

Let’s give another proof by structural induction: a proper initial segment


of a string t of symbols is any string s that agrees with t symbol by symbol,
read from the left, but t is longer. So, e.g., [ a ◦ is a proper initial segment of
[ a ◦ b], but neither are [b ◦ (they disagree at the second symbol) nor [ a ◦ b]
(they are the same length).
Proposition B.6. Every proper initial segment of a nice term t has more [’s than ]’s.

Proof. By induction on t:
1. t is a letter by itself: Then t has no proper initial segments.

2. t = [s1 ◦ s2 ] for some nice terms s1 and s2 . If r is a proper initial segment


of t, there are a number of possibilities:

a) r is just [: Then r has one more [ than it does ].


b) r is [r1 where r1 is a proper initial segment of s1 : Since s1 is a nice
term, by induction hypothesis, r1 has more [ than ] and the same is
true for [r1 .
c) r is [s1 or [s1 ◦ : By the previous result, the number of [ and ] in s1
are equal; so the number of [ in [s1 or [s1 ◦ is one more than the
number of ].
d) r is [s1 ◦ r2 where r2 is a proper initial segment of s2 : By induction
hypothesis, r2 contains more [ than ]. By the previous result, the
number of [ and of ] in s1 are equal. So the number of [ in [s1 ◦ r2 is
greater than the number of ].
e) r is [s1 ◦ s2 : By the previous result, the number of [ and ] in s1 are
equal, and the same for s2 . So there is one more [ in [s1 ◦ s2 than
there are ].

B.6 Relations and Functions


When we have defined a set of objects (such as the natural numbers or the nice
terms) inductively, we can also define relations on these objects by induction.
For instance, consider the following idea: a nice term t1 is a subterm of a nice
term t2 if it occurs as a part of it. Let’s use a symbol for it: t1 ⊑ t2 . Every nice
term is a subterm of itself, of course: t ⊑ t. We can give an inductive definition
of this relation as follows:

326
B.6. Relations and Functions

Definition B.7. The relation of a nice term t1 being a subterm of t2 , t1 ⊑ t2 , is


defined by induction on t2 as follows:
1. If t2 is a letter, then t1 ⊑ t2 iff t1 = t2 .
2. If t2 is [s1 ◦ s2 ], then t1 ⊑ t2 iff t1 = t2 , t1 ⊑ s1 , or t1 ⊑ s2 .

This definition, for instance, will tell us that a ⊑ [b ◦ a]. For (2) says that
a ⊑ [b ◦ a] iff a = [b ◦ a], or a ⊑ b, or a ⊑ a. The first two are false: a
clearly isn’t identical to [b ◦ a], and by (1), a ⊑ b iff a = b, which is also false.
However, also by (1), a ⊑ a iff a = a, which is true.
It’s important to note that the success of this definition depends on a fact
that we haven’t proved yet: every nice term t is either a letter by itself, or there
are uniquely determined nice terms s1 and s2 such that t = [s1 ◦ s2 ]. “Uniquely
determined” here means that if t = [s1 ◦ s2 ] it isn’t also = [r1 ◦ r2 ] with s1 ̸= r1
or s2 ̸= r2 . If this were the case, then clause (2) may come in conflict with
itself: reading t2 as [s1 ◦ s2 ] we might get t1 ⊑ t2 , but if we read t2 as [r1 ◦ r2 ]
we might get not t1 ⊑ t2 . Before we prove that this can’t happen, let’s look at
an example where it can happen.
Definition B.8. Define bracketless terms inductively by
1. Every letter is a bracketless term.
2. If s1 and s2 are bracketless terms, then s1 ◦ s2 is a bracketless term.
3. Nothing else is a bracketless term.

Bracketless terms are, e.g., a, b ◦ d, b ◦ a ◦ b. Now if we defined “subterm”


for bracketless terms the way we did above, the second clause would read
If t2 = s1 ◦ s2 , then t1 ⊑ t2 iff t1 = t2 , t1 ⊑ s1 , or t1 ⊑ s2 .
Now b ◦ a ◦ b is of the form s1 ◦ s2 with

s1 = b and s2 = a ◦ b.

It is also of the form r1 ◦ r2 with

r1 = b ◦ a and r2 = b.

Now is a ◦ b a subterm of b ◦ a ◦ b? The answer is yes if we go by the first


reading, and no if we go by the second.
The property that the way a nice term is built up from other nice terms is
unique is called unique readability. Since inductive definitions of relations for
such inductively defined objects are important, we have to prove that it holds.
Proposition B.9. Suppose t is a nice term. Then either t is a letter by itself, or there
are uniquely determined nice terms s1 , s2 such that t = [s1 ◦ s2 ].

327
B. I NDUCTION

Proof. If t is a letter by itself, the condition is satisfied. So assume t isn’t a letter


by itself. We can tell from the inductive definition that then t must be of the
form [s1 ◦ s2 ] for some nice terms s1 and s2 . It remains to show that these are
uniquely determined, i.e., if t = [r1 ◦ r2 ], then s1 = r1 and s2 = r2 .
So suppose t = [s1 ◦ s2 ] and also t = [r1 ◦ r2 ] for nice terms s1 , s2 , r1 , r2 . We
have to show that s1 = r1 and s2 = r2 . First, s1 and r1 must be identical, for
otherwise one is a proper initial segment of the other. But by Proposition B.6,
that is impossible if s1 and r1 are both nice terms. But if s1 = r1 , then clearly
also s2 = r2 .

We can also define functions inductively: e.g., we can define the function f
that maps any nice term to the maximum depth of nested [. . . ] in it as follows:
Definition B.10. The depth of a nice term, f (t), is defined inductively as fol-
lows: (
0 if t is a letter
f (t) =
max( f (s1 ), f (s2 )) + 1 if t = [s1 ◦ s2 ].

For instance
f ([a ◦ b]) = max( f (a), f (b)) + 1 =
= max(0, 0) + 1 = 1, and
f ([[a ◦ b] ◦ c]) = max( f ([a ◦ b]), f (c)) + 1 =
= max(1, 0) + 1 = 2.
Here, of course, we assume that s1 an s2 are nice terms, and make use
of the fact that every nice term is either a letter or of the form [s1 ◦ s2 ]. It
is again important that it can be of this form in only one way. To see why,
consider again the bracketless terms we defined earlier. The corresponding
“definition” would be:
(
0 if t is a letter
g(t) =
max( g(s1 ), g(s2 )) + 1 if t = s1 ◦ s2 .
Now consider the bracketless term a ◦ b ◦ c ◦ d. It can be read in more than
one way, e.g., as s1 ◦ s2 with
s1 = a and s2 = b ◦ c ◦ d,

or as r1 ◦ r2 with

r1 = a ◦ b and r2 = c ◦ d.
Calculating g according to the first way of reading it would give
g(s1 ◦ s2 ) = max( g(a), g(b ◦ c ◦ d)) + 1 =
= max(0, 2) + 1 = 3

328
B.6. Relations and Functions

while according to the other reading we get

g(r1 ◦ r2 ) = max( g(a ◦ b), g(c ◦ d)) + 1 =


= max(1, 1) + 1 = 2

But a function must always yield a unique value; so our “definition” of g


doesn’t define a function at all.

329
Appendix C

Biographies

C.1 Georg Cantor


An early biography of Georg Cantor
(GAY-org KAHN-tor) claimed that he was
born and found on a ship that was sail-
ing for Saint Petersburg, Russia, and that
his parents were unknown. This, how-
ever, is not true; although he was born
in Saint Petersburg in 1845.
Cantor received his doctorate in
mathematics at the University of Berlin
in 1867. He is known for his work in
set theory, and is credited with found-
ing set theory as a distinctive research
discipline. He was the first to prove
that there are infinite sets of different
sizes. His theories, and especially his
theory of infinities, caused much debate
among mathematicians at the time, and
his work was controversial.
Cantor’s religious beliefs and his Figure C.1: Georg Cantor
mathematical work were inextricably
tied; he even claimed that the theory of transfinite numbers had been com-
municated to him directly by God. In later life, Cantor suffered from mental
illness. Beginning in 1894, and more frequently towards his later years, Can-
tor was hospitalized. The heavy criticism of his work, including a falling out
with the mathematician Leopold Kronecker, led to depression and a lack of
interest in mathematics. During depressive episodes, Cantor would turn to
philosophy and literature, and even published a theory that Francis Bacon
was the author of Shakespeare’s plays.

331
C. B IOGRAPHIES

Cantor died on January 6, 1918, in a sanatorium in Halle.

Further Reading For full biographies of Cantor, see Dauben (1990) and Grattan-
Guinness (1971). Cantor’s radical views are also described in the BBC Radio 4
program A Brief History of Mathematics (du Sautoy, 2014). If you’d like to hear
about Cantor’s theories in rap form, see Rose (2012).

C.2 Alonzo Church


Alonzo Church was born in Washing-
ton, DC on June 14, 1903. In early
childhood, an air gun incident left
Church blind in one eye. He finished
preparatory school in Connecticut in
1920 and began his university education
at Princeton that same year. He com-
pleted his doctoral studies in 1927. Af-
ter a couple years abroad, Church re-
turned to Princeton. Church was known
to be exceedingly polite and careful.
His blackboard writing was immaculate,
and he would preserve important pa-
pers by carefully covering them in Duco
cement (a clear glue). Outside of his aca- Figure C.2: Alonzo Church
demic pursuits, he enjoyed reading sci-
ence fiction magazines and was not afraid to write to the editors if he spotted
any inaccuracies in the writing.
Church’s academic achievements were great. Together with his students
Stephen Kleene and Barkley Rosser, he developed a theory of effective calcu-
lability, the lambda calculus, independently of Alan Turing’s development of
the Turing machine. The two definitions of computability are equivalent, and
give rise to what is now known as the Church–Turing Thesis, that a function of
the natural numbers is effectively computable if and only if it is computable
via Turing machine (or lambda calculus). He also proved what is now known
as Church’s Theorem: The decision problem for the validity of first-order for-
mulas is unsolvable.
Church continued his work into old age. In 1967 he left Princeton for
UCLA, where he was professor until his retirement in 1990. Church passed
away on August 1, 1995 at the age of 92.

Further Reading For a brief biography of Church, see Enderton (2019). Church’s
original writings on the lambda calculus and the Entscheidungsproblem (Church’s
Thesis) are Church (1936a,b). Aspray (1984) records an interview with Church

332
C.3. Gerhard Gentzen

about the Princeton mathematics community in the 1930s. Church wrote a se-
ries of book reviews of the Journal of Symbolic Logic from 1936 until 1979. They
are all archived on John MacFarlane’s website (MacFarlane, 2015).

C.3 Gerhard Gentzen


Gerhard Gentzen is known primarily
as the creator of structural proof the-
ory, and specifically the creation of the
natural deduction and sequent calculus
derivation systems. He was born on
November 24, 1909 in Greifswald, Ger-
many. Gerhard was homeschooled for
three years before attending preparatory
school, where he was behind most of his
classmates in terms of education. De-
spite this, he was a brilliant student and Figure C.3: Gerhard Gentzen
showed a strong aptitude for mathematics. His interests were varied, and he,
for instance, also write poems for his mother and plays for the school theatre.
Gentzen began his university studies at the University of Greifswald, but
moved around to Göttingen, Munich, and Berlin. He received his doctorate in
1933 from the University of Göttingen under Hermann Weyl. (Paul Bernays
supervised most of his work, but was dismissed from the university by the
Nazis.) In 1934, Gentzen began work as an assistant to David Hilbert. That
same year he developed the sequent calculus and natural deduction deriva-
tion systems, in his papers Untersuchungen über das logische Schließen I–II [In-
vestigations Into Logical Deduction I–II]. He proved the consistency of the Peano
axioms in 1936.
Gentzen’s relationship with the Nazis is complicated. At the same time his
mentor Bernays was forced to leave Germany, Gentzen joined the university
branch of the SA, the Nazi paramilitary organization. Like many Germans, he
was a member of the Nazi party. During the war, he served as a telecommuni-
cations officer for the air intelligence unit. However, in 1942 he was released
from duty due to a nervous breakdown. It is unclear whether or not Gentzen’s
loyalties lay with the Nazi party, or whether he joined the party in order to en-
sure academic success.
In 1943, Gentzen was offered an academic position at the Mathematical
Institute of the German University of Prague, which he accepted. However, in
1945 the citizens of Prague revolted against German occupation. Soviet forces
arrived in the city and arrested all the professors at the university. Because of
his membership in Nazi organizations, Gentzen was taken to a forced labour
camp. He died of malnutrition while in his cell on August 4, 1945 at the age
of 35.

333
C. B IOGRAPHIES

Further Reading For a full biography of Gentzen, see Menzler-Trott (2007).


An interesting read about mathematicians under Nazi rule, which gives a brief
note about Gentzen’s life, is given by Segal (2014). Gentzen’s papers on logical
deduction are available in the original german (Gentzen, 1935a,b). English
translations of Gentzen’s papers have been collected in a single volume by
Szabo (1969), which also includes a biographical sketch.

C.4 Kurt Gödel


Kurt Gödel (GER-dle) was born on
April 28, 1906 in Brünn in the Austro-
Hungarian empire (now Brno in the
Czech Republic). Due to his inquisitive
and bright nature, young Kurtele was
often called “Der kleine Herr Warum”
(Little Mr. Why) by his family. He ex-
celled in academics from primary school
onward, where he got less than the high-
est grade only in mathematics. Gödel
was often absent from school due to
poor health and was exempt from phys-
ical education. He was diagnosed with
rheumatic fever during his childhood.
Throughout his life, he believed this per-
manently affected his heart despite med-
ical assessment saying otherwise.
Gödel began studying at the Univer- Figure C.4: Kurt Gödel
sity of Vienna in 1924 and completed his
doctoral studies in 1929. He first intended to study physics, but his interests
soon moved to mathematics and especially logic, in part due to the influence
of the philosopher Rudolf Carnap. His dissertation, written under the super-
vision of Hans Hahn, proved the completeness theorem of first-order predi-
cate logic with identity (Gödel, 1929). Only a year later, he obtained his most
famous results—the first and second incompleteness theorems (published in
Gödel 1931). During his time in Vienna, Gödel was heavily involved with
the Vienna Circle, a group of scientifically-minded philosophers that included
Carnap, whose work was especially influenced by Gödel’s results.
In 1938, Gödel married Adele Nimbursky. His parents were not pleased:
not only was she six years older than him and already divorced, but she
worked as a dancer in a nightclub. Social pressures did not affect Gödel, how-
ever, and they remained happily married until his death.
After Nazi Germany annexed Austria in 1938, Gödel and Adele emigrated
to the United States, where he took up a position at the Institute for Advanced

334
C.5. Emmy Noether

Study in Princeton, New Jersey. Despite his introversion and eccentric nature,
Gödel’s time at Princeton was collaborative and fruitful. He published essays
in set theory, philosophy and physics. Notably, he struck up a particularly
strong friendship with his colleague at the IAS, Albert Einstein.
In his later years, Gödel’s mental health deteriorated. His wife’s hospi-
talization in 1977 meant she was no longer able to cook his meals for him.
Having suffered from mental health issues throughout his life, he succumbed
to paranoia. Deathly afraid of being poisoned, Gödel refused to eat. He died
of starvation on January 14, 1978, in Princeton.

Further Reading For a complete biography of Gödel’s life is available, see


John Dawson (1997). For further biographical pieces, as well as essays about
Gödel’s contributions to logic and philosophy, see Wang (1990), Baaz et al.
(2011), Takeuti et al. (2003), and Sigmund et al. (2007).
Gödel’s PhD thesis is available in the original German (Gödel, 1929). The
original text of the incompleteness theorems is (Gödel, 1931). All of Gödel’s
published and unpublished writings, as well as a selection of correspondence,
are available in English in his Collected Papers Feferman et al. (1986, 1990).
For a detailed treatment of Gödel’s incompleteness theorems, see Smith
(2013). For an informal, philosophical discussion of Gödel’s theorems, see
Mark Linsenmayer’s podcast (Linsenmayer, 2014).

C.5 Emmy Noether


Emmy Noether (NER-ter) was born in Erlangen, Germany, on March 23, 1882,
to an upper-middle class scholarly family. Hailed as the “mother of modern
algebra,” Noether made groundbreaking contributions to both mathematics
and physics, despite significant barriers to women’s education. In Germany at
the time, young girls were meant to be educated in arts and were not allowed
to attend college preparatory schools. However, after auditing classes at the
Universities of Göttingen and Erlangen (where her father was professor of
mathematics), Noether was eventually able to enroll as a student at Erlangen
in 1904, when their policy was updated to allow female students. She received
her doctorate in mathematics in 1907.
Despite her qualifications, Noether experienced much resistance during
her career. From 1908–1915, she taught at Erlangen without pay. During this
time, she caught the attention of David Hilbert, one of the world’s foremost
mathematicians of the time, who invited her to Göttingen. However, women
were prohibited from obtaining professorships, and she was only able to lec-
ture under Hilbert’s name, again without pay. During this time she proved
what is now known as Noether’s theorem, which is still used in theoretical
physics today. Noether was finally granted the right to teach in 1919. Hilbert’s

335
C. B IOGRAPHIES

response to continued resistance of his university colleagues reportedly was:


“Gentlemen, the faculty senate is not a bathhouse.”
In the later 1920s, she concentrated
on work in abstract algebra, and her con-
tributions revolutionized the field. In
her proofs she often made use of the so-
called ascending chain condition, which
states that there is no infinite strictly in-
creasing chain of certain sets. For in-
stance, certain algebraic structures now
known as Noetherian rings have the
property that there are no infinite se-
quences of ideals I1 ⊊ I2 ⊊ . . . . The
condition can be generalized to any par-
tial order (in algebra, it concerns the spe-
cial case of ideals ordered by the subset
relation), and we can also consider the
dual descending chain condition, where
every strictly decreasing sequence in a
partial order eventually ends. If a par-
Figure C.5: Emmy Noether
tial order satisfies the descending chain
condition, it is possible to use induction along this order in a similar way in
which we can use induction along the < order on N. Such orders are called
well-founded or Noetherian, and the corresponding proof principle Noetherian
induction.
Noether was Jewish, and when the Nazis came to power in 1933, she was
dismissed from her position. Luckily, Noether was able to emigrate to the
United States for a temporary position at Bryn Mawr, Pennsylvania. During
her time there she also lectured at Princeton, although she found the univer-
sity to be unwelcoming to women (Dick, 1981, 81). In 1935, Noether under-
went an operation to remove a uterine tumour. She died from an infection as
a result of the surgery, and was buried at Bryn Mawr.

Further Reading For a biography of Noether, see Dick (1981). The Perime-
ter Institute for Theoretical Physics has their lectures on Noether’s life and
influence available online (Institute, 2015). If you’re tired of reading, Stuff You
Missed in History Class has a podcast on Noether’s life and influence (Frey and
Wilson, 2015). The collected works of Noether are available in the original
German (Jacobson, 1983).

C.6 Rózsa Péter

336
C.6. Rózsa Péter

Rózsa Péter was born Rósza Politzer, in Budapest, Hungary, on February 17,
1905. She is best known for her work on recursive functions, which was es-
sential for the creation of the field of recursion theory.
Péter was raised during harsh polit-
ical times—WWI raged when she was
a teenager—but was able to attend the
affluent Maria Terezia Girls’ School in
Budapest, from where she graduated
in 1922. She then studied at Pázmány
Péter University (later renamed Loránd
Eötvös University) in Budapest. She
began studying chemistry at the insis-
tence of her father, but later switched
to mathematics, and graduated in 1927.
Although she had the credentials to
teach high school mathematics, the eco-
nomic situation at the time was dire as
the Great Depression affected the world
economy. During this time, Péter took
Figure C.6: Rózsa Péter
odd jobs as a tutor and private teacher
of mathematics. She eventually returned to university to take up graduate
studies in mathematics. She had originally planned to work in number the-
ory, but after finding out that her results had already been proven, she almost
gave up on mathematics altogether. She was encouraged to work on Gödel’s
incompleteness theorems, and unknowingly proved several of his results in
different ways. This restored her confidence, and Péter went on to write her
first papers on recursion theory, inspired by David Hilbert’s foundational pro-
gram. She received her PhD in 1935, and in 1937 she became an editor for the
Journal of Symbolic Logic.
Péter’s early papers are widely credited as founding contributions to the
field of recursive function theory. In Péter (1935a), she investigated the rela-
tionship between different kinds of recursion. In Péter (1935b), she showed
that a certain recursively defined function is not primitive recursive. This
simplified an earlier result due to Wilhelm Ackermann. Péter’s simplified
function is what’s now often called the Ackermann function—and sometimes,
more properly, the Ackermann–Péter function. She wrote the first book on re-
cursive function theory (Péter, 1951).
Despite the importance and influence of her work, Péter did not obtain a
full-time teaching position until 1945. During the Nazi occupation of Hungary
during World War II, Péter was not allowed to teach due to anti-Semitic laws.
In 1944 the government created a Jewish ghetto in Budapest; the ghetto was
cut off from the rest of the city and attended by armed guards. Péter was
forced to live in the ghetto until 1945 when it was liberated. She then went on

337
C. B IOGRAPHIES

to teach at the Budapest Teachers Training College, and from 1955 onward at
Eötvös Loránd University. She was the first female Hungarian mathematician
to become an Academic Doctor of Mathematics, and the first woman to be
elected to the Hungarian Academy of Sciences.
Péter was known as a passionate teacher of mathematics, who preferred
to explore the nature and beauty of mathematical problems with her students
rather than to merely lecture. As a result, she was affectionately called “Aunt
Rosa” by her students. Péter died in 1977 at the age of 71.

Further Reading For more biographical reading, see (O’Connor and Robert-
son, 2014) and (Andrásfai, 1986). Tamassy (1994) conducted a brief interview
with Péter. For a fun read about mathematics, see Péter’s book Playing With
Infinity (Péter, 2010).

C.7 Julia Robinson


Julia Bowman Robinson was an Amer-
ican mathematician. She is known
mainly for her work on decision prob-
lems, and most famously for her con-
tributions to the solution of Hilbert’s
tenth problem. Robinson was born in
St. Louis, Missouri, on December 8,
1919. Robinson recalls being intrigued
by numbers already as a child (Reid,
1986, 4). At age nine she contracted scar-
let fever and suffered from several re-
current bouts of rheumatic fever. This
forced her to spend much of her time
in bed, putting her behind in her educa-
tion. Although she was able to catch up
with the help of private tutors, the phys-
ical effects of her illness had a lasting im-
Figure C.7: Julia Robinson
pact on her life.
Despite her childhood struggles, Robinson graduated high school with
several awards in mathematics and the sciences. She started her university
career at San Diego State College, and transferred to the University of Cali-
fornia, Berkeley, as a senior. There she was influenced by the mathematician
Raphael Robinson. They became good friends, and married in 1941. As a
spouse of a faculty member, Robinson was barred from teaching in the math-
ematics department at Berkeley. Although she continued to audit mathemat-
ics classes, she hoped to leave university and start a family. Not long after
her wedding, however, Robinson contracted pneumonia. She was told that

338
C.7. Julia Robinson

there was substantial scar tissue build up on her heart due to the rheumatic
fever she suffered as a child. Due to the severity of the scar tissue, the doctor
predicted that she would not live past forty and she was advised not to have
children (Reid, 1986, 13).
Robinson was depressed for a long time, but eventually decided to con-
tinue studying mathematics. She returned to Berkeley and completed her PhD
in 1948 under the supervision of Alfred Tarski. The first-order theory of the
real numbers had been shown to be decidable by Tarski, and from Gödel’s
work it followed that the first-order theory of the natural numbers is unde-
cidable. It was a major open problem whether the first-order theory of the
rationals is decidable or not. In her thesis (1949), Robinson proved that it was
not.
Interested in decision problems, Robinson next attempted to find a solu-
tion to Hilbert’s tenth problem. This problem was one of a famous list of
23 mathematical problems posed by David Hilbert in 1900. The tenth prob-
lem asks whether there is an algorithm that will answer, in a finite amount of
time, whether or not a polynomial equation with integer coefficients, such as
3x2 − 2y + 3 = 0, has a solution in the integers. Such questions are known as
Diophantine problems. After some initial successes, Robinson joined forces with
Martin Davis and Hilary Putnam, who were also working on the problem.
They succeeded in showing that exponential Diophantine problems (where
the unknowns may also appear as exponents) are undecidable, and showed
that a certain conjecture (later called “J.R.”) implies that Hilbert’s tenth prob-
lem is undecidable (Davis et al., 1961). Robinson continued to work on the
problem throughout the 1960s. In 1970, the young Russian mathematician
Yuri Matijasevich finally proved the J.R. hypothesis. The combined result
is now called the Matijasevich–Robinson–Davis–Putnam theorem, or MRDP
theorem for short. Matijasevich and Robinson became friends and collabo-
rated on several papers. In a letter to Matijasevich, Robinson once wrote that
“actually I am very pleased that working together (thousands of miles apart)
we are obviously making more progress than either one of us could alone”
(Matijasevich, 1992, 45).
Robinson was the first female president of the American Mathematical So-
ciety, and the first woman to be elected to the National Academy of Science.
She died on July 30, 1985 at the age of 65 after being diagnosed with leukemia.

Further Reading Robinson’s mathematical papers are available in her Col-


lected Works (Robinson, 1996), which also includes a reprint of her National
Academy of Sciences biographical memoir (Feferman, 1994). Robinson’s older
sister Constance Reid published an “Autobiography of Julia,” based on inter-
views (Reid, 1986), as well as a full memoir (Reid, 1996). A short documentary
about Robinson and Hilbert’s tenth problem was directed by George Csicsery
(Csicsery, 2016). For a brief memoir about Yuri Matijasevich’s collaborations

339
C. B IOGRAPHIES

with Robinson, and her influence on his work, see (Matijasevich, 1992).

C.8 Bertrand Russell

Bertrand Russell is hailed as one of the


founders of modern analytic philoso-
phy. Born May 18, 1872, Russell was
not only known for his work in philoso-
phy and logic, but wrote many popular
books in various subject areas. He was
also an ardent political activist through-
out his life.
Russell was born in Trellech, Mon-
mouthshire, Wales. His parents were
members of the British nobility. They
were free-thinkers, and even made
friends with the radicals in Boston at the
time. Unfortunately, Russell’s parents
died when he was young, and Russell
was sent to live with his grandparents.
There, he was given a religious upbring-
ing (something his parents had wanted Figure C.8: Bertrand Russell
to avoid at all costs). His grandmother
was very strict in all matters of morality. During adolescence he was mostly
homeschooled by private tutors.
Russell’s influence in analytic philosophy, and especially logic, is tremen-
dous. He studied mathematics and philosophy at Trinity College, Cambridge,
where he was influenced by the mathematician and philosopher Alfred North
Whitehead. In 1910, Russell and Whitehead published the first volume of
Principia Mathematica, where they championed the view that mathematics is
reducible to logic. He went on to publish hundreds of books, essays and po-
litical pamphlets. In 1950, he won the Nobel Prize for literature.
Russell’s was deeply entrenched in politics and social activism. During
World War I he was arrested and sent to prison for six months due to pacifist
activities and protest. While in prison, he was able to write and read, and
claims to have found the experience “quite agreeable.” He remained a pacifist
throughout his life, and was again incarcerated for attending a nuclear disar-
mament rally in 1961. He also survived a plane crash in 1948, where the only
survivors were those sitting in the smoking section. As such, Russell claimed
that he owed his life to smoking. Russell was married four times, but had a
reputation for carrying on extra-marital affairs. He died on February 2, 1970
at the age of 97 in Penrhyndeudraeth, Wales.

340
C.9. Alfred Tarski

Further Reading Russell wrote an autobiography in three parts, spanning


his life from 1872–1967 (Russell, 1967, 1968, 1969). The Bertrand Russell Re-
search Centre at McMaster University is home of the Bertrand Russell archives.
See their website at Duncan (2015), for information on the volumes of his col-
lected works (including searchable indexes), and archival projects. Russell’s
paper On Denoting (Russell, 1905) is a classic of 20th century analytic philoso-
phy.
The Stanford Encyclopedia of Philosophy entry on Russell (Irvine, 2015)
has sound clips of Russell speaking on Desire and Political theory. Many video
interviews with Russell are available online. To see him talk about smoking
and being involved in a plane crash, e.g., see Russell (n.d.). Some of Russell’s
works, including his Introduction to Mathematical Philosophy are available as
free audiobooks on LibriVox (n.d.).

C.9 Alfred Tarski


Alfred Tarski was born on January 14,
1901 in Warsaw, Poland (then part of
the Russian Empire). Described as
“Napoleonic,” Tarski was boisterous,
talkative, and intense. His energy was
often reflected in his lectures—he once
set fire to a wastebasket while disposing
of a cigarette during a lecture, and was
forbidden from lecturing in that build-
ing again.
Tarski had a thirst for knowledge
from a young age. Although later in
life he would tell students that he stud-
ied logic because it was the only class in
which he got a B, his high school records
show that he got A’s across the board—
even in logic. He studied at the Univer- Figure C.9: Alfred Tarski
sity of Warsaw from 1918 to 1924. Tarski
first intended to study biology, but became interested in mathematics, philos-
ophy, and logic, as the university was the center of the Warsaw School of Logic
and Philosophy. Tarski earned his doctorate in 1924 under the supervision of
Stanisław Leśniewski.
Before emigrating to the United States in 1939, Tarski completed some of
his most important work while working as a secondary school teacher in War-
saw. His work on logical consequence and logical truth were written during
this time. In 1939, Tarski was visiting the United States for a lecture tour. Dur-
ing his visit, Germany invaded Poland, and because of his Jewish heritage,

341
C. B IOGRAPHIES

Tarski could not return. His wife and children remained in Poland until the
end of the war, but were then able to emigrate to the United States as well.
Tarski taught at Harvard, the College of the City of New York, and the Insti-
tute for Advanced Study at Princeton, and finally the University of California,
Berkeley. There he founded the multidisciplinary program in Logic and the
Methodology of Science. Tarski died on October 26, 1983 at the age of 82.

Further Reading For more on Tarski’s life, see the biography Alfred Tarski:
Life and Logic (Feferman and Feferman, 2004). Tarski’s seminal works on logi-
cal consequence and truth are available in English in (Corcoran, 1983). All of
Tarski’s original works have been collected into a four volume series, (Tarski,
1981).

C.10 Alan Turing


Alan Turing was born in Maida Vale, London, on June 23, 1912. He is consid-
ered the father of theoretical computer science. Turing’s interest in the phys-
ical sciences and mathematics started at a young age. However, as a boy his
interests were not represented well in his schools, where emphasis was placed
on literature and classics. Consequently, he did poorly in school and was rep-
rimanded by many of his teachers.
Turing attended King’s College, Cam-
bridge as an undergraduate, where he
studied mathematics. In 1936 Turing de-
veloped (what is now called) the Turing
machine as an attempt to precisely de-
fine the notion of a computable function
and to prove the undecidability of the
decision problem. He was beaten to the
result by Alonzo Church, who proved
the result via his own lambda calculus.
Turing’s paper was still published with
reference to Church’s result. Church
invited Turing to Princeton, where he
spent 1936–1938, and obtained a doctor-
ate under Church.
Despite his interest in logic, Turing’s Figure C.10: Alan Turing
earlier interests in physical sciences re-
mained prevalent. His practical skills were put to work during his service
with the British cryptanalytic department at Bletchley Park during World
War II. Turing was a central figure in cracking the cypher used by German
Naval communications—the Enigma code. Turing’s expertise in statistics and
cryptography, together with the introduction of electronic machinery, gave

342
C.11. Ernst Zermelo

the team the ability to crack the code by creating a de-crypting machine called
a “bombe.” His ideas also helped in the creation of the world’s first pro-
grammable electronic computer, the Colossus, also used at Bletchley park to
break the German Lorenz cypher.
Turing was gay. Nevertheless, in 1942 he proposed to Joan Clarke, one
of his teammates at Bletchley Park, but later broke off the engagement and
confessed to her that he was homosexual. He had several lovers throughout
his lifetime, although homosexual acts were then criminal offences in the UK.
In 1952, Turing’s house was burgled by a friend of his lover at the time, and
when filing a police report, Turing admitted to having a homosexual relation-
ship, under the impression that the government was on their way to legalizing
homosexual acts. This was not true, and he was charged with gross indecency.
Instead of going to prison, Turing opted for a hormone treatment that reduced
libido. Turing was found dead on June 8, 1954, of a cyanide overdose—most
likely suicide. He was given a royal pardon by Queen Elizabeth II in 2013.

Further Reading For a comprehensive biography of Alan Turing, see Hodges


(2014). Turing’s life and work inspired a play, Breaking the Code, which was
produced in 1996 for TV starring Derek Jacobi as Turing. The Imitation Game,
an Academy Award nominated film starring Bendict Cumberbatch and Kiera
Knightley, is also loosely based on Alan Turing’s life and time at Bletchley
Park (Tyldum, 2014).
Radiolab (2012) has several podcasts on Turing’s life and work. BBC Hori-
zon’s documentary The Strange Life and Death of Dr. Turing is available to watch
online (Sykes, 1992). (Theelen, 2012) is a short video of a working LEGO Tur-
ing Machine—made to honour Turing’s centenary in 2012.
Turing’s original paper on Turing machines and the decision problem is
Turing (1937).

C.11 Ernst Zermelo


Ernst Zermelo was born on July 27, 1871 in Berlin, Germany. He had five
sisters, though his family suffered from poor health and only three survived
to adulthood. His parents also passed away when he was young, leaving
him and his siblings orphans when he was seventeen. Zermelo had a deep
interest in the arts, and especially in poetry. He was known for being sharp,
witty, and critical. His most celebrated mathematical achievements include
the introduction of the axiom of choice (in 1904), and his axiomatization of set
theory (in 1908).
Zermelo’s interests at university were varied. He took courses in physics,
mathematics, and philosophy. Under the supervision of Hermann Schwarz,
Zermelo completed his dissertation Investigations in the Calculus of Variations
in 1894 at the University of Berlin. In 1897, he decided to pursue more studies

343
C. B IOGRAPHIES

at the University of Göttigen, where he was heavily influenced by the foun-


dational work of David Hilbert. In 1899 he became eligible for professorship,
but did not get one until eleven years later—possibly due to his strange de-
meanour and “nervous haste.”
Zermelo finally received a paid pro-
fessorship at the University of Zurich in
1910, but was forced to retire in 1916 due
to tuberculosis. After his recovery, he
was given an honourary professorship
at the University of Freiburg in 1921.
During this time he worked on founda-
tional mathematics. He became irritated
with the works of Thoralf Skolem and
Kurt Gödel, and publicly criticized their
approaches in his papers. He was dis-
missed from his position at Freiburg in
1935, due to his unpopularity and his
opposition to Hitler’s rise to power in
Germany.
The later years of Zermelo’s life were
marked by isolation. After his dismissal
in 1935, he abandoned mathematics. He Figure C.11: Ernst Zermelo
moved to the country where he lived
modestly. He married in 1944, and became completely dependent on his wife
as he was going blind. Zermelo lost his sight completely by 1951. He passed
away in Günterstal, Germany, on May 21, 1953.

Further Reading For a full biography of Zermelo, see Ebbinghaus (2015).


Zermelo’s seminal 1904 and 1908 papers are available to read in the original
German (Zermelo, 1904, 1908). Zermelo’s collected works, including his writ-
ing on physics, are available in English translation in (Ebbinghaus et al., 2010;
Ebbinghaus and Kanamori, 2013).

344
Appendix D

Problems

Problems for Chapter 1


Problem 1.1. Prove that there is at most one empty set, i.e., show that if A and
B are sets without elements, then A = B.

Problem 1.2. List all subsets of { a, b, c, d}.

Problem 1.3. Show that if A has n elements, then ℘( A) has 2n elements.

Problem 1.4. Prove that if A ⊆ B, then A ∪ B = B.

Problem 1.5. Prove rigorously that if A ⊆ B, then A ∩ B = A.

Problem 1.6. Show that if A is a set and A ∈ B, then A ⊆


S
B.

Problem 1.7. Prove that if A ⊊ B, then B \ A ̸= ∅.

Problem 1.8. Using Definition 1.23, prove that ⟨ a, b⟩ = ⟨c, d⟩ iff both a = c
and b = d.

Problem 1.9. List all elements of {1, 2, 3}3 .

Problem 1.10. Show, by induction on k, that for all k ≥ 1, if A has n elements,


then Ak has nk elements.

Problems for Chapter 2


Problem 2.1. List the elements of the relation ⊆ on the set ℘({ a, b, c}).

Problem 2.2. Give examples of relations that are (a) reflexive and symmetric
but not transitive, (b) reflexive and anti-symmetric, (c) anti-symmetric, transi-
tive, but not reflexive, and (d) reflexive, symmetric, and transitive. Do not use
relations on numbers or sets.

345
D. P ROBLEMS

Problem 2.3. Show that ≡n is an equivalence relation, for any n ∈ Z+ , and


that N/≡n has exactly n members.

Problem 2.4. Give a proof of Proposition 2.25.

Problem 2.5. Consider the less-than-or-equal-to relation ≤ on the set {1, 2, 3, 4}


as a graph and draw the corresponding diagram.

Problem 2.6. Show that the transitive closure of R is in fact transitive.

Problems for Chapter 3


Problem 3.1. Show that if f : A → B has a left inverse g, then f is injective.

Problem 3.2. Show that if f : A → B has a right inverse h, then f is surjective.

Problem 3.3. Prove Proposition 3.18. You have to define f −1 , show that it
is a function, and show that it is an inverse of f , i.e., f −1 ( f ( x )) = x and
f ( f −1 (y)) = y for all x ∈ A and y ∈ B.

Problem 3.4. Prove Proposition 3.19.

Problem 3.5. Show that if f : A → B and g : B → C are both injective, then


g ◦ f : A → C is injective.

Problem 3.6. Show that if f : A → B and g : B → C are both surjective, then


g ◦ f : A → C is surjective.

Problem 3.7. Suppose f : A → B and g : B → C. Show that the graph of g ◦ f


is R f | R g .

Problem 3.8. Given f : A → 7 B, define the partial function g : B →


7 A by: for
any y ∈ B, if there is a unique x ∈ A such that f ( x ) = y, then g(y) = x;
otherwise g(y) ↑. Show that if f is injective, then g( f ( x )) = x for all x ∈
dom( f ), and f ( g(y)) = y for all y ∈ ran( f ).

Problems for Chapter 4


Problem 4.1. Define an enumeration of the positive squares 1, 4, 9, 16, . . .

Problem 4.2. Show that if A and B are countable, so is A ∪ B. To do this,


suppose there are surjective functions f : Z+ → A and g : Z+ → B, and define
a surjective function h : Z+ → A ∪ B and prove that it is surjective. Also
consider the cases where A or B = ∅.

346
Problem 4.3. Show that if B ⊆ A and A is countable, so is B. To do this,
suppose there is a surjective function f : Z+ → A. Define a surjective func-
tion g : Z+ → B and prove that it is surjective. What happens if B = ∅?

Problem 4.4. Show by induction on n that if A1 , A2 , . . . , An are all countable,


so is A1 ∪ · · · ∪ An . You may assume the fact that if two sets A and B are
countable, so is A ∪ B.

Problem 4.5. According to Definition 4.4, a set A is enumerable iff A = ∅ or


there is a surjective f : Z+ → A. It is also possible to define “countable set”
precisely by: a set is enumerable iff there is an injective function g : A → Z+ .
Show that the definitions are equivalent, i.e., show that there is an injective
function g : A → Z+ iff either A = ∅ or there is a surjective f : Z+ → A.

Problem 4.6. Show that (Z+ )n is countable, for every n ∈ N.

Problem 4.7. Show that (Z+ )∗ is countable. You may assume problem 4.6.

Problem 4.8. Give an enumeration of the set of all non-negative rational num-
bers.

Problem 4.9. Show that Q is countable. Recall that any rational number can
be written as a fraction z/m with z ∈ Z, m ∈ N+ .

Problem 4.10. Define an enumeration of B∗ .

Problem 4.11. Recall from your introductory logic course that each possible
truth table expresses a truth function. In other words, the truth functions are
all functions from Bk → B for some k. Prove that the set of all truth functions
is enumerable.

Problem 4.12. Show that the set of all finite subsets of an arbitrary infinite
countable set is countable.

Problem 4.13. A subset of N is said to be cofinite iff it is the complement of


a finite set N; that is, A ⊆ N is cofinite iff N \ A is finite. Let I be the set
whose elements are exactly the finite and cofinite subsets of N. Show that I is
countable.

Problem 4.14. Show that the countable union of countable sets is countable.
That is, whenever A1 , A2 , . . . are sets, and each Ai is countable, then the union
S∞
i =1 Ai of all of them is also countable. [NB: this is hard!]

Problem 4.15. Let f : A × B → N be an arbitrary pairing function. Show that


the inverse of f is an enumeration of A × B.

347
D. P ROBLEMS

Problem 4.16. Specify a function that encodes N3 .

Problem 4.17. Show that ℘(N) is uncountable by a diagonal argument.

Problem 4.18. Show that the set of functions f : Z+ → Z+ is uncountable


by an explicit diagonal argument. That is, show that if f 1 , f 2 , . . . , is a list of
functions and each f i : Z+ → Z+ , then there is some f : Z+ → Z+ not on this
list.

Problem 4.19. Show that if there is an injective function g : B → A, and B is


uncountable, then so is A. Do this by showing how you can use g to turn an
enumeration of A into one of B.

Problem 4.20. Show that the set of all sets of pairs of positive integers is un-
countable by a reduction argument.

Problem 4.21. Show that the set X of all functions f : N → N is uncountable


by a reduction argument (Hint: give a surjective function from X to Bω .)

Problem 4.22. Show that Nω , the set of infinite sequences of natural numbers,
is uncountable by a reduction argument.

Problem 4.23. Let P be the set of functions from the set of positive integers
to the set {0}, and let Q be the set of partial functions from the set of positive
integers to the set {0}. Show that P is countable and Q is not. (Hint: reduce
the problem of enumerating Bω to enumerating Q).

Problem 4.24. Let S be the set of all surjective functions from the set of posi-
tive integers to the set {0,1}, i.e., S consists of all surjective f : Z+ → B. Show
that S is uncountable.

Problem 4.25. Show that the set R of all real numbers is uncountable.

Problem 4.26. Show that if A ≈ C and B ≈ D, and A ∩ B = C ∩ D = ∅, then


A ∪ B ≈ C ∪ D.

Problem 4.27. Show that if A is infinite and countable, then A ≈ N.

Problem 4.28. Show that there cannot be an injection g : ℘( A) → A, for any


set A. Hint: Suppose g : ℘( A) → A is injective. Consider D = { g( B) | B ⊆
A and g( B) ∈/ B}. Let x = g( D ). Use the fact that g is injective to derive a
contradiction.

348
Problems for Chapter 6
Problem 6.1. Prove Lemma 6.8.

Problem 6.2. Prove that for any term t, l (t) = r (t).

Problem 6.3. Prove Lemma 6.12.

Problem 6.4. Prove Proposition 6.13 (Hint: Formulate and prove a version of
Lemma 6.12 for terms.)

Problem 6.5. Prove Proposition 6.19.

Problem 6.6. Prove Proposition 6.20.

Problem 6.7. Prove Lemma 6.28.

Problem 6.8. Prove Proposition 6.30. Hint: use a similar strategy to that used
in the proof of Theorem 6.29.

Problem 6.9. Prove Proposition 6.32.

Problem 6.10. Give an inductive definition of the bound variable occurrences


along the lines of Definition 6.33.

Problems for Chapter 7


Problem 7.1. Is N, the standard model of arithmetic, covered? Explain.

Problem 7.2. Let L = {c, f , A} with one constant symbol, one one-place func-
tion symbol and one two-place predicate symbol, and let the structure M be
given by

1. |M| = {1, 2, 3}

2. cM = 3

3. f M (1) = 2, f M (2) = 3, f M (3) = 2

4. AM = {⟨1, 2⟩, ⟨2, 3⟩, ⟨3, 3⟩}

(a) Let s(v) = 1 for all variables v. Find out whether

M, s ⊨ ∃ x ( A( f (z), c) ⊃ ∀y ( A(y, x ) ∨ A( f (y), x )))

Explain why or why not.


(b) Give a different structure and variable assignment in which the formula
is not satisfied.

349
D. P ROBLEMS

Problem 7.3. Complete the proof of Proposition 7.14.

Problem 7.4. Prove Proposition 7.18

Problem 7.5. Prove Proposition 7.19.

Problem 7.6. Suppose L is a language without function symbols. Given a


structure M, c a constant symbol and a ∈ |M|, define M[ a/c] to be the struc-
ture that is just like M, except that cM[ a/c] = a. Define M ||= φ for sentences φ
by:

1. φ ≡ ⊥: not M ||= φ.

2. φ ≡ R(d1 , . . . , dn ): M ||= φ iff ⟨dM M M


1 , . . . , dn ⟩ ∈ R .

3. φ ≡ d1 = d2 : M ||= φ iff dM M
1 = d2 .

4. φ ≡ ∼ψ: M ||= φ iff not M ||= ψ.

5. φ ≡ (ψ & χ): M ||= φ iff M ||= ψ and M ||= χ.

6. φ ≡ (ψ ∨ χ): M ||= φ iff M ||= ψ or M ||= χ (or both).

7. φ ≡ (ψ ⊃ χ): M ||= φ iff not M ||= ψ or M ||= χ (or both).

8. φ ≡ ∀ x ψ: M ||= φ iff for all a ∈ |M|, M[ a/c] ||= ψ[c/x ], if c does not
occur in ψ.

9. φ ≡ ∃ x ψ: M ||= φ iff there is an a ∈ |M| such that M[ a/c] ||= ψ[c/x ],


if c does not occur in ψ.

Let x1 , . . . , xn be all free variables in φ, c1 , . . . , cn constant symbols not in φ,


a1 , . . . , an ∈ |M|, and s( xi ) = ai .
Show that M, s ⊨ φ iff M[ a1 /c1 , . . . , an /cn ] ||= φ[c1 /x1 ] . . . [cn /xn ].
(This problem shows that it is possible to give a semantics for first-order
logic that makes do without variable assignments.)

Problem 7.7. Suppose that f is a function symbol not in φ( x, y). Show that
there is a structure M such that M ⊨ ∀ x ∃y φ( x, y) iff there is an M′ such that
M′ ⊨ ∀ x φ( x, f ( x )).
(This problem is a special case of what’s known as Skolem’s Theorem;
∀ x φ( x, f ( x )) is called a Skolem normal form of ∀ x ∃y φ( x, y).)

Problem 7.8. Carry out the proof of Proposition 7.20 in detail.

Problem 7.9. Prove Proposition 7.23

Problem 7.10. 1. Show that Γ ⊨ ⊥ iff Γ is unsatisfiable.

350
2. Show that Γ ∪ { φ} ⊨ ⊥ iff Γ ⊨ ∼ φ.

3. Suppose c does not occur in φ or Γ. Show that Γ ⊨ ∀ x φ iff Γ ⊨ φ[c/x ].

Problem 7.11. Complete the proof of Proposition 7.31.

Problems for Chapter 8


Problem 8.1. Find formulae in L A which define the following relations:

1. n is between i and j;

2. n evenly divides m (i.e., m is a multiple of n);

3. n is a prime number (i.e., no number other than 1 and n evenly di-


vides n).

Problem 8.2. Suppose the formula φ(v1 , v2 ) expresses the relation R ⊆ |M|2
in a structure M. Find formulas that express the following relations:

1. the inverse R−1 of R;

2. the relative product R | R;

Can you find a way to express R+ , the transitive closure of R?

Problem 8.3. Let L be the language containing a 2-place predicate symbol <
only (no other constant symbols, function symbols or predicate symbols—
except of course =). Let N be the structure such that |N| = N, and <N =
{⟨n, m⟩ | n < m}. Prove the following:
1. {0} is definable in N;

2. {1} is definable in N;

3. {2} is definable in N;

4. for each n ∈ N, the set {n} is definable in N;

5. every finite subset of |N| is definable in N;

6. every co-finite subset of |N| is definable in N (where X ⊆ N is co-finite


iff N \ X is finite).

Problem 8.4. Show that the comprehension principle is inconsistent by giving


a derivation that shows

∃y ∀ x ( x ∈ y ≡ x ∈
/ x ) ⊢ ⊥.

It may help to first show ( A ⊃ ∼ A) & (∼ A ⊃ A) ⊢ ⊥.

351
D. P ROBLEMS

Problems for Chapter 9


Problem 9.1. Give derivations that show the following:

1. φ & (ψ & χ) ⊢ ( φ & ψ) & χ.

2. φ ∨ (ψ ∨ χ) ⊢ ( φ ∨ ψ) ∨ χ.

3. φ ⊃ (ψ ⊃ χ) ⊢ ψ ⊃ ( φ ⊃ χ).

4. φ ⊢ ∼∼ φ.

Problem 9.2. Give derivations that show the following:

1. ( φ ∨ ψ) ⊃ χ ⊢ φ ⊃ χ.

2. ( φ ⊃ χ) & (ψ ⊃ χ) ⊢ ( φ ∨ ψ) ⊃ χ.

3. ⊢ ∼( φ & ∼ φ).

4. ψ ⊃ φ ⊢ ∼ φ ⊃ ∼ψ.

5. ⊢ ( φ ⊃ ∼ φ) ⊃ ∼ φ.

6. ⊢ ∼( φ ⊃ ψ) ⊃ ∼ψ.

7. φ ⊃ χ ⊢ ∼( φ & ∼χ).

8. φ & ∼χ ⊢ ∼( φ ⊃ χ).

9. φ ∨ ψ, ∼ψ ⊢ φ.

10. ∼ φ ∨ ∼ψ ⊢ ∼( φ & ψ).

11. ⊢ (∼ φ & ∼ψ) ⊃ ∼( φ ∨ ψ).

12. ⊢ ∼( φ ∨ ψ) ⊃ (∼ φ & ∼ψ).

Problem 9.3. Give derivations that show the following:

1. ∼( φ ⊃ ψ) ⊢ φ.

2. ∼( φ & ψ) ⊢ ∼ φ ∨ ∼ψ.

3. φ ⊃ ψ ⊢ ∼ φ ∨ ψ.

4. ⊢ ∼∼ φ ⊃ φ.

5. φ ⊃ ψ, ∼ φ ⊃ ψ ⊢ ψ.

6. ( φ & ψ) ⊃ χ ⊢ ( φ ⊃ χ) ∨ (ψ ⊃ χ).

7. ( φ ⊃ ψ) ⊃ φ ⊢ φ.

352
8. ⊢ ( φ ⊃ ψ) ∨ (ψ ⊃ χ).

(These all require the ⊥C rule.)

Problem 9.4. Give derivations that show the following:

1. ⊢ (∀ x φ( x ) & ∀y ψ(y)) ⊃ ∀z ( φ(z) & ψ(z)).

2. ⊢ (∃ x φ( x ) ∨ ∃y ψ(y)) ⊃ ∃z ( φ(z) ∨ ψ(z)).

3. ∀ x ( φ( x ) ⊃ ψ) ⊢ ∃y φ(y) ⊃ ψ.

4. ∀ x ∼ φ( x ) ⊢ ∼∃ x φ( x ).

5. ⊢ ∼∃ x φ( x ) ⊃ ∀ x ∼ φ( x ).

6. ⊢ ∼∃ x ∀y (( φ( x, y) ⊃ ∼ φ(y, y)) & (∼ φ(y, y) ⊃ φ( x, y))).

Problem 9.5. Give derivations that show the following:

1. ⊢ ∼∀ x φ( x ) ⊃ ∃ x ∼ φ( x ).

2. (∀ x φ( x ) ⊃ ψ) ⊢ ∃y ( φ(y) ⊃ ψ).

3. ⊢ ∃ x ( φ( x ) ⊃ ∀y φ(y)).

(These all require the ⊥C rule.)

Problem 9.6. Prove Proposition 9.16

Problem 9.7. Prove that Γ ⊢ ∼ φ iff Γ ∪ { φ} is inconsistent.

Problem 9.8. Complete the proof of Theorem 9.27.

Problem 9.9. Prove that = is both symmetric and transitive, i.e., give deriva-
tions of ∀ x ∀y ( x = y ⊃ y = x ) and ∀ x ∀y ∀z(( x = y & y = z) ⊃ x = z)

Problem 9.10. Give derivations of the following formulae:

1. ∀ x ∀y (( x = y & φ( x )) ⊃ φ(y))

2. ∃ x φ( x ) & ∀y ∀z (( φ(y) & φ(z)) ⊃ y = z) ⊃ ∃ x ( φ( x ) & ∀y ( φ(y) ⊃ y =


x ))

353
D. P ROBLEMS

Problems for Chapter 10


Problem 10.1. Complete the proof of Proposition 10.2.

Problem 10.2. Complete the proof of Proposition 10.11.

Problem 10.3. Complete the proof of Lemma 10.12.

Problem 10.4. Complete the proof of Proposition 10.14.

Problem 10.5. Complete the proof of Lemma 10.18.

Problem 10.6. Use Corollary 10.21 to prove Theorem 10.20, thus showing that
the two formulations of the completeness theorem are equivalent.

Problem 10.7. In order for a derivation system to be complete, its rules must
be strong enough to prove every unsatisfiable set inconsistent. Which of the
rules of derivation were necessary to prove completeness? Are any of these
rules not used anywhere in the proof? In order to answer these questions,
make a list or diagram that shows which of the rules of derivation were used
in which results that lead up to the proof of Theorem 10.20. Be sure to note
any tacit uses of rules in these proofs.

Problem 10.8. Prove (1) of Theorem 10.23.

Problem 10.9. In the standard model of arithmetic N, there is no element k ∈


|N| which satisfies every formula n < x (where n is 0′...′ with n ′’s). Use
the compactness theorem to show that the set of sentences in the language of
arithmetic which are true in the standard model of arithmetic N are also true
in a structure N′ that contains an element which does satisfy every formula
n < x.

Problem 10.10. Prove Proposition 10.27. Avoid the use of ⊢.

Problem 10.11. Prove Lemma 10.28. (Hint: The crucial step is to show that if
Γn is finitely satisfiable, so is Γn ∪ {θn }, without any appeal to derivations or
consistency.)

Problem 10.12. Prove Proposition 10.29.

Problem 10.13. Prove Lemma 10.30. (Hint: the crucial step is to show that if
Γn is finitely satisfiable, then either Γn ∪ { φn } or Γn ∪ {∼ φn } is finitely satisfi-
able.)

Problem 10.14. Write out the complete proof of the Truth Lemma (Lemma 10.12)
in the version required for the proof of Theorem 10.31.

354
Problems for Chapter 12
Problem 12.1. Choose an arbitrary input and trace through the configurations
of the doubler machine in Example 12.4.

Problem 12.2. Design a Turing-machine with alphabet {▷, 0, A, B} that accepts,


i.e., halts on, any string of A’s and B’s where the number of A’s is the same as
the number of B’s and all the A’s precede all the B’s, and rejects, i.e., does not
halt on, any string where the number of A’s is not equal to the number of B’s
or the A’s do not precede all the B’s. (E.g., the machine should accept AABB,
and AAABBB, but reject both AAB and AABBAABB.)

Problem 12.3. Design a Turing-machine with alphabet {▷, 0, A, B} that takes


as input any string α of A’s and B’s and duplicates them to produce an output
of the form αα. (E.g. input ABBA should result in output ABBAABBA).

Problem 12.4. Alphabetical?: Design a Turing-machine with alphabet {▷, 0, A, B}


that when given as input a finite sequence of A’s and B’s checks to see if all
the A’s appear to the left of all the B’s or not. The machine should leave the
input string on the tape, and either halt if the string is “alphabetical”, or loop
forever if the string is not.

Problem 12.5. Alphabetizer: Design a Turing-machine with alphabet {▷, 0, A, B}


that takes as input a finite sequence of A’s and B’s rearranges them so that all
the A’s are to the left of all the B’s. (e.g., the sequence BABAA should be-
come the sequence AAABB, and the sequence ABBABB should become the
sequence AABBBB).

Problem 12.6. Give a definition for when a Turing machine M computes the
function f : Nk → Nm .

Problem 12.7. Trace through the configurations of the machine from Exam-
ple 12.12 for input ⟨3, 2⟩. What happens if the machine computes 0 + 0?

Problem 12.8. In Example 12.14 we described a machine consisting of a com-


bination of the doubler machine from Figure 12.4 and the mover machine from
Figure 12.5. What happens if you start this combined machine on input x = 0,
i.e., on an empty tape? How would you fix the machine so that in this case the
machine halts with output 2x = 0? (You should be able to do this by adding
one state and one transition.)

Problem 12.9. Subtraction: Design a Turing machine that when given an input
of two non-empty strings of strokes of length n and m, where n > m, computes
the function f (n, m) = n − m.

355
D. P ROBLEMS

Problem 12.10. Equality: Design a Turing machine to compute the following


function: (
1 if n = m
equality(n, m) =
0 if n ̸= m
where n and m ∈ Z+ .

Problem 12.11. Design a Turing machine to compute the function min( x, y)


where x and y are positive integers represented on the tape by strings of 1’s
separated by a 0. You may use additional symbols in the alphabet of the ma-
chine.
The function min selects the smallest value from its arguments, so min(3, 5) =
3, min(20, 16) = 16, and min(4, 4) = 4, and so on.

Problem 12.12. Give a disciplined machine that computes f ( x ) = x + 1.

Problem 12.13. Find a disciplined machine which, when started on input 1n


produces output 1n ⌢ 0 ⌢ 1n .

Problem 12.14. Give a disciplined Turing machine computing f ( x ) = x + 2


by taking the machine M from problem 12.12 and construct M ⌢ M.

Problems for Chapter 13


Problem 13.1. Can you think of a way to describe Turing machines that does
not require that the states and alphabet symbols are explicitly listed? You may
define your own notion of “standard” machine, but say something about why
every Turing machine can be computed by a “standard” machine in your new
sense.

Problem 13.2. The Three Halting (3-Halt) problem is the problem of giving a
decision procedure to determine whether or not an arbitrarily chosen Turing
Machine halts for an input of three 1’s on an otherwise blank tape. Prove that
the 3-Halt problem is unsolvable.

Problem 13.3. Show that if the halting problem is solvable for Turing machine
and input pairs Me and n where e ̸= n, then it is also solvable for the cases
where e = n.

Problem 13.4. We proved that the halting problem is unsolvable if the input
is a number e, which identifies a Turing machine Me via an enumeration of all
Turing machines. What if we allow the description of Turing machines from
section 13.2 directly as input? Can there be a Turing machine which decides
the halting problem but takes as input descriptions of Turing machines rather
than indices? Explain why or why not.

356
Problem 13.5. Show that the partial function s′ is defined as
(
′ 1 if machine Me halts for input e
s (e) =
undefined if machine Me does not halt for input e

is Turing computable.

Problem 13.6. Prove Proposition 13.10. (Hint: use induction on k − m).

Problem 13.7. Complete case (3) of the proof of Lemma 13.13.

Problem 13.8. Give a derivation of Sσi (i, n′ ) from Sσi (i, n) and φ(m, n) (as-
suming i ̸= m, i.e., either i < m or m < i).

Problem 13.9. Give a derivation of ∀ x (k < x ⊃ S0 ( x, n′ )) from ∀ x (k < x ⊃
S0 ( x, n′ )), ∀ x x < x ′ , and ∀ x ∀y ∀z (( x < y & y < z) ⊃ x < z).)

Problem 13.10. Let M be a Turing machine with the single state q0 and single
′′ ′
instruction δ(q0 , 0) = ⟨q, 0, N ⟩. Let |M′′ | = {0, 1, 2}, ′M (0) = ′M (1) = 1 and
′′ ′′ ′′ M′′ M′′
′M (2) = 2, and <M = {⟨0, 1⟩, ⟨1, 1⟩, ⟨2, 2⟩}. Define QM q0 , S0 , and S▷
so that τ ( M, Λ) and α( M, Λ) become true and explain why they are. Hint:
Observe that δ(q0 , ▷) is undefined. Ensure that

Qq0 (1, n) & S▷ (0, n) & ∀ x (0 < x ⊃ S0 ( x, n)) for all n ∈ N


∃y (Qq0 (0, y) & S▷ (0, y))

are both true in M′′ .

Problem 13.11. Complete the proof of Lemma 13.19 by proving that M′ ⊨


τ ( M, w) & E( M, w).

Problem 13.12. Complete the proof of Lemma 13.20 by proving that if M,


started on input w, has not halted after n steps, then τ ′ ( M, w) ⊨ ψ(n).

Problem 13.13. Prove Corollary 13.22. Observe that ψ is satisfied in every


finite structure iff ∼ψ is not finitely satisfiable. Explain why finite satisfiability
is semi-decidable in the sense of Theorem 13.18. Use this to argue that if there
were a derivation system for finite validity, then finite satisfiability would be
decidable.

Problems for Chapter 14


Problem 14.1. Show that TA = { φ | N ⊨ φ} is not axiomatizable. You may
assume that TA represents all decidable properties.

357
D. P ROBLEMS

Problems for Chapter 15


Problem 15.1. Prove Proposition 15.5 by showing that the primitive recursive
definition of mult can be put into the form required by Definition 15.1 and
showing that the corresponding functions f and g are primitive recursive.
Problem 15.2. Give the complete primitive recursive notation for mult.
Problem 15.3. Prove Proposition 15.13.
Problem 15.4. Show that
x 
.2
.. y 2’s
f ( x, y) = 2(2 )

is primitive recursive.
Problem 15.5. Show that integer division d( x, y) = ⌊ x/y⌋ (i.e., division, where
you disregard everything after the decimal point) is primitive recursive. When
y = 0, we stipulate d( x, y) = 0. Give an explicit definition of d using primitive
recursion and composition.
Problem 15.6. Show that the three place relation x ≡ y mod n (congruence
modulo n) is primitive recursive.
Problem 15.7. Suppose R(⃗x, z) is primitive recursive. Define the function m′R (⃗x, y)
which returns the least z less than y such that R(⃗x, z) holds, if there is one, and
0 otherwise, by primitive recursion from χ R .
Problem 15.8. Define integer division d( x, y) using bounded minimization.
Problem 15.9. Show that there is a primitive recursive function sconcat(s)
with the property that
sconcat(⟨s0 , . . . , sk ⟩) = s0 ⌢ . . . ⌢ sk .
Problem 15.10. Show that there is a primitive recursive function tail(s) with
the property that
tail(Λ) = 0 and
tail(⟨s0 , . . . , sk ⟩) = ⟨s1 , . . . , sk ⟩.
Problem 15.11. Prove Proposition 15.24.
Problem 15.12. The definition of hSubtreeSeq in the proof of Proposition 15.25
in general includes repetitions. Give an alternative definition which guaran-
tees that the code of a subtree occurs only once in the resulting list.
Problem 15.13. Define the remainder function r ( x, y) by course-of-values re-
cursion. (If x, y are natural numbers and y > 0, r ( x, y) is the number less
than y such that x = z × y + r ( x, y) for some z. For definiteness, let’s say that
if y = 0, r ( x, 0) = 0.)

358
Problems for Chapter 16
Problem 16.1. Show that the function flatten(z), which turns the sequence
⟨# t1 # , . . . , # tn # ⟩ into # t1 , . . . , tn # , is primitive recursive.

Problem 16.2. Give a detailed proof of Proposition 16.8 along the lines of the
first proof of Proposition 16.5.

Problem 16.3. Prove Proposition 16.9. You may make use of the fact that any
substring of a formula which is a formula is a sub-formula of it.

Problem 16.4. Prove Proposition 16.12

Problem 16.5. Define the following properties as in Proposition 16.16:

1. FollowsBy⊃Elim (d),

2. FollowsBy=Elim (d),

3. FollowsBy∨Elim (d),

4. FollowsBy∀Intro (d).

For the last one, you will have to also show that you can test primitive recur-
sively if the last inference of the derivation with Gödel number d satisfies the
eigenvariable condition, i.e., the eigenvariable a of the ∀Intro inference occurs
neither in the end-formula of d nor in an open assumption of d. You may use
the primitive recursive predicate OpenAssum from Proposition 16.18 for this.

Problems for Chapter 17


Problem 17.1. Show that the relations x < y, x | y, and the function rem( x, y)
can be defined without primitive recursion. You may use 0, successor, plus,
times, χ= , projections, and bounded minimization and quantification.

Problem 17.2. Prove that y = 0, y = x ′ , and y = xi represent zero, succ, and


Pin , respectively.

Problem 17.3. Prove Lemma 17.18.

Problem 17.4. Use Lemma 17.18 to prove Proposition 17.17.

Problem 17.5. Using the proofs of Proposition 17.20 and Proposition 17.20 as
a guide, carry out the proof of Proposition 17.21 in detail.

Problem 17.6. Show that if R is representable in Q, so is χ R .

359
D. P ROBLEMS

Problems for Chapter 18


Problem 18.1. A formula φ( x ) is a truth definition if Q ⊢ ψ ≡ φ(⌜ψ⌝) for all
sentences ψ. Show that no formula is a truth definition by using the fixed-
point lemma.

Problem 18.2. Every ω-consistent theory is consistent. Show that the con-
verse does not hold, i.e., that there are consistent but ω-inconsistent theories.
Do this by showing that Q ∪ {∼γQ } is consistent but ω-inconsistent.

Problem 18.3. Two sets A and B of natural numbers are said to be computably
inseparable if there is no decidable set X such that A ⊆ X and B ⊆ X (X is the
complement, N \ X, of X). Let T be a consistent axiomatizable extension of
Q. Suppose A is the set of Gödel numbers of sentences provable in T and B
the set of Gödel numbers of sentences refutable in T. Prove that A and B are
computably inseparable.

Problem 18.4. Show that PA derives γPA ⊃ ConPA .

Problem 18.5. Let T be a computably axiomatized theory, and let Prov T be


a derivability predicate for T. Consider the following four statements:

1. If T ⊢ φ, then T ⊢ Prov T (⌜φ⌝).

2. T ⊢ φ ⊃ Prov T (⌜φ⌝).

3. If T ⊢ Prov T (⌜φ⌝), then T ⊢ φ.

4. T ⊢ Prov T (⌜φ⌝) ⊃ φ

Under what conditions are each of these statements true?

Problem 18.6. Show that Q(n) ⇔ n ∈ {# φ# | Q ⊢ φ} is definable in arith-


metic.

Problem 18.7. Suppose you are asked to prove that A ∩ B ̸= ∅. Unpack all
the definitions occurring here, i.e., restate this in a way that does not mention
“∩”, “=”, or “∅”.

Problem 18.8. Prove indirectly that A ∩ B ⊆ A.

Problem 18.9. Expand the following proof of A ∪ ( A ∩ B) = A, where you


mention all the inference patterns used, why each step follows from assump-
tions or claims established before it, and where we have to appeal to which
definitions.

Proof. If z ∈ A ∪ ( A ∩ B) then z ∈ A or z ∈ A ∩ B. If z ∈ A ∩ B, z ∈ A. Any


z ∈ A is also ∈ A ∪ ( A ∩ B).

360
Problem 18.10. Define the set of supernice terms by

1. Any letter a, b, c, d is a supernice term.

2. If s is a supernice term, then so is [s].

3. If s1 and s2 are supernice terms, then so is [s1 ◦ s2 ].

4. Nothing else is a supernice term.

Show that the number of [ in a supernice term t of length n is ≤ n/2 + 1.

Problem 18.11. Prove by structural induction that no nice term starts with ].

Problem 18.12. Give an inductive definition of the function l, where l (t) is the
number of symbols in the nice term t.

Problem 18.13. Prove by structural induction on nice terms t that f (t) < l (t)
(where l (t) is the number of symbols in t and f (t) is the depth of t as defined
in Definition B.10).

361
Photo Credits

Georg Cantor, p. 331: Portrait of Georg Cantor by Otto Zeth courtesy of the
Universitätsarchiv, Martin-Luther Universität Halle–Wittenberg. UAHW Rep. 40-
VI, Nr. 3 Bild 102.
Alonzo Church, p. 332: Portrait of Alonzo Church, undated, photogra-
pher unknown. Alonzo Church Papers; 1924–1995, (C0948) Box 60, Folder 3.
Manuscripts Division, Department of Rare Books and Special Collections, Prince-
ton University Library. © Princeton University. The Open Logic Project has
obtained permission to use this image for inclusion in non-commercial OLP-
derived materials. Permission from Princeton University is required for any
other use.
Gerhard Gentzen, p. 333: Portrait of Gerhard Gentzen playing ping-pong
courtesy of Ekhart Mentzler-Trott.
Kurt Gödel, p. 334: Portrait of Kurt Gödel, ca. 1925, photographer un-
known. From the Shelby White and Leon Levy Archives Center, Institute for
Advanced Study, Princeton, NJ, USA, on deposit at Princeton University Li-
brary, Manuscript Division, Department of Rare Books and Special Collec-
tions, Kurt Gödel Papers, (C0282), Box 14b, #110000. The Open Logic Project
has obtained permission from the Institute’s Archives Center to use this image
for inclusion in non-commercial OLP-derived materials. Permission from the
Archives Center is required for any other use.
Emmy Noether, p. 336: Portrait of Emmy Noether, ca. 1922, courtesy of the
Abteilung für Handschriften und Seltene Drucke, Niedersächsische Staats-
und Universitätsbibliothek Göttingen, Cod. Ms. D. Hilbert 754, Bl. 14 Nr. 73.
Restored from an original scan by Joel Fuller.
Rózsa Péter, p. 337: Portrait of Rózsa Péter, undated, photographer un-
known. Courtesy of Béla Andrásfai.
Julia Robinson, p. 338: Portrait of Julia Robinson, unknown photographer,
courtesy of Neil D. Reid. The Open Logic Project has obtained permission to
use this image for inclusion in non-commercial OLP-derived materials. Per-
mission is required for any other use.
Bertrand Russell, p. 340: Portrait of Bertrand Russell, ca. 1907, courtesy of
the William Ready Division of Archives and Research Collections, McMaster
University Library. Bertrand Russell Archives, Box 2, f. 4.

363
P HOTO C REDITS

Alfred Tarski, p. 341: Passport photo of Alfred Tarski, 1939. Cropped and
restored from a scan of Tarski’s passport by Joel Fuller. Original courtesy
of Bancroft Library, University of California, Berkeley. Alfred Tarski Papers,
Banc MSS 84/49. The Open Logic Project has obtained permission to use this
image for inclusion in non-commercial OLP-derived materials. Permission
from Bancroft Library is required for any other use.
Alan Turing, p. 342: Portrait of Alan Mathison Turing by Elliott & Fry, 29
March 1951, NPG x82217, © National Portrait Gallery, London. Used under a
Creative Commons BY-NC-ND 3.0 license.
Ernst Zermelo, p. 344: Portrait of Ernst Zermelo, ca. 1922, courtesy of the
Abteilung für Handschriften und Seltene Drucke, Niedersächsische Staats-
und Universitätsbibliothek Göttingen, Cod. Ms. D. Hilbert 754, Bl. 6 Nr. 25.

364
Bibliography

Andrásfai, Béla. 1986. Rózsa (Rosa) Péter. Periodica Polytechnica Electrical Engi-
neering 30(2-3): 139–145. URL http://www.pp.bme.hu/ee/article/view/
4651.

Aspray, William. 1984. The Princeton mathematics community in the 1930s:


Alonzo Church. URL http://www.princeton.edu/mudd/finding_aids/
mathoral/pmc05.htm. Interview.

Baaz, Matthias, Christos H. Papadimitriou, Hilary W. Putnam, Dana S. Scott,


and Charles L. Harper Jr. 2011. Kurt Gödel and the Foundations of Mathematics:
Horizons of Truth. Cambridge: Cambridge University Press.

Cantor, Georg. 1892. Über eine elementare Frage der Mannigfaltigkeitslehre.


Jahresbericht der deutschen Mathematiker-Vereinigung 1: 75–8.

Cheng, Eugenia. 2004. How to write proofs: A quick quide.


URL https://eugeniacheng.com/wp-content/uploads/2017/02/
cheng-proofguide.pdf.

Church, Alonzo. 1936a. A note on the Entscheidungsproblem. The Journal of


Symbolic Logic 1: 40–41.

Church, Alonzo. 1936b. An unsolvable problem of elementary number theory.


American Journal of Mathematics 58: 345–363.

Corcoran, John. 1983. Logic, Semantics, Metamathematics. Indianapolis: Hack-


ett, 2nd ed.

Csicsery, George. 2016. Zala films: Julia Robinson and Hilbert’s tenth problem.
URL http://www.zalafilms.com/films/juliarobinson.html.

Dauben, Joseph. 1990. Georg Cantor: His Mathematics and Philosophy of the Infi-
nite. Princeton: Princeton University Press.

Davis, Martin, Hilary Putnam, and Julia Robinson. 1961. The decision prob-
lem for exponential Diophantine equations. Annals of Mathematics 74(3):
425–436. URL http://www.jstor.org/stable/1970289.

365
B IBLIOGRAPHY

Dick, Auguste. 1981. Emmy Noether 1882–1935. Boston: Birkhäuser.

du Sautoy, Marcus. 2014. A brief history of mathematics: Georg Cantor. URL


http://www.bbc.co.uk/programmes/b00ss1j0. Audio Recording.

Duncan, Arlene. 2015. The Bertrand Russell Research Centre. URL http:
//russell.mcmaster.ca/.

Ebbinghaus, Heinz-Dieter. 2015. Ernst Zermelo: An Approach to his Life and


Work. Berlin: Springer-Verlag.

Ebbinghaus, Heinz-Dieter, Craig G. Fraser, and Akihiro Kanamori. 2010. Ernst


Zermelo. Collected Works, vol. 1. Berlin: Springer-Verlag.

Ebbinghaus, Heinz-Dieter and Akihiro Kanamori. 2013. Ernst Zermelo: Col-


lected Works, vol. 2. Berlin: Springer-Verlag.

Enderton, Herbert B. 2019. Alonzo Church: Life and Work. In The Collected
Works of Alonzo Church, eds. Tyler Burge and Herbert B. Enderton. Cam-
bridge, MA: MIT Press.

Feferman, Anita and Solomon Feferman. 2004. Alfred Tarski: Life and Logic.
Cambridge: Cambridge University Press.

Feferman, Solomon. 1994. Julia Bowman Robinson 1919–1985. Bio-


graphical Memoirs of the National Academy of Sciences 63: 1–28. URL
http://www.nasonline.org/publications/biographical-memoirs/
memoir-pdfs/robinson-julia.pdf.

Feferman, Solomon, John W. Dawson Jr., Stephen C. Kleene, Gregory H.


Moore, Robert M. Solovay, and Jean van Heijenoort. 1986. Kurt Gödel: Col-
lected Works. Vol. 1: Publications 1929–1936. Oxford: Oxford University Press.

Feferman, Solomon, John W. Dawson Jr., Stephen C. Kleene, Gregory H.


Moore, Robert M. Solovay, and Jean van Heijenoort. 1990. Kurt Gödel: Col-
lected Works. Vol. 2: Publications 1938–1974. Oxford: Oxford University Press.

Frege, Gottlob. 1884. Die Grundlagen der Arithmetik: Eine logisch mathematische
Untersuchung über den Begriff der Zahl. Breslau: Wilhelm Koebner. Transla-
tion in Frege (1953).

Frege, Gottlob. 1953. Foundations of Arithmetic, ed. J. L. Austin. Oxford: Basil


Blackwell & Mott, 2nd ed.

Frey, Holly and Tracy V. Wilson. 2015. Stuff you missed in history class:
Emmy Noether, mathematics trailblazer. URL https://www.iheart.
com/podcast/stuff-you-missed-in-history-cl-21124503/episode/
emmy-noether-mathematics-trailblazer-30207491/. Podcast audio.

366
Bibliography

Gentzen, Gerhard. 1935a. Untersuchungen über das logische Schließen I.


Mathematische Zeitschrift 39: 176–210. English translation in Szabo (1969),
pp. 68–131.

Gentzen, Gerhard. 1935b. Untersuchungen über das logische Schließen II.


Mathematische Zeitschrift 39: 176–210, 405–431. English translation in Szabo
(1969), pp. 68–131.

Gödel, Kurt. 1929. Über die Vollständigkeit des Logikkalküls [On the com-
pleteness of the calculus of logic]. Dissertation, Universität Wien. Reprinted
and translated in Feferman et al. (1986), pp. 60–101.

Gödel, Kurt. 1931. über formal unentscheidbare Sätze der Principia Mathe-
matica und verwandter Systeme I [On formally undecidable propositions
of Principia Mathematica and related systems I]. Monatshefte für Mathematik
und Physik 38: 173–198. Reprinted and translated in Feferman et al. (1986),
pp. 144–195.

Grattan-Guinness, Ivor. 1971. Towards a biography of Georg Cantor. Annals


of Science 27(4): 345–391.

Hammack, Richard. 2013. Book of Proof. Richmond, VA: Virginia Com-


monwealth University. URL http://www.people.vcu.edu/~rhammack/
BookOfProof/BookOfProof.pdf.

Hodges, Andrew. 2014. Alan Turing: The Enigma. London: Vintage.

Hutchings, Michael. 2003. Introduction to mathematical arguments. URL


https://math.berkeley.edu/~hutching/teach/proofs.pdf.

Institute, Perimeter. 2015. Emmy Noether: Her life, work, and influence. URL
https://www.youtube.com/watch?v=tNNyAyMRsgE. Video Lecture.

Irvine, Andrew David. 2015. Sound clips of Bertrand Russell speaking. URL
http://plato.stanford.edu/entries/russell/russell-soundclips.
html.

Jacobson, Nathan. 1983. Emmy Noether: Gesammelte Abhandlungen—Collected


Papers. Berlin: Springer-Verlag.

John Dawson, Jr. 1997. Logical Dilemmas: The Life and Work of Kurt Gödel. Boca
Raton: CRC Press.

LibriVox. n.d. Bertrand Russell. URL https://librivox.org/author/1508?


primary_key=1508&search_category=author&search_page=1&search_
form=get_results. Collection of public domain audiobooks.

367
B IBLIOGRAPHY

Linsenmayer, Mark. 2014. The partially examined life: Gödel on


math. URL http://www.partiallyexaminedlife.com/2014/06/16/
ep95-godel/. Podcast audio.

MacFarlane, John. 2015. Alonzo Church’s JSL reviews. URL http://


johnmacfarlane.net/church.html.

Magnus, P. D., Tim Button, J. Robert Loftis, Aaron Thomas-Bolduc, Robert


Trueman, and Richard Zach. 2021. Forall x: Calgary. An Introduction to For-
mal Logic. Calgary: Open Logic Project, f21 ed. URL https://forallx.
openlogicproject.org/.

Matijasevich, Yuri. 1992. My collaboration with Julia Robinson. The Mathemat-


ical Intelligencer 14(4): 38–45.

Menzler-Trott, Eckart. 2007. Logic’s Lost Genius: The Life of Gerhard Gentzen.
Providence: American Mathematical Society.

O’Connor, John J. and Edmund F. Robertson. 2014. Rózsa Péter. URL http:
//www-groups.dcs.st-and.ac.uk/~history/Biographies/Peter.html.

Péter, Rózsa. 1935a. Über den Zusammenhang der verschiedenen Begriffe der
rekursiven Funktion. Mathematische Annalen 110: 612–632.

Péter, Rózsa. 1935b. Konstruktion nichtrekursiver Funktionen. Mathematische


Annalen 111: 42–60.

Péter, Rózsa. 1951. Rekursive Funktionen. Budapest: Akademiai Kiado. English


translation in (Péter, 1967).

Péter, Rózsa. 1967. Recursive Functions. New York: Academic Press.

Péter, Rózsa. 2010. Playing with Infinity. New York: Dover.


URL https://books.google.ca/books?id=6V3wNs4uv_4C&lpg=PP1&ots=
BkQZaHcR99&lr&pg=PP1#v=onepage&q&f=false.

Potter, Michael. 2004. Set Theory and its Philosophy. Oxford: Oxford University
Press.

Radiolab. 2012. The Turing problem. URL http://www.radiolab.org/story/


193037-turing-problem/. Podcast audio.

Reid, Constance. 1986. The autobiography of Julia Robinson. The College Math-
ematics Journal 17: 3–21.

Reid, Constance. 1996. Julia: A Life in Mathematics. Cambridge: Cam-


bridge University Press. URL https://books.google.ca/books?id=
lRtSzQyHf9UC&lpg=PP1&pg=PP1#v=onepage&q&f=false.

368
Bibliography

Robinson, Julia. 1949. Definability and decision problems in arithmetic.


The Journal of Symbolic Logic 14(2): 98–114. URL http://www.jstor.org/
stable/2266510.

Robinson, Julia. 1996. The Collected Works of Julia Robinson. Providence: Amer-
ican Mathematical Society.

Rose, Daniel. 2012. A song about Georg Cantor. URL https://www.youtube.


com/watch?v=QUP5Z4Fb5k4. Audio Recording.

Russell, Bertrand. 1905. On denoting. Mind 14: 479–493.

Russell, Bertrand. 1967. The Autobiography of Bertrand Russell, vol. 1. London:


Allen and Unwin.

Russell, Bertrand. 1968. The Autobiography of Bertrand Russell, vol. 2. London:


Allen and Unwin.

Russell, Bertrand. 1969. The Autobiography of Bertrand Russell, vol. 3. London:


Allen and Unwin.

Russell, Bertrand. n.d. Bertrand Russell on smoking. URL https://www.


youtube.com/watch?v=80oLTiVW_lc. Video Interview.

Sandstrum, Ted. 2019. Mathematical Reasoning: Writing and Proof. Allendale,


MI: Grand Valley State University. URL https://scholarworks.gvsu.edu/
books/7/.

Segal, Sanford L. 2014. Mathematicians under the Nazis. Princeton: Princeton


University Press.

Sigmund, Karl, John Dawson, Kurt Mühlberger, Hans Magnus Enzensberger,


and Juliette Kennedy. 2007. Kurt Gödel: Das Album–The Album. The Math-
ematical Intelligencer 29(3): 73–76.

Smith, Peter. 2013. An Introduction to Gödel’s Theorems. Cambridge: Cambridge


University Press.

Smullyan, Raymond M. 1968. First-Order Logic. New York, NY: Springer.


Corrected reprint, New York, NY: Dover, 1995.

Solow, Daniel. 2013. How to Read and Do Proofs. Hoboken, NJ: Wiley.

Steinhart, Eric. 2018. More Precisely: The Math You Need to Do Philosophy. Pe-
terborough, ON: Broadview, 2nd ed.

Sykes, Christopher. 1992. BBC Horizon: The strange life and death of Dr. Tur-
ing. URL https://www.youtube.com/watch?v=gyusnGbBSHE.

369
B IBLIOGRAPHY

Szabo, Manfred E. 1969. The Collected Papers of Gerhard Gentzen. Amsterdam:


North-Holland.

Takeuti, Gaisi, Nicholas Passell, and Mariko Yasugi. 2003. Memoirs of a Proof
Theorist: Gödel and Other Logicians. Singapore: World Scientific.

Tamassy, Istvan. 1994. Interview with Róza Péter. Modern Logic 4(3): 277–280.

Tarski, Alfred. 1981. The Collected Works of Alfred Tarski, vol. I–IV. Basel:
Birkhäuser.

Theelen, Andre. 2012. Lego turing machine. URL https://www.youtube.


com/watch?v=FTSAiF9AHN4.

Turing, Alan M. 1937. On computable numbers, with an application to the


“Entscheidungsproblem”. Proceedings of the London Mathematical Society, 2nd
Series 42: 230–265.

Tyldum, Morten. 2014. The imitation game. Motion picture.

Velleman, Daniel J. 2019. How to Prove It: A Structured Approach. Cambridge:


Cambridge University Press, 3rd ed.

Wang, Hao. 1990. Reflections on Kurt Gödel. Cambridge: MIT Press.

Zermelo, Ernst. 1904. Beweis, daß jede Menge wohlgeordnet werden kann.
Mathematische Annalen 59: 514–516. English translation in (Ebbinghaus
et al., 2010, pp. 115–119).

Zermelo, Ernst. 1908. Untersuchungen über die Grundlagen der Mengen-


lehre I. Mathematische Annalen 65(2): 261–281. English translation in
(Ebbinghaus et al., 2010, pp. 189-229).

Zuckerman, Martin M. 1973. Formation sequences for propositional formulas.


Notre Dame Journal of Formal Logic 14(1): 134–138.

370

You might also like