0% found this document useful (0 votes)
20 views57 pages

Compiler

The document provides notes for a formal methods seminar, focusing on basic set theory and the foundations of logic. It covers essential concepts such as sets, relations, functions, and introduces classical propositional logic, including its language, semantics, and proof theory. The notes aim to establish a foundational understanding of logic and its components, preparing readers for more advanced topics in subsequent chapters.

Uploaded by

Faisal Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views57 pages

Compiler

The document provides notes for a formal methods seminar, focusing on basic set theory and the foundations of logic. It covers essential concepts such as sets, relations, functions, and introduces classical propositional logic, including its language, semantics, and proof theory. The notes aim to establish a foundational understanding of logic and its components, preparing readers for more advanced topics in subsequent chapters.

Uploaded by

Faisal Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Notes for formal methods seminar

Part I. Logic

Richard Pettigrew
Department of Philosophy
University of Bristol
Chapter 1

Basic set theory

Those familiar with basic set-theoretical concepts and notation should skip
this chapter. Even those who are not might profitably skip it and return
here when they encounter notation they don’t understand.

1.1 Notation
A set and its members
• A set is a collection of objects.
• The objects in a set are called its elements or members.
• If S is a set and a is a member of S, then we write a 2 S.
• If S is a set and a is not a member of S, then we write a 62 S.
Some sets of particular interest
• We write ? to denote the empty set. Thus, for all a, a 62 ?.
• If S and T are both sets, we say that T is a subset of S if every element
of T is an element of S: in this case, we write T ✓ S. Note that, for
every set S, ? ✓ S and S ✓ S.
• If a1 , a2 , ..., an are objects, we write {a1 , a2 ..., an } for the set whose
members are all and only the ai s.
• Suppose S is a set and is a property. Then we write {x 2 S : (x)}
to denote the subset of S whose elements are all and only the elements
of S that have the property . Thus, for all a,
a 2 {x 2 S : (x)} i↵ (a 2 S and (a))

1
2 Notation

• If S is a set, we write }(S) to denote its power set: that is, the set
that contains all and only the subsets of S. Thus,

T 2 }(S) i↵ T ✓ S

In particular, for any property ,

{x 2 S : (x)} 2 }(S).

But, for an infinite set S, there are many members of }(S) that are
not of the form {x 2 S : (x)}.

Ordered n-tuples and Cartesian products

• If a and b are objects, we write (a, b) to denote the ordered pair whose
first element is a and whose second element is b.

• Extending this notation, (a, b, c) is the ordered triple whose first ele-
ment is a, second is b, and third is c.

• Generalizing this notation, (a1 , ..., an ) is the ordered n-tuple whose first
element is a1 , whose second is a2 , whose third is a3 , and so on.

• If S and T are sets, then S ⇥ T is the set of all ordered pairs (a, b)
such that a 2 S and b 2 T .

• If S1 , ..., Sn are sets, then S1 ⇥ ... ⇥ Sn is the set of all ordered n-tuples
(a1 , ..., an ) such that a1 2 S1 , ..., an 2 Sn . We call S1 ⇥ ... ⇥ Sn the
Cartesian product of S1 , ..., Sn .

• If S is a set, then S n = S ⇥ ... ⇥ S. That is, S n is the set of ordered


n-tuples (a1 , ..., an ) such that a1 , ..., an 2 S.

Relations

• An n-place relation over a set S is a subset of S n : that is, it is a set


of n-tuples whose elements lie in S.

• We have names for various types of two-place (or binary) relation.


Suppose R is a two-place relation over S:

– R is reflexive i↵ (8x 2 S)Rxx


– R is symmetric i↵ (8x, y 2 S)(Rxy Ryx).
– R is transitive i↵ (8x, y, z 2 S)((Rxy & Ryz) Rxz)
Basic set theory 3

• We say that a binary relation R is an equivalence relation if R is


reflexive, symmetric, and transitive.

• Given an equivalence relation R over a set S, and given a 2 S, we


define the equivalence class of a as follows:

[a] = {x 2 S : Rax}

• It is a theorem that, if R is an equivalence relation over S, then the


equivalence classes of R partitions S into mutually exclusive subsets.
That is:

– for all a 2 S, there is b 2 S (namely, a itself) such that a 2 [b];


and
– for all a, b 2 S either [a] = [b] or [a] \ [b] = ?.

Functions

• If S and T are sets, then f : S ! T means that f is a function from


S into T . That is, for each a 2 S, f (a) 2 T . We call f (a) the value of
f at a.

• If f : S ! T , then S is called the domain of f and T is called the


codomain of f .

• If f : S ! T , then the range of f is the subset of T that contains the


values of f at the various elements of S. That is, the range of f is

{x 2 T : there is y 2 S such that f (y) = x}

• If f : S ! T , then we say that f is one-one or injective if, for every


a, b 2 S, if a 6= b, then f (a) 6= f (b).

• If f : S ! T , then we say that f is onto or surjective if, for every


b 2 T , there is a 2 S such that f (a) = b. That is, the range of f is T .

• If f : S ! T , then we say that f is a one-one correspondence or


bijective if f is one-one and onto: that is, injective and surjective.
Chapter 2

What is a logic?

In this part of the notes, we will discuss four logics: classical proposi-
tional logic, classical first-order logic, modal propositional logic, and classical
second-order logic. In chapters 3, 4, 6, and ??, we will present them. In
the case of the former three, we will prove two central results about each,
namely, their soundness and completeness theorems. In chapter 5, we will
present some further results about classical first-order logic that reveal its
weaknesses. In chapter ??, we will see how these may be overcome, but at
a serious cost. This will build on work on the logical study of arithmetic
presented in chapter ??.

To begin, we give a very brief definition of what will count as a logic. At


first, it will seem very general and rather abstract. However, when we see
the definition of the various logics in chapters 3, 4, 6, and ??, we will begin
to see why the generality is desirable.
A logic comprises three components:
• A language
• A semantics
• A proof theory
The language of a logic consists of a set of symbols—its alphabet—and a
set of some of the finite strings of those symbols—its well-formed formulae.
Usually, we specify which strings are to count as well-formed formulae by
specifying a grammar for our language; this will be the case in the systems
defined below.
The semantics of a logic comprises three components: a specification of
the sort of mathematical object that will count as a model relative to which

4
What is a logic? 5

the well-formed formulae of the logic can be assigned truth values; a set
of truth-values; and a function that takes any model and any well-formed
formula to a truth-value—this is usually called a valuation function. The
latter allows us to define the relation M |= ' that holds between certain
models and certain well-formed formulae.
The proof theory of a logic identifies the sets of well-formed formulae S
and the well-formed formulae ' for which ' can be proved or derived from
S—in symbols, S ` '. There are various ways to specify this. One condition
is that the way of identifying it must be ‘e↵ective’—that is, it must consist
of a mechanical procedure that a computer may carry out.
Chapter 3

Classical Propositional Logic

Enough abstract definitions. Let’s meet our first example of a formal system.
It’s the system of classical propositional logic (abbreviated CPL). Remember
that, in order to specify it, we must specify its language, semantics, and proof
theory.

3.1 The language


3.1.1 The alphabet
The alphabet of propositional logic consists of the following symbols:

1. Propositional letters p, q, r, ... (and with subscripts)

2. Propositional connectives: ! and ¬

3. Punctuation: ( and )

The propositional connectives and punctuation are the logical symbols of the
language. The propositional letters are its non-logical symbols.
The strings of this formal system are those (finite) sequences composed
of symbols from the alphabet.
Those who have studied logic before may be concerned that our language
lacks the propositional connectives ^, _, and $. However, this is not a
problem since there is a natural sense in which these can be defined in terms
of the symbols we do have. We do this in the next section, once we have
specified the grammar for propositional logic.

6
Classical propositional logic 7

3.1.2 The grammar


Next, we must say which of these strings are to count as the well-formed
formulae of propositional logic. To do this, we describe the grammar of the
language, which tells us how symbols may be combined in a well-formed
way. By analogy, the grammar of English tells us how the symbols of its
alphabet (‘a’, ‘b’, ‘c’, etc.) may be combined to produce a sentence that
is meaningful: for instance, it tells us that an adjective must precede a
noun, and an adverb must precede or come after a verb. The grammar of
propositional logic consists of similar rules.
We start by stating so-called ‘closure rules’ for well-formed formulae:

(1) Any propositional letter is a well-formed formulae.

(2) If ' and are well-formed formulae, then ¬' and (' ! ) are well-
formed formulae.

Henceforth, we will abbreviate ‘well-formed formula’ to ‘w↵’.


This tells us about a large set of strings that they are w↵s: for instance,
it tells us that ((p ! q) ! q) is a w↵. (Why?) But it doesn’t tell us of
any string that it is not a w↵: for instance, it doesn’t tell us that p !) is
not a w↵. To stipulate this, we want to say that all formulae are obtained
by starting with the propositional letters in (1) and applying the rules of
generation in (2). We do this by saying the following: the set of w↵s is the
smallest set of strings of which (1) and (2) are true.
An important point: We define the set of w↵s here by saying that it is the
smallest set that includes certain things (the propositional letters) is closed
under certain operations (the operations of negation and conditional) and is
the smallest set with those properties. This mode of definition is known as
recursive definition. It is sufficient to define the natural numbers: the set of
natural numbers is the smallest set that includes zero and is closed under the
operation of adding one. Thus, the theory of the language of propositional
logic is as strong as the theory of arithmetic. This is important, since logical
facts are often cited in attempts to provide a foundation for arithmetic.
Another consequence of this definition of the set of well-formed formulae
is that we can prove theorems about them using mathematical induction.
Suppose I can prove the following two facts:

(a) Each propositional letter has property

(b) If w↵s ' and have property , then w↵s ¬' and ' ! have property
.
8 The semantics

Then it follows by mathematical induction that all w↵s have property .


We are now in a position to define the other familiar propositional con-
nectives, ^, _, and $. By definition, if ' and are w↵s, we let

(' ^ ) =df. ¬(' ! ¬ )


(' _ ) =df. (¬' ! )
(' $ ) =df. ((' ! ) ^ ( ! '))

When we come to describe the semantics of propositional logic, we will see


that the w↵s on the right-hand side of these equations have the requisite
semantic properties.

3.2 The semantics


In the previous section, we described the language of classical propositional
logic (abbreviated ‘CPL’). In this section, we describe its semantics. As
noted above, we need to say what sort of mathematical object counts as a
model, we need to specify a truth-value set, and we need to define a function
that takes a model and a w↵ to an element of the truth-value set.
In propositional logic, our truth values are True and False. Thus, our
set of truth values is {T, F }.
And a model is a truth value assignment:

Definition 3.2.1 (Propositional model) A propositional model M =


(I) is a function I that takes each propositional letter to a truth value. That
is, I : {p, q, r, ...} ! {T, F }.

Given a propositional model M = (I), we extend I to produce an interpre-


tation—also denoted I—which takes each w↵ to its truth value in the model
M.

Definition 3.2.2 (Interpretation) Given a propositional model M = (I),


we extend I to give an interpretation recursively as follows:
Suppose ' and are w↵s and suppose further that we have already defined
I(') and I( ). Then

• Let ⇢
T if I(') = F or I( ) = T
I(' ! ) =
F otherwise
Classical propositional logic 9

• Let ⇢
T if I(') = F
I(¬') =
F otherwise

Some notation

(1) If ' is a w↵ and M is a propositional model, we write M |= ', if


I(') = T . In this case, we say that ' is true in the model M.

(2) We write S |= ', if M |= ', for any model M such that M |= for
all in S.
In this case, we say that ' is a logical consequence of S.

(3) We write |= ', if ' is a w↵ and M |= ' for any propositional model
M.
In this case, we say that ' is logically valid.

(4) If S is a set of w↵s and M is a model,we write M |= S if M |= ' for


all ' in S.

(5) If S is a set of w↵s, we say that S is satisfiable if there is a propositional


model M such that M |= S.

(6) We say that S is finitely satisfiable if every finite subset of S is satis-


fiable. That is, for all finite T ✓ S, there is M such that M |= T .

(7) Given a propositional model M, let

Th(M) = {' : ' is a w↵ and M |= '}

Th(M) is called the theory of the model M.

3.3 The proof theory


Many di↵erent combinations of sets of axioms and sets of rules of inference
give rise to the same proof relation ` in propositional logic. We choose a
particular set. We will present them in the course of giving a definition of a
proof of a w↵ ' from a set of assumptions S.

Definition 3.3.1 (Proof ) Suppose S is a set of w↵s and suppose that P =


('1 , ..., 'n ) is a sequence of w↵s. Then we say that P is a proof of 'n from
assumptions S if, for every 'i in P , one of the following is true:
10 The semantics

(1) 'i is in S,

(2) 'i is an axiom. That is, 'i is one of the following

• (' ! ( ! ')), for some w↵s ' and ,


• ((' ! ( ! ✓)) ! ((' ! ) ! (' ! ✓))), for some w↵s ', ,
and ✓,
• ((¬ ! ¬') ! (' ! )), for some w↵s ' and , or

(3) There is j1 , j2 < i such that 'j1 is ! 'i and 'j2 is . In this case,
we say that 'i is inferred from 'j1 and 'j2 by modus ponens.

Some notation
(1) We write S `CP L ' if there is a proof of ' from assumptions S. In
this case, we say that ' is a syntactic consequence of S.

(2) We write `CP L ' if there is a proof of ' from the empty set of as-
sumptions, ?. In this case, we say that ' is a theorem of propositional
logic.

(3) We say that S is consistent if is there is no proof of ? =df. ¬(p !


(p ! p)) from assumptions S; we say that S is inconsistent if there is
such a proof.
Throughout this chapter, we will drop the subscript on `CP L , since we will
always be talking about proofs within classical propositional logic. However,
in future chapters, when we wish to compare provability in di↵erent systems,
we might introduce it again.
We complete our discussion of the syntax of our system by stating the
following important theorems about our formal system:
Theorem 3.3.2 If S is inconsistent, then S ` ', for any w↵ '.
Proof. Suppose S ` ? =df. ¬(p ! (p ! p)). Then

S ` ¬(p ! (p ! p)) Assumption


S ` ¬(p ! (p ! p)) ! (¬' ! ¬(p ! (p ! p))) Axiom
S ` (¬' ! ¬(p ! (p ! p))) Modus ponens
S ` (¬' ! ¬(p ! (p ! p))) ! ((p ! (p ! p)) ! ') Axiom
S ` ((p ! (p ! p)) ! ') Modus ponens
S ` (p ! (p ! p)) Axiom
S ` ' Modus ponens
Classical propositional logic 11

Theorem 3.3.3 (Deduction Theorem) If S [{'} ` , then S ` ' ! .

Theorem 3.3.4 (Proof by Contradiction) If S[{¬'} ` ?, then S ` '.

Proof. By the Deduction Theorem, if S [ {¬'} ` ?, then S ` ¬' ! ?.

S ` ¬' ! ¬(p ! (p ! p)) Assumption


S ` (¬' ! ¬(p ! (p ! p))) ! ((p ! (p ! p)) ! ') Axiom
S ` ((p ! (p ! p)) ! ') Modus ponens
S ` (p ! (p ! p)) Axiom
S ` ' Modus ponens

as required. 2

3.4 Questions
(i) Show that (p ! (q ! p)) is a w↵ of propositional logic.

(ii) Show, using mathematical induction, that no w↵ of propositional logic


includes !) within it.

(iii) Show that (p !) is not a w↵.

(iv) Show, by appealing to mathematical induction, that (p ! qp) is not a


w↵.

(v) Show that



T if I(') = T and I( ) = T
I(' ^ ) =
F otherwise

where ' ^ is as defined above. Do likewise for ' _ and ' $ .

3.5 The soundness and completeness theorems


This completes our description of the language, semantics, and proof theory
of propositional logic. We turn now to the metatheory. There are three sorts
of results in the metatheory of a formal system: those that concern the proof
theory of the system, those that concern its model theory, and those that
establish the links between the two. We begin with the two fundamental
results in the latter category: the Soundness and Completeness Theorems.
12 Soundness and completeness

Definition 3.5.1 (Soundness) We say that a logic is sound with respect


to a semantics if all syntactic consequences are also semantic consequences.
That is, if
S`' implies S |= '

Definition 3.5.2 (Completeness) We say that a logic is complete with


respect to a semantics if every semantic consequence is a syntactic conse-
quence. That is, if
S |= ' implies S`'

It is the task of the next two sections to prove that propositional logic is
sound and complete with respect to the semantics we have just described:
that is, a w↵ ' can be derived from a set of assumptions S if, and only if,
' is true in all models in which each w↵ in S is true.

3.5.1 Soundness
Theorem 3.5.3 (Soundness theorem for propositional logic) If S is
a set of w↵s and ' is a single w↵, then

S`' implies S |= '

Proof. We will prove the following:

For any n, if ('1 , ..., 'n ) is a proof of 'n from a set of assumptions
S, then S |= '.

We call this our inductive hypothesis.


Our proof then has two stages:

• First, we will show that the inductive hypothesis holds for n = 1: this
is called the Base Case.

• Then, we will show that, if it holds for all k < n, then it holds for n:
this is called the Inductive Step.

Proofs that employ this strategy are called proofs by mathematical induction.

Base Case We show that, if P = ('1 ) is a proof of '1 from assumptions


S, then S |= '1 . If P = ('1 ) is a proof of '1 from assumptions S, then
either '1 is an axiom or '1 is in S.
Classical propositional logic 13

• Suppose '1 is an axiom. It is easy to see that M |= '1 , for any model
M. (Try it!) So a fortiori S |= ', as required.
• Suppose '1 is in S. Then clearly, if M |= S, then M |= '1 . That is,
S |= '1 , as required.
Inductive Step Assume the following: For any k < n and any set of w↵s
S 0 , if P 0 = ('01 , ..., '0k ) is a proof of '0k from assumptions S 0 , then S 0 |= '0k .
We call this the inductive hypothesis. We now prove the following: For any
set of w↵s S, if P = ('1 , ..., 'n ) is a proof of 'n from assumptions S, then
S |= 'n .
This time there are three options: 'n is an axiom, 'n is in S, or 'n is
derived from 'i and 'j by modus ponens. The first two options are dealt
with exactly as in the Base Case. Thus, we turn to the third options:
• Suppose i, j < n and 'n is derived from 'i and 'j by modus ponens:
that is, 'i = ! 'n and 'j = . Then, ('1 , ..., 'i ) is a proof of 'i
from assumptions S and ('1 , ..., 'j ) is a proof of 'j from assumptions
S. Thus, by the inductive hypothesis, S |= ! 'n and S |= . It
follows immediately that S |= 'n as required.
This completes our induction. 2

3.5.2 Completeness
Theorem 3.5.4 (Completeness theorem for propositional logic) If S
is a set of w↵s and ' is a single w↵, then
S |= ' implies S`'

Our strategy is not to prove the theorem directly. Rather, we’re going to
state a theorem that is equivalent to it (Theorem 3.5.5), prove that it is
equivalent (Proposition 3.5.6), and prove that instead. Here is the theorem
that is equivalent to the completeness theorem.
Theorem 3.5.5 (The alternative Completeness theorem) Suppose S
is a set of w↵s. Then if S is consistent, then S is satisfiable.
And here is the proof that it is equivalent to the completeness theorem.
Proposition 3.5.6 Theorems 3.5.4 and 3.5.5 are equivalent.
Proof. First, we assume that Theorem 3.5.4 is true and prove that Theorem
3.5.5 follows. Then, we assume that Theorem 3.5.5 is true and prove that
Theorem 3.5.4 follows.
14 Soundness and completeness

• Suppose Theorem 3.5.4 is true. We want to show that Theorem 3.5.5


follows. To that end, suppose that S is consistent. We must show that
there is a model M such that M |= S.
S is consistent. Thus, S 6` ?. Thus, by the contrapositive of Theorem
3.5.4, it follows that S 6|= ?. That is, it is not the case that every
model that makes S true also makes ? true. Thus, there is a model
in which S is true and ? is false. A fortiori, there is a model in which
S is true, as required.
Thus, Theorem 3.5.4 entails Theorem 3.5.5.

• Now, suppose Theorem 3.5.5 holds. And suppose that S |= '. Then
there is no model of S [{¬'}. Thus, by the contrapositive to Theorem
3.5.5, S [ {¬'} is not consistent. That is,

S [ {¬'} ` ?

It follows from this, by Theorem 3.3.4 above, that

S`'

as required. Thus, Theorem 3.5.5 entails Theorem 3.5.4.

Thus, Theorem 3.5.4 and Theorem 3.5.5 are equivalent. 2


It follows that, in order to prove Theorem 3.5.4, it will suffice to prove
Theorem 3.5.5.

The structure of the proof


In this section, we describe the structure of the proof of Theorem 3.5.5. In
the next section, we fill in the details.
The proof proceeds by proving the following two propositions:

Proposition 3.5.7 (Lindenbaum Lemma for Propositional Logic) If


S is a consistent set of w↵s, then there is a set S1 of w↵s with the following
properties:

(a) S ✓ S1 .
(That is, every formula in S is in S1 .)

(b) S1 is maximal.
(That is, for every w↵ , either is in S1 or ¬ is in S1 .)
Classical propositional logic 15

(c) S1 is consistent.
(That is, S1 6` ?)
Proposition 3.5.8 If S1 is a maximal consistent set of w↵s, then there is
a propositional model M such that, for any w↵ ,
M |= i↵ is in S1
We might consider Proposition 3.5.7 as laying the foundations on which the
model will be built. In Proposition 3.5.8, we build the model on top of those
foundations.

Laying the foundations


Proof of Proposition 3.5.7. Suppose S is a consistent set of w↵s. We are
going to construct S1 by augmenting S infinitely many times, as follows.
1. First, we list all the w↵s of propositional logic: '1 , '2 , '3 , ....
2. Then, we define Sn recursively as follows:
• First, we let
S0 = S
• Second, we let

Sn [ {'n } if Sn [ {'n } is consistent
Sn+1 =
Sn [ {¬'n } otherwise
3. Finally, we let
1
[
S1 = Sn = {' : ' is in Sn for some n}
n=0

To complete our proof, we need to show that S1 , which we just constructed,


is maximal, consistent, and contains S. By construction, it contains S and
is maximal. Thus, we need only show that S1 is consistent. Suppose not.
Then S1 ` ?. But since proofs are finite sequences of formulae, there must
0 of S 0
be a finite subset S1 1 such that S1 ` ?. But then there is n such
that Sn ` ?. But it is easy to prove by induction that, if S is consistent,
then each Sn is consistent: by assumption, S0 = S is consistent; and, if Sn
is consistent, then either S [ {'n } or S [ {¬'n } is consistent (or possibly
both). Thus, we have a contradiction. So, S1 is consistent. This completes
the proof of Proposition 3.5.7. We have laid the foundations for our model.

2
16 Soundness and completeness

Building the model


We must turn now to proving Proposition 3.5.8; that is, we must now build
the model on the foundations we have laid.
Suppose S1 is a maximal consistent set of w↵s. We now use S1 to
construct a propositional model M = (I) with the following property: for
all w↵s ,
M |= i↵ is in S1
We define I as follows: for all propositional letters p,

T if p is in S1
I(p) =
F otherwise

We now prove that, when we extend I recursively to give an interpretation—


also denoted I—we get that, for all w↵s ,

I( ) = T i↵ is in S1

We proceed by induction on the complexity of :


Base Case If is a propositional letter, then, by definition, I( ) = T i↵
is in S1 .
Inductive Step There are two cases:

• is ¬'.

(i) First, we show that, if I(¬') = T , then ¬' is in S1 . Suppose


I(¬') = T . Then I(') = F . By inductive hypothesis, it follows
that ' is not in S1 . By maximality of S1 , it follows that ¬' is
in S1 , as required.
(ii) Second, we show that, if ¬' is in S1 , then I(¬') = T . Suppose
¬' is in S1 . Then, by consistency of S1 , ' is not in S1 , since

', ¬' ` ?

for any w↵ '. (Show this!) Thus, by inductive hypothesis, I(') =


F . So I(¬') = T .

• is (' ! ✓).

(i) First, we show that, if I(' ! ✓) = T , then (' ! ✓) is in S1 .


Suppose I(' ! ✓) = T . Then I(') = F or I(✓) = T . By
inductive hypothesis, it follows that ' is not in S1 or ✓ is in
Classical propositional logic 17

S1 . Suppose ¬(' ! ✓) is in S1 . Then we have that S1 is


inconsistent, since

¬', ¬(' ! ✓) ` ? and ✓, ¬(' ! ✓) ` ?

This contradicts the consistency of S1 . Thus, by maximality of


S1 , (' ! ✓) is in S1 .
(ii) Second, we show that if ' ! ✓ is in S1 , then I(' ! ✓) = T .
Suppose ' ! ✓ is in S1 . Then, since S1 is consistent, it cannot
be the case that ' and ¬✓ are in S1 since

', ¬✓, ' ! ✓ ` ?

Thus, ' is not in S1 or ¬✓ is not in S1 . By inductive hypothesis,


it follows that I(') = F or I(✓) = T . In either case, I(' ! ✓) =
T , as required.

This completes our proof of Proposition 3.5.8, and that completes our proof
of the Completeness Theorem for Propositional Logic.
Chapter 4

Classical first-order logic

In propositional logic, we represent propositions as truth-functional combi-


nations of further propositions. In first-order logic, we delve a little more
deeply into the logical structure of the propositional letters out of which all
propositional logic are ultimately constructed. As before, we meet the lan-
guage first, then the semantics and the proof theory in that order. Finally
we marry the two with soundness and completeness theorems.

4.1 The language


4.1.1 The alphabet
The alphabet of first-order logic consists of the following symbols:

1. Constants: a, b, c, ... (and with subscripts)

2. Free variables: a, b, c, ... (and with subscripts)

3. Bound variables: u, v, w, x, y, z (and with subscripts)

4. Function symbols: f 1 , f 2 , ..., g 1 , g 2 , ... (and with subscripts)

5. Relation symbols: P 0 , P 1 , P 2 , ..., R0 , R1 , R2 , ... (and with subscripts)

6. Propositional connectives: ! and ¬

7. The universal quantifier: 8

8. Punctuation: ( and ) and ,

18
Classical first-order logic 19

The superscripts on the function and relation symbols indicate their arity.
Thus, f n is an n-ary function symbol. That is, it represents an n-place
function. Similarly, P n is an n-ary relation symbol. That is, it represents
an n-place relation.
The quantifier, propositional connectives, punctuation, and free and
bound variables are the logical symbols of the language. The constants,
function, and relation symbols are its non-logical symbols.

Definition 4.1.1 (Signature) A first-order signature is a set of non-logical


symbols: that is, a set of constant symbols, function symbols, and relation
symbols.

The strings of this formal system are those (finite) sequences composed
of symbols from the alphabet. Given a signature , the -strings are the
strings whose non-logical symbols are drawn from .

4.1.2 The grammar


Next, we must say which of these strings are to count as the well-formed
formulae of first-order logic. These are considerably more complicated than
in the propositional case.

Terms First, we identify a set of strings that we call the terms of the
language. Roughly, these are the naming phrases of the language; they are
the strings that are used to denote objects.
Definition 4.1.2 (Term)
(1) Every constant and every free variable is a term.

(2) If t1 , ..., tn are terms and f n is a function symbol, then f n (t1 , ..., tn ) is
a term.

(3) There are no other terms.


Note that, in (2), the number of terms, t1 , ..., tn must match the superscript
on the function symbol. As noted above, we call that superscript the arity
of the function symbol.
Atomic formulae Next, we pick out the atomic formulae of the language.
These will be the building blocks of the well-formed formulae. Roughly, they
assert that a particular relation holds between particular objects.
Definition 4.1.3 (Atomic formula)
20 The language

(1) If t1 , ..., tn are terms and Rn is a relation symbol, then Rn (t1 , ..., tn )
is an atomic formula.

(2) There are no other atomic formulae.

As in the definition of terms above (Definition 4.1.2), note that the number
of terms, t1 , ..., tn must match the superscript on the relation symbol. Again,
we call that superscript the arity of the relation symbol.
Well-formed formulae Finally, we are in a position to pick out the well-
formed formulae from amongst the strings of symbols of the first-order lan-
guage.

Definition 4.1.4 (Well-formed formula) The set of well-formed formu-


lae (or w↵s) of first-order logic is the smallest set that has the following
closure properties:

(1) Atomic formulae are well-formed formulae.

(2) If ' and are well-formed formulae, then ¬' and (' ! ) are well-
formed formulae.

(3) If ' is a well-formed formulae, then 8x'[x/a] is a well-formed formula,


where '[x/a] is the string obtained from the well-formed formula '
by substituting the bound variable x in place of the free variable a
throughout.

Definition 4.1.5 ( -w↵ ) Given a signature , a -w↵ is a w↵ whose non-


logical symbols are drawn from .

Those who have studied logic before may be concerned that our language
lacks the existential quantifier 9. However, as in the case of the propositional
connectives, this is not a problem since there is a natural sense in which this
can be defined in terms of the symbols we do have. Thus, by definition, if
' is a w↵, we let

9x'[x/a] =df. ¬8x¬'[x/a]

When we come to describe the semantics of first-order logic, we will see that
the w↵ on the right-hand side of this equation has the requisite semantic
properties.
Classical first-order logic 21

4.2 The semantics


In the previous section, we described the language of classical first-order
logic (FOL). As noted above, we now need to say what sort of mathematical
object counts as a model over a given signature , we need to specify a
truth-value set, and we need to define a function that takes a -model and
a -w↵ to an element of the truth-value set.

4.2.1 First-order models


Definition 4.2.1 ( -model) Given a signature , a -model is a pair,
M = (D, I), where D is a set and I is a function defined as follows:

(1) For each constant symbol c in , I(c) 2 D.

(2) For each free variable a, I(a) 2 D.

(3) For each function symbol f n in , I(f n ) : Dn ! D.

(4) For each relation symbol Rn in , I(Rn ) ✓ Dn .

We call D the domain of M and we call I the interpretation function of M.

4.2.2 Truth-value set


We have said what it is to be a model of first-order logic. Before we can
complete our description of first-order semantics, we must say what the
truth-value set is. It is {T, F }: that is, it contains only two truth-values,
True and False.

4.2.3 Extending the interpretation function


With this in hand, we can complete our description of the semantics for
first-order logic by saying, given a -w↵ ' and a model M over signature
, which truth-value is assigned to ' interpreted in M.
We do this by extending the interpretation function I of M in three
stages: by the third stage, given a w↵ ', there is a value I(') 2 {T, F }.

• First step First, we extend I so that, for each term t, it gives I(t) 2
D.

(i) We already have that, for any constant symbol c or free variable
a, I(c) 2 D and I(a) 2 D.
22 The semantics

(ii) Now suppose t1 , ..., tn are terms and f n is a function symbol.


And suppose that we have already extended I to give values for
I(t1 ), ..., I(tn ) 2 D. Then let
I(f n (t1 , ..., tn )) =df. I(f n )(I(t1 ), ..., I(tn ))
That is, given f n (t1 , ..., tn ), I takes the n-tuple (I(t1 ), ..., I(tn ))
and applies the function I(f n ) to it.
• Second step Next, we extend I so that, for each atomic formula
', it gives I(') 2 {T, F }. Suppose t1 , ..., tn are terms and Rn is a
relation symbol. Suppose further that we have already extended I to
give values for I(t1 ), ..., I(tn ) 2 D. Then let

T if (I(t1 ), ..., I(tn )) 2 I(Rn )
I(Rn (t1 , ..., tn )) =df.
F if (I(t1 ), ..., I(tn )) 62 I(Rn )
• Third step Finally, we extend I so that, for each w↵ ', it gives
I(') 2 {T, F }. Before we do this, we need to introduce some notation:
Given an interpretation function I, a free variable a, and d 2 D, let
Iad be the interpretation function obtained from I be setting I(a) = d.
Now, suppose ' and are w↵s and suppose further that we have
already extended I to give values for I(') and I( ) 2 {T, F }. Then
(i) Let ⇢
T if I(') = F or I( ) = T
I(' ! ) =df.
F otherwise
(ii) Let ⇢
T if I(') = F
I(¬') =df.
F otherwise
(iii) Let

T if, for every d 2 D, Iad (') = T
I(8x'[x/a]) =df.
F otherwise
Definition 4.2.2 Suppose M = (D, I) is a model, ' is a w↵ containing
free variables a1 , ..., an , and d1 , ..., dn are elements of D. Then we write
M |= '[d1 /a1 , ..., dn /an ]
to mean that
Mda11 ,...,d
,...,an |= '
n

where M = (D, Iad11,...,a


,...,dn d1 ,...,dn
n ): recall that Ia1 ,...,an di↵ers from I only in that it
takes I(ai ) = di for i = 1, 2, ..., n.
Classical first-order logic 23

4.3 The proof theory


Many di↵erent combinations of sets of axioms and sets of rules of inference
give rise to the same theorems in first-order logic. We choose a particular
set. We will present them in the course of giving a definition of a proof of
a w↵ ' from a set of assumptions S.
Definition 4.3.1 (Proof ) Suppose S is a set of w↵s and suppose that P =
('1 , ..., 'n ) is a sequence of w↵s. Then we say that P is a proof of 'n from
assumptions S if, for every 'i in P , one of the following is true:
(1) 'i is in S,
(2) 'i is an axiom. That is, 'i is one of the following
• (' ! ( ! ')), for some w↵s ' and ,
• ((' ! ( ! ✓)) ! ((' ! ) ! (' ! ✓))), for some w↵s ', ,
and ✓,
• ((¬ ! ¬') ! (' ! )), for some w↵s ' and , or
• (8x'[x/a] ! '[t/a]), for some w↵ ' and term t.
(3) There is j1 , j2 < i such that 'j1 is ! 'i and 'j2 is . In this case,
we say that 'i is inferred from 'j1 and 'j2 by modus ponens.
(4) There is j < i such that 'i is 8x'j [x/a], ('1 , ..., 'j ) is a proof of 'j
from assumptions S, and a does not occur in any w↵ in S. In this
case, we say that 'i is inferred from 'j by Universal Generalization.
This time, we write S `F OL ' if there is a proof of ' from assumptions S in
classical first-order logic. As in the propositional case, we drop the subscript
throughout this chapter, but may introduce it again in future chapters.

4.4 The soundness and completeness theorems


This completes our description of the formal system of first-order logic and
its semantics. We turn now to the soundness and completeness theorems.

4.4.1 Soundness
Theorem 4.4.1 (Soundness theorem for first-order logic) If S is a
set of w↵s and ' is a single w↵, then
S`' implies S |= '
24 Soundness and completeness

Proof. As in the propositional case, we will prove the following:

For any n, if ('1 , ..., 'n ) is a proof of 'n from a set of assumptions
S, then S |= '.

Base Case This is exactly as in the propositional case, except that we


must show that
M |= (8x'[x/a] ! '[t/a])

for any model M, w↵ ', and term t. Inductive Step Assume the follow-
ing: For any k < n and any set of w↵s S 0 , if P 0 = ('01 , ..., '0k ) is a proof of '0k
from assumptions S 0 , then S 0 |= '0k . We call this the inductive hypothesis.
We now prove the following: For any set of w↵s S, if P = ('1 , ..., 'n ) is a
proof of 'n from assumptions S, then S |= 'n .
This time there are four options: 'n is an axiom, 'n is in S, 'n is derived
from 'i and 'j by modus ponens, or 'n is derived from 'i by Universal
Generalization. The first two options are dealt with exactly as in the Base
Case. The third is dealt with as in the propositional case. Thus, we turn
to the final option:

• Suppose i < n and 'n is inferred from 'i by Universal Generalization.


That is, 'n is 8x'i [x/a], a does not occur in S, and ('1 , ..., 'i ) is a
proof of 'i from the assumption of S. Thus, by inductive hypothesis,
S |= 'i . So suppose M |= S. We must show that I(8x'i [x/a]) = T .
Thus, we must show that, for every d 2 D, Iad ('i ) = T . Since a does
not occur in any w↵ in S and since I( ) = T for every in S, it
d d d
follows that Ia ( ) = T for every in S. Thus, Ma = (D, Ia ) |= S.
So Mda = (D, Iad ) |= 'i . Thus, M |= 8x'i [x/a]. That is, M |= 'n .

This completes our induction. 2

4.4.2 Completeness
This result was proved first by Kurt Gödel in his doctoral thesis in 1929.
The proof that we give is due to Leon Henkin who published it in 1949.

Theorem 4.4.2 (Completeness theorem for first-order logic) If S is


a set of w↵s and ' is a single w↵, then

S |= ' implies S`'


Classical first-order logic 25

As in the case of propositional logic, our strategy is not to prove the theorem
directly. Rather, we’re going to state a theorem that is equivalent to it
(Theorem 4.4.3), prove that it is equivalent, and prove that instead. Here is
the theorem that is equivalent to the completeness theorem. It is the same
as in the propositional case.
Theorem 4.4.3 (The alternative Completeness theorem) Suppose S
is a set of w↵s of first-order logic. If S is consistent, then S is satisfiable.
The proof that they are equivalent is exactly as in the propositional case.

The structure of the proof


The structure of the proof is exactly as in the propositional case. That is,
we prove the following two lemmas:
Proposition 4.4.4 (Lindenbaum Lemma for First-order Logic) If S
is a consistent set of -w↵s, then there is a signature 1 such that ✓ 1
and a set S1 of 1 -wfs with the following properties:
(a) S ✓ S1 .

(b) S1 is maximal

(c) S1 is consistent.

(d) S1 is existentially witnessed.


(That is, if is a 1 -w↵ and 9x [x/a] is in S1 , then there is a
constant c in 1 such that [c/a] is in S1 .)

Proposition 4.4.5 If S1 is a maximal, consistent, and existentially wit-


nessed set of 1 -w↵s, then there is a model M over 1 such that, for any
1 -w↵ ,

M |= i↵ is in S1

Laying the foundations


Proof of Proposition 4.4.4. Suppose S is a consistent set of -w↵s. We are
going to construct S1 by extending to 1 and then augmenting S in two
steps.
1. First, we extend the signature by adding new constants c1 , c2 , c3 ,
· · ·, such that no ci occurs in any formula in S. Call the resulting
signature 1 .
26 Soundness and completeness

2. Next, we list all the 1 -w↵s: '1 , '2 , '3 , ....

3. Then, we extend S in the following way:

(a) S0 = S
(b) Sn+1 = Sn [ {(9x'n+1 [x/a] ! 'n+1 [cj /a])}
where cj is the first of the new constants that does not occur in
the formulae '1 , · · ·, 'n+1 .

Then let
1
[
+
S = = {' : ' is in Sn for some n}
n=0

(This step guarantees that the set S1 , which we are constructing at


the moment, is existentially witnessed.)

4. Finally, beginning this time with S + instead of S, we construct the


maximal consistent set S1 such that S ✓ S1 by the same recur-
sive process used in the propositional case. That is, we enumerate
all i nf ty-w↵s and add either them or their negation one at a time,
retaining consistency throughout.

Having constructed S1 , we need to show that it is maximal, consistent,


and existentially witnessed. The proof that it is maximal and consistent is
analogous to the propositional case.

(d) Suppose 9x [x/a] is in S1 . Then, by the definition of S + , we have


that (9x [x/a] ! [cj /a]) is in S1 for some constant cj . Thus,
[cj /a] is in S1 . That is, S1 is existentially witnessed.

This completes the proof of Proposition 4.4.4. We have laid the foundations
for our model. 2

Building the model


We must turn now to proving Proposition 4.4.5; that is, we must now build
the model on the foundations we have laid.
Suppose S1 is a maximal, consistent, and existentially witnessed set of
1 -w↵s. We now use S 1 such that, for
1 to construct a model M over
all 1 -w↵s
M |= i↵ is in S1
We define this model as follows: M = (D, I) where
Classical first-order logic 27

• D is the set of terms that occur in S1 .


(Thus, each of the constants in the original signature will be in D;
also, each new constant c1 , c2 , ...; also, each n-ary function symbol of
applied to every list of n of those constants; and so on.)

• I(c) = c, if c is a constant symbol in 1.

• I(a) = a, if a is a free variable.

• I(Rn ) = {(t1 , ..., tn ) : t1 , · · · , tn are terms and Rn (t1 , ..., tn ) is in S1 },


if Rn is a relation symbol in 1 .

• I(f n )(t1 , ..., tn ) = f n (t1 , ..., tn ), if f n is a function symbol in 1.

This completes our description of the model. Now we show that it has the
property required of it by Proposition 4.4.5.

Showing that it does what we want


We must prove the following proposition. From this, Proposition 4.4.5 fol-
lows.

Proposition 4.4.6 For any 1 -w↵ ',

M = (D, I) |= ' i↵ ' is in S1

Proof. We prove this by induction on the construction of the formulae of


S.
Base Case If t1 , · · ·, tn are terms, and Rn is a relation symbol of arity n,

M |= Rn (t1 , · · · , tn ) i↵ (t1 , · · · , tn ) is in I(Rn )


i↵ Rn (t1 , · · · , tn ) is in S1

Inductive Step Suppose that, for all formulae ' containing n or fewer
connectives and quantifiers,

M |= ' i↵ ' is in S1

Call this the inductive hypothesis.


28 Soundness and completeness

• Suppose ( ! ✓) has n + 1 connectives. Then and ✓ have n or fewer


connectives. Thus, by the inductive hypothesis,

M |= i↵ is in S1

and
M |= ✓ i↵ ✓ is in S1
Thus,

M |= ( ! ✓) i↵ M 6|= or M |= ✓
i↵ is not in S1 or ✓ is in S1
i↵ ( ! ✓) is in S1

• Suppose ¬ has n + 1 connectives. Then has n connectives. Thus,


by the inductive hypothesis,

M |= i↵ is in S1

Thus,

M |= ¬ i↵ M 6|=
i↵ is not in S1
i↵ ¬ is in S1

• Suppose 8x [x/a] has n + 1 connectives. Then, for any term t, [t/a]


has n connectives. Thus, by the inductive hypothesis,

M |= [t/a] i↵ [t/a] is in S1

Thus,

M |= 8x [x/a] i↵ M |= [t/a] for every term t in D


i↵ [t/a] is in S1 for every term t in D
i↵ 8x [x/a] is in S1

The final ‘i↵’ requires some further justification:

(i) (‘If’) First, suppose that 8x [x/a] is in S1 . Then, if there was


a term t such that ¬ [t/a] is also in S1 , then S1 would be
inconsistent. Thus, [t/a] is in S1 for every term t in D.
Classical first-order logic 29

(ii) (‘Only if’) Second, suppose that 8x [x/a] is not in S1 . Then,


since S1 is maximal, it follows that ¬8x [x/a] is in S1 . Thus,
9x¬ [x/a] is in S1 . Since S1 is a maximal consistent extension
of S + , there is a constant cj such that the w↵ (9x¬ [x/a] !
¬ [cj /a]) is in S1 . It follows that ¬ [cj /a] is in S1 . Thus,
it is not the case that [t/a] is in S1 for every t in D. By
contraposition, this establishes that, if [t/a] is in S1 for every
t in S, then 8x [x/a] is in S1 , as required.

This completes our proof of Proposition 4.4.6. 2


This in turn completes our proof of Proposition 4.4.5. 2
The alternative Completeness Theorem then follows from Proposition 4.4.5.
Given a model M = (D, I) over 1 of S1 , we can create a -model M0 =
(D0 , I 0 ) of S: D0 = D and I 0 is the restriction of I to the signature . This
completes our proof of the Completeness Theorem for First-Order Logic.
Chapter 5

The metatheory of
first-order logic

5.0.3 Compactness

Suppose S is a set of w↵s in the language of first-order logic. Recall that


S is finitely satisfiable if, for every finite subset S 0 of S, S 0 is satisfiable.
The compactness theorem tells us that this is sufficient to make S itself
satisfiable.

Theorem 5.0.7 (Compactness theorem for first-order logic) If S is


finitely satisfiable, then S is satisfiable.

If this doesn’t surprise you on first reading it, you may wish to think a little
more about it. It’s an extremely surprising result; as we shall see, it has
extremely surprising consequences.

Proof of Theorem 5.0.7. We proceed by proving the contrapositive of the


theorem. That is, we will prove that, if S is not satisfiable, then there is
a finite subset of S that is not satisfiable. Thus, suppose that S is not
satisfiable. That is, there is no model M such that M |= S. Thus, it
follows from the alternative version of the completeness theorem (Theorem
??) that S is inconsistent. That is, there is a proof of a contradiction from
the assumption of S. Since every proof is finite, there is a finite subset S 0 of
S such that there is a contradiction from S 0 . Thus, S 0 is inconsistent. By
the soundness theorem, it follows that S 0 is not satisfiable. This completes
the proof. 2

30
Metatheory of FOL 31

5.0.4 Consequences of Compactness


One use to which we wish to put logic is to characterize types of structures.
Let’s consider two examples.
First, the following formulae characterize the structure shared by all two-
element sets, in the sense that, in all models M = (D, I) of these formulae,
D is a set containing exactly two elements:

1. 9x9y(x 6= y)

2. 8x8y8z(x = y _ y = z _ x = z)

Second, the following formulae characterize the linear ordering struc-


tures, in the sense that, in all models M = (D, I) of these formulae, I(R)
is a linear ordering of the domain D:

1. 8xRxx

2. 8x8y((Rxy ^ Ryx) ! x = y)

3. 8x8y8z((Rxy ^ Ryz) ! Rxz)

4. 8x8y(Rxy _ Ryx)

If we add the following formula, they characterize the dense linear ordering
structures:

5. 8x8y(Rxy ! 9x(Rxz ^ Rzy))

And if we add another formula, they characterize the dense linear orderings
with no endpoints:

6. 8x(9yRxy ^ 9zRzx)

The Natural Number Structure


At the end of the nineteenth century, the mathematician Richard Dedekind
wished to characterize the natural number structure. That is, he wished to
provide a set A of formulae such that M |= A if, and only if, M has the
structure of the natural numbers. Since logic was still in its infancy when
Dedekind wrote, he did not give his project so precise a formulation—indeed,
it was another mathematician called Giuseppe Peano who, a year later, first
wrote Dedekind’s conditions in something that we would recognize as logical
notation. However, this is certainly his intention.
32

However, it is a consequence of the Compactness Theorem that this aim


could not be achieved using first-order logic. That is, there is no set of
formulae A in first-order logic such that M |= A if, and only if, M has
the structure of the natural numbers. It can be achieved using second-order
logic, and that is essentially what Dedekind did in his seminal treatise Was
ist und was sollen die Zahlen?. That is, there is a set of formulae in second-
order logic whose models are exactly the natural number structures. But
there is no such set of first-order formulae, and this is a consequence of the
Compactness Theorem, as I will now show.
First, what is the natural number structure? For Dedekind, a natural
number structure consisted of a set N , a distinguished zero element 0 and
a successor function S, which takes one number to the next. Thus, if S is a
set of w↵s that hopes to characterize this structure, the signature of these
w↵s must include a constant symbol 0 and a one-place function symbol S.
In such a signature, for any number n, we let n denote the term

SS...S
| {z } 0.
n

Now suppose, for the sake of a contradiction, that the set of first-order
w↵s A characterizes the natural number structure. Thus, the natural num-
bers themselves may be used to provide a model for A: we call it N = (N, IN ),
where N is the set of natural numbers and I(0) is zero, and I(S) is the func-
tion that takes a number to the number that immediately follows it. Thus,
by hypothesis, N |= A.
Next, we introduce a new constant symbol c and a new set of formulae,
which we call A⇤ :

A⇤ = A [ {c 6= n : n = 0, 1, 2, ...}

Now, there is no guarantee that N = (N, IN ) |= A⇤ . However, we can use N


to show that every finite subset of the axioms of A⇤ has a model. And then
we can use the Compactness Theorem to show that A⇤ has a model. Since
any model of A⇤ does not have the structure of the natural numbers, and
since any model of A⇤ is a model of A, it follows that A has a model that
does not have the structure of the natural numbers.
Suppose B is a finite subset of A⇤ . Then let k be the greatest number
such that the formula c 6= k is in B. Then extend IN to an interpretation
IN⇤ by letting
IN⇤ (c) = I(k + 1)
Then let N⇤ = (D, IN⇤ ). Then N⇤ |= B. After all, for any ' 2 B, either
Metatheory of FOL 33

• ' is in A, in which case N⇤ |= ', since N⇤ |= A; or,

• ' is c 6= n for some n  k, in which case N⇤ |= ', since IN⇤ (c) =


I(k + 1) 6= I(n).

Thus, N⇤ |= B, for any finite subset B of A⇤ . By the Compactness Theorem,


it follows that A⇤ is satisfiable; that is, there is a model M0 = (D0 , I0 ) of
A⇤ . And, thus, there is an element I0 (c) of M0 such that a 6= I0 (n), for all
numbers n. Thus, M0 is not a natural number structure. It includes, in
Dedekind’s words, “alien intruders”; in particular I0 (c). But, of course, since
M0 |= A⇤ and A ✓ A⇤ , M0 |= A. Thus, it is not the case that every model of
A is a natural number structure, contrary to our supposition. M0 is a model
of A and M0 does not have the structure of the natural numbers. Thus,
there is no first-order characterization of the natural number structure.

The Real Number Structure


A similar trick shows that there is no first-order characterization of the
real number structure. Suppose that C is a set of first-order formulae that
purport to give such a characterization. Amongst its signature there must be
constant symbols 0 and 1, a relation symbol <, and function symbols +, ,
⇥, and /. It is a minimum requirement on C characterizing the real number
structure that R = (R, IR ) |= C, where R is the set of real numbers, and IR
is the natural interpretation of the non-logical symbols in the signature of
C.
Next, let 1/n denote the term 1/ 1| + 1 +
{z... + 1}. Then extend the signa-
n
ture of C by adding a new constant c and let

C ⇤ = C [ {c < 1/n : n = 0, 1, 2, ...}

Then, as above, suppose B is a finite subset of C ⇤ . Let k be the greatest


number such that the formula c < 1/k is in B. And extend IR to IR⇤ by
letting
IR⇤ (c) = IR (1/(k + 1))

then R⇤ |= B. So, by Compactness, there is model M1 = (D1 , I1 ) of C ⇤ .


This model is also a model of C, but it does not have the structure of the
real numbers since it contains an element I1 (c), that is smaller than I1 (1/n),
for all n = 1, 2, .... Thus, there is no first-order characterization of the real
number structure.
34

The models M0 and M1 are known as non-standard models of arith-


metic and real analysis respectively. The study of their structure is of great
mathematical interest.

Finitude
A similar trick to that used above shows that there is no set of first-order
formulae F such that M = (D, I) |= F if, and only if, D is a finite set.
Thus, just as there is no first-order characterization of the natural or real
number structures, there is no first-order characterization of finitude.

5.0.5 The Löwenheim-Skolem Theorems


In the previous section, we used the Compactness Theorem to show that
there is no first-order characterization of the natural number structure, nor
of the real number structure. In this section, we state the Löwenheim-Skolem
theorems, which provide a di↵erent route to slightly weaker versions of these
results.

Definition 5.0.8 (Submodels and extensions) Suppose 1 , 2 are sig-


natures and 1 ✓ 2 . And suppose M1 = (D1 , I1 ) is a 1 -model and
M2 = (D2 , I2 ) is a 2 -model. Then we say that M1 is a submodel of
M2 if

(1) D1 ✓ D2

(2) For each constant c in 1, I1 (c) = I2 (c)

(3) For each free variable a, I1 (a) = I2 (a)

(4) For each n-place function symbol f in 1 , I1 (f ) = I2 (f )|D1n , where


I2 (f )|D1n is the function I2 (f ), whose domain is D2n , restricted to D1n .

(5) For each n-place relation symbol R in 1, I1 (R) = I2 (R) \ D1n .

In this case, we say that M2 is an extension of M1 .

Definition 5.0.9 (Elementary submodels and extensions) We say that


M1 is an elementary submodel of M2 if

(1) M1 is a submodel of M2
Metatheory of FOL 35

(2) For every 1 -w↵ ' containing free variables a1 , ..., an and every d1 ,
..., dn in D1

M1 |= '[d1 , ..., dn ] i↵ M2 |= '[d1 , ..., dn ]

(For the definition of this square brackets notation, see Definition


4.2.2.)

In this case, we say that M2 is an elementary extension of M1 .

We now state a result that gives rise to a useful test, called the Tarski-Vaught
test, which provides necessary and sufficient conditions for two -models M1
and M2 to be such that M1 is an elementary submodel of M2 . First, a
definition.

Definition 5.0.10 Suppose M = (D, I) is a -model. Suppose further that


A ✓ D, ' is a -w↵ with free variables amongst b, a1 , ..., an , and d1 , ..., dn 2
D. Then ' defines A with parameters d1 , ..., dn if

A = {d 2 D : M |= '[d, d1 , ..., dn ]}

Theorem 5.0.11 (Tarski-Vaught test) Suppose M1 and M2 are -models,


and D1 ✓ D2 . Then M1 is an elementary submodel of M2 if, and only if,
for every non-empty subset A ✓ D2 for which there is a -w↵ that defines
A with parameters from D1 , A contains an element of D1 .

Proof. Suppose M1 and M2 are as they are in the hypothesis of the theorem.
(‘Only if.’) Suppose M1 is an elementary submodel of M2 . That is, M1 is
a submodel of M2 and, for all -w↵s ' containing free variables a1 , ..., an ,
and all d1 , ..., dn in D1 ,

M1 |= '[d1 , ..., dn ] i↵ M2 |= '[d1 , ..., dn ]

Now suppose that

A = {d 2 D2 : M2 |= '[d, d1 , ..., dn ] for some ' and d1 , ..., dn 2 D1 }

and suppose that A is non-empty. Then there is d 2 D2 and d1 , ..., dn 2 D1


such that
M2 |= '[d, d1 , ..., dn ]
Therefore,
M2 |= 9x'[x, d1 , ..., dn ]
36

(Of course, this is a slight abuse of notation. What we ought to say is that, if
is 9x'[x/a], then M2 |= [d1 , ..., dn ], but this is less perspicuous.) Thus,
since M1 is an elementary submodel of M2 ,

M1 |= 9x'[x, d1 , ..., dn ]

So, there is d0 2 D1 such that

M1 |= '[d0 , d1 , ..., dn ]

So, again since M1 is an elementary submodel of M2 ,

M2 |= '[d0 , d1 , ..., dn ]

Thus, d0 2 A, as required.
(‘If.’) Now suppose that, for every non-empty subset A ✓ D2 for which there
is a -w↵ that defines A with parameters from D1 , A contains an element
of D1 . Then we proceed by induction on the construction of -w↵s to show
that, for any -w↵ ' with free variables a1 , ..., an , and any d1 , ..., dn 2 D1

M1 |= '[d1 , ..., dn ] i↵ M2 |= '[d1 , ..., dn ] (5.1)

Base Case If ' is an atomic formula, (5.1) holds since M1 is a submodel


of M2 .
Inductive Step If (5.1) holds of ' and , then clearly it holds of ¬'
and ' ! . The remaining case is 9x'[x/a]. If M1 |= 9x'[x, d1 , ..., dn ],
then clearly M2 |= 9x'[x, d1 , ..., dn ] by reasoning similar to that used in the
‘Only if.’ case above. Thus, suppose

M2 |= 9x'[x, d1 , ..., dn ]

Then there is d0 2 D2 such that

M2 |= '[d0 , d1 , ..., dn ]

Thus,
d0 2 A =df. {d 2 D2 : M2 |= '[d0 , d1 , ..., dn ]}
But, by hypothesis, there is d00 2 A \ D1 . Thus, by induction hypothesis
(5.1),
M1 |= '[d00 , d1 , ..., dn ]
Metatheory of FOL 37

so
M1 |= 9x[x, d1 , ..., dn ]
as required.
This completes the induction, and thus the proof of ‘If.’, and thereby the
proof of the theorem. 2
With this definition and theorem in hand, we can state and prove the
Löwenheim-Skolem theorems.

Theorem 5.0.12 (Löwenheim-Skolem theorems) Suppose is a sig-


nature and suppose that M = (D, I) is an infinite -model. Then, if  is
an infinite set and the set of constant and function symbols in is less than
or equal to , then there is an infinite -model M0 = (D0 , I 0 ) such that

(1) D0 '  (that is, D0 is the same size as ); and

(2) i. If   D, then M0 is an elementary submodel of M


(This is known as the downwards Löwenheim-Skolem theorem.)
ii. If D < , then M is an elementary submodel of M0 .
(This is known as the upwards Löwenheim-Skolem theorem.)

Proof. Our strategy is quite di↵erent in the two cases described by (2)i and
(2)ii respectively. We begin with (2)ii, which is the more straightforward,
requiring only basic compactness arguments of the sort we encountered in
section 5.0.4. The proof of (2)i is given after this; it requires more sophisti-
cated ideas, and appeals to the Tarski-Vaught test from above.
(2)ii Suppose and M are as they are in the hypotheses of the theorem.
Suppose that  is infinite and greater than or equal to the set of constant
and function symbols in . Finally, suppose that D < .
We begin by extending to 0 by adding a set of new constant symbols
{c : 2 }. Then consider the following set of 0 -w↵s:

S = Th(M) [ {c 6= c : , 2  and 6= }

Then, since M is infinite, we can use M to construct a model for any finite
subset of S. Thus, by Compactness, there is a 0 -model M0 = (D0 , I 0 ) of S.
It is clear that we can restrict the interpretation function of M0 to give a
-model M0 | of Th(M). Then the two required facts follow:

(I) D0 ' 
38

(II) M is an elementary submodel of M0 |

To prove (I), note that

  D0   + D   +  ' 

This final equality is an elementary theorem of the arithmetic of infinite


sets.
To prove (II), we simply observe that M0 |= S.
(2)i Suppose and M = (D, I) are as they are in the hypotheses of the
theorem. Suppose that  is infinite and greater than or equal to the set of
constant and function symbols in . Finally, suppose that   D. Then let
A be a subset of D that is the same size as . Our strategy is to produce
the smallest elementary submodel of M whose domain contains A. It will
turn out that the domain of this minimal elementary submodel is the same
size as A, as required.
Suppose ' is a -w↵ whose free variables are among b, a1 , ..., an . Then
there is a function
f'(b,a1 ,...,an ) : Dn ! D
such that, for any d1 , ..., dn 2 D, exactly one of the following holds:
⇥ ⇤
(a) M |= ' f'(b,a1 ,...,an ) (d1 , ..., dn ), d1 , ..., dn ;

(b) M 6|= ' [d, d1 , ..., dn ], for all d 2 D.

Then, given a subset B ✓ D, let

F (B) = {d 2 D : M |= '[d, d1 , ..., dn ] where ' is a -w↵ and d1 , ..., dn 2 B}

That is, F (B) is the union of all subsets of D for which there is a -w↵ that
defines that subset with parameters in B. Let

F n (B) = F (F (...F (B)...))


| {z }
n

And let
1
[
F 1 (B) = F n (B)
n=1

Then, given A ✓ D such that A ' , I claim two things:

(I) F 1 (A) = 
Metatheory of FOL 39

(II) The -model M0 = (F 1 , I) is an elementary submodel of M

To prove (I), we use some basic cardinal arithmetic. Suppose B ✓ D and


 ' B; then
1
[
F (B) ' ({' : ' has n + 1 free variables} ⇥ B n )
n=1

and since the set of constant and function symbols in is less than or equal
to , it follows that
1
[ 1
[ 1
[
  F (B)  n ⇥ n ' ⇥'  ' @0 ⇥  ' 
n=1 n=1 n=1

Thus, it follows that F (A) ' , and thus F (F (A)) ' , and so on. Then
1
[ 1
X
1 n
F (A) = F (A) '  ' @0 ⇥  ' 
n=1 n=1

as required.
To prove (II), we use the Tarski-Vaught test. Suppose that C ✓ D is non-
empty and suppose that ' defines C with parameters d1 , ..., d2 from F 1 (A).
We must show that C contains an element of F 1 (A). Let k be the least
number such that d1 , ..., dn 2 F k (A). Then it is clear that

C ✓ F (F k (A)) = F k+1 (A) ✓ F 1 (A)

Thus, by the Tarski-Vaught test, M0 = (F 1 (A), I) is an elementary sub-


model of M = (D, I), as required.
This completes the proof of the Löwenheim-Skolem theorems. 2
Chapter 6

Modal propositional logic

There is not just one modal propositional logic. Rather, there are many. In
this chapter, I describe some of the main ones.
Recall that, in order to describe a logical system, you need to describe
three things: its language, its axioms, and its rules of inference. I start by
describing the language that all modal propositional logics share (§6.1.1).
Then I’ll give the axioms and rules of inference for K, the weakest system
of modal logic (§6.1.2). Then I’ll give the axioms for the stronger theories
B, T , S4, and S5 (§6.1.3). The rules of inference for these stronger systems
are the same as for K.

6.1 The syntax


6.1.1 The language shared by all modal propositional logics
The language shared by all modal propositional logics is an extension of the
language of ordinary propositional logic.
Thus, there are the following familiar components:

(1) Propositional letters: p, q, r, ... (and with subscripts)

(2) Propositional connectives: ! and ¬.1

(3) Punctuation: ( and ).

Plus the following new component:


1
As in the propositional and first-order case, it helps to have as few logical symbols as
possible. Thus, we choose ! and ¬, and we define ^, _, and $ in terms of those.

40
Modal logic 41

(4) The necessity operator: 2

And the following grammatical rules tell us how we can string these com-
ponents together to make w↵s. The first three are familiar from classical
propositional logic. The final one tells us how we might add the new symbol
2 into our formulae.

(i) Every propositional letter is a w↵.

(ii) If ' and are w↵s, then ¬' is a w↵ and ' ! is a w↵.

(iii) If ' is a w↵, then 2' is a w↵.

6.1.2 The axioms and rules of inference for K


K is the minimal formal system for modal propositional logic—that is, it is
a subsystem of all others. We present it first.

The axioms of K
The axioms of K belong to two groups:

(A) First: if ', , and ✓ are w↵s, then following w↵s are axioms of K:

• (' ! ( ! '))
• ((' ! ( ! ✓)) ! ((' ! ) ! (' ! ✓)))
• ((¬' ! ¬ ) ! ( ! '))

(B) Second: if ' and are w↵s, then the following is an axiom of K:

• (2(' ! ) ! (2' ! 2 ))

Sometimes, this is called the Axiom of Distributivity.

The rules of inference of K


The system K has two rules of inference:

(C) The first is the familiar Rule of Modus Ponens (or !-elimination)
from ordinary propositional logic:

• From (' ! ) and ', infer .

(D) This rule is known as the Rule of Necessitation:


42

• From ', infer 2'.


Thus, when you have established ', you may then infer that ' is
necessarily true.
If there is a proof of ' from assumptions in S using these axioms and rules
of inference, we write S `K '. Since every w↵, axiom, and rule of inference
of classical propositional logic also belongs to K, CPL is a subsystem of
K—that is, S `CP L ' entails S `K '.

6.1.3 The axioms of the stronger systems


We now introduce a variety of systems that are stronger than K. The rules
of inference for the stronger systems remain the same as for K. But they
all have new axioms over and above those in K. I list them here:
• T = K + (2' ! ')
• B = K + (2' ! ') + (¬' ! 2¬2')
• S4 = K + (2' ! ') + (2' ! 22')
• S5 = K + (2' ! ') + (¬' ! 2¬2') + (2' ! 22')
If L is K, T , B, S4, or S5, and S is a set of w↵s in modal propositional
logic, and ' is a w↵, then we write
S `L '
if one can prove ' from S using only the axioms and rules of inference of L.
This completes my description of various modal propositional logics: we
have met K—the weakest system—as well as T , B, S4, and S5—each of
which is stronger than K. Next, we’re going to look at a particular sort of
semantics for these theories that was proposed by Saul Kripke.

6.2 The semantics


6.2.1 The definition of a Kripke model
A Kripke model consists of three components:
(1) A set of ‘possible worlds’, which we call W .
While we call these ‘possible worlds’, they need not be worlds in any
recognizable sense. They can be anything. For instance, as we will
see, they can be sets of w↵s.
Modal logic 43

(2) An accessibility relation, which we call R.


This is a relation that may or may not hold between two possible
worlds, w1 and w2 , from W . You might think of w1 Rw2 as meaning
that world w2 is possible from world w1 .

(3) An interpretation, which we call I.


This is a function that takes as its inputs a possible world w and a
propositional letter p (for instance), and gives as its output F or T .

• If I(p, w) = T , then the propositional letter p is true at world w


• If I(p, w) = F , then the propositional letter p is false at world w.

We denote such a Kripke model as M = hW, R, Ii: this notation just means
that the model is the ordered triple whose first elements is W (the set
of possible worlds), whose second element is R (the accessibility relation
between worlds), and whose third element is I (the interpretation).

6.2.2 Extending the interpretation to complex w↵s


We have said what it means for a propositional letter to be true at a world
in a given model, and what it means for a propositional letter to be false at
a world in a given model. But what does it mean for a more complex w↵ to
be true or false at a world in a model? We say that now.
At the moment, I is a function that takes only propositional letters and
worlds to F or T . Now we show how to extend any such function to a
function that takes as its input any w↵ and any world, and gives as its
output F or T . And we abuse notation a bit and use I to denote this
extended function as well.
Suppose ' and are w↵s and w is a world and suppose that we have defined
I(', w) and I( , w). Then

• I(¬', w) = T if I(', w) = F ; and I(¬', w) = F if I(', w) = T .

• I((' ! ), w) = T if I(', w) = F or I( , w) = T ; otherwise, I((' !


), w) = F

• I(2', w) = T if I(', w0 ) = T for all w0 2 W such that wRw0 .

Remarks The first two are the same as in the case of classical propositional
logics. However, the third is all new. It says this: 2' is true at a world
if ' is true at all worlds that are accessible from that world. When you
44

remember that w1 Rw2 means w2 is possible from w1 , it shouldn’t be too


difficult to convince yourself that this is the correct definition of when a
proposition is necessarily true at a world.
Our interpretation function tells us only when a w↵ is true at a world
in a Kripke model. It does not tell us when a w↵ is true in a Kripke model.
We define this now:
Definition 6.2.1 (Truth in a model) Suppose ' is a w↵ and M = hW, R, Ii
is a Kripke model. Then we say that ' is true in M (written: M |= ') if
For all worlds w 2 W , I(', w) = T
And we write M |= S if M |= ' for all ' in S.

6.2.3 Important classes of Kripke models


Certain classes of Kripke models will prove to be important to us when
we come to prove the soundness and completeness of our systems of modal
propositional logic. I just lay down our abbreviations for them now:
(1) U is the class of all Kripke models.
(2) R is the class of all Kripke models in which the accessibility relation
is reflexive: that is, for all worlds w in W , wRw.
In other words, in a reflexive model, every world is possible from itself.
(3) S is the class of all Kripke models in which the accessibility relation is
symmetric: that is, for all worlds w1 and w2 , if w1 Rw2 , then w2 Rw1 .
In other words, in a symmetric model, if one world is possible from
another, that other is possible from the first.
(4) T is the class of all Kripke models in which the accessibility relation is
transitive: that is, for all worlds w1 , w2 , and w3 , if w1 Rw2 and w2 Rw3 ,
then w1 Rw3 .
In other words, in a transitive model, if one world is possible from
another and that other is possible from a third, then the first is possible
from the third.
Then we write RT for the class of models in which the accessibility relation
is both reflexive and transitive; and we write ST for the class of models in
which the accessibility relation is both symmetric and transitive; and so on.
With these definitions in hand, we can define the notions of logical con-
sequence and satisfiability for modal propositional logic:
Modal logic 45

Definition 6.2.2 (Logical consequence) If C is a class of Kripke mod-


els, we write S |=C ' if, for all Kripke models M in C, if M |= S, then
M |= '.
Definition 6.2.3 If C is a class of Kripke models and S is a set of w↵s, S
is C-satisfiable if there is a Kripke model M in C such that M |= S.

6.3 The soundness and completeness theorems


In this section, we ask how the relation of provability in a given system of
modal propositional logic is connected to the relation of logical consequence
relative to a class of models. To state the connection, we lay down the
following definitions:
Definition 6.3.1 (Soundness) Suppose L is a system of modal proposi-
tional logic and suppose C is a class of Kripke models. Then
(1) We say that L is sound with respect to C if
S `L ' implies S |=C '

(2) We say that L is complete with respect to C if


S |=C ' implies S `L '

The following theorem describes the connections between the relations


of provability in a given system of modal propositional logic and the relation
of logical consequence relative to a class of models.
Theorem 6.3.2 (Soundness and completeness for K, T , B, S4, and S5)

(1) K is sound and complete with respect to the class of all Kripke models.
(2) T is sound and complete with respect to R.
(3) B is sound and complete with respect to RS.
(4) S4 is sound and complete with respect to RT .
(5) S5 is sound and complete with respect to RST .
To make sure we understand what this means, let’s take (4). This means
that, ' may be proved from S and the axioms of S4 if, and only if, all Kripke
models that are both reflexive and transitive and in which each w↵ in S is
true are also models in which ' is true.
46

6.3.1 The soundness results


Here’s the soundness portion of the soundness and completeness theorems
to remind us what we’re proving:

Theorem 6.3.3 (Soundness for K, T , B, S4, and S5) Suppose S is a


set of w↵s and ' is a w↵:

(1) S `K ' implies S |=U '

(2) S `T ' implies S |=R '

(3) S `B ' implies S |=RS '

(4) S `S4 ' implies S |=RT '

(5) S `S5 ' implies S |=RST '

Here we wish to show that, if ' is provable from S in a given modal propo-
sitional logic, then ' is a logical consequence of S relative to the relevant
class of models: that is, if S proves ' in a given system, then ' is true in all
models in the relevant class in which all w↵s in S are true. Thus, it suffices
to show two things:

(1) Every axiom of the given modal propositional logic is true in all models
in the relevant class.

(2) (a) Given a Kripke model M, the Rule of Modus Ponens preserves
the property of being true in M: that is, if ' ! and ' are true
in M, then is true in M.
(b) Given a Kripke model M, the Rule of Necessitation preserves the
property of being true in M: that is, if ' is true in M, then 2'
is true in M.

We prove (2) first, and then prove (1) for each system and the relevant class
of models given above:

(2) (a) Proof. This is straightforward: if I( ! ✓, w) = T and


I( , w) = T , then it is clear that I(✓, w) = T as well.
Thus, the Rule of Modus Ponens preserves the property of being
true in M.
Modal logic 47

(b) Proof. Suppose that is true in M: that is, for all worlds w,
I( , w) = T . Then given a world w 2 W , it is certainly true
that, at all worlds w0 accessible from w, I( , w0 ) = T , and thus
I(2 , w) = T , as required.
Thus, the Rule of Necessitation preserves the property of being
true in a model.
Next, we show that
(1) The axioms of K are true in all Kripke models.
(2) The axioms of T are true in all models in R.
(3) The axioms of B are true in all models in RS.
(4) The axioms of S4 are true in all models in RT .
(5) The axioms of S5 are true in all models in RST .
It suffices to show that
(i) (2(' ! ) ! (2' ! 2 )) is true in all models.
(ii) (2' ! ') is true in all reflexive models.
(iii) (¬' ! 2¬2') is true in all symmetric models.
(iv) (2' ! 22') is true in all transitive models.
(i) Proof. Suppose I(2(' ! ), w) = T . Then, for all worlds w0 that
are accessible from w, I(', w0 ) = F or I( , w0 ) = T . Now, if it is also
the case that I(2', w) = T and thus that, for all worlds w0 that are
accessible from w, I(', w0 ) = T , then it follows that, at all worlds w0
that are accessible from w, I( , w0 ) = T and thus that I(2 , w) = T ,
as required.
(ii) Proof. Suppose the accessibility relation is reflexive. And suppose
I(2', w) = T . Then, since the accessibility relation is reflexive, wRw,
it follows that I(', w) = T .
(iii) Proof. Suppose the accessibility relation is symmetric. And suppose
I(', w) = F . And suppose that wRw0 . Then it suffices to show
that I(2', w0 ) = F (because it will follow that I(2¬2', w) = T ,
as required). Now, since the accessibility relation is symmetric, we
have w0 Rw, and thus, since I(', w) = F , it is not true that ' is true
at every world accessible from w0 : in particular, it is not true at w.
Thus, I(2', w0 ) = F , as required.
48

(iv) Proof. Suppose the accessibility relation is transitive. And suppose


I(2', w) = T . Then we must show that, for all worlds w such that
wRw0 , I(2', w0 ) = T . Suppose w00 is a world and w0 Rw00 . Then, since
the accessibility relation is transitive, wRw00 . Thus, since I(2', w) =
T , I(', w00 ) = T . And thus I(2', w0 ) = T . And thus I(22', w) = T ,
as required.

6.3.2 The completeness results


Having proved the soundness portion of the soundness and completeness
theorem, we now prove the completeness portion. Here it is to remind us:
Theorem 6.3.4 (Completeness for K, T , B, S4, and S5) Suppose S is
a set of w↵s and ' is a w↵:
(1) S |=U ' implies S `K '

(2) S |=R ' implies S `T '

(3) S |=RS ' implies S `B '

(4) S |=RT ' implies S `S4 '

(5) S |=RST ' implies S `S5 '


To prove this theorem, we prove the following equivalent theorem. To state
it, we require the following definition:
Definition 6.3.5 (L-consistent) Suppose L is a system of modal proposi-
tional logic and S is a set of w↵s. Then S is L-consistent if S 6`L ?.
Thus, a set of w↵s is L-consistent if one cannot derive a contradiction from
it in L.
Theorem 6.3.6 (Completeness for K, T , B, S4, and S5) Suppose S is
a set of w↵s:
(1’) If S is K-consistent, then there is M in U such that M |= S.

(2’) If S is T -consistent, then there is M in R such that M |= S.

(3’) If S is B-consistent, then there is M in RS such that M |= S.

(4’) If S is S4-consistent, then there is M in RT such that M |= S.

(5’) If S is S5-consistent, then there is M in RST such that M |= S.


Modal logic 49

The following lemma shows that (1) and (1’) are equivalent, (2) and (2’) are
equivalent, and so on.

Lemma 6.3.7 Suppose L is a system of modal propositional logic and C is


a class of models. Then the following two propositions are equivalent:

(1) For all S and ', if S |=C ' then S `L '.

(2) For all S, if S is L-consistent, then S is C-satisfiable.

Proof. The proof is exactly similar to that given of Proposition 3.5.6 above.
2
Having shown that, in order to prove Theorem 6.3.4, it will suffice to
prove Theorem 6.3.6, we now prove Theorem 6.3.6. The key result we need
to get that is the following lemma.

Lemma 6.3.8 (The existence of canonical models) Suppose S is a set


of w↵s. Then, if S is K-consistent, there is a Kripke model CM(S) such
that CM(S) |= S. We call this model the canonical model for S.

We devote the next section to proving this, and in the section following, we
apply it to prove Theorem 6.3.6.

Constructing canonical models


In this section, we prove that, if S is a K-consistent set of w↵s, there is a
model, called the canonical model of S (written: CM(S)), such that each
w↵ in S is true in CM(S).
I said above that, while we call the members of W ‘possible worlds’, they
needn’t be anything like what we would call a ‘possible world’ in philosophy.
And this is particularly true in the case of canonical models for K-consistent
sets of w↵s: the ‘possible worlds’ in a canonical model are all sets of w↵s.
Bearing this in mind should hopefully make things clearer as we proceed.
More precisely, each world w in the canonical model for S is a maximal
and K-consistent set of w↵s that includes each member of S. Thus, each
world w in our canonical model is the same sort of thing as the models we
created for sets of propositional w↵s in the chapter on classical propositional
logic.

Definition 6.3.9 (Canonical model) Suppose S is a set of w↵s, and sup-


pose that S is K-consistent. Then define CM(S) = hW, R, vi as follows:
50

• The set of worlds W is defined as follows:

W ={ : is maximal and K-consistent and S ✓ }

• The accessibility relation R is defined as follows: Suppose w1 and w2


are in W (therefore, they are sets of w↵s!). Then

w1 Rw2 i↵, for all w↵s , if 2 2 w1 , then 2 w2 .

• The interpretation I is defined as follows: Suppose p is a propositional


letter and w is a world (a set of w↵s!) in W . Then

T if p 2 w
I(p, w) =
F if p 2
6 w

Having defined the canonical model CM(S) for a set of K-consistent w↵s
S, we must show that it has the property we claimed that it had in Lemma
6.3.8. This is a consequence of the following theorem. The proof of this
theorem takes up a lot of space, but it is not difficult. It is simply an
induction on the lengths of the w↵s.

Theorem 6.3.10 (Fundamental Theorem of Canonical Models) Suppose


S is a K-consistent set of w↵s. Then, for any w↵ and any world w in
CM(S),

I( , w) = T if 2w
I( , w) = F if 62 w

Proof. We prove this by mathematical induction. That is, first, we prove


that it is true for all w↵s containing no connectives; we call this the Base
Case. Then we show that, for any number n, if it is true for all w↵s
containing n or fewer connectives, then it is true for all w↵s containing n + 1
connectives; we call this the Inductive Step.
Base Case It is true by the definition of CM(S) that, for all w↵s that
contain no connectives—that is, the propositional letters—and for all worlds
w

I( , w) = T if 2w
I( , w) = F if 62 w
Modal logic 51

Inductive Step Now, suppose that, for all w↵s with n or fewer connec-
tives and for all worlds w

I( , w) = T if 2w
I( , w) = F if 62 w

We call this the inductive hypothesis.

And now suppose that is a w↵ with n + 1 connectives. Then we must


show that for all worlds w

I( , w) = T if 2w
I( , w) = F if 62 w

There are three cases to consider; the first two are familiar from the case of
classical propositional logic:

(1) is ¬'. Then, since contains n + 1 connectives, ' contains n


connectives. Thus, by the inductive hypothesis, for all worlds w

I(', w) = T if ' 2 w
I(', w) = F if ' 62 w

Now, suppose w is world and suppose 2 w. That is, ¬' 2 w. Then,


since w is K-consistent, ' 62 w. Thus, I(', w) = F , so I(¬', w) = T ,
as required. Similarly, if 62 w, then ' 2 w, so I(', w) = T , so
I(¬', w) = F , as required.

(2) is ' ! . Then, since contains n + 1 connectives, ' and


contain n or fewer connectives. Thus, by the inductive hypothesis, for
all worlds w,

I(', w) = T if ' 2 w I( , w) = T if 2w
I(', w) = F if ' 2
6 w I( , w) = F if 62 w

Then suppose 2 w. Then it is not the case ' and ¬ are both in w.
Thus, either ¬' 2 w, in which case I(', w) = F and thus I( , w) = T ,
or 2 w, in which case I( , w) = T and thus I( , w) = T , as required.
Similarly, if 62 w, then ¬ 2 w and thus ' and ¬' are in w, and so
I(', w) = T and I( , w) = F and thus I( , w) = F .
52

(3) is 2'. Then, since contains n + 1 connectives, ' contains n


connectives. Thus, by the inductive hypothesis, for all worlds w

I(', w) = T if ' 2 w
I(', w) = F if ' 62 w

Now suppose = 2' 2 w. We must show that, for all worlds w0 2 W


such that wRw0 , ' 2 w. This follows immediately from the definition
of R in CM(S). Recall that w1 Rw2 i↵, for every w↵ , if 2 2 w1 ,
then 2 w2 . Thus, if wRw0 and 2' 2 w, then ' 2 w0 , so by inductive
hypothesis I(', w0 ) = T . Thus, I(2', w) = T .
It is considerably more difficult to prove that, if 2' 62 w, then I(2', w) =
F . We omit the proof here.

This completes our proof. 2


In the next section, we apply this result to prove Theorem 6.3.6.

Proof of Theorem 6.3.6


We will prove each component of Theorem 6.3.6 in turn.

(1’) Proof. This is the content of Lemma 6.3.8. If S is K-consistent,


CM(S) is a Kripke model and CM(S) |= S, as required.

(2’) Proof. Suppose S is T -consistent. Then, since

T = K + (2' ! ')

it follows that S [ {2' ! '} is K-consistent. Thus, by the Funda-


mental Theorem of Canonical Models

CM(S [ {2' ! '}) |= S [ {2' ! '}

Thus, in particular

CM(S [ {2' ! '}) |= S

Thus, to complete our proof, we need only show that the accessibility
relation in CM(S [ {2' ! '}) is reflexive.
By the Fundamental Theorem of Canonical Models, (2' ! ') is true
at every world in CM(S [ {2' ! '}). We use this to show that, in
CM(S [ {2' ! '}), the accessibility relation in reflexive.
Modal logic 53

Suppose w is a world in CM(S [{2' ! '}). Then we must show that


wRw. By the definition of R in CM(S [ {2' ! '}), this requires us
to show that, for any w↵ , if 2 2 w, then 2 w. This is clearly
true, since (2 ! is true at w.

(3’) Proof. Suppose S is B-consistent. Then, since

B = K + (2' ! ') + (¬' ! 2¬2')

it follows that S [ {2' ! ', ¬' ! 2¬2'} is K-consistent. Thus, by


the Fundamental Theorem of Canonical Models

CM(S [ {2' ! ', ¬' ! 2¬2'}) |= S [ {2' ! ', ¬' ! 2¬2'}

Thus, in particular

CM(S [ {2' ! ', ¬' ! 2¬2'}) |= S

Thus, to complete our proof, we need only show that the accessibility
relation in CM(S [ {2' ! ', ¬' ! 2¬2'}) is both reflexive and
symmetric.
By the Fundamental Theorem of Canonical Models, (2' ! ') is true
at every world in CM(S [ {2' ! ', ¬' ! 2¬2'}) and from our
proof of (2’), we know that it follows from this that the accessibility
relation is reflexive.
Also, by the Fundamental Theorem of Canonical Models, ¬' ! 2¬2'
is true at every world in CM(S [ {2' ! ', ¬' ! 2¬2'}). We
use this to show that, in CM(S [ {2' ! ', ¬' ! 2¬2'}), the
accessibility relation in symmetric as well.
Suppose w1 and w2 are worlds in CM(S [ {2' ! ', ¬' ! 2¬2'}).
And suppose that w1 Rw2 . Thus, for any w↵ , if 2 2 w1 , then
2 w2 . Now, we must show that w2 Rw1 . That is, we must show
that, for any w↵ , if 2 2 w2 , then 2 w1 . Suppose 2 2 w2 . And
suppose (for a contradiction) that 62 w1 . Then, since each world
is maximally K-consistent, ¬ 2 w1 . Thus, since ¬ ! 2¬2 is
true at w1 , it follows that 2¬2 is true at w1 . Thus, since w1 Rw2 ,
it follows that ¬2' is in w2 ; but this gives a contradiction, since we
have assumed that 2' is in w2 and that w2 is K-consistent. Thus,
2 w1 , as required, and so w2 Rw1 .
54

(4’) Suppose S is S4-consistent. Then, since

S4 = K + (2' ! ') + (2' ! 22')

it follows that S [ {2' ! ', 2' ! 22'} is K-consistent. Thus, by


the Fundamental Theorem of Canonical Models

CM(S [ {2' ! ', 2' ! 22'}) |= S [ {2' ! ', 2' ! 22'}

Thus, in particular

CM(S [ {2' ! ', 2' ! 22'}) |= S

Thus, to complete our proof, we need only show that the accessibility
relation in CM(S [ {2' ! ', 2' ! 22'}) is both reflexive and
transitive.
By the Fundamental Theorem of Canonical Models, 2' ! ' is true
at every world in CM(S [ {2' ! ', 2' ! 22'}) and from our proof
of (2’), we know that it follows from this that the accessibility relation
is reflexive.
Also, by the Fundamental Theorem of Canonical Models, 2' ! 22'
is true at every world in CM(S [{2' ! ', 2' ! 22'}). We use this
to show that, in CM(S [ {2' ! ', 2' ! 22'}), the accessibility
relation in transitive as well.
Suppose w1 , w2 , and w3 are worlds in CM(S[{2' ! ', 2' ! 22'}).
And suppose that

(i) w1 Rw2 : that is, for all , if 2 2 w1 , then 2 w2 ; and


(ii) w2 Rw3 : that is, for all , if 2 2 w2 , then 2 w3

We must show that w1 Rw3 : that is, we must show that, for all ,
if 2 2 w1 , then 2 w3 . Thus, suppose 2 2 w1 . Then, since
2 ! 22 2 w1 , it follows that 22 2 w1 . Then, by (i), it follows
that 2 2 w2 . And, by (ii), it follows that 2 w3 , as required.

(5’) Suppose S is S5-consistent. Then, since

S4 = K + (2' ! ') + (¬' ! 2¬2') + (2' ! 22')

it follows that S [ {2' ! ', ¬' ! 2¬2', 2' ! 22'} is K-


consistent. Thus, by the Fundamental Theorem of Canonical Models

CM(S [ {2' ! ', ¬' ! 2¬2', 2' ! 22'}) |= S


Modal logic 55

Thus, to complete our proof, we need only show that the accessibility
relation in CM(S [ {2' ! ', ¬' ! 2¬2', 2' ! 22'}) is reflexive,
symmetric, and transitive.
By our proof of (2’), since 2' ! ' is true at every world in CM(S [
{2' ! ', ¬' ! 2¬2', 2' ! 22'}) the accessibility relation is
reflexive.
By our proof of (3’), since ¬' ! 2¬2' is true at every world in
CM(S [ {2' ! ', ¬' ! 2¬2', 2' ! 22'}) the accessibility rela-
tion is symmetric.
By our proof of (4’), since 2' ! 22' is true at every world in CM(S[
{2' ! ', ¬' ! 2¬2', 2' ! 22'}) the accessibility relation is
transitive.

This completes our proof of the completeness portion of the soundness and
completeness theorems. Thus, we have proved the soundness and complete-
ness theorems for K, B, T , S4, and S5. 2

You might also like