0% found this document useful (0 votes)
47 views21 pages

Basic-St Lectures

This document provides an introduction to basic set theory concepts including: 1. It defines the language and axioms of Zermelo-Fraenkel set theory (ZF), including axioms for set existence, extensionality, pairing, union, powerset, comprehension, infinity, foundation, and replacement. 2. It introduces important set-theoretic concepts like subsets, unions, powersets, the empty set, successor sets, and well-foundedness. 3. The document is intended as a quick introduction requiring no background in logic or set theory for undergraduate mathematics students. It will cover additional topics like well-orderings, ordinals, cardinals, and the continuum hypothesis

Uploaded by

wen wen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views21 pages

Basic-St Lectures

This document provides an introduction to basic set theory concepts including: 1. It defines the language and axioms of Zermelo-Fraenkel set theory (ZF), including axioms for set existence, extensionality, pairing, union, powerset, comprehension, infinity, foundation, and replacement. 2. It introduces important set-theoretic concepts like subsets, unions, powersets, the empty set, successor sets, and well-foundedness. 3. The document is intended as a quick introduction requiring no background in logic or set theory for undergraduate mathematics students. It will cover additional topics like well-orderings, ordinals, cardinals, and the continuum hypothesis

Uploaded by

wen wen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

A QUICK INTRODUCTION TO BASIC SET THEORY

ANUSH TSERUNYAN

This note is intended to quickly introduce basic notions in set theory to (undergraduate) students
in mathematics not necessarily specializing in logic, thus requiring no background. It uses the
sections on well-orderings, ordinals and cardinal arithmetic of [Kun83], as well as some parts from
[Mos06].

Contents
1. Zermelo–Fraenkel set theory 2
2. Well-orderings 6
3. Ordinals 7
4. Transfinite induction 9
5. The set of natural numbers 10
6. Equinumerosity 11
7. Finite/infinite sets and Axiom of Choice 14
8. Cardinals and cardinality 16
9. Uncountable cardinals and the Continuum Hypothesis 16
10. Cofinality and Kőnig’s theorem 17
11. Classes 18
Exercises 19
References 21

Date: June 7, 2022.


1
1. Zermelo–Fraenkel set theory
The language of ZFC. ZF stands for Zermelo–Fraenkel set theory and ZFC stands for Zermelo–
Fraenkel set theory with Choice (the latter being an extra axiom added to ZF).
To describe the axioms of ZFC we need to fix a language (formally speaking, a first order logic
language). The language of ZF (same for ZFC) has only one special symbol, namely the binary
relation symbol ∈, in addition to the standard symbols that every other first order language has:
the symbols =, ¬, ∧, ∨, →, ∀, ∃, (, ) and variables x0 , x1 , x2 , ....
To keep things informal and easy to read, we also use other letters such as x, y, z or even A, B, C , F
for variables, as well as extra symbols like ,, <, ↔, etc.
Using these symbols and variables we form statements about sets such as ∀x(x = x), ∃x(x , x),
∃x(x ∈ y ∧ y < z). The variables are interpreted as sets and the symbols are interpreted the expected
way: = as “equality”, ∈ as “belongs to” (set membership), ¬ as “negation”, ∧ as “conjunction” (and),
∨ as “disjunction” (or), → as “implication”, ∀ as “for all”, and ∃ as “there exists”. For example,
∀x((x ∈ y ∧ y ∈ z) → (x ∈ z)) is interpreted as “for all sets x, if the set x is a member of the set y
and the set y is a member of the set z, then the set x is a member of the set z”. In mathematical
logic, we refer to these kinds of statements as formulas (in the language of ZF). Here is their formal
definition, although a nonpedantic reader can safely skip it.

Definition 1.1. Recursively define the notion of a formula in the language of ZF as follows:
(i) For variables x, y, x = y is a formula;
(ii) For variables x, y, x ∈ y is a formula;
(iii) If φ is a formula, then so is ¬(φ);
(iv) If φ and ψ are formulas, then so are (φ) ∧ (ψ), (φ) ∨ (ψ) and (φ) → (ψ);
(v) If φ is a formula and x a variable, then ∀x(φ) and ∃x(φ) are formulas.

When writing a formula, one doesn’t have to religiously follow the rules about parentheses
described in the definition above as long as there is no ambiguity in reading the formula; for
example, ¬x = y is technically not a formula, but it’s ok to write it this way since there is no
ambiguity: it is clearly the same as ¬(x = y). Also, when we want to emphasize that a formula φ
says something about a set (variable) x, we write φ(x), although φ may also contain other variables;
for example, φ(x) may be x ∈ y. Lastly, we use abbreviations

∀x ∈ X φ(x) ..⇐⇒ ∀x x ∈ X → φ(x)




∃x ∈ X φ(x) ..⇐⇒ ∃x x ∈ X ∧ φ(x) .




Axioms of ZFC. We now list all of the axioms of ZFC for reference, although we discuss only
some of them here and leave the detailed discussion of the rest for later. Axioms 0–7 form the
system ZF and ZFC is just ZF together with Axiom 9 (Choice).

Axiom 0. Set existence:


∃x (x = x).
Since x = x vacuously holds, this axiom just says that there is a set (one has to start with
something).
2
Axiom 1. Extensionality:
∀x ∀y (
∀z (z ∈ x ↔ z ∈ y)

x=y
).
This says that if two sets have the same members then they are equal; in other words, sets are
uniquely determined by the members they contain.

Axiom 2. Pairing:
∀u ∀v ∃p ∀w (
w∈p

(w = u ∨ w = v)
).
This says that for any sets u, v, there is a set p that contains both u, v as its members and no other
members. We denote this set p by {u, v} and call it the (unordered) pairing of u, v.

Axiom 3. Union:
∀C ∃U ∀x (
x∈U

∃S ∈ C (x ∈ S )
).
This says that for each set C (think of C as a collection of sets), there is a set U whose members
are exactly the elements that belong to some set S in this collection C . We denote this set U by C
S
and call it the union of the sets in C .

Axiom 4. Powerset:
∀X ∃P ∀Y (
Y∈P

Y⊆X
),
where Y ⊆ X is an abbreviation for ∀z(z ∈ Y → z ∈ X). This says that for any set X there is a set P
(think of it as a collection of sets) whose elements are exactly the subsets of X. We denote this set
P by P(X) and call it the powerset of X.

Axiom schema 5. Comprehension: for every formula φ, the following is an axiom:


3
∀x ∃y ∀z (
z∈y

(z ∈ x ∧ φ(z))
).

This says that for each set X there is a set Y whose element are exactly those elements of X that
satisfy φ. We denote this set Y by {z ∈ X : φ(z)} and call it the subset of X defined by φ.
Note that this is not one axiom! This is an infinite collection of axioms, one for each φ. Further-
more, note that φ(z) may also contain variables other than z, X, Y and we think of them as parameters.

Axiom 6. Infinity:

∃X (
∅∈X

∀x ∈ X (S (x) ∈ X)
),

where ∅ and S (x) are defined as follows. It can be deduced from the above axioms that there is a
set with no elements, which we denote by ∅ and call the emptyset. Furthermore, it can be deduced
that for every set x, there is a set y whose members are exactly the elements of x and x itself, i.e.,
y ..= x ∪ {x}. We denote this set y by S (x) and call it the successor of x.
Call a set X inductive if ∅ ∈ X and for each x ∈ X, its successor S (x) is also an element of X.
Infinity axiom states that there exists an inductive set.

Axiom 7. Foundation:

∀X (
X,∅

∃y ∈ X ¬∃z ∈ X (z ∈ y)
),

where we again use some easy-to-unravel abbreviations.


We call an element y ∈ X an ∈-minimal (or membership-minimal) element of X if there is no
element z ∈ X such that z ∈ y (in particular, y < y). Foundation axiom states that the binary relation
∈ is well-founded, i.e., every set X has an ∈-minimal element. It is rarely used in set theory and not
at all in other areas of mathematics. Thus, as an exercise, we try to circumvent its uses below in the
text and the reader is urged to do so as well.

Axiom schema 8. Replacement: for every formula φ, the following is an axiom:


4
∀X [
∀x ∈ X ∃!y φ(x, y)

∃Y ∀x ∈ X ∃y ∈ Y φ(x, y)
],
where ∃!y abbreviates “there exists a unique y”.
We say that a formula φ(x, y) defines a function on a set X if for each x ∈ X, there is exactly one
set y such that φ(x, y) holds. In this case, we say that a set Y contains the φ-image of X if for all
x ∈ X, all y with φ(x, y) are elements of Y.
Replacement axiom for φ says that for every set X, if φ(x, y) defines a function on X, then there
is a set Y containing the φ-image of X.

Axiom 9. Choice:
∀C [
∅<C

[
∃f : C → C ∀S ∈ C ( f (S ) ∈ S )
],
where for sets X, Y, the notation f : X → Y means that f is a function from X to Y. The notion of a
function, as well as that of an ordered pair (x, y) and Cartesian product X × Y, can be defined using
the axioms above, and we leave it as an exercise (see Exercise 1).
Choice axiom (more commonly referred to as Axiom of Choice, abbreviated, AC) states that for
any set C with ∅ < C (think of C as a collection of nonempty sets), there is a function f that assigns
to each set S ∈ C an element of S .

Notation 1.2. For a set X and sets x0 , x1 , . . . , xn , we write X = {x0 , x1 , . . . xn } to mean that for all sets
x,
x ∈ X ↔ (x = x0 ) ∨ (x = x1 ) ∨ · · · ∨ (x = xn ) .


Furthermore, for a formula φ, we write X = {x : φ(x)} to mean that for all sets x,
x ∈ X ↔ φ(x).
Caution 1.3. For a given formula φ, we don’t know a priori if there exists a set X such that
X = {x : φ(x)}. In fact, it doesn’t exist for some φ (see the discussion of Russel’s paradox in
Section 11). It is tempting to say that the existence of such a set X follows from Comprehension
axiom, but the latter only gives the existence of sets of the form {x ∈ Y : φ(x)} and not {x : φ(x)}.
To apply Comprehension, one has to first find (prove existence of) an appropriate set Y.
Below we abandon using the awkward formal (symbolic) language of ZFC and start writing our
definitions and theorems in ordinary (mathematical) English, with the understanding that if we had
to, we could translate everything to the formal language.
5
2. Well-orderings
Definition 2.1. A binary relation < on a set A is called an ordering (or a strict ordering) if for all
x, y, z ∈ A,
(i) (Irreflexivity) x ≮ x,
(ii) (Transitivity) x < y < z ⇒ x < z.
We will refer to the pair (A, <) as an ordering or an ordered set if < is an ordering of A.
Observation 2.2. Conditions (i) and (ii) imply that orderings are asymmetric, i.e.
x < y ⇒ y ≮ x.
Remark 2.3. The term ordering is also often used for a binary relation ⩽ on a set A that is reflexive,
anti-symmetric, and transitive. This is why, the notion in Definition 2.1 is often referred to as a
strict ordering. The two different notions are easily obtained from one another: indeed, given a
nonstrict ordering ⩽, one obtains its strict version by defining a < b ..⇔ a ⩽ b and a , b; conversely,
given a strict ordering, its nonstrict version is defined by a ⩽ b ..⇔ a < b or a = b. In this note, we
will only use strict orderings and hence we simply call it ordering throughout.
For an ordering <, we write x ⩽ y to mean “x < y or x = y”.
Definition 2.4. An ordering < on a set A is called linear (or total) if in addition we have:
(iii) (Totality) For all x, y ∈ A, either x = y or x < y or y < x.
A linear ordering < is called a well-ordering if
(iv) (Well-foundedness) every subset of A has a least element x, i.e. for all y ∈ A, x ⩽ y. Note that
such x is necessarily unique.
Definition 2.5. Let (A, <) be an ordering. For a ∈ A, let pred(a, A, <) denote the set of predecessors
of a, i.e.
pred(a, A, <) ..= {b ∈ A : b < a} .
We simply write pred(a), when (A, <) is clear from the context. A subset B ⊆ A is called an initial
segment if it is closed downward, i.e. for all x ∈ B, pred(x, A, <) ⊆ B. Call B a proper initial
segment if B , A.
Lemma 2.6. If (A, <) is a well-ordering, then any proper initial segment B is equal to pred(a) for
some a ∈ A; in other words, B has a supremum.
Proof. Take a to be the least element of A \ B. □
Definition 2.7. For ordered sets (A, <A ), (B, <B ), call a function f : A → B an isomorphism of
(A, <A ) with (B, <B ) (or just an order-isomorphism) if it is a bijection and for all a0 , a1 ∈ A,
a0 <A a1 ⇐⇒ f (a0 ) <B f (a1 ).
Say that (A, <A ) and (B, <B ) are isomorphic, and denote (A, <A ) ≃ (B, <B ), if there is an isomorphism
of (A, <A ) with (B, <B ).
Notation. If (A, <) is a well-ordering and B ⊆ A, then (B, <|B ) is also a well-ordering, where
<|B ..= < ∩ B × B. We will abuse the notation and write (B, <) below.
For well-orderings (A, <A ) and (B, <B ), write (A, <A ) ≺ (B, <B ) if
(A, <A ) ≃ (pred(b), <B ),
for some b ∈ B. Write (A, <A ) ⪯ (B, <B ) if (A, <A ) ≺ (B, <B ) or (A, <A ) ≃ (B, <B ).
6
Observation 2.8. (A, <A ) ⪯ (B, <B ) is equivalent to the existence of an isomorphism of (A, <A ) with
an initial segment of (B, <B ).
Lemma 2.9. For well-orderings (A, <A ), (B, <B ), there is at most one isomorphism of (A, <A ) with
an initial segment of (B, <B ).
Proof. Let B′ , B′′ be initial segments of B (possibly equal) and suppose towards a contradiction
that there are order-isomorphisms f : A → B′ , g : A → B′′ such that f , g. Let a be the <A -least
element in A for which f (a) , g(a). Without loss of generality, assume f (a) <B g(a). Then
f (a) ∈ B′′ , so we can apply g−1 to the last inequality and get a′ ..= g−1 ( f (a)) <A a. Hence, by the
choice of a, f (a′ ) = g(a′ ) = f (a), contradicting the injectivity of f . □
Corollary 2.10. (A, <) ⊀ (A, <) for any well-ordering (A, <).
Proof. The identity map a 7→ a is already an isomorphism of (A, <A ) with its initial segment, so
there cannot be any other, by Lemma 2.9. □
Observation 2.11. For any well-orderings (A, <A ), (B, <B ), (C, <C ),
(a) (A, <A ) ≺ (B, <B ) ⪯ (C, <C ) =⇒ (A, <A ) ≺ (C, <C );
(b) (A, <A ) ⪯ (B, <B ) ≺ (C, <C ) =⇒ (A, <A ) ≺ (C, <C ).
Remark 2.12. Recall that functions are simply sets of pairs satisfying a certain condition. More
precisely, a function f : A → B is a subset of A × B satisfying the following: for every a ∈ A
there is exactly one b ∈ B such that (a, b) ∈ f . (We commonly write f (a) = b instead of awkward
(a, b) ∈ f .) Thus, for example, the notation f ⊆ g makes sense for functions f, g.
Theorem 2.13. For well-orderings (A, <A ) and (B, <B ), exactly one of the following holds:
(i) (A, <A ) ≃ (B, <B );
(ii) (A, <A ) ≺ (B, <B );
(iii) (B, <B ) ≺ (A, <A ).
Proof. That these statements are mutually exclusive follows from Lemma 2.9 via Observation 2.11.
To show that one of them holds, let
f ..= (a, b) ∈ A × B : (pred(a), <A ) ≃ (pred(b), <B ) .


It is straightforward to check that f is a function from some initial segment A′ of A to an initial


segment B′ of B. In fact, f is an isomorphism of (A′ , <A ) with (B′ , <B ), so it is enough to show
that A′ = A or B′ = B. But A′ and B′ cannot both be proper since otherwise A′ = pred(a) and
B′ = pred(b) for some a ∈ A \ A′ , b ∈ B \ B′ , whence (a, b) ∈ f by the definition of f , so a ∈ A′ , a
contradiction. □
3. Ordinals
Simply put, ordinals are a generalization of natural numbers, suitable for listing (counting) infinite
sets. They are certain well-ordered sets that serve as canonical representatives of isomorphism
classes of well-ordered sets, meaning that every well-ordered set is isomorphic to exactly one
ordinal (see Theorem 3.12). Besides, ordinals are very convenient to work with: for example, the
relation ≺ between ordinals coincides with ∈.
In ordinary mathematics, it is customary to denote sets by capital letters A, B, ... and elements by
small letters a, b, .... In set theory however, sets themselves are often elements of other sets under
consideration, so one has to abandon this tradition, as we do below.
Definition 3.1. A set x is called transitive if all of its elements are also its subsets, i.e. for all y ∈ x,
y ⊆ x.
7
For example, the sets ∅, {∅} , {∅, {∅}} , {∅, {∅} , {{∅}}} are transitive (check!), but the set {{∅}} isn’t
(why?).
Definition 3.2. A set α is called an ordinal if it is transitive and well-ordered by ∈.
The transitive sets ∅, {∅} , {∅, {∅}} mentioned above are all ordinals; however, the set {∅, {∅} , {{∅}}}
isn’t an ordinal since ∈ is not an ordering on it1. We usually denote ordinals by Greek letters α, β, γ,
etc.
Lemma 3.3. Let α be an ordinal.
(a) α < α.
(b) For any y ∈ α, y = pred(y, α, ∈).
(c) The least element of α is ∅ and we denote 0 ..= ∅.
(d) Every y ∈ α is itself an ordinal.
Proof. For part (a), recall that since ∈ is an ordering on α, it must be irreflexive. Thus, for all y ∈ α,
y < y. Hence, if α ∈ α, taking y = α would give a contradiction. Part (b) follows from the transitivity
of α and Axiom 1 (Extensionality). Part (c) follows directly from (b) since if y is the least element
of α, then pred(y, α, ∈) = ∅. Part (d) is left as an exercise. □
For ordinals α, we will write α instead of (α, ∈) even when we mean to view α as a well-ordered
set.
Lemma 3.4. Let α, β be ordinals.
(a) If α ≃ β, then α = β.
(b) ∈ is a total order on ordinals, i.e. exactly one of the following holds: either α = β, or α ∈ β, or
β ∈ α.
(c) If α ⊊ β, then α ∈ β.
Proof. Part (a) is left as an exercise. For part (b), apply Theorem 2.13 and use (b) of Lemma 3.3
and (a) of the current lemma. Finally, (c) follows from (b) since α ⊊ β and β ∈ α cannot both hold:
otherwise, α ⊊ β ⊆ α and hence α ⊊ α, a contradiction. □
In light of the above lemma, we often write α < β instead of α ∈ β, for ordinals α, β. We also use
the notation α ⩽ β to mean α = β or α < β.
Lemma 3.5.
(a) Every nonempty set of ordinals C has an ∈-least element, i.e. there is α ∈ C such that for all
β ∈ C not equal to α, we have α ∈ β. We denote this α by min C.
(b) For every mathematical statement (formula) φ(x), if there is an ordinal α for which φ(α) holds,
then there is a least such ordinal.
(c) Every transitive set of ordinals is itself an ordinal.
Proof. For (a), fix an ordinal γ ∈ C. If γ ∩ C = ∅ then γ is the ∈-least element in C and we are done.
Otherwise, since γ ∩ C is a subset of γ and γ is well-ordered by ∈, there is an ∈-least element α in
γ ∩ C. Clearly, α would also be the ∈-least element in C.
The proof of (b) is identical to that of (a); in fact, the latter follows from the former by letting
φ(x) be x ∈ C.
Part (c) follows from (b) of Lemma 3.4 and (a) of the current lemma. □
1
The author thanks Minh Tran for this quick example.
8
Remark 3.6. Part (b) above allows us to freely use definitions like “Let κ be the least ordinal such
that φ(κ)”, without having to spot a set (the set C in part (a)) that would contain this κ as we would
have to do if we used (a) instead.

We now define a successor operation for ordinals.


Definition 3.7. For a set x, we define the successor S (x) of x to be the set x ∪ {x}.
Note that x ∈ S (x) and x ⊆ S (x).
Lemma 3.8. For an ordinal α, S (α) is the least ordinal greater than α.
Proof. First note that S (α) is a transitive set of ordinals and thus is an ordinal by (c) of Lemma 3.5.
By definition, α < S (α). Now, assuming β > α, we show that S (α) ⩽ β. Indeed, by the transitivity
of β, we have α ⊆ β and thus S (α) ⊆ β. Therefore, either S (α) = β or S (α) < β by (c) of
Lemma 3.4. □
Definition 3.9. An ordinal α , 0 is called a successor if there is an ordinal β such that α = S (β);
otherwise, we would say that α is a limit ordinal.
Thus, there are three types of ordinals: 0, successor ordinals, and limit ordinals.
Definition 3.10. For a set of ordinals C, define sup C = C ..= α∈C α.
S S

Lemma 3.11. Let C be a set of ordinals.


(a) sup C is the least ordinal β such that for all α ∈ C, α ⩽ β.
(b) If C is transitive (and hence an ordinal), then sup C < C if and only if C is a limit ordinal or ∅.
(c) min C = C ..= α∈C α.
T T

Proof. Left as an exercise. □


Theorem 3.12. Any well-ordering (A, <A ) is isomorphic to exactly one ordinal.
Proof. Uniqueness follows from (a) of Lemma 3.4. For existence, let
A′ = a ∈ A : there is an ordinal α such that pred(a, A, <A ) ≃ α .


By uniqueness, the ordinal α in the definition of A′ is unique for each a ∈ A. Thus, by Axiom 8
(Replacement), there is a set C and a function f : A′ → C such that for all a ∈ A′ , f (a) is an
ordinal and pred(a, A, <A ) ≃ f (a). By shrinking C, we may assume that f is surjective. Note
that A′ is an initial segment of A and hence C is transitive. Thus, by (c) of Lemma 3.5, C is an
ordinal. It is also easy to see that f is an isomorphism between (A′ , <A ) and C. Now if A′ , A,
then A′ = pred(a, A, <A ) for some a ∈ A \ A′ . But then pred(a, A, <A ) ≃ C and hence a ∈ A′ , a
contradiction. Thus, A′ = A and we are done. □
Definition 3.13. For a well-ordering (A, <A ) the unique ordinal α with (A, <A ) ≃ α is called the
order type of (A, <A ) and denoted by tp(A, <A ).
4. Transfinite induction
The usual proof of the induction theorem for N given in an undergraduate course uses the fact
that the usual ordering on N is a well-ordering. The same proof works for any well-ordered set:
Theorem 4.1 (Transfinite induction). Let (A, <) be a well-ordered set and let P ⊆ A be a set such
that for any a ∈ A, pred(a, A, <) ⊆ P implies a ∈ P. Then P = A.
Proof. Assume for contradiction that P , A and let a be the least element of A \ P. By our choice,
pred(a, A, <) ⊆ P, and hence a ∈ P, a contradiction. □
9
Remark 4.2. In the proofs using transfinite induction on ordinals, the cases of 0, successor ordinals
and limit ordinals are usually considered separately. An example is the proof of Theorem 4.4 below.
It is common in mathematics to define sequences (indexed by N) inductively.2 Such an example
is the Fibonacci sequence. In set theory however, we often want to inductively define sequences
that are indexed by a larger well-ordered set (typically an ordinal). The next theorem shows that
this indeed can be done. For notational simplicity, we state it for ordinals, but the same proof would
work for arbitrary well-ordered sets.
First, note (realize?) that a sequence indexed by an ordinal κ, is nothing but a function defined
on κ. The idea of inductive definition of such function is that in order to define its value at a point
α ∈ κ, we use its values at the predecessors of α. To make this precise, we need the following:
Definition 4.3. For sets A, B, a partial function from A to B is a usual function from a subset of A
to B. We denote this by f : A ⇀ B, and we denote the domain of f by dom( f ). We denote the set
of partial functions from A to B by Partial(A, B).
Theorem 4.4 (Transfinite recursion). For every ordinal κ, set A, and map3 F : Partial(κ, A) × κ → A,
there is a unique function f : κ → A such that for every α ∈ κ,
f (α) = F( f |α , α).
Proof. We prove by transfinite induction on S (κ) that for all γ ⩽ κ, there is a unique function
fγ : γ → A satisfying
fγ (α) = F( fγ |α , α), (∗)
for every α < γ. Granted this, we would take f = fκ and be done.
For γ = 0, put fγ = ∅ and note that this is the only function defined on ∅ in general. For γ a
limit Sordinal, it follows from uniqueness that for all α < β < γ, fα ⊆ fβ (see Remark 2.12). Thus
fγ = β<γ fβ is a function from γ to A and it clearly satisfies (∗) as so do the functions fβ , β < γ.
The uniqueness of fγ also follows from the uniqueness of fβ for each β < γ.
For the successor case, let γ = S (β). For α ∈ γ, put
F( fβ , β) if α = β
(
fγ (α) = .
fβ (α) otherwise
It is clear that fγ satisfies (∗) and that it is the unique such function. □
Remark 4.5. The proof of Theorem 4.4 shows that its conclusion holds even if the map F is only
defined at points ( f, α) ∈ Partial(κ, A) × κ with α = dom( f ), which is what usually happens in
practice.

5. The set of natural numbers


Having defined 0 and the successor operation, we can attempt to define natural numbers as
follows:
0 ..= ∅, 1 ..= S (0), 2 ..= S (1), ..., n + 1 ..= S (n), ...
But wait, I just used induction on the set of natural numbers while trying to define natural numbers!!!
Not good.
It is still ok to define 1 ..= S (∅), 2 ..= S (S (∅)), even 7 ..= S (S (S (S (S (S (S (∅))))))), but doing so,
we won’t be able to define all of the natural numbers at once. We can try a top-to-bottom approach
instead; that is, isolate which ordinals we would like to be called natural numbers.
2
The correct word here is “recursively”.
3
One can relax the hypothesis by replacing sets with classes, see Section 11.
10
Definition 5.1. An ordinal α is called a natural number, if all β ⩽ α are successor ordinals or 0.
Now we want to define the set of natural numbers. It is tempting to say “Put N ..= {α : α is a natural number}”,
but the problem is that this may not be a set! In other words, this type of definitions is not allowed in
general as it may lead to contradictions (see section Classes below). What is allowed is postulated
by Axiom 5 (Comprehension) and in order to use this axiom schema, we need to know a priori that
there exists a set y that contains all of the natural numbers. Using ∅ and set-theoretic operations like
pairing, union, powerset, etc., we won’t be able to construct such a set. This is why the existence
of such a set is outright postulated in ZF and is called Axiom of Infinity (Axiom 6). It says the
following:
There is a set y such that ∅ ∈ y and for every x ∈ y, S (x) ∈ y.
Definition 5.2. Call a set y inductive, if it satisfies ∅ ∈ y and for every x ∈ y, S (x) ∈ y.
Lemma 5.3. Every inductive y contains all of the natural numbers.
Proof. Assume for contradiction that there is a natural number n (in the sense of Definition 5.1) that
is not in y, and let m be the least ordinal in S (n) that does not belong to y.4 m cannot be ∅ as ∅ ∈ y,
so m must be a successor ordinal. Thus m = S (k) for some natural number k. By the choice of m,
k ∈ y and hence m ∈ y, a contradiction. □
Using this lemma and Axiom 6 (Infinity), we can apply Axiom 5 (Comprehension) and get
ω ..= {α ∈ y : α is a natural number} .
We will also use N for ω when we don’t necessarily want to view it as an ordinal.
Proposition 5.4.
(a) ω is a limit ordinal and is the least such.
(b) ω is an inductive set.
(c) ω is the ⊆-least inductive set.
Proof. Left as an exercise. □
Remark 5.5. Part (c) of this proposition is the statement of the (usual) induction theorem for natural
numbers. It says in particular that if P ⊆ ω has the property that 0 ∈ P and n ∈ P ⇒ S (n) ∈ P, for
all n ∈ ω, then P = ω.

6. Equinumerosity
There are two ways to figure out whether the number of students is equal to the number of chairs
in the classroom:
(1) Count the students and count the chairs, and see if those numbers are equal.
(2) Ask the students to each pick a chair and sit on it. If there are standing students or empty chairs,
then the answer is no; otherwise, it is yes.
In set theory we take the second approach because that’s the one that works for infinite sets.
Definition 6.1. Two sets A, B are called equinumerous, noted A ≡ B, if there is a bijection between
them. We write A ⊑ B if A is equinumerous with a subset of B, i.e. A injects into B. Finally, we
write A ⊏ B if A ⊑ B but A . B.
4
We used Axiom 5 (Comprehension) here.
11
We leave it as an exercise5 to prove that

N≡Z≡Q

and
R ≡ (0, 1) ≡ [0, 1] ≡ 2N ≡ P(N).

Notation: For sets A, B, a function f : A → B and a subset C ⊆ A, let f ′′C or f ′′ (C) denote the
image of C under f , i.e.
f ′′C = {b ∈ B : ∃a ∈ C(b = f (a))} .
Here is a very easy but relevant lemma:

Lemma 6.2. Let A, B be nonempty sets. Any injection f : A → B has a surjective left inverse
g : B → A, i.e. g ◦ f = idA .

Proof. Fix a ∈ A and define g : B → A by setting, for b ∈ B,



 f −1 (b) if b ∈ f ′′ A

g(b) = 

a
 otherwise.

The following can be safely considered as the first theorem of set theory.

Theorem 6.3 (Cantor). Let A be a set.


(a) There is no surjection f : A ↠ P(A).
(b) P(A) @ A.

Proof. Part (b) follows from (a) by Lemma 6.2. For (a), assume that there is such surjection f and
let B = {a ∈ A : a < f (a)}. Since f is surjective, there is b ∈ A such that f (b) = B. Now either
b ∈ f (b) or b < f (b), and both cases yield a contradiction. □

Definition 6.4. A set A is said to be countable if A ⊑ N.

Cantor’s theorem in particular implies that P(N) (and hence also R) is uncountable.

The following theorem helps when proving equinumerosity of sets.

Theorem 6.5 (Cantor–Schröder–Bernstein). For sets A, B, if A ⊑ B and B ⊑ A, then A ≡ B.

Proof. Prototypical case: Let A = B = N ∪ {∞}, f (n) ..= g(n) ..= n + 1, for n ∈ N, and f (∞) ..=
g(∞) ..= ∞. To distinguish the two copies of N ∪ {∞}, denote the elements n of A and B by nA and
nB , respectively. We define a bijection h : A → B as follows:

 f (a)
 if a = 2n or a = ∞
h(a) ..= 

g (a) if a = 2n + 1.
 −1

5
Read Theorem 6.5 first.
12
Schematically, from 0A 0B we obtain 0A 7 0B
f g f
w ' g−1 '
1A 1B 1A 1B
f g
w '
2A 2B 2A 7 2B
f g f
w ' g−1 '
3A 3B 3A 3B
f g

.. x .. & .. .. .. ..
. . . . . .
f
∞A o / ∞B ∞A f / ∞B
g
General case: Let f : A → B and g : B → A be injections. We prove by reducing this to the
prototypical case as follows: we will obtain partitions
G G
A= An ⊔ A∞ and B = Bn ⊔ B∞
n∈N n∈N

such that f ′′ An = Bn+1 and g′′ Bn = An+1 for each n ∈ N, as well as f ′′ A∞ = B∞ , so we define a
desired bijection h : A → B just like in the prototypical example, namely, for a ∈ A,

 f (a)
 if a ∈ A2n or a ∈ A∞
h(a) = 
.

.
g−1 (a) if a ∈ A2n+1

and have basically the same picture:

from A0 B0 obtain A0 3; B0
f g f
s{ #+ g−1 #+
A1 B1 A1 B1
f g
s{ #+
A2 B2 A2 3; B2
f g f
s{ #+ g−1 #+
A3 B3 A3 B3
f g

.. t| .. "* .. .. .. ..
. . . . . .
f
+3 +3
A∞ ks B∞ A∞ f B∞
g

To carve out such partitions, we define recursively decreasing sequences (Ān )n∈N and ( B̄n )n∈N by
Ā0 ..= A, B̄0 ..= B,
B̄n+1 ..= f ′′ Ān , and Ān+1 ..= g′′ B̄n ,
and let An ..= Ān \ Ān+1 , Bn ..= B̄n \ B̄n+1 , as well as
\ \
A∞ ..= Ān and B∞ ..= B̄n .
n∈N n∈N

It remains to verify that the partitions A = n∈N An ⊔ A∞ and B = n∈N Bn ⊔ B∞ are as desired.
F F
The injectivity of f implies that f ′′ commutes with set-subtraction and intersections, so we have
13
f ′′ An = f ′′ (Ān \ Ān+1 ) = f ′′ Ān \ f ′′ Ān+1 = B̄n+1 \ B̄n+2 = Bn+1 and
 
\  \ \ \
f ′′ A∞ = f ′′  Ān  = f ′′ Ān = B̄n+1 = B̄n = B∞ .


n∈N n∈N n∈N n∈N
Similarly, we have the analogous statements for g, which finishes the proof. □

7. Finite/infinite sets and Axiom of Choice


Which sets should be called finite? Intuitively, a set is finite if when we count its elements, we
get a natural number. Here is a formal version of this definition:
Definition 7.1. A set A is said to be finite if it is equinumerous with some natural number. Otherwise,
we say that it is infinite.
There are a few other intuitive definitions of infinite and below we explore the relations between
them.
Definition 7.2 (Dedekind). A set A is called Dedekind infinite if it is equinumerous with a proper
subset of itself, i.e. there is an injection f : A → A with f ′′ A ⊊ A. Otherwise, A is said to be
Dedekind finite.
Proposition 7.3. Finite sets are Dedekind finite.
Proof. It is enough to prove that natural numbers are Dedekind finite, and we do this by induction on
ω. It is trivial that ∅ is Dedekind finite since it does not have proper subsets. Assume n is Dedekind
finite and show that so is n + 1. It is enough to show that every injection from n + 1 to n + 1 is
surjective. Let f : n + 1 → n + 1 be an injection. We first claim that without loss of generality we
may assume that f ′′ n ⊆ n: otherwise, f (n) < n and m ..= f −1 (n) < n, and by swapping the values of
f (m) and f (n) we will achieve f ′′ n ⊆ n. Since f |n : n → n is injective, it must also be surjective by
the induction hypothesis. Thus f (n) must be equal to n, and hence f is surjective as well. □
One could also define infinite sets to be exactly those sets A, for which ω ⊑ A. The following
proposition shows that this notion coincides with Dedekind infinite.
Proposition 7.4. A set A is Dedekind infinite if and only if ω ⊑ A. In particular, ω is Dedekind
infinite.
Proof. ⇒: Let f : A → A be an injection such that f ′′ A ⊊ A. For each n ∈ N, put Bn ..=
( f n )′′ (A \ f ′′ A). Because f is injective, it respects the setminus operation and thus, f ′′ Bn =
( f n )′′ (A) \ ( f n+1 )′′ (A), so, the sets Bn are pairwise disjoint. Fix x ∈ B0 and define g : ω → A by
n 7→ f n (x). This g is injective because g(n) ∈ Bn .
⇐: Instead of a proof, I will tell you the story of Hilbert’s hotel. It has ω-many rooms and all of
them are occupied. However, a new guest comes and demands a room. To resolve the situation, the
hotel manager simply asks all of the tenants to move to the room next to their current room with
one higher number, thus making room number 0 available for the new guest. □
There is yet another notion of infinite for a set A, namely, the existence of a surjection A ↠ ω.
By Lemma 6.2, ω ⊑ A implies A ↠ ω.
To summarize, we have mentioned the following notions of infinite for a set A:
(1) A is not equinumerous with a natural number.
(2) A is Dedekind infinite.
(3) ω ⊑ A.
(4) A ↠ ω.
14
So far, we have shown that (2)⇒(1), (2)⇔(3), and (3)⇒(4). Our intuition (at least mine) tells us
that all of these notions should be equivalent. Let’s attempt to prove for example that (1)⇒(3).
Attempt to prove (1)⇒(3). Suppose that A is not equinumerous with a natural number, and try to
recursively define an embedding f : ω → A as follows: fix a0 ∈ A, and for n ∈ ω, put

some element from A \ ( f |n )′′ n if A \ ( f |n )′′ n , ∅
  
f (n) .= 
.


a0
 otherwise.

The first condition must hold since otherwise f |n would be a bijection from n to A, contradicting
our hypothesis. Thus f is injective and we are done.
However, something is wrong with the above recursive definition. Namely, the right side is
′′ 
not a (predefined) function. The expression “some element from A \ ( f |n ) n ” does not define a
correspondence n 7→ an ∈ A \ ( f |n )′′ n for all n ∈ ω at once. What would fix our problem6 is if we

had a “choice” function c : P(A) → A such that for each nonempty B ⊆ A, c(B) ∈ B. Then we
would define our f by
f (n) = c(A \ ( f |n )′′ n),
for n ∈ ω, and this f would clearly work.
Similarly, if we had such a choice function c : P(A) → A, we could prove that (4)⇒(3) as
follows: if f : A → ω is a surjection, then define g : ω → A by a 7→ c( f −1 (a)). Clearly g is
injective.
Thus, the existence of a choice function c : P(A) → A would prove the equivalence of (1)-(4).
Does such a function always exist? This is not postulated by any of the axioms of ZF, and in fact,
the existence of such a choice function cannot be proven from ZF7. Hence, we need an extra axiom
(Axiom 9), called Axiom of Choice (AC), that states the following:
If A is a set whose elements are nonempty
S sets, then there exists a choice function; that is, a
function f : A → A such that for every x ∈ A, f (x) ∈ x.
The proof system ZF+AC is called the Zermelo–Fraenkel set theory with Choice and denoted by
ZFC. Below, all of the statements or exercises that use AC are marked with (AC).
Using the above attempts to prove (1)⇒(3) and (4)⇒(3), we now get
Proposition 7.5 (AC). For a set A, the following are equivalent:
(1) A is not equinumerous with a natural number.
(2) A is Dedekind infinite.
(3) ω ⊑ A.
(4) A ↠ ω.
For this and many other reasons, most mathematicians work in ZFC rather than in ZF.
Remark 7.6. To prove (1)⇒(3) and (4)⇒(3), one does not need the full power of AC: it is enough to
assume a weaker axiom, called Dependant Choice (DC), which (roughly speaking) gives a function
that makes a choice depending on the previously made choices. This is exactly what we used in the
attempt to prove (1)⇒(3).

6
I’m not saying this is absolutely necessary; one could get away with something a bit weaker.
7
This is a highly nontrivial theorem.
15
8. Cardinals and cardinality
Definition 8.1. An ordinal α is called a cardinal if there is no ordinal β < α such that β ≡ α.
Proposition 7.3 implies the following:
Corollary 8.2.
(a) Every natural number is a cardinal.
(b) ω is a cardinal.
Definition 8.3. Let A be a set and κ a cardinal. A set A is said to have cardinality κ if A ≡ κ. We
denote this by |A| = κ. We say that A has cardinality if it has cardinality λ for some cardinal λ.
Does every set have cardinality? The following lemma gives a reformulation of this question:
Lemma 8.4. A set A has cardinality if and only if it can be well-ordered.
Proof. ⇒: If there is a bijection f : A → κ for some cardinal κ, just copy the well-ordering of κ to
A via f .
⇐: If there is a well-ordering <A on A, then by Theorem 3.12, there is an ordinal α such that
(A, <A ) ≃ α. Put
κ = min {β ⩽ α : A ≡ β} .
It is clear that κ is a cardinal and |A| = κ. □
Thus, the real question is whether every set can be well-ordered or not. Think of R for example:
the natural ordering of R is a linear ordering but not a well-ordering. In fact, it cannot be proved
in ZF that R can be well-ordered. More generally, it turns out that the statement “every set can be
well-ordered” is equivalent to AC and is known as Zermelo’s theorem. We state this in Theorem 8.6
together with an equivalent formulation called Zorn’s lemma. First, we need the following:
Definition 8.5. Let A be a set and < be an ordering on it. An element a ∈ A is called an upper
bound for a subset B ⊆ A if for any b ∈ B, b ⩽ a.8 B is called a chain if < |B is a linear ordering on
B. An element a ∈ A is called maximal if there is no element b ∈ A with a < b.
Theorem 8.6. Each of the following statements is equivalent to AC:
1. Zorn’s Lemma: For every ordered set (A, <) whose every chain has an upper bound, there is a
maximal element in A.
2. Zermelo’s Theorem: Every set can be well-ordered.
Proof. The proof is outlined in Exercises 9 to 11. □
Corollary 8.7 (AC). Every set has cardinality.

9. Uncountable cardinals and the Continuum Hypothesis


We now consider the question of whether there exists uncountable cardinals; more generally,
given a cardinal κ, is there a cardinal greater than κ? By Cantor’s theorem, we have κ ⊏ P(κ). By
AC, P(κ) has cardinality and thus λ ..= |P(κ)| > κ. This shows that the answer to the question is
positive in ZFC.
Can we prove this in ZF? The answer is still yes, and it follows from the following theorem:
Theorem 9.1 (Hartogs). For every set A, there is a cardinal κ such that κ @ A. We denote the least
such κ by χ(A).
8
As usual, ⩽ means < or =.
16
Proof. Put
n o
WO(A) ..= (B, <B ) ∈ P(A) × P(A2 ) : B ⊆ A and <B is a well-ordering on B ,
and, recalling Definition 3.13, put
χ(A) = tp(B, <B ) : (B, <B ) ∈ WO(A) ;
this set exists by Axiom 8 (Replacement). Clearly χ(A) is a transitive set of ordinals, and hence an
ordinal itself, by (c) of Lemma 3.5. To show that χ(A) @ A, assume for contradiction that there is an
injection f : χ(A) → A and put B = f ′′ χ(A). Let <B be the push-forward of ∈ |χ(A) to a relation on
B, i.e. for b1 , b2 ∈ B,
b1 <B b2 ⇐⇒ f −1 (b1 ) ∈ f −1 (b2 ).
By definition, f is an isomorphism from χ(A), ∈ to (B, <B ), and hence <B is a well-ordering on B.

Therefore, (B, <B ) ∈ WO(A) and thus tp(B, <B ) ∈ χ(A). But by uniqueness, tp(B, <B ) = χ(A), so
χ(A) ∈ χ(A), contradicting part (a) of Lemma 3.3.
A similar argument shows that χ(A) is the least ordinal with χ(A) @ A, and hence a cardinal. □
Remark 9.2. There is a version of Hartogs’ theorem that doesn’t require Axiom 8 (Replacement). It
is stated in Exercise 13.
Notation 9.3. For a cardinal κ, put κ+ ..= χ(κ). Thus κ+ is the least cardinal greater than κ. In general,
for every ordinal α, we can define the αth infinite cardinal ℵα by recursion on α as follows:
ω if α = 0



 +
ℵα =  if α = S (β) .

ℵβ
 sup ℵ if α is a limit


β<α β

For small cardinals such as ℵ0 , ℵ1 , ℵ2 , it is also common to use the notation ω0 , ω1 , ω2 , instead.
Continuum hypothesis. One of the most notorious questions in set theory (or in mathematics in
general) is whether ω1 = |P(N)|; equivalently, whether ω1 = |R|. The positive answer to this is
known as the Continuum Hypothesis (CH). In 1940, Gödel showed that it is consistent with ZFC
that CH holds. On the other hand, Cohen proved in 1963 that the failure of CH is also consistent
with ZFC, thus showing the independence of CH from ZFC. To do this, Cohen introduced his
famous method of forcing, which, together with its numerous variations, has since been the main
method of proving independence results.

10. Cofinality and Kőnig’s theorem


Definition 10.1. For κ a cardinal and β an ordinal, a sequence (xα )α<β of ordinals xα ∈ κ is called
cofinal in κ if for every γ < κ, there is an index α < β with xα ⩾ γ. We refer to β as the length of
this sequence and define the cofinality of κ to be the length of the shortest cofinal sequence in κ, i.e.
cof(κ) ..= the least ordinal β such that there is a cofinal sequence (xα )α<β in κ.
For example, cof(ω) = ω, cof(ℵω ) = ω, but cof(ω1 ) = ω1 (the proof of this is outlined in the
exercises below).
Definition 10.2. A cardinal κ is called regular if cof(κ) = κ; otherwise, it is called singular.
In the examples above, w and w1 are regular. In general, for an infinite cardinal κ, κ+ is always
regular (this is also outlined in the exercises below). However, ℵω is singular.
Cantor’s theorem implies that κ ⊏ 2κ and thus κ ⊏ κκ for an infinite cardinal κ. What about κcof(κ) ?
Do we still have κ ⊏ κcof(κ) ? The following theorem generalizes Cantor’s theorem and gives an
affirmative answer to the latter question.
17
F
Notation 10.3. LetF I be a set
S and (Ai )i∈I be a sequence of sets. We denote by i∈I Ai the disjoint
union of Ai , i.e. i∈I Ai ..= i∈I {i} × Ai .
Theorem
F 10.4 (Kőnig). Let I be a set and (Ai )i∈I , (Bi )i∈I be sequences. If Ai ⊏ Bi for all i ∈ I, then
Q
A
i∈I i ⊏ i∈I Bi .

We prove this theorem below, but first note that this is indeed a generalization of Cantor’s theorem
since applying it to I = ω, Ai = 1 and Bi = 2 (for all i ∈ ω), gives ω ⊏ 2ω . Moreover, we also get:
Corollary 10.5. κ ⊏ κcof(κ) , for every infinite cardinal κ. In particular, ℵω ⊏ ℵωω .
Proof. Put I = cof(κ) and let (Ai )i∈I be a cofinal sequence inSκ, and thus, = κ. Putting Bi = κ,
S
F i∈I AiQ
it is clear that Ai ⊏ Bi . Hence, Kőnig’s theorem implies κ = i∈I Ai ⊑ i∈I Ai ⊏ i∈I Bi = κcof(κ) . □
F Q
Proof of Theorem 10.4. First, we show that i∈I Ai ⊑ i∈I Bi . Using AC, we get a sequence ( fi )i∈I
of injections fi : Ai ,→ Bi . Note that because Bi @ Ai , Q none of these fi is surjective, so Bi \ fi′′ Ai , ∅.
Applying AC again, we F get a sequence
Q x = (xi )i∈I ∈ i∈I Bi such that xi ∈ Bi \ fi′′ Ai for each i ∈ I.
Finally, we define f : i∈I Ai → i∈I Bi by

 fi (a) if j = i

f (i, a)( j) ..= 

x j
 otherwise.
F Q
It is easy to check that f is injective and thus i∈I Ai ⊑ i∈I Bi .
F Q
Now suppose
Q towards a contradiction that there is a surjection g : i∈I Ai ↠ i∈I Bi . We will
define x ∈ i∈I Bi that is not in the image of g.
The main point is that the map gi : Ai → Bi defined by a 7→ g(i, a)(i) cannot be surjective by the
′′
F so Bi \ gi AiF
hypothesis, , ∅ and we can choose a point from it. To this end, fix a choice function
π : P ( i∈I Bi ) \ {∅} → i∈I Bi . Now for each i ∈ I,  we define
 the value x(i) such that it ensures x
is not in the g-image of {i} × Ai , namely: x(i) ..= π Bi \ g′′i Ai . It is clear that x is not in the image
of g, for if x = g(i, a) for some i ∈ I and a ∈ Ai , then x(i) , gi (a) = g(i, a)(i). □

11. Classes
This section can be safely skipped by an impatient reader. It was mainly written to satisfy the
curiosity of those readers (in particular, the author’s significant other), who have heard the phrase
“It’s a proper class, not a set.” and have always wondered what it means.
Given a mathematical statement (formula) φ(x), one may wonder if there exists a set y such that
y = {x : φ(x)} ,
i.e. y contains all of the sets (in the world) that satisfy φ(x). The answer of course depends on φ.
For example, if φ(x) is the statement “x ∈ z”, for some set z, then such a set y exists, namely y = z.
Another such example is φ(x) ≡ “∀y ∈ x(y , y)” as {x : φ(x)} = ∅. On the other hand, if φ(x) is the
statement “x < x”, then such y doesn’t exist since if it did, then either y ∈ y or y < y, and both yield
a contradiction. (This is the famous Russell’s paradox.)
However, when writing proofs or theorems, it is often convenient to refer to {x : φ(x)} as if it was
a set (even when it isn’t) and call it a class. This is just abuse of language and is not part of formal
mathematics. The word class stands for the totality of all sets satisfying the formula φ, and it can be
eliminated from any statement that contains it, although this usually makes statements longer. For
example, for a class C = {x : φ(x)}, the statement “x ∈ C” really means “x satisfies φ”. A perhaps
more convincing example is the statement “The class ON of all ordinals is well-ordered by ∈.” as it
is equivalent to the statement of (b) of Lemma 3.4 together with (b) of Lemma 3.5.
18
The classes that are not equal to sets (like the class defined by “x < x”) are called proper classes.
The intuition is that proper classes are “too big” to be sets. The existence of such classes comes
from the fact that it is not postulated by ZF (or ZFC) that for a given formula φ(x), there must be a
set y such that y = {x : φ(x)}. What is postulated is Axiom Schema of Comprehension (Axiom 5),
which we recall here:
For every set y, there is a set z such that z = {x ∈ y : φ(x)}.
Thus, to avoid ending up with a proper class when trying to define a set, we make sure that all of
the elements we want to put in our set a priori belong to some (perhaps larger) set. In rare cases
when this is not possible, we may need to use Axiom 8 (Replacement) instead, as we did in the proof
of Theorem 3.12.
Proposition 11.1. The following classes are proper:
(a) R = {x : x < x};
(b) V = {x : x = x};
(c) ON = {x : x is an ordinal}.
Proof. Part (a) is left as an exercise. For (b), note that if V was a set, then by Axiom 5 (Compre-
hension), R = {x ∈ V : x < x} would also be a set, which is a contradiction. Part (c) is left as an
exercise. □
Exercises
1. Let x, y, A, B be sets.
(a) Show that {x} is a set.
(b) Show that (x, y) ..= {{x} , {x, y}} is a set and write a formula φ(z) that holds if and only if
z is an ordered pair. Moreover, write formulas φ0 (z, x) and φ1 (z, y) such that φ0 (z, x) and
φ1 (z, y) hold if and only if z = (x, y). In other words, φ0 defines the function z 7→ x and φ1
defines the function z 7→ y.
(c) Show that A × B ..= {(x, y) : x ∈ A ∧ y ∈ B} is a set.
(d) Define the notion of a function f : A → B as a certain subset of A × B, i.e. write down
which sets are called functions from A to B.

2. Show that the powerset of a transitive set is transitive.

3. Prove part (d) of Lemma 3.3 and part (a) of Lemma 3.4. Do not use any of the later parts (b)-(c)
of Lemma 3.4 in your proofs.
4. Prove that there does NOT exist a set that contains all of the ordinals.

5. Prove Lemma 3.11.


6. Prove Proposition 5.4.
7. Prove that N ≡ Z ≡ Q.

8. Prove that R ≡ (0, 1) ≡ [0, 1] ≡ 2N ≡ P(N).


9. Prove that AC implies Zorn’s Lemma.
Outline: Let (A, <) be as in the statement of Zorn’s Lemma and assume for contradiction that
no element is maximal. Then for every chain C ⊆ A, the set U(C) = {a ∈ A : ∀b ∈ C(b < a)}
19
of strict upper bounds is nonempty. Hence, denoting the set of all chains in A (including the
empty set) by Chains(A), AC provides a function f : Chains(A) → A mapping each chain C to
an element in U(C). Using f , obtain a contradiction by defining an injection of χ(A) into A by
transfinite induction.
10. Prove that Zorn’s Lemma implies Zermelo’s Theorem.
Hint: For a set A, consider (WO(A), ≺), where WO(A) is as in the proof of Theorem 9.1.
11. Prove that Zermelo’s Theorem implies AC.
Caution: It is easy to accidentally use AC in your proof. Make sure you don’t.

12. (a) Prove without using AC that for a cardinal κ ⩾ ω, |κ × κ| = κ.


Hint: Define a well-ordering <2 of κ × κ (using the well-ordering of κ) such that the
cardinality of every proper initial segment of (κ × κ, <2 ) is less than κ (think of how you
would do it for κ = ω). Conclude that tp(κ × κ, <2 ) ⩽ κ.
(b) (AC) Conclude that if Aα are sets of cardinality at most κ, for α < κ, then | α<κ Aα | ⩽ κ.
S
Pinpoint exactly where you use AC.
(c) (AC) Show that for any infinite cardinal κ, κ+ is regular. In particular, ω1 is regular.

13. Prove the following version of Hartogs’ theorem that doesn’t use Axiom 8 (Replacement):
Theorem 11.2 (Hartogs). For every set A, there is a well-ordered set (H(A), <H(A) ) such that
H(A) @ A and (H(A), <H(A) ) is ⪯-least such well-ordering, i.e. if (B, <B ) is another well-ordering
with the property that B @ A, then (H(A), <H(A) ) ⪯ (B, <B ).

Outline: Let WO(A) be as in Theorem 9.1 and denote by H(A) the quotient of WO(A) by the
equivalence relation ≃, i.e. H(A) = WO(A)/ ≃. For (B, <B ) ∈ WO(A), let [(B, <B )] denote
the ≃-equivalence class of (B, <B ) in H(A). Define an ordering <H(A) on H(A) as follows: for
[(B, <B )], [(C, <C )] ∈ H(A),
[(B, <B )]<H(A) [(C, <C )] ⇐⇒ (B, <B ) ≺ (C, <C ).
Show that <H(A) is actually a well-ordering and verify that (H(A), <H(A) ) satisfies the conclusion
of the theorem.
14. Show that R ..= {x : x < x} is a proper class.
15. An open interval in ω1 is a set of the form (α, β) ..= {γ ∈ ω1 : α < γ < β} or [0, α) ..= α, for some
α < β < ω1 . The topology generated by open intervals is naturally called the open interval
topology.
(AC) Prove that the open interval topology on ω1 is sequentially compact (i.e. every sequence
has a convergent subsequence), but not compact (in the sense of open covers).
16. Let X be a second-countable topological space.
(a) Show that X has at most continuum many open subsets.
(b) Let α, β, γ denote ordinals. A sequence of sets (Aα )α<γ is called monotone if it is either
increasing (i.e. α < β ⇒ Aα ⊆ Aβ , for all α, β < γ) or decreasing (i.e. α < β ⇒ Aα ⊇ Aβ ,
for all α, β < γ); call it strictly monotone, if all of the inclusions are strict.
20
Prove that any strictly monotone sequence (Uα )α<γ of open subsets of X has countable
length, i.e. γ is countable.
Hint: Use the same idea as in the proof of (a).
(c) Show that every monotone sequence (Uα )α<ω1 open subsets of X eventually stabilizes, i.e.
there is γ < ω1 such that for all α < ω1 with α ⩾ γ, we have Uα = Uγ .
Hint: Use the regularity of ω1 .
(d) Conclude that parts (a), (b) and (c) are also true for closed sets.

References
[Kun83] K. Kunen, Set Theory: An Introduction to Independence Proofs, Studies in Logic and the Foundations of
Mathematics, vol. 102, Elsevier, 1983.
[Mos06] Y. N. Moschovakis, Notes on Set Theory, 2nd ed., Undergraduate Texts in Mathematics, Springer, 2006.

Department of Mathematics, University of Illinois at Urbana-Champaign, IL, 61801, USA


Email address: anush@illinois.edu

21

You might also like