Ramsey Theory
Ramsey Theory
Lent 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
What happens when we cut up a mathematical structure into many ‘pieces’ ? How
big must the original structure be in order to guarantee that at least one of the pieces
has a specific property of interest? These are the kinds of questions at the heart of
Ramsey theory. A prototypical result in the area is van der Waerden’s theorem, which
states that whenever we partition the natural numbers into finitely many classes, there
is a class that contains arbitrarily long arithmetic progressions.
The course will cover both classical material and more recent developments in the
subject. Some of the classical results that I shall cover include Ramsey’s theorem, van
der Waerden’s theorem and the Hales–Jewett theorem; I shall discuss some applications
of these results as well. More recent developments that I hope to cover include the
properties of non-Ramsey graphs, topics in geometric Ramsey theory, and finally,
connections between Ramsey theory and topological dynamics. I will also indicate a
number of open problems.
Pre-requisites
There are almost no pre-requisites and the material presented in this course will, by
and large, be self-contained. However, students familiar with the basic notions of graph
theory and point-set topology are bound to find the course easier.
1
Contents III Ramsey Theory
Contents
0 Introduction 3
3 Partition Regularity 27
Index 52
2
0 Introduction III Ramsey Theory
0 Introduction
Vaguely, the main idea of Ramsey theory is as follows — we have some object
X that comes with some structure, and then we break X up into finitely many
pieces. Can we find a piece that retains some of the structure of X? Usually, we
phrase this in terms of colourings. We pick a list of finitely many colours, and
colour each element of X. Then we want to find a monochromatic subset of X
that satisfies some properties.
The most classical example of this is a graph colouring problem. We take a
graph, say
3
0 Introduction III Ramsey Theory
3x + 6y + 2z = 0
2x + 7y + 3z = 0.
4
1 Graph Ramsey theory III Ramsey Theory
N = {1, 2, 3, 4, · · · }.
We also write
[n] = {1, 2, 3, · · · , n}.
Notation. For a set X, we write X (r) for the subsets of X of size r. The
elements are known as r-sets.
We all know what a graph is, hopefully, but for concreteness we provide a
definition:
Definition (Graph). A graph G is a pair (V, E), where E ⊆ V (2) .
In particular, our graphs are not directed.
We are mostly interested in graphs with V = N or [n]. In this case, we will
write an edge {i, j} as ij, and always assume that i < j. More generally, in N(r) ,
we write i1 i2 · · · ir for an r-set, and implicitly assume i1 < i2 < · · · < ir .
Definition (k-colouring). A k-colouring of a set X is a map c : X → [k].
Of course, we often replace k with some actual set of colours, e.g. {red, blue}.
We make more obvious definitions:
Definition (Monochromatic set). Let X be a set with a k-colouring. We say a
subset Y ⊆ X is monochromatic if the colouring restricted to Y is constant.
The first question we are interested is in the following — if N(2) is k-coloured,
can we find a complete infinite subgraph that is monochromatic? In other words,
is there an infinite subset X ⊆ N such that X (2) ⊆ N(2) is monochromatic?
We begin by trying some examples. The correct way to read these examples
is to stare at the colouring, and then try to find a monochromatic subset yourself,
instead of immediately reading on to see what the monochromatic subset is.
5
1 Graph Ramsey theory III Ramsey Theory
a2
a1
6
1 Graph Ramsey theory III Ramsey Theory
...
7
1 Graph Ramsey theory III Ramsey Theory
8
1 Graph Ramsey theory III Ramsey Theory
i j k ` m n
which contradicts the fact that c1 is DIFF. So we know X4 is also coloured DIFF
under c4 .
We are almost there. We need to deal with edges that nest in the sense of
(3)
(iii) and (iv). We look at c5 : X4 → {LSAME, LDIFF} given by
(
LSAME c(ij) = c(ik)
c5 (ijk) =
LDIFF otherwise
9
1 Graph Ramsey theory III Ramsey Theory
but we still have to do the same analysis and it just complicates matters more.
There is a generalization of this to r-sets. One way we can rewrite the
theorem is to say that the colour is uniquely determined by some subset of the
vertices. The cases (i), (ii), (iii), (iv) correspond to no vertices, all vertices, first
vertex, and second vertex respectively. Then for r-sets, we have 2r possibilities,
one for each subset of the r-coordinates.
Theorem (Higher dimensional canonical Ramsey theorem). Let c : N(r) → N
be a colouring. Then there exists D ⊆ [r] and an infinite subset X ⊆ N such
that for all x, y ∈ X (r) , we have c(x) = c(y) if {i : xi = yi } ⊇ D, where
x = {x1 < x2 < · · · < xr } (and similarly for y).
10
1 Graph Ramsey theory III Ramsey Theory
We can write
V = [n], E = {12, 23, 34, · · · , (n − 1)n}.
Finally, we have
V = [n], E = V (2) .
11
1 Graph Ramsey theory III Ramsey Theory
a1
12
1 Graph Ramsey theory III Ramsey Theory
Next we randomly pick a point in one of the red or blue sets, and try to move on
from there. Suppose we landed in the red one. Now note that if we find a blue
Kn in the red set, then we are done. But on the other hand, we only need a red
Kn−1 , and not a full blown Kn . When we moved to this red set, the problem is
no longer symmetric.
Thus, to figure out a good bound, it is natural to consider an asymmetric
version of the problem.
Definition (Off-diagonal Ramsey number). We define R(n, m) = R(Kn , Km )
to be the minimum N ∈ N such that whenever we red-blue colour the edges of
KN , we either get a red Kn or a blue Km .
Clearly we have
R(n, m) ≤ R(max{n, m}).
In particular, they are finite. Once we make this definition, it is easy to find a
bound on R.
Theorem. We have
v(Kn ) \ {v} = A ∪ B,
where each vertex A is joined by a red edge to v, and each vertex in B is joined
by a blue edge to v. Then
It follows that either |A| ≥ R(n − 1, m) or |B| ≥ R(n, m − 1). We wlog it is the
former. Then by definition of R(n − 1, m), we know A contains either a blue
copy of Km or a red copy of Kn−1 . In the first case, we are done, and in the
second case, we just add v to the red Kn−1 .
The last formula is just a straightforward property of binomial coefficients.
In particular, we find
4n
2n − 1
R(n) ≤ ≤√ .
n−2 n
We genuinely have no idea whether ∼ 4n is the correct growth rate, i.e. if there
is some ε such that R(n) ≤ (4 − ε)n . However, we do know that for any c > 0,
we eventually have
4n
R(n) ≤ c .
n
13
1 Graph Ramsey theory III Ramsey Theory
That was an upper bound. Do we have a lower bound? In particular, does R(n)
have to grow exponentially? The answer is yes, and the answer is a very classical
construction of Erdös.
√ n
Theorem. We have R(n) ≥ 2 for sufficiently large n ∈ N.
The proof is remarkable in that before this was shown, we had no sensible
bound at all. However, the proof is incredibly simple, and revolutionized how
we think about colourings.
√ n
Proof. Let N ≤ 2 . We perform a red-blue colouring of KN randomly, where
each edge is coloured red independently of the others with probability 12 .
We let XR be the number of red copies of Kn in such a colouring. Then
since expectation is linear, we know the expected value is
(n2 )
N 1
[XR ] =
n 2
n (n2 )
eN 1
≤
n 2
e n √ n2 √ −n2
≤ 2 2
ne n
=
n
1
<
2
for sufficiently large n.
Similarly, we have [XB ] < 12 . So the expected number of monochromatic Kn
is < 1. So in particular there must be some colouring with no monochromatic
Kn .
Recall the bound
m+n−2
R(m, n) ≤ .
m−1
If we think of m as being fixed, then this tells us
R(m, n) ∼ (n + m)m−1 .
For example, if m is 3, then we have
n+1 n(n + 1)
R(3, n) ≤ = ∼ n2 .
2 2
We can sort-of imagine where this bound came from. Suppose we randomly pick
a vertex v1 . Then if it is connected to at least n other vertices by a red edge,
then we are done — if there is even one red edge among those n things, then we
have a red triangle. Otherwise, all edges are blue, and we’ve found a complete
blue Kn .
So this is connected to at most n − 1 things by a red edge. So if our graph
is big enough, we can pick some v2 connected to v1 by a blue edge, and do the
same thing to v2 . We keep going on, and by the time we reach vn , we would
have found v1 , · · · , vn all connected to each other by blue edges, and we are
done. So we have K(3, n) ∼ n2 .
14
1 Graph Ramsey theory III Ramsey Theory
v1 v2 v3
But this argument is rather weak, because we didn’t use that large pool of blue
edges we’ve found at v1 . So in fact this time we can do better than n2 .
Theorem. We have
100n2
R(3, n) ≤
log n
for sufficiently large n ∈ N.
Here the 100 is, obviously, just some random big number.
Proof. Let N ≥ 100n2 / log n, and consider a red-blue colouring of the edges of
KN with no red K3 . We want to find a blue Kn in such a colouring.
We may assume that
(i) No vertex v has ≥ n red edges incident to it, as argued just now.
(ii) If we have v1 , v2 , v3 such that v1 v2 and v1 v3 are red, then v2 v3 is blue:
v2
v1 v3
We now let
Claim.
log n
E[Xv ] >
10
for each vertex v.
To see this, let
A = {u : uv is red}
and let
B = {u : uv is blue}.
15
1 Graph Ramsey theory III Ramsey Theory
then from the properties we’ve noted down, we know that |A| < n and A(2) is
blue. So we know very well what is happening in A, and nothing about what is
in B.
We fix a set S ⊆ B such that S ∈ F, i.e. S (2) is blue. What can we say about
W if we condition on B ∩ W = S?
A
T
a1
S
B
Let T ⊆ A be the set of vertices that are joined to S only by blue edges. Write
|T | = x. Then if B ∩ W = S, then either W ⊆ S ∪ T , or W ⊆ S ∪ {v}. So there
are exactly 2x + 1 choices of W . So we know
n 2x
E[Xv | W ∩ B = S] ≥ + (E[|random subset of T|])
2x + 1 2x + 1
n 2x−1 x
= x + x .
2 +1 2 +1
Now if
log n
x< ,
2
then
n n log n
≥√ ≥
2x + 1 n+1 10
for all sufficiently large n. On the other hand, if
log n
x≥ ,
2
then
2x−1 x 1 log n log n
x
≥ · ≥ .
2 +1 4 2 10
So we are done.
Claim. X
Xv ≤ 2n|W |.
v∈V
16
1 Graph Ramsey theory III Ramsey Theory
17
2 Ramsey theory on the integers III Ramsey Theory
By the pigeonhole principle, there must be two things that are the same colour,
say
We now cut the universe into blocks into 5 things. Again by the pigeonhole
principle, there must be two blocks that look the same. Say it’s this block again.
18
2 Ramsey theory on the integers III Ramsey Theory
Now we have two sequences, and the point at the end belongs to both of the
two sequences. And no matter what colour it is, we are done.
For k = 3, we can still find such a block, but now that third point could be a
third colour, say, green. This will not stop us. We can now look at these big
blocks, and we know that eventually, these big blocks must repeat themselves.
Here we did the case m = 2, and we used the pigeonhole principle. When we do
m > 2, we will use van der Waerden theorem for smaller m inductively.
We now come up with names to describe the scenario we had above.
Definition (Focused progression). We say a collection of arithmetic progressions
A1 , A2 , · · · , Ar of length m with
Ai = {ai , ai + di , · · · , ai + (m − 1)di }
19
2 Ramsey theory on the integers III Ramsey Theory
0
We view this k-colouring of [n] as a k 2n colouring of blocks of length 2n0 , of
0
which there are W (m − 1, k 2n ).
0
Then by definition of W (m − 1, k 2n ), we can find blocks
Bs , Bs+t , · · · , Bs+(m−2)t
which are coloured identically. By the inductive hypothesis, we know each Bs
contains r − 1 colour-focused AP’s of length m − 1, say A1 , .., Ar−1 with first
terms a1 , · · · , ar and common difference d1 , · · · , dr−1 , and also their focus f ,
because the length of Bs is 2n0 , not just n0 .
Since we assumed there is no monochromatic progression of length n, we can
assume f has a different colour than the Ai .
Now consider A01 , A02 , · · · , A0r−1 , where A0i has first term ai , common difference
di + 2n0 t, and length m − 1. This difference sends us to the next block, and then
the next term in the AP. We also pick A0r to consist of the all the focus of the
blocks Bi , namely
A0r = {f, f + 2n0 t, · · · , f + 2n0 t(m − 2)}
These progressions are monochromatic with distinct colours, and focused at
f + (2n0 t)(m − 1). So we are done.
This argument, where one looks at the induced colourings of blocks, is called
a product argument.
The bounds we obtain from this argument are, as one would expect, terrible.
We have 4k k
..
W (3, k) ≤ k . ,
where the tower of k’s has length k − 1.
Now we can generalize in a different way. What can we say about monochro-
matic structures a k-colouring of Nd ? What is the right generalization of van
der Waerden theorem?
To figure out the answer, we need to find the right generalization of arithmetic
progressions.
Definition (Homothetic copy). Given a finite S ⊆ Nd , a homothetic copy of S
is a set of the form
`S + M,
where `, M ∈ Nd and ` 6= 0.
Example. An arithmetic progression of length m is just a homothetic copy of
[m] = {1, 2, · · · , m}.
Thus, the theorem we want to say is the following:
Theorem (Gallai). Whenever Nd is k-coloured, there exists a monochromatic
(homothetic) copy of S for each finite S ⊆ Nd .
In order to prove this, we prove the beautiful Hales–Jewett theorem, which
is in some sense the abstract core of the argument we used for van der Waerden
theorem.
We need a bit of set up. As usual, we write
[m]n = {(x1 , · · · , xn ) : xi ∈ [m]}.
Here is the important definition:
20
2 Ramsey theory on the integers III Ramsey Theory
It is easy to see that any line has exactly [m] elements. We write L− and L+
for the first and last point of the line, i.e. the points where the active coordinates
are all 1 and m respectively. It is clear that any line L is uniquely determined
by L− and its active coordinates.
Example. In [3]3 , we have the line
L = {(1, 2, 1), (2, 2, 2), (3, 2, 3)}.
This is a line with I = {1, 3} and a2 = 2. The first and last points are (1, 2, 1)
and (3, 2, 3).
Then we have the following theorem:
Theorem (Hales–Jewett theorem). For all m, k ∈ N, there exists N = HJ(m, k)
such that whenever [m]N is k-coloured, there exists a monochromatic line.
Note that this theorem implies van der Waerden’s theorem easily. The idea
is to embed [m]N into N linearly, so that a monochromatic line in [m]N gives an
arithmetic progression of length m. Explicitly, given a k-colouring c : N → [k],
we define c0 : [m]N → [k] by
c0 (x1 , x2 , · · · , xN ) = c(x1 + x2 + · · · + xN ).
Now a monochromatic line gives us an arithmetic progression of length m. For
example, if the line is
L = {(1, 2, 1), (2, 2, 2), (3, 2, 3)},
then we get the monochromatic progression 4, 6, 8 of length 3. In general, the
monochromatic AP defined has d = |I|.
The proof is essentially what we did for van der Waerden theorem. We are
going to build a lot of almost-lines that point at the same vertex, and then no
matter what colour the last vertex is, we are happy.
21
2 Ramsey theory on the integers III Ramsey Theory
22
2 Ramsey theory on the integers III Ramsey Theory
0
Consider the line L01 , L02 , · · · , L0r−1 in [m]n+n , where
(L0i )− = (L− −
i , L ),
and the active coordinate set is Ii ∪ I, where I is the active coordinate set of L.
Also consider the line F with F − = (f, L− ) and active coordinate set I.
Then we see that L01 , · · · , L0r−1 , F are r colour-focused lines with focus
(f, L+ ).
¯ A ∩ [a, b]
d(A) = lim sup .
(b−a)→∞ |b − a|
Clearly, in any finite k-colouring of N, there exists a colour class with positive
density. Thus, we want to know if merely a positive density implies the existence
of progressions. Remarkably, the answer is yes!
Theorem (Szemerédi theorem). Let δ > 0 and m ∈ N. Then there exists some
N = S(m, δ) ∈ N such that any subset A ⊆ [N ] with |A| ≥ δN contains an
m-term arithmetic progression.
23
2 Ramsey theory on the integers III Ramsey Theory
The proof of this theorem is usually the subject of an entire lecture course,
so we are not going to attempt to prove this. Even the particular case m = 3 is
very hard.
This theorem has a lot of very nice Ramsey consequences. In the case of
graph colourings, we asked ourselves what happens if we colour a graph with
infinitely many colours. Suppose we now have a colouring c : N → N. Can we
still find a monochromatic progression of length m? Of course not, because c
can be injective.
Proof. We set
1
δ= .
m2 (m + 1)2
We let N = S(m, δ). We write
[N ] = A1 ∪ A2 ∪ · · · ∪ Ak ,
N2 2 2 N2 1 2 2 N2
− k|A i | m ≥ − (δN ) m = − δN 2 m2 ≥ 0.
(m + 1)2 (m + 1)2 δ (m + 1)2
P
So we are done. Here the first inequality follows from the fact that |Ai | = [N ]
and each |Ai | < δN .
Our next theorem will mix arithmetic and graph-theoretic properties. Con-
sider a colouring c : N(2) → [2]. As before, we say a set X is monochromatic if
c|X (2) is constant. Now we want to try to find a monochromatic set with some
arithmetic properties.
The first obvious question to ask is — can we find a monochromatic 3-term
arithmetic progression? The answer is no. For example, we can define c(ij) to be
the parity of largest power of 2 dividing j −i, and then there is no monochromatic
3-term arithmetic progression.
What if we make some concessions — can we find a blue 10-AP, or if not,
find an infinite red set? The answer is again no. This construction is slightly
more involved. To construct a counterexample, we can make progressively larger
24
2 Ramsey theory on the integers III Ramsey Theory
red cliques, and take all other edges blue. If we double the size of the red cliques
every time, it is not hard to check that there is no blue 4-AP, and no infinite
red set.
···
What if we further relax our condition, and only require an arbitrarily large red
set?
The idea is that since we have lots of red edges, we can try to find a point with
a lot of red edges connected to it, and we hope to find a progression in it.
We say X, Y ⊆ N form an (r, k)-structure if
(i) They are disjoint
(ii) X is red;
(iii) Y is an arithmetic progression;
···
25
2 Ramsey theory on the integers III Ramsey Theory
26
3 Partition Regularity III Ramsey Theory
3 Partition Regularity
In the previous chapter, the problems we studied were mostly “linear” in nature.
We had some linear system, namely that encoding the fact that a sequence is an
AP, and we wanted to know if it had a monochromatic solution. More generally,
we can define the following:
Definition (Partition regularity). We say an m × n matrix A over Q is partition
regular if whenever N is finitely coloured, there exists an x ∈ Nn such that
Ax = 0 and x is monochromatic, i.e. all coordinates of x have the same colour.
Recall that N does not include 0.
There is no loss in generality by assuming A in fact has entries in Z, by
scaling A, but sometimes it is (notationally) convenient to consider cases where
the entries are rational.
The question of the chapter is the following — when is a matrix partition
regular? We begin by looking at some examples.
x3 − x2 = x2 − x1 ,
or equivalently
x3 + x1 = 2x2 .
This implies that 1 −2 1 is partition regular. But this is actually not a
very interesting thing to say, because x1 = x2 = x3 is always a solution to this
equation. So this falls into the previous “trivial” case.
On the second example sheet we saw a stronger version of van der Waerden.
It says we can always find a monochromatic set of the form
27
3 Partition Regularity III Ramsey Theory
By including this variable, we can write down the property of being a progression
in a non-trivial format by
d
1 1 0 −1 a=0
2 1 −1 0 x2
x3
for s = 1, · · · , d. In particular,
X
c(i) = 0
i∈B1
What does this mean? Let’s look at the matrices we’ve seen so far.
Example. 1 1 −1 has the columns property by picking B1 = {1, 3} and
B2 = {2}.
Example. 2 3 −5 has the columns property by picking B1 = {1, 2, 3}.
28
3 Partition Regularity III Ramsey Theory
has the column property. Indeed, we take B1 = {1, 3, · · · , m + 2}, and since
{c(3) , · · · , c(m+2) } spans all of Rm , we know picking B2 = {2} works.
Example. ` −1 has the columns property iff ` = 1. In particular, 1 1
does not have a columns property.
Given these examples, it is natural to conjecture the following:
For a fixed prime p, we let d(x) denote the last non-zero digit of x in base p,
i.e. if
x = dn pn + dn−1 pn−1 + · · · + d0 ,
then
L(x) = min{i : di 6= 0}
and d(x) = dL(x) . We now prove the easy direction of the theorem.
Proposition. If a2 , · · · , an ∈ Q\{0} and a1 a2 · · · an is partition regular,
then X
ai = 0
i∈I
29
3 Partition Regularity III Ramsey Theory
xi ≡ d (mod pL+1 ).
Then some Ip has to occur infinitely often, and then we are done. Note that this
is merely about the fact that if a fixed number is 0 mod n for arbitrarily large n,
then it must be zero. This is not some deep local-global correspondence number
theoretic fact.
It turns out this is essentially the only way we know for proving this theorem.
One possible variation is to use the “first non-zero digit” to do the colouring,
but this is harder.
Let’s now try and do the other direction. Before we do that, we start by
doing a warm up. Last time, we proved that if we had 1 λ , then this is
partition regular iff λ = −1. We now prove a three element version.
Proposition. The equation 1 λ −1 is partition regular for all λ ∈ Q.
30
3 Partition Regularity III Ramsey Theory
Proof. We may wlog λ > 0. If λ = 0, then this is trivial, and if λ < 0, then we
can multiply the whole equation by −1.
Say
r
λ= .
s
The idea is to try to find solutions of this in long arithmetic progressions.
Suppose N is k-coloured. We let
{a, a + d, · · · , a + nd}
Proof. One direction was done. To do the other direction, we recall that we had
a really easy case of, say,
2 3 −5 ,
because we can just make all the variables the same?
In the general case, we can’t quite do this, but we may try to solve this
equation with the least number of variables possible. In fact, we shall find some
monochromatic x, y, z, and then assign each of x1 , · · · , xn to be one of x, y, z.
We know X
ai = 0.
i∈I
31
3 Partition Regularity III Ramsey Theory
But, since X
ai = 0,
i∈I
this is equivalent to
Since all these coefficients were non-zero, we can divide out by ai0 , and we are
done by the previous case.
Note that our proof of the theorem shows that if an equation is partition
regular for all last-digit base p colourings, then it is partition regular for all
colourings. This might sound like an easier thing to prove that the full-blown
Rado’s theorem, but it turns out the only proof we have for this implication is
Rado’s theorem.
We now prove the general case. We first do the easy direction, because it is
largely the same as the single-equation case.
Proposition. If A is an m × n matrix with rational entries which is partition
regular, then A has the columns property.
Proof. We again wlog all entries of A are integers. Let the columns of A be
c(1) , · · · , c(n) . Given a prime p, we consider the (p − 1)-colouring of N, where
x is coloured d(x), the last non-zero digit in the base p expansion. Since A is
partition regular, we obtain a monochromatic solution.
We then get a monochromatic x1 , · · · , xn such that Ax = 0, i.e.
X
xi c(i) = 0.
Last time, with the benefit of hindsight, we were able to choose some large prime
p that made the argument work. So we use the trick we mentioned after the
proof last time.
Since there are finitely many possible partitions of [n], we may assume that
this particular partition is generated by infinitely many primes p. Call these
primes P. We introduce some notation. We say two vectors u, v ∈ Zm satisfy
u ≡ v (mod p) if ui ≡ vi (mod p) for all i = 1, · · · , m.
Now we know that X
xi c(i) = 0.
32
3 Partition Regularity III Ramsey Theory
Similarly, for higher s, we find that for each base p colouring, we have
X X
pt dc(i) + xi c(i) ≡ 0 (mod pt+1 )
i∈Bs i∈B1 ∪...∪Bs
This is not exactly immediate, because the values of xi in (∗) may change as we
change our p. But it is still some easy linear algebra.
Suppose this were not true. Since we are living in a Euclidean space, we have
an inner product, and we can find some v ∈ Zm such that
and * +
X
v, c(i) 6= 0.
i∈Bs
Equivalently, we have
* +
X
(i)
v, c ≡0 (mod p),
i∈Bs
but this is a contradiction. So we showed that A has the columns property with
respect to the partition [n] = B1 ∪ · · · ∪ Br .
33
3 Partition Regularity III Ramsey Theory
For each j, the set {cxj + λj+1 xj+1 + · · · + λm xm : λi ∈ [−p, p]} is called a row
of S.
Example. What does a (2, p, 1) set look like? It has the form
In other words, this is just an arithmetic progression with its common difference.
Example. A (2, p, 3)-set has the form
The idea of an (m, p, c) set is that we “iterate” this process many times, and
so an (m, p, c)-set is an “iterated AP’s and various patterns of their common
differences”.
Our proof strategy is to show that that whenever we finitely-colour N, we can
always find an (m, p, c)-set, and given any matrix A with the columns property
and any (m, p, c)-set (for suitable m, p, c), there will always be a solution in
there.
34
3 Partition Regularity III Ramsey Theory
n
Since c is fixed, we can pick this so that c is large. By van der Waerden, we
find some set monochromatic
for all λi ∈ [−p, p], which is easily seen to be the case, provided N ≥ (m −
1)pM .
Note that the argument itself is quite similar to that in the 1 λ −1 case.
Recall that Schur’s theorem said that whenever we finitely-colour N, we can
find a monochromatic {x, y, x + y}. More generally, for x1 , x2 , · · · , xm ∈ N, we
let ( )
X
F S(x1 , · · · , xm ) = xi : I ⊆ [n], I 6= ∅ .
i∈I
then we can find these as well. For example, we can restrict attention to
{2n : n ∈ N}, and use the finite sum theorem. This is the same idea as we had
when we used van der Waerden to find geometric progressions.
But what if we want both? Can we have F S(x1 , · · · , xm ) ∪ F P (x1 , · · · , xm )
in the same colour class? The answer is actually not known! Even the case
when m = 2, i.e. finding a monochromatic set of the form {x, y, x + y, xy} is
open. Until mid 2016, we did not know if we can find {x + y, xy} monochromatic
(x, y > 2).
To finish he proof of Rado’s theorem, we need the following proposition:
Proposition. If A is a rational matrix with the columns property, then there
is some m, p, c ∈ N such that Ax = 0 has a solution inside any (m, p, c) set, i.e.
all entries of the solution lie in the (m, p, c) set.
In the case of a single equation, we reduced the general problem to the case
of 3 variables only. Here we are going to do something similar — we will use the
columns property to reduce the solution to something much smaller.
35
3 Partition Regularity III Ramsey Theory
[n] = B1 ∪ · · · ∪ Br
for some qis ∈ Q. These qis only depend on the matrix. In other words, we have
X
dis c(i) = 0,
where
−qis
i ∈ B1 ∪ · · · Bs−1
dis = 1 i ∈ Bs
0 otherwise
For a fixed s, if we scan these coefficients starting from i = n and then keep
decreasing i, then the first non-zero coefficient we see is 1, which is good, because
it looks like what we see in an (m, p, c) set.
Now we try to write down a general solution with r many free variables.
Given x1 , · · · , xr ∈ Nr , we look at
r
X
yi = dis xs .
s=1
Now take m = r, and pick c large enough such that cdis ∈ Z for all i, s, and
finally, p = max{cdis : i, s ∈ Q}.
Thus, if we consider the (m, p, c)-set on generators (xm , · · · , x1 ) and yi as
defined above, then we have Ay = 0 and hence A(cy) = 0. Since cy is integral,
and lies in the (m, p, c) set, we are done!
We have thus proved Rado’s theorem.
Theorem (Rado’s theorem). A matrix A is partition regular iff it has the
column property.
So we have a complete characterization of all partition regular matrices.
Note that Rado’s theorem reduces Schur’s theorem, van der Waerden’s
theorem, finite sums theorem etc. to just checking if certain matrices have the
columns property, which are rather straightforward computations.
More interestingly, we can prove some less obvious “theoretical” results.
36
3 Partition Regularity III Ramsey Theory
37
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
C = {c : Z → [k]} ∼
= [k]Z .
We endow this with the product topology. Since [k] has the discrete topology,
our basic open sets have the form
38
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
We observe that this map is invertible, with inverse given by the right shift. This
is one good reason to work over Z instead of N. Moreover, we see that L is nice
and uniformly continuous, since if
1
ρ(c1 , c2 ) ≤ ,
n+1
then
1
ρ(Lc1 , Lc2 ) ≤ .
n
Similarly, L−1 is uniformly continuous. So it follows that (C, L) is a dynamical
system.
Instead of proving Hindman’s theorem directly, we begin by re-proving van
der Waerden’s theorem using this approach, to understand better how it works.
Theorem (Topological van der Waerden). Let (X, T ) be a dynamical system.
Then there exists an ε > 0 such that whenever r ∈ N, then we can find x ∈ X
and n ∈ N such that ρ(x, T in x) < ε for all i = 1, · · · , r.
In words, this says if we flow around X under T , then eventually some point
must go back near to itself, and must do that as many times as we wish to, in
regular intervals.
To justify calling this the topological van der Waerden theorem, we should
be able to easily deduce van der Waerden’s theorem from this. It is not difficult
to imagine this is vaguely related to van der Waerden in some sense, but it is
not obvious how we can actually do the deduction. After all, topological van
der Waerden theorem talks about the whole space of all colourings, and we do
not get to control what x we get back from it. What we want is to start with a
single colouring c and try to say something about this particular c.
Now of course, we cannot restrict to the sub-system consisting only of c itself,
because, apart from it being really silly, the function L does not restrict to a
map {c} → {c}. We might think of considering the set of all translates of c.
While this is indeed closed under L, this is not a dynamical system, since it is
not compact! The solution is to take the closure of that.
Definition (Orbital closure). Let (X, T ) be a dynamical system, and x ∈ X.
The orbital closure x̄ of x is the set cl{T s x : s ∈ Z}.
39
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
for all i = 0, · · · , r.
We are not quite done, because we just know that x ∈ c̄, and not that
x = Lk x for some k.
But this is not bad. Since x ∈ c̄, we can find some s ∈ Z such that
1
ρ(T s c, x) ≤ .
rn + 1
This means x and T s c agree on the first rn + 1 elements. So we know
Then the orbital closure has just two points, c and Lc.
What if we instead had the following?
It turns out these sequences tell us everything we want to know about orbital
closures.
40
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
which implies
So we are done.
For ⇐, if Seq(c0 ) ⊆ Seq(c), then for all n ∈ N, there exists sn ∈ Z such that
Then we have
Lsi c → c0 .
So we have c0 ∈ c̄.
We now return to talking about topological dynamics. We saw that in our
last example, the orbital closure of c has three “kinds” of things — translates of
c, and the all-red and all-blue ones. The all-red and all-blue ones are different,
because they seem to have “lost” the information of c. In fact, the orbital closure
of each of them is just that colouring itself. This is a strict subset of c̄. These
are phenomena we don’t want.
Definition (Minimal dynamical system). We say (X, T ) is minimal if x̄ = X
for all x ∈ X. We say x ∈ X is minimal if (x̄, T ) is a minimal dynamical system.
Proposition. Every dynamical system (X, T ) has a minimal point.
Thus, most of the time, when we want to prove something about dynamical
systems, we can just pass to a minimal subsystem, and work with it.
Proof. Let U = {x̄ : x ∈ X}. Thus is a family of closed sets, ordered by inclusion.
We want to apply Zorn’s lemma to obtain a minimal element. Consider a chain
S in U. If
x̄1 ⊇ x̄2 ⊇ · · · ⊇ x̄n ,
then their intersection is x̄n , which is in particular non-empty. So any finite
collection in S has non-empty intersection. Since X is compact, we know
\
x̄ 6= ∅.
x̄∈S
We pick \
z∈ x̄ 6= ∅.
x̄∈S
41
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
Proof. Suppose not. Then there exists ε > 0 and points xi , yi ∈ X such that
min ρ(T s xi , yi ) ≥ ε.
|s|≤i
ρ(T s x, y) ≥ ε
ρ(T s1 y, T s2 y) < ε,
42
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
Note that this is a different statement from the theorem, because we are not
starting at x. In fact, this is a triviality. Indeed, let (x0 , n) be as guaranteed by
the hypothesis. That is, ρ(x0 , T in x0 ) < ε for all 1 ≤ i ≤ r. Then pick y = x0
and x = T −n x0 .
The next goal is to show that there is nothing special about this y.
Claim. For all ε > 0 and for all z ∈ X, there exists xz ∈ X and n ∈ N for
which ρ(T in xz , z) < ε.
The idea is just to shift the picture so that y gets close to z, and see where
we send x to. We will use continuity to make sure we don’t get too far away.
We choose m as in the previous lemma for 2ε . Since T −m , T −m+1 , · · · , T m
are all uniformly continuous, we can choose ε0 such that ρ(a, b) < ε0 implies
ρ(T s a, T s b) < 2ε for all |s| ≤ m.
Given z ∈ X, we obtain x and y by applying our first claim to ε0 . Then we
can find s ∈ Z with |s| ≤ m such that ρ(T s y, z) < 2ε . Consider xz = T s x. Then
Claim. For all ε > 0 and z ∈ X, there exists x ∈ X, n ∈ N and ε0 > 0 such
that T in (B(x, ε0 )) ⊆ B(z, ε) for all 1 ≤ i ≤ r.
We choose ε0 by continuity, using the previous claim.
The idea is that we repeatedly apply the final claim, and then as we keep
moving back, eventually two points will be next to each other.
We pick z0 ∈ X and set ε0 = 2ε . By the final claim, there exists z1 ∈ X and
some n1 ∈ N such that T in1 (B(z1 , ε1 )) ⊆ B(z0 , ε0 ) for some 0 < ε1 ≤ ε0 and all
1 ≤ i ≤ r.
Inductively, we find zs ∈ X, ns ∈ N and some 0 < εs ≤ εs−1 such that
for all 1 ≤ i ≤ r.
By compactness, (zs ) has a convergent subsequence, and in particular, there
exists i < j ∈ N such that ρ(zi , zj ) < 2ε .
Now take x = zj , and
n = nj + nj−1 + · · · + ni+1
Then
T `n (B(x, εj )) ⊆ B(zi , εi ).
for all 1 ≤ ` ≤ r. But we know
ε
ρ(zi , zj ) ≤ ,
2
and
ε
εi ≤ ε0 ≤ .
2
43
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
So we have
ρ(T `n x, x) ≤ ε
for all 1 ≤ ` ≤ r.
In the remaining of the chapter, we will actually prove Hindman’s theorem.
We will again first prove a “topological” version, but unlike the topological van
der Waerden theorem, it will look nothing like the actually Hindman’s theorem.
So we still need to do some work to deduce the actual version from the topological
version.
To state topological Hindman, we need the following definition:
Definition (Proximal points). We say x, y ∈ X are proximal if
inf ρ(T s x, T s y) = 0.
s∈Z
Example. Two colourings c1 , c2 ∈ C are proximal iff there are arbitrarily large
intervals where they agree.
We will first show how this implies Hindman’s theorem, and then prove this.
To do the translation, we need to, of course, translate concepts in dynamical
systems to concepts about colourings. We previously showed that for colourings
c, c0 , we have c0 ∈ c̄ iff Seq(c0 ) ⊆ Seq(c). The next thing to figure out is when
c : Z → [k] is minimal. By definition, this means for any x ∈ c̄, we have x̄ = c̄.
Equivalently, we have Seq(x) = Seq(c).
We introduce some notation.
Notation. For an interval I ⊆ Z, we say I = {a, a + 1, · · · , b}, we write c(I) for
the sequence (c(a), c(a + 1), · · · , c(b)).
Clearly, we have
44
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
Proof. Suppose c has the bounded gaps property. We want to show that for all
x ∈ c̄, we have Seq(x) = Seq(c). We certainly got one containment Seq(x) ⊆
Seq(c). Consider any interval I ⊆ Z. We want to show that c(I) ∈ Seq(x). By
the bounded gaps property, there is some M such that any interval U ⊆ Z of
length > M satisfies c(I) 4 c(U). But since x ∈ c̄, we know that
x([0, · · · , M ]) = c([t, t + 1, · · · , t + M ])
for some t ∈ Z. So c(I) 4 x([0, · · · , M ]), and this implies Seq(c) ⊆ Seq(x).
For the other direction, suppose c does not have the bounded gaps property.
So there is some bad interval I ⊆ Z. Then for any n ∈ N, there is some tn such
that
c(I) 64 c([tn − n, tn − n + 1, · · · , tn + n]).
Now consider xn = Ltn (c). By passing to a subsequence if necessary, we may
assume that xn → x. Clearly, c(I) 6∈ Seq(x). So we have found something in
Seq(c) \ Seq(x). So c 6∈ x̄.
It turns out once we have equated minimality with the bounded gaps property,
it is not hard to deduce Hindman’s theorem.
Theorem (Hindman’s theorem). If c : N → [k] is a k-colouring, then there
exists an infinite A ⊆ N such that F S(A) is monochromatic.
Proof. We extend the colouring c to a colouring of Z by defining x : Z → [k + 1]
with
x(i) = x(−i) = c(i)
if x 6= 0, and x(0) = k + 1. Then it suffices to find an infinite an infinite A ⊆ Z
such that F S(A) is monochromatic with respect to x.
We apply topological Hindman to (x̄, L) to find a minimal colouring y such
that x and y are proximal. Then either
We wlog it is the first case, i.e. x and y are positively proximal. We now build
up this infinite set A we want one by one.
Let the colour of y(0) be red. We shall in fact pick 0 < a1 < a2 < · · ·
inductively so that x(t) = y(t) = red for all t ∈ F S(a1 , a2 , · · · ).
By the bounded gaps property of y, there exists some M1 such that for any
I ⊆ Z of length ≥ M1 , then it contains y([0]), i.e. y([0]) 4 y(I).
Since x and y are positively proximal, there exists I ⊆ Z such that |I| ≥ M
and min(I) > 0 such that x(I) = y(I). Then we pick a1 ∈ I such that
x(a1 ) = y(a1 ) = y(0).
Now we just have to keep doing this. Suppose we have found a1 < · · · < an
as required. Consider the interval J = [0, · · · , a1 + a2 + · · · + an ]. Again by
the bounded gaps property, there exists some Mn+1 such that if I ⊆ Z has
|I| ≥ Mn+1 , then y(J) 4 y(I).
We choose an interval I ⊆ Z such that
(i) x(I) = y(I)
45
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
Pn
(iii) min I > i=1 ai .
Then we know that
y([0, · · · , a1 + · · · + an ]) 4 y(I).
{f : X → Y | f (xi ) ∈ Ui for i = 1, · · · , k}
– Post-composition Lg : X X → X X be given by Lg (f ) = g ◦ f .
– Pre-composition Rg : X X → X X be given by Rg (f ) = f ◦ g.
The useful thing to observe is that Rg is always continuous, while Lg is continuous
if g is continuous.
Now if (X, T ) is a dynamical system, we let
ET = cl{T s : s ∈ Z} ⊆ X X .
The idea is that we look at what is going on in this space instead. Let’s note a
few things about this subspace ET .
46
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
F = {f ∈ ET : f (x) ∈ Y }.
47
4 Topological Dynamics in Ramsey Theory III Ramsey Theory
48
5 Sums and products* III Ramsey Theory
But this set obviously has some very nice structure. So we are motivated to
consider the following definition:
A∗ = {x ∈ N : x, x + d, · · · , x + md ∈ A}.
is piecewise syndetic.
49
5 Sums and products* III Ramsey Theory
Proof. Let’s in fact assume that A is syndetic, with gaps bounded by b. The
proof for a piecewise syndetic set is similar, but we have to pass to longer and
longer intervals, and then piece them back together.
We want to apply van der Waerden’s theorem. By definition,
N = A ∪ (A + 1) ∪ · · · ∪ (A + b).
a1 , a1 + d1 , · · · , a1 + md1 ∈ A + i.
So we wlog i = 0.
But van der Waerden doesn’t require the whole set of N. We can split N into
huge blocks of size W (m + 1, b + 1), and then we run the same argument in each
of these blocks. So we find (a1 , d1 ), (a2 , d2 ), · · · such that
ai , ai + di , · · · , ai + mdi ∈ A.
50
5 Sums and products* III Ramsey Theory
51
Index III Ramsey Theory
Index
52