Main
Main
Morten S. Risager
These are notes for the 7 week course “Introduction to Number Theory” at the University of
Copenhagen.
Cover photo kindly provided by Christian Rasmussen.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Primes 11
1.4.1 The fundamental theorem of arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.2 On the number of primes less than a given size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 Pythagorean triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1 Basic structures 16
2.2 Modular arithmetic 17
2.3 Euler’s ϕ-function 18
2.3.1 The Chinese remainder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Primality testing 25
3.1.1 Carmichael numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Miller-Rabin’s primality test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.3 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4
4 Quadratic reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Quadratic residues and Legendre symbols 33
4.2 Quadratic reciprocity 34
4.2.1 Algebraic integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Gauss sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6 Sums of squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.0.1 Waring’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8 The ABC-conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Introduction
Number theory has a long history in mathematics. Indeed its problems and concepts have played
a formative role in many branches of mathematics. Even today it is a vibrant and active part of
modern mathematics, and it continues to offer new insights and stimulate the creation of new
mathematical subdisciplines and theories.
Historically there is a clear tendency that number theory has been driven by the desire to tackle
specific problems, e.g.: Does there exist positive integer solutions to the equation
x n + yn = zn
for n ≥ 3 (the answer is no)? Are there infinitely many twin primes (we do not know)? Trying to
answer such questions has often taken precedence, rather than generalizing the theory as much as
possible. This comment should not be taken as an indication that number theory does not use or
develop deep theories, but merely as statement of what motivates and drives many number theorists.
This text gives an introduction to the many facets of number theory, including tastes of its
algebraic, analytic, metric, Diophantine and geometric incarnations. We assume the reader has
taken a first course in algebra and has familiarity with groups as well as modular arithmetic.
The text is somewhat brief at points, and we strongly encourage the reader to dive into other
texts on number theory to get a more multifaceted view on the subject. Several are listed in the
bibliography and they all offer their own unique views on the topic.
1. Divisibility and primes
Theorem 1.2.1 — Euclidean division. Let a, b ∈ Z with b ̸= 0. Then there exist unique integers
q, r satisfying 0 ≤ r < |b| and a = bq + r.
■Example 1.1 The integers q, r from Euclidean division can be found by standard long division:
Given a = 36948, b = 2100 we find
36948 : 2100 = 17
2100
15948
14700
1248
As an example of the power of Theorem 1.2.1 we prove a simple result about division. This
will be somewhat easier to prove after we have proved the fundamental theorem of arithmetic below
(Theorem 1.4.2) but the following simple proof shows that it can be proved without the use of
unique factorization.
Proposition 1.2.2 Assume m2 | b2 . Then m | b.
Proof. Certainly there exist natural numbers l such that m | bl (e.g. l = m). By the well-ordering
principle we can let l0 be the smallest such number. Let n, k be the integers such that b2 = m2 n,
bl0 = mk. Notice that mbk = b2 l0 = m2 nl0 so bk = mnl0 . By Theorem 1.2.1 we can write k = ql0 + r
with 0 ≤ r < l0 . But then
so m divides br. By the minimality of l0 we must have r = 0. But then bl0 = mk = mql0 which
shows that b = mq. ■
Definition 1.2.1 Given a, b ∈ Z not both zero we define their greatest common divisor as the
largest natural number which divides both a and b, i.e.
We define gcd(0, 0) = 0
We note that 1 divides any integer, and that if d | a then |d| ≤ |a|, so the maximum in the above
definition is over a finite non-empty set.
Proposition 1.2.3 Let a, b ∈ Z. Then
Proof. We prove that gcd(a, b) = gcd(a, b + a). The other identities are proved a similar way. Let
g1 = gcd(a, b) and g2 = gcd(a, a + b). Since g1 | a and g1 | b we have g1 | a + b so g1 is a common
divisor of a and a + b. It follows that g1 ≤ g2 . Similarly, g2 is a divisor of a and a + b and therefore
also of a and b = (a + b) − a. Hence, g2 ≤ g1 which completes the proof. ■
The greatest common divisor of two integers a, b can be computed effectively as follows: We
have gcd(a, 0) = gcd(a, ±a) = |a|. If ab ̸= 0, |a| ̸= |b| we may use Proposition 1.2.3 to reduce to
the case a > b > 0. Using Theorem 1.2.1 we write a = bq + r with 0 ≤ r < b. By repeated use of
Proposition 1.2.3 we see that gcd(a, b) = gcd(r, b). If r = 0 we have gcd(a, b) = b. If not we repeat
the above argument with a′ = b, b′ = r. This procedure will eventually terminate since r goes down
by at least one in each step. This procedure is called Euclid’s algorithm.
■ Example 1.2 Consider a = 567 and b = 32. We use Euclid’s algorithm and find
Theorem 1.2.4 — Bezout’s identity. Let a, b be integers not both zero. Then there exist an
integer solution x, y ∈ Z to the equation
gcd(a, b) = ax + by.
Proof. One way of proving this is to take the procedure described immediately before the theorem
and then use backwards substitution. Here is another proof that has the advantage of giving an
alternative characterization of gcd(a, b):
Consider
S = {ax + by > 0|x, y ∈ Z} ⊆ N
Clearly S is non-empty. Let g = ax0 + by0 ∈ S be its smallest element. Then g | a since if
not a = gq + r with 0 < r < g by Theorem 1.2.1. But since r = a − gq = a − qax0 − by0 q =
a(1 − qx0 ) − by0 q ∈ S this contradicts the minimality of g. By the same argument g | b. Hence g is
a common divisor i.e. g ≤ gcd(a, b).
Clearly gcd(a, b) | ax0 + bx0 = g so gcd(a, b) ≤ g, and we may conclude that gcd(a, b) = g. ■
R We notice that the proof of Theorem 1.2.4 actually proves that for any a, b ∈ Z we have
gcd(a, b) = min{ax + by > 0|x, y ∈ Z}.
From this observation we easily deduce the following corollary:
1.3 Diophantine equations 9
■ Example 1.3 In the proof of Theorem 1.2.4 we used the well-ordering principle for N i.e. that
any non-empty subset of the natural numbers contains a smallest element to prove the existence of
solutions to the integer equation gcd(a, b) = ax + by. If we want to actually find such a solution we
may use the procedure described above for finding gcd(a, b) = rn−1 : We see from (1.1) that we can
express rn−1 as an integer linear combination of an−1 and bn−2 . Doing successive substitution of
the previous equation we arrive at the result.
With the integers from Example 1.2 we find that
gcd(567, 32) = 1 = 5 − 1 · 4
= 5 − 1 · (9 − 1 · 5) = −9 + 2 · 5
= −9 + 2 · (23 − 2 · 9) = 2 · 23 − 5 · 9
= 2 · 23 − 5 · (32 − 23) = −5 · 32 + 7 · 23
= −5 · 32 + 7 · (567 − 17 · 32) = 7 · 567 − 124 · 32,
Proposition 1.2.6
(i) The integers a, b are coprime if and only if 1 = ax + by for some x, y ∈ Z.
(ii) Let d = gcd(a, b). Then the integers a/d and b/d are coprime.
(iii) Assume a, b are coprime and a | c and b | c. Then ab | c.
(iv) Assume a, b are coprime and a | bc. Then a | c.
Proof. The claim in (i) follows directly from the remark after Theorem 1.2.4. The remaining 3
claims all use (i).
To prove (ii) we use again Theorem 1.2.4 and divide ax + by = d by d. Then (i) proves the
claim. To see (iii) we note that on the assumptions we have c = ea and c = f b so by (i) we have
c = c(ax + by) = f bax + eaby = ab( f x + ey) which proves the claim. To prove (iv) we note if
a, b are relatively prime and bc = da then by (i) we have c = c(ax + by) = cax + day = a(cx + dy)
which shows that a | c. ■
Theorem 1.3.1 Assume a, b, c ∈ Z with a, b not both zero. Let d = gcd(a, b). Then the equation
ax + by = c
has integer solutions if and only if d | c. If d | c the complete solution consists of all pairs of the
10 Chapter 1. Divisibility and primes
form
c b a
(x, y) = (x0 , y0 ) + n · ,−
d d d
where n ∈ Z. Here (x0 , y0 ) is any particular solution to ax + by = gcd(a, b).
Proof. If an integer solution (x, y) exists then since d divides a and b it also divides ax + by = c.
If on the other hand d | c then we pick a solution (x0 , y0 ) of ax0 + by0 = d which is possible by
Bezout’s identity (Theorem 1.2.4). It is then clear that dc (x0 , y0 ) is a solution to ax + by = c. Let
now (x, y) be any integer solution. Then (u, v) = (x, y) − dc (x0 , y0 ) is an integer solution to
au + bv = 0
and therefore also to
a b
u+ v = 0 (1.2)
d d
It follows from Proposition 1.2.6 (ii) that a/d and b/d are coprime. By Proposition 1.2.6 (iv) we
can conclude that b/d | u, i.e. u = nb/d for some n ∈ Z. Substituting back into 1.2 we find that
v = −(a/d)n. We have now proved that any integer solution is of the form
c b a
(x, y) = (x0 , y0 ) + n ,−
d d d
for some integer n. It is easy to see – by direct verification – that all such integer pairs are indeed
solutions. ■
■ Example 1.4 We find all solution to
1485x + 1745y = 15
We start by finding the greatest common divisor of 1485 and 1745. Using division with remainder
we have
1745 = 1 · 1485 + 260
1485 = 5 · 260 + 185
260 = 1 · 185 + 75
185 = 2 · 75 + 35
75 = 2 · 35 + 5
35 = 7 · 5 + 0
so by the discussion before Theorem 1.2.4 we have gcd(1485, 1745) = 5, and since 5 | 15 there are
infinitely many solutions. Doing backwards substitution we find
5 = 75 − 2 · 35
= 75 − 2(185 − 2 · 75) = 5 · 75 − 2 · 185
= 5 · (260 − 185) − 2 · 185 = 5 · 260 − 7 · 185
= 5 · 260 − 7 · (1485 − 5 · 260) = 40 · 260 − 7 · 1485
= 40 · (1745 − 1485) − 7 · 1485
= 40 · 1745 − 47 · 1485
Using Theorem 1.3.1 we find that the complete solution is
1745 1485
(x, y) = 3 · (−47, 40) + n ,− = (−141, 120) + n · (349, −297)
5 5
■
1.4 Primes 11
1.4 Primes
We recall the definition of primes:
Definition 1.4.1 A natural number p > 1 is called a prime if its only divisors are 1 and p. A
natural number n > 1 which is not a prime is called a composite number.
We note that by this definition 1 is neither composite or prime. Historically 1 has been considered a
prime for a long time, but certain theorems become easier to formulate if this is not so.
■ Example 1.5 This is the first 100 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,-
61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173,-
179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283,-
293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421,-
431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541.
Short lists like this can be easily found by hand. For a given n we note that if n is composite
√
then n = ab where either a or b is at most n. Hence if we have checked that n has no factors less
√
than n it must be prime. (See Exercise 1.8). Using Eratosthenes’ sieve we can find a complete list
of small primes. Much longer lists can be found online e.g. at primes.utm.edu. ■
R There is a very clever algorithm (published in 2004) based on Fermat’s little theorem (See
Corollary 2.3.2) which checks whether a number is prime or not. The algorithm is called
AKS after its inventors Agrawal, Kayal and Saxena. We will encounter several algorithms
later which verifies whether a number is prime or not, but a major interest in AKS is that it is
the first algorithm that provably runs in ‘computational time’ bounded by a polynomial in
log(p) where p is the number we want to investigate if it is a prime or not. Hence the required
time is polynomial in the number of digits of p. For more information see [AKS04; Gra05].
Maybe the most important fact about prime numbers is the fundamental theorem of arithmetic.
The crucial result we need to prove it, uses the divisibility results from Section 1.2:
Lemma 1.4.1 — Euclid. Let a, b ∈ Z and assume p is a prime. If p | ab then p | a or p | b.
Proof. If p | a we are done. Assume not: Then gcd(p, a) = 1, for if gcd(p, a) ̸= 1 then since 1, p
are the only divisors of p then gcd(p, a) = p which in particular means that p | a but we have
assumed this not to be the case, i.e. gcd(p, a) = 1. Using Bezout (Theorem 1.2.4) we find that there
exist integers x0 , y0 such that x0 p + y0 a = 1. Multiplying this by b we find that
x0 pb + y0 ab = b
Theorem 1.4.2 — The fundamental theorem of arithmetic. Every natural number n > 1 can
be written as a product of primes. The product is unique up to permutation of terms.
Proof. Let X ⊆ N be the set of natural numbers which can not be written as a product of primes. To
see that X is the empty set assume that it is not, and let n be its smallest member. Then n cannot be
a prime (then it is clearly a product of primes), so it has a factorization n = ab where a, b < n. But
by minimality a, b are not in X so they can both be written as a product of primes, which implies
12 Chapter 1. Divisibility and primes
that n is a product of primes. But then n is not in X which shows that X is the empty set. Hence
every n > 1 can be written as a product of primes.
To show that such a product is unique we assume that
p1 p2 · · · pd = q1 · · · ql
By successive use of Lemma 1.4.1 we find that pd = q j for some j = 1, . . . l. By possibly renaming
the qi ’s we may assume that pd = ql . Cancelling by ql we find
p1 p2 · · · pd−1 = q1 · · · ql−1
1 = q1 · · · ql−d
But this is only possible if l = d, which means we have removed all primes q j on the right. The
claim follows. ■
R The fundamental theorem of arithmetic says that every natural number can be factored
in primes. Is factoring a given number n ‘easy’or ‘hard’? We discussed earlier that the
AKS-algorithm determines whether a number n is prime or not, and does so in polynomial
time in the digits of n, i.e. determining whether a number is prime is ‘easy’. If n is not a
prime the AKS algorithm does not give a factorization of n. It is a big unsolved problem if
there exist a polynomial time algorithm which factors n. There is an algorithm due to Schor
[Sho94; Sho97] which can do prime factorizations on quantum computers in polynomial time.
However quantum computing is still in its infancy, so this has not had great practical use so
far. As we shall see later some of the most used cryptosystems like RSA relies on the fact
that no one knows how to factor ‘easily’.
Proof. We will construct infinitely many prime numbers recursively. 2 is a prime. Assume now
that we have a list of n primes p1 , p2 , . . . pn . Multiplying them together, adding 1, and then using
the fundamental theorem of arithmetic (Theorem 1.4.2) we have
p1 p2 · · · pn + 1 = q1 q2 . . . ql (1.3)
where q1 , . . . ql are (not necessarily distinct) primes. We now notice that q j ̸= pi for all i, j since
if pi = q j = p then by (1.3) we find p | 1 which is impossible. Letting pn+1 be the smallest of the
q j ’s we have added one extra prime to our list. Doing this recursively we see that we can construct
arbitrarily many primes. ■
The first few terms in the sequence of primes constructed in the proof of Theorem 1.4.3 are
2, 3, 7, 43, 13, 53, 5, 6221671, 38709183810571, 139, 2801, 11, 17, 5471, 52662739, . . .
Even if we know that there are at infinitely many primes we have of course only explicitly
computed finitely many. The largest known prime is 282589933 − 1 which is a monstrous number with
24862048 decimal digits. It was found in December 2018. If you want to join the prime-finding
community go to GIMPS or PrimeGrid. The only thing you have to do is to run a piece of software
on your computer.
It was fairly straightforward to prove that there are infinitely many primes. Consider two
coprime integers a and b. Then it is natural to ask if there are infinitely many primes of the form
an + b i.e. infinitely many primes which has remainder b if we divide by a. This is indeed so, but
except for a few simple choices of a and b this turns out to be significantly harder to prove. It was
first proved by Dirichlet more that 2000 years after Euclid’s result (Theorem 1.4.3). See [Apo76,
Thm 7.9] for an accessible proof.
where x is any real number. Theorem 1.4.3 can be formulated as π(x) → ∞ as x → ∞. The
proof of Theorem 1.4.3 can be used to show a bit more. One can use the proof to see that
π(x) ≥ log2 (log2 (x)) (See Exercise 1.13). Much more is true: By a relatively elementary argument
one can show that for x > 2 we have π(x) ≥ 61 log(x)
x
([Apo76, Thm 4.6]) as well as an analogous
x
upper bound 6 log(x) . Riemann was probably the first to fully understand the link between π(x) and
the Riemann zeta function
∞
1
ζ (s) = ∑ ns when ℜ(s) > 1,
n=1
which we shall study more in Chapter 5. Hadamard and de la Vallée Poussin proved in 1896, using
Riemann’s insights, that in fact
x
π(x) ∼ , (1.4)
log(x)
meaning that the quotient of the two sides converges to 1 as x → ∞. This result is called the prime
number theorem and its validity was conjectured independently by Gauss and Legendre 100 years
earlier. The prime number theorem is very powerful, but we expect a much stronger result to be
true: The Riemann hypothesis, which is usually stated in terms of the zeros of ζ (s), is equivalent to
the following conjecture:
Z x
1
Conjecture 1.4.5 For every ε > 0 we have π(x) − dt = O(x1/2+ε ).
2 log(t)
P(X,Y, Z) = X 2 +Y 2 − Z 2 = 0.
By Pythagoras’ theorem solutions with X,Y, Z > 0 to this equation corresponds to side lengths in
right triangles. Given an integer solution (e.g. (X,Y, Z) = (3, 4, 5)) we can always multiply this by
an integer m and get another integer solution (mX, mY, mZ). Such solutions will always have m as
a common factor. Solutions (X,Y, Z) without common factor are called primitive. Note that if two
of X, Y , Z has a common factor the third has the same factor.
14 Chapter 1. Divisibility and primes
Observe now that no primitive solution has Z = 0, and if (X,Y, Z) is a primitive solution then
(X,Y, −Z) is also a primitive solution. Since X and Y has no common factor they cannot both be
even. We claim that they also cannot both be odd. If they were then Z 2 would be of the form 2 + 4k
for some k. But no perfect square is of this form, so one of X, Y is even and the other is odd. Given
a primitive solution (X,Y, Z) we have that (Y, X, Z) is also a solution and this mapping interchanges
the primitive solutions with X even with the primitive solutions with X odd.
We now parametrize all primitive solutions.
X 2 +Y 2 = Z 2 (1.5)
Proof.
We recall that there is a bijection between primitive solution of (1.5) and the rational points on
the unit circle, i.e. rational solutions of the equation
X 2 +Y 2 = 1.
This is seen as follows:
Given a primitive solution x2 + y2 = z2 with z > 0 then (X,Y ) = (x/z, y/z) is a rational number
on the unit circle. Furthermore given a rational point (X ′ ,Y ′ ) = (x′ /z′ , y′ /w′ ) with z′ , w′ > 0 on the
unit circle then consider (x′ w′ )2 + (y′ z′ )2 = (z′ w′ )2 . It is immediate from Proposition 1.2.2 that if
two of the three numbers x′ w′ , y′ z′ , and z′ w′ has a common factor then so does the third. Hence
we can divide by this common factor m . It follows that we can write (X ′ ,Y ′ ) = (a/c, b/c) where
(a, b, c) is a primitive solution to (1.5) with c > 0. The Pythagorean triples with X odd and Y even
correspond exactly to those rational points on the circle (a/c, b/c) with a odd and b even.
We now parametrize all rational points on the unit circle. Consider P = (1, 0) and another
rational point Q = (q1 , q2 ) ̸= P on the unit circle. Then the line between them y = α(x − 1) has
rational slope α = q2 /(q1 −1). On the other hand, given a rational slope α ∈ Q the line y = α(x−1)
and x2 + y2 = 1 intersects at a point Q which satisfies
x2 + α 2 (x − 1)2 − 1 = (1 + α 2 )x2 − 2α 2 x + α 2 − 1 = 0
which has the two solutions
p
2α 2 ± 4α 4 − 4(1 + α 2 )(α 2 − 1) 2α 2 ± 2 α2 − 1
2
= 2
= 1, 2 .
2(1 + α ) 2(1 + α ) α +1
In particular we note that Q has rational coordinates:
2
α − 1 −2α
Q= , .
α2 + 1 α2 + 1
Q
P
1.5 Exercises for Chapter 1 15
2 2 −2pq
If α = qp is an irreducible fraction with p ≥ 0, we find that Q = pp2 −q ,
+q2 p2 +q2
. It is straight-
forward to see that the second coordinate is of the form b/c with b even precisely if p − q is odd.
By the above discussion the result follows. The Pythagorean triple (1, 0, 1) is covered by p = 1,
q = 0 which corresponds to α = ∞. ■
The above theorems might give the impression that it is relatively straightforward to determine
if integer polynomial equations have integer solutions. This is wrong. Most often it is very difficult
to determine all integer solutions. To illustrate this consider Fermat’s equation
xn + yn = zn , for n ≥ 3.
It took almost 400 years between 1637 when Fermat thought that he had proved that this does not
have any integer solutions satisfying xyz ̸= 0 until A. Wiles (and his collaborators) finally found a
definitive proof in 1995.
It follows from Proposition 1.2.3 that the set on the right is well-defined. If x mod n ∈ Z/nZ
is invertible then there exist y mod n ∈ Z/nZ such that xy = 1 + kn for some k ∈ Z. But
then by Proposition 1.2.6 (i) we have gcd(x, n) = 1. On the other hand if gcd(x, n) = 1
then again by Proposition 1.2.6 (i) there exist y, k such that 1 = yx + kn which shows that
(x mod n) · (y mod n) = 1 mod n.
■
Definition 2.1.3 Let R be a ring. If all non-zero elements are invertible then R is called a division
ring or a skew field. If additionally R is commutative then R is called a field.
■ Example 2.3 By Example 2.2 we see that Q, R, C are fields and Z/nZ is a field if and only if
n = p is a prime. This field is usually denoted F p . ■
ax = b (mod n) (2.2)
Proof. Let g = gcd(a, n). Assume x is a solution. Then n | ax − b so ax − b = nc for some integer
c and hence g | b. Assume on the other hand g | b. Note that by Proposition 1.2.6 (ii) we have
gcd(a/g, n/g) = 1. By (2.1) a/g is a unit in Z/(n/g)Z, so there exists a y such that ya/g = 1+kn/g.
Multiplying by b we find that x = yb/g solves (2.2). ■
■ Example 2.4 • Note that x = 4 (mod 8) and x = 0 (mod 8) are two different solutions to
2x = 0 (mod 8). Hence there is not necessarily uniqueness in Proposition 2.2.1.
• Consider solutions to 3x = 7 (mod 5). Euclidean division gives
5 = 1·3+2
3 = 1·2+1
2 = 2·1+0
It is obvious that ϕ(1) = 1. It follows from Example 2.3 that if p is a prime ϕ(p) = p − 1.
More generally we have
To see this we note that we are counting all the numbers 1, 2, . . . , pn except for those that have
a factor in common with pn . But if gcd(a, pn ) > 1 then using that p is a prime we see that
gcd(a, p) > 1 which means that p divides a. Hence the numbers we are excluding are exactly
p, 2p, 3p, . . . , pn and the claim follows.
Proof. From basic algebra we have Lagrange’s theorem which implies that for a finite group G we
have g#G = 1. Applying this to G = (Z/nZ)× gives the result by (2.1) and (2.3). ■
Corollary 2.3.2 — Fermat’s little theorem. Let p be a prime. Then a p = a (mod p) for every
a ∈ Z.
Proof. For p | a both sides are zero and the claim is trivial. For p ∤ a we have gcd(a, p) = 1 and
Euler’s theorem and (2.4) gives that a p−1 = 1 (mod p). Multiplying by a gives the result. ■
We note that the above proof is somewhat anachronistic. Historically Lagrange’s theorem was
a generalization of Fermat’s little theorem. For a more elementary proof of Euler’s theorem we
can argue as follows: Since gcd(x, n) = 1 multiplication by x gives a permutation of the elements
G = {a mod n|gcd(a, n) = 1}. It follows that
∏ ax = ∏ a (mod n).
a∈G a∈G
Since gcd(∏a∈G a, n) = 1 it follows from Bezout’s theorem that there exists an integer b such that
b ∏a∈G a = 1 (mod n). Multiplying by this b finishes the proof of Theorem 2.3.1 .
x = ai (mod ni ), i = 1, . . . , l
Proof. For l = 1 the claim is clear: Choose x = a1 . Assume l = 2 and consider the equation in t
given by
a1 + n1 t = a2 (mod n2 ).
As gcd(n1 , n2 ) = 1 Proposition 2.2.1 ensures that there is a solution to this. Indeed we may take
t = (a2 − a1 )n−1
1 (mod n2 ). Letting x = a1 + n1t we find that this solves the two equations. Assume
two solutions x, y. Then subtracting them we find that x − y = 0 (mod ni ). Then by Proposition
1.2.6 (iii) we have that n1 n2 | x − y which shows uniqueness modulo n1 n2 .
Assume now that the theorem holds for a given l. Then we want to show that it holds also for
l + 1. So assume we have a solution to
x = ai (mod ni ), i = 1, . . . , l
Then we want to show that we can find a solution to
y = ai (mod ni ), i = 1, . . . , l + 1
We consider now the equation
x + tn1 n2 · · · nl = al+1 (mod nl+1 ) (2.5)
As gcd(n1 · · · nl , nl+1 ) = 1 Proposition 2.2.1 ensures that there is a solution to this. Indeed we may
take t = (al+1 − x)(n1 · · · nl )−1 (mod nl+1 ). Letting y = x + tn1 n2 · · · nl we find that this solves all
l + 1 equations. Uniqueness is proved like the case l = 2. ■
■ Example 2.6 The Chinese mathematician Sun Tzu asked in a late 3rd century book for the
smallest solution to the equations
x = 2 (mod 3)
x = 3 (mod 5)
x = 2 (mod 7).
Using Theorem 2.3.3 we see immediately that a solution exist. Recalling the proof we see that in
order to solve the two first equations we solve, in t the equation
2 + t · 3 = 3 (mod 5).
We easily find that t = 2 solves this, and hence 2 + 2 · 3 = 8 is a solution of the two first equations.
To get a solution of all three equations we need to solve in t the equation
8+t ·3·5 = 2 (mod 7).
Since 15 = 1 (mod 7) we find that t = −6 = 1 (mod 7) solves the equation, and x = 8 + 1 · 15 = 23
solves all three equations. Since all solution are of the form x = 23 + k · 3 · 5 · 7 (and all integers of
this form are indeed solutions), this is also the smallest solution. ■
R Note that the proof given for the Chinese remainder theorem above is constructive. A shorter
non-constructive proof can be given as follows: Consider
ψ : Z/n1 · · · nl Z → Z/n1 Z × · · · Z/nl Z
(2.6)
c mod n1 · · · nl 7→ (c mod n1 , . . . , c mod nl )
which is well-defined. It is also injective, for if c mod ni = 0 for all i = 1, . . . l , then by
Proposition 1.2.6 (iii) we have c mod n1 · · · nl = 0. Since the domain and the codomain have
the same number of elements this map must also be surjective, which is a reformulation of
the Chinese remainder theorem.
20 Chapter 2. Modular arithmetic
is a bijection.
Proof. Notice that the map is independent of choice of representative. Note also that if c mod mn ∈
(Z/mnZ)× then gcd(c, mn) = 1. It follows that gcd(c, n) = 1 since gcd(c, n) | gcd(c, mn) = 1. The
same argument shows that gcd(c, m) = 1. It follows that ψ(c mod mn) ∈ (Z/mZ)× × (Z/nZ)× so
the map is indeed well-defined.
As ψ is the restriction to (Z/mnZ)× of the bijection ψ from (2.6) it follows that ψ is injective.
To see that it is surjective we have to show that if ψ(c mod mn) ∈ (Z/mZ)× × (Z/nZ)× then
c mod mn ∈ (Z/mnZ)× . But assume that a prime p | gcd(c, mn). Then by Euclid’s lemma (Lemma
1.4.1) we have p | m or p | n. But this means that p | gcd(c, m) = 1 or p | gcd(c, n) = 1 which
cannot be the case. So no such prime exists, and we have gcd(c, mn) = 1 which shows that
c mod mn ∈ (Z/mnZ)× . Hence ψ is surjective, which finishes the proof. ■
In fact a bit more is true. One can show that ψ is a group isomorphism. Indeed ψ is a ring
isomorphism (we have not defined what this means) but since we do not need it we settle for the
above statement. We need it to be able to conclude the following:
Proof. This follows immediately from Lemma 2.3.4, and the definition of ϕ. ■
■ Example 2.7 Recall Example 2.5. We have (5 mod 6)2 = 25 mod 6 = 1 mod 6. It follows that
5 mod 6 has order 2 = ϕ(6). I.e. 5 is a primitive root modulo 6. ■
Note that if a is a primitive root mod n then by the definition of ϕ(n) we have that (Z/nZ)× is
cyclic with a mod n being a generator. It is a natural question to ask whether to a given n primitive
roots modulo n exist. We will prove that if n is prime this is indeed the case. In order to prove this
we need to study roots of polynomials:
Proposition 2.4.1 Let k be a field and let 0 ̸= f ∈ k[x] be a polynomial over k. Then f has at most
deg( f ) roots.
We note that f (x) = x2 − 2 ∈ (Z/3Z)[x] has no roots in Z/3Z since modulo 3 we have f (0) =
−2, f (1) = −1, f (2) = 2.
Proof. The proof is induction in deg( f ). For polynomials of degree 0, i.e. constants, the claim is
clear. Assume now that the claim holds for polynomials of degree at most n. Let
Since xi − α i = ∑i−1 j i−1− j (x − α) this implies that f (x) = (x − α)g(x) where deg(g) ≤ n.
j=0 x α
If β ̸= α is a root of f then 0 = (β − α)g(β ) and since β − α ̸= 0 we may multiply by its
inverse (since k is a field) and find g(β ) = 0. By induction hypothesis this can be true for at most n
different β and it follows that f has at most n + 1 roots. ■
Proof. Write p − 1 = dl, and consider e(x) = x p−1 − 1 ∈ F p [x]. By Euler’s theorem (Theorem
2.3.1) all non-zero elements of F p are roots of this polynomial, i.e. it has p − 1 different roots. We
notice also the factorization
where g(x) ∈ F p [x] has degree p − 1 − d. By Proposition 2.4.1 f has at most d roots and g has at
most p − 1 − d roots but since their product has p − 1 different roots f must have at least d roots
and g must have at least p − 1 − d roots which proves the claim. ■
Lemma 2.4.3 Let G be a group. Assume g1 , g2 ∈ G are commuting elements with coprime finite
orders n1 ,n2 . Then their product g1 g2 has order n1 n2 .
Proof. Let n = ord(g1 g2 ). Since g1 and g2 commutes (g1 g2 )n1 n2 = (gn11 )n2 (gn22 )n1 = 1 so n | n1 n2 .
On the other hand we have gn1 gn2 = 1. Raising this to the n2 ’th power and using gn22 = 1 we find
gnn
1 = 1 so n1 | nn2 . Since n1 and n2 are coprime we get from Proposition 1.2.6 (iv) that n1 | n. A
2
similar argument shows that n2 |n, which then by Proposition 1.2.6 (iii) implies that n1 n2 | n. Hence
we have n = n1 n2 . ■
Theorem 2.4.4 Primitive roots modulo p exist for every prime number p.
Proof. By the fundamental theorem of arithmetic we can factor ϕ(p) = p − 1 = pl11 · · · plkk into a
product of powers of distinct primes. If we can show that for every i = 1, . . . k, we can find an
li
element ai of F×
p with order pi , then by successive use of Lemma 2.4.3 we find that a = a1 a2 . . . ak
has order ϕ(p) which proves that there exists a primitive root modulo p.
To show that there exists an elements of order plii , we observe that by Proposition 2.4.2 we see
li
that there are plii − plii −1 elements of order plii since the roots of x pi − 1 = 0 which are not roots of
22 Chapter 2. Modular arithmetic
li −1 l
pi
x pi − 1 = 0 have order plii : For such a root ai satisfies ai i = 1 so ord(ai ) | plii i.e. ord(ai ) = pli for
li −1
some 0 ≤ l ≤ li . But if l < li then ai is a root of x pi − 1 = 0 which we have assumed it is not.
This finishes the proof. ■
Theorem 2.4.4 settles the question of the existence of primitive roots modulo primes. There are
no primitive roots modulo 8 (Exercise 2.2), so some restriction on n needs to be assumed. It turns
out that primitive roots modulo n exists precisely if n = 2, 4, pe , or 2pe where p is any odd prime
and e ∈ N. We shall not prove this but please consult [JJ98, Theorem 6.7] for an accessible proof.
Proposition 2.4.6 Let n be a natural number and assume that there exist primitive roots modulo n.
Then ϕ(ϕ(n)) of the numbers 1, 2, . . . , n − 1 are primitive roots modulo n.
Proof. If a is a primitive root modulo n then a (mod n) is a generator of (Z/nZ)× . We see the
number we are looking for equals the number of 1 ≤ r ≤ ϕ(n) where ar (mod n) is a generator of
(Z/nZ)× . But ar (mod n) is a generator precisely if for every 1 ≤ c ≤ ϕ(n) the equation ary = ac
(mod n) has a solution. This happens precisely if ry = c (mod ϕ(n)) has a solution. But by
Proposition 2.2.1 this happens precisely if gcd(r, ϕ(n)) | c for all c which happens precisely if
gcd(r, ϕ(n)) = 1. Hence we are counting the number of elements in the set
R We remark that the proof of Proposition 2.4.6 shows that if a is a primitive root modulo n then
set of all primitive roots are the integers b such that b = ar mod n for some gcd(r, ϕ(n)) = 1.
■ Example 2.8 We want to find all primitive roots modulo 7. Recall that ϕ(7) = 6, so we are
looking for all natural numbers a such that a mod 7 has order 6, or said differently that a mod 7
is a generator for (Z/7Z)× . Note that 32 = 2 (mod 7), 33 = 6 (mod 7) so 3 mod 7 does not have
order 1,2, or 3. Since the order is a divisor of the order of the group. The group order is 6 so we
may conclude that 3 mod 7 has order 6 and is therefore a generator. By the proof of Proposition
2.4.6 the element 3r mod 7 is a generator precisely if gcd(r, 6) = 1 which happens precisely for
r = 1, 5 (mod 6). Note that 35 = 5 (mod 7).
Therefore 3,5 are the only primitive roots modulo 7 which are less than 6. Note that that this
agrees with ϕ(ϕ(7)) = ϕ(6) = ϕ(2)ϕ(3) = 2.
Hence a is a primitive root modulo 7 if and only if a = 3 (mod 7) or a = 5 (mod 7). ■
three. Heath-brown also showed that it holds whenever a is square-free (i.e. a being a product of
different primes) with at most 3 exceptions.
Hooley showed [Hoo67] that Artin’s conjecture follows from a generalization of the Riemann
hypothesis.
x5 − x2 + x − 3 = 0,
i.e. we are looking for integer roots of the polynomial f (x) = x5 − x2 + x − 3. If such a root exist we
can reduce it modulo n (where n is any integer) and we will have that x satisfies f (x) = 0 (mod n).
In this case it turns out to be convenient to consider n = 4. We can then verify that f (0) = 1,
f (1) = 2, f (2) = 3 and f (3) = 2, in particular these numbers are all non-zero modulo 4, and hence
no such integer solution can exist. We say that there is a local obstruction to f (x) = 0 having
integer solutions.
Note that it is straightforward to verify that f (x) has real roots as f (x) → ∞ as x → ∞, and
f (x) → −∞ as x → −∞, so by the intermediate value theorem f (x0 ) = 0 for some real x0 . ■
3 = x 2 + y2 .
It is easy to verify that any square x2 is either 0 or 1 modulo 4. Hence any expression x2 + y2 must
be either 0, 1, or 2 modulo 4. But since 3 is 3 modulo 4 this Diophantine equation does not have a
solution. Notice that this argument generalizes to proving that the equation
p = x 2 + y2 (2.8)
does not have solutions whenever p is equal to 3 modulo 4. We will later in Theorem 6.1.3 that if p
is a prime which is not equal to 3 modulo 4 then (2.8) does indeed have a solution. ■
Local obstructions are useful for proving that there are no solutions to a Diophantine equation.
In other cases there are no local obstructions (e.g. if the Diophantine equation has solutions there
can be no local obstruction). There are families of Diophantine equations where the existence of
both solutions modulo pn for every prime power and a real solution suffices to conclude that the
Diophantine equation has an integer solution. Such a result is called a local to global principle.
Exercise 2.8 Seven pirates try to share a stolen pile of identical diamonds equally between
themselves. Unfortunately, six diamonds are left over, and in the fight over them, one pirate is
thrown overboard and gets eaten by sharks. The remaining six pirates, still unable to share the
diamonds equally since two are left over, again fight, and another is killed. When the remaining
five share the diamonds, one diamond is left over, and it is only after yet another pirate is thrown
overboard that an equal sharing is possible. What is the minimum number of diamonds that the
pirates have stolen?
Exercise 2.9 For which values n is ϕ(n) odd? Show that there exists integers n with ϕ(n) =
2, 4, 6, 8, 10, 12 but not 14.
Exercise 2.10 Find the smallest integer n such that ϕ(n)/n ≤ 1/4
Exercise 2.11 Show that if n > 4 is composite then (n − 1)! = 0 mod n.
Exercise 2.12 Assume a = −1 or that a is a perfect square. Show that a is not a primitive root
modulo p for any prime p > 3. Determine whether a is a primitive root modulo 3. Compare with
Artin’s Conjecture 2.4.7.
Exercise 2.13 (Hard) Let p be an odd prime. Let g be a primitive root modulo p. Prove that either
g or g + p is a primitive roots modulo p2 . In particular this shows that primitive roots modulo p2
exists One can show that primitive roots modulo pe exist for any e ≥ 1. See e.g. [JJ98, Thm 6.7].
Exercise 2.14 Find all integer solutions to each of the equations
3x = 4 (mod 123)
5x = 23 (mod 123)
5x = 234 (mod 2002).
3. Cryptography
The basic idea of cryptography is that we want to send coded messages between two persons
(call them Alice and Bob) without anyone else being able to read and/or change the message.
A major benefit of modern cryptosystems is that they allow Alice and Bob to exchange secret
messages without having secretly exchanged the keys they are using to encrypt and decrypt their
messages. They can use keys that they construct and then exchange in an open non-encrypted form
of communication (like the internet). In fact we all use such encrypted systems every day in our
daily interactions with modern digital technology. We will now describe the principles in some of
these cryptosystems.
We will always assume that the message is a number or a string of numbers. Clearly any text
can be translated to a string of numbers. For instance one can use the unicode-values of the standard
characters. See also [Ste09, p. 3.3.2] for an even more basic implementation.
In the two cryptographic settings we will describe in some detail (Diffie-Hellman and RSA) a
central role is played by large primes. We therefore start by discussing techniques to verify if a
number is prime or not.
Theorem 3.1.1 — Wilson. A number p > 1 is a prime if and only if (p − 1)! = −1 (mod p).
Proof. The statement is clear for p = 2 so assume p > 2. If p is a prime then every 1, 2, . . . p − 1 has
multiplicative inverses in F p since it is a field. If b is its own inverse then b2 − 1 = (b − 1)(b + 1) is
divisible by p so by Euclid’s lemma 1.4.1 p divides b − 1 or b + 1. But this means that b equals
1 or p − 1. Since all the other integers less that p − 1 has a unique inverse different from itself it
follows that (p − 1)! = p − 1 = −1 (mod p).
26 Chapter 3. Cryptography
Let on the other hand (p − 1)! = −1 (mod p). If we assume that p is not a prime then it has
a non-trivial factor 1 < l < p. It follows that 0 = −1 (mod l) which is a contradiction since an
integer l > 1 cannot divide 1. ■
In terms of efficiency using Wilson’s theorem to check for primality is horribly slow as it is very
time consuming to find (p − 1)! if p is large.
Theorem 3.1.2 Let p > 1. Then p is a prime if and only if a p−1 = 1 (mod p) for every a ̸= 0
(mod p).
Proof. Assume p is prime. Then by Theorem 2.3.1 and (2.4) we have a p−1 = 1 (mod p).
If on the other hand a p−1 = 1 (mod p) for every a ̸= 0 (mod p) then if p is not a prime let
1 < l < p be a non-trivial factor. Then l p−1 = 1 (mod l) which implies that l|1 which is impossible.
By contradiction we conclude that p must be a prime. ■
■ Example 3.1 We want to show that 561 is a Carmichael number. Note that 561 = 3 · 11 · 17.
Assume that a is relatively prime to 561. Then a is also relatively prime to 3, 11, and 17. It then
follows from Euler’s theorem 2.3.1 that
a560 = a2·280 = 1 (mod 3), a560 = a10·56 = 1 (mod 11), a560 = a16·35 = 1 (mod 17)
It follows that a560 − 1 is divisible by 3, 11, and 17, and therefore by Proposition 1.2.6 (iii) also by
561. Hence a560 = 1 (mod 561), so 561 is a Carmichael number. ■
R There are many Carmichael numbers: 561, 1105, 1729, 2465, 2821, 6601, 8911 are all the
Carmichael numbers less than 10000. Ahlford, Granville, and Pomerance [AGP94] proved
in 1994 that there are infinitely many Carmichael numbers. It is a wide open problem to
determine asymptotics for the number of Carmichael numbers less than a given x.
Theorem 3.1.3 Let p > 1 be an integer, and write p − 1 = 2k m where m is odd. Then p is a
prime if and only if for every a ̸= 0 (mod p)
r
either am = 1 (mod p) or a2 m = −1 (mod p) for some 0 ≤ r < k. (3.1)
k
Proof. Assume that p > 1 is prime. By Theorem 3.1.2 1 = a p−1 = am2 (mod p), so
k−1 k−1
(am2 + 1)(am2 − 1) = 0 (mod p).
3.1 Primality testing 27
k−1 k−1
By Euclid’s lemma 1.4.1 this implies that am2 = ±1 (mod p). If am2 = −1 (mod p) we are
k−2
done. If not we can use the same argument to conclude that am2 = ±1 (mod p). Continuing to
r
strip off a power of 2 if am2 = 1 (mod p) until r = 0 we find that am = ±1 (mod p) which shows
that (3.1) holds.
Assume now instead that (3.1) holds. If am = 1 (mod p) then
k k
a p−1 = am2 = (am )2 = 1 (mod p).
r
If a2 m = −1 (mod p) then
r k−r k−r−1
a p−1 = (a2 m )2 = ((−1)2 )2 =1 (mod p).
In any case we have a p−1 = 1 (mod p) so by Theorem 3.1.2 p is a prime. ■
Given an odd number p, the existence of a single a ̸= 0 (mod p) not satisfying (3.1) suffices for
proving that p is not a prime.
If (3.1) holds for a given a ̸= 0 (mod p) we say that p is a pseudo-prime to base a. Hence a
prime is a number which is a pseudo-prime to any base a ̸= 0 (mod p). If we have tested that p
is a pseudo-prime to sufficiently many bases we might start to suspect that maybe it is indeed a
prime. This can be made precise and there is a way to use this to verify that p is a prime with a
sufficiently high probability. We will not go into the details of how this is defined but mention only
that such probabilistic tests are usually much faster than deterministic tests. And for applications in
cryptosystems we might be content with knowing that the communication is secure with probability
≥ 1 − ε with the size of ε depending on the concrete case. If I am encrypting my bank account
information ε = 10−10 might suffice, whereas if the national security is at stake I might want a
smaller ε.
R The Miller-Rabin primality test as described above is usually used as a probabilistic test. It
turns out that if we knew a generalization of the Riemann hypothesis, then we could prove
that if p is a pseudo-prime to base a for the first 2 log p bases then p is indeed a prime. See
[Bac90] for a proof.
This implies that 35 is 100011 in binary. We now compute the 2i powers of 3 (mod 100).
0
32 = 3 (mod 100)
21 20 2 2
3 = (3 ) = 3 = 9 (mod 100)
22 21 2 2
3 = (3 ) = 9 = 81 (mod 100)
3 2
32 = (32 )2 = 812 = 61 (mod 100)
4 3
32 = (32 )2 = 612 = 21 (mod 100)
5 4
32 = (32 )2 = 212 = 41 (mod 100)
5 1 0 5 1 0
It follows that 335 = 32 +2 +2 = 32 32 32 = 41 · 9 · 3 = 7 (mod 100). This method is much more
efficient than the naive multiplication. We have only made 7 multiplications (and 5 divisions by 2).
■
3.2.1 Diffie–Hellman
The Diffie–Hellman key exchange protocol is a way for Alice and Bob to agree on a secret number
that they can use to encrypt and decrypt messages that they send to each other. It allows them to
agree on a secret number that only they know, even if anyone can listen to their communication.
The protocol was proposed by W. Diffie and M.E. Hellman in the 1970s [DH76], and is still being
used in various cryptosystems.
It is a so-called symmetric key exchange meaning that the secret key (a number) is used for
both encryption and decryption.
Agreeing on a key
The Diffie–Hellman key exchange protocol consists of the following steps.
This protocol enables Alice and Bob to find a secret key that only they know, and even if
they have only communicated over insecure channels they have managed to agree on a number s
that only they know how to compute. They can now use this number to encode and decode their
messages (using standard tools like AES, Twofish, Serpent, Blowfish, CAST5, Grasshopper, RC4
or similar).
■ Example 3.3 Here is a small example of what Alice and Bob needs to compute. In more realistic
examples the numbers are of course much larger. Even for these small examples it is convenient to
use a computer to do the computations.
1. p = 101, g = 41.
2. nA = 56, gnA = 95 (mod 101).
3. nB = 22, gnB = 65 (mod 101).
4. Alice sends 95 to Bob and receives 65 from him.
5. Alice computes s = 6556 = 36 (mod 101), and Bob computes s = 9522 = 36 (mod 101), so
36 is the secret key.
■
Problem 3.1 — The discrete log problem. Let G be a finite group and let g ∈ G. Given an
element a in the cyclic group generated by g, find n such that
gn = a
In general solving the discrete log problem is believed to be computationally infeasible. No general
algorithm is known, which is polynomial in the number of digits of the size of the group G. There
are groups (Like (Z/nZ, +)) where solving the discrete logarithm problem is easy, but in general,
and for the group (Z/pZ)× it is considered very hard. Analogous to factoring into primes there is a
quantum computer algorithm developed by Schor [Sho94] which solves this in polynomial time, so
if the technology of quantum computers computers matures sufficiently, Alice and Bob need to be
careful when using the Diffie–Hellman protocol. And at all times they need to choose the size of p
in a way where they are not vulnerable to this type of attack.
What if, when Alice and Bob transmit their respective values gnA (mod p) and gnB (mod p) to
each other, Charlie manages to substitute gnA (mod p) by gnC (mod p) and also gnB (mod p) by
gnC (mod p) without Alice and Bob noticing. Here nC is a value that Charlie chooses.
Then Alice computes sA = (bnC )nA (mod p) believing this is the shared secret, while Bob
computes sB = (bnC )nB (mod p) believing this is the shared secret. But Charlie can compute both
sA , and sB since he knows gnA (mod p), gnB (mod p), and nC . Hence Charlie can completely
control the encrypted correspondence between Alice and Bob.
3.2.2 RSA
A more flexible and much more widely used protocol is the RSA protocol invented by R. Rivest, A.
Shamir, and L. Adleman in the 1970’ies [RSA78]. This is a so-called asymmetric algorithm which
uses different “keys” for encrypting and decrypting.
Constructing private and public keys
In RSA Alice needs to construct 2 keys. One for encrypting and one for decrypting. What she
needs to do is the following:
Constructing keys
1. Choose secretly two large primes p1 , p2 .
2. Compute secretly nA = p1 p2 and ϕ(nA ) = (p1 − 1)(p2 − 1), and
3. Choose a “random” 1 < eA < ϕ(nA ) which is coprime to ϕ(nA ).
4. Find secretly a solution dA to the equation eA x = 1 (mod ϕ(nA )) (using Bezout’s identity).
5. Make public the public key (eA , nA ).
6. Keep secret the private key (dA , nA ).
If Bob now wants to send the message a mod nA to Alice he simply encrypts his message by
computing aeA mod nA and sends it to Alice. Alice then computes (aeA )dA mod nA and uses the
following Proposition:
Proposition 3.2.1 Let n be a product of distinct primes. Assume ed = 1 (mod ϕ(n)). Then for
every a we have
aed = a (mod n).
Proof. Notice that ϕ(n) = ∏ p|n (p − 1). Since distinct primes are relatively prime it follows from
Proposition 1.2.6 (iii) that it suffices to prove that aed = a (mod p) for every prime dividing n. If a
is divisible by p this is clear since both sides equal 0. If a is not divisible by p the numbers a and p
are coprime, and Theorem 2.3.1 then gives that
Note that in the proof of Proposition 3.2.1 we do not use that ed is a product of integers, but only
that it equals 1 mod ϕ(n). We could just as well have formulated it for an integer m = 1 mod ϕ(n),
and then concluded that am = a mod n. The reason it is formulated with a product m = ed is
that this shows directly that Alice can compute (aeA )dA mod nA and will get as result the original
message a mod nA . This is the basic idea of RSA.
Now Bob has a secure way to send messages to Alice. If Bob wants to be able to receive
encrypted messages he needs to go through the same procedure as Alice to construct his own public
key (eB , nB ) and private key (dB , nB ). If he makes the public key public Alice (or anyone else) can
encrypt a message b mod nB by computing beB mod nB and sending this coded messages to Bob
who can decrypt it using his private key.
3.3 Exercises for Chapter 3 31
■ Example 3.4 Let us compute a (unrealistically small) set of RSA-keys for Alice.
1. Choose the primes p1 = 43, p2 = 67.
2. nA = 43 · 67 = 2881, ϕ(nA ) = 42 · 66 = 2772.
3. We choose eA = 1003.
4. 1003x = 1 (mod 2772) has solution 655.
5. The public key is (1003, 2881).
6. The private key is (655, 2881).
Alice now puts the public key on her website. Bob wants to send her the number 1330 which is the
time he wants her to meet at their secret place the next day. He computes 13301003 mod 2881 =
2259 mod 2881 and sends Alice 2259. She computes 2259655 mod 2881 = 1330 mod 2881 and
now knows that Bob wants to meet her at 13.30.
In real life applications the primes are typically several 100 digits long. ■
x2 + (ϕ(n) − (n + 1))x + n
1I
dare you to find the prime factors of
n =159542902770478353477333649649796581389938839451613607718093662563072803655525608627699609028760
74683622288154113013374670636273789063661078805497874071102398808291120799306047641864281161403791
13482778195649341783699438770737541361816167097612982009724078014971271538123606653607727522296003
21587160692562661
which is a product of two primes. The primes were computed and multiplied together on my laptop in about a second.
This is a realistic size n for RSA.
4. Quadratic reciprocity
In the field F p it is easy to solve first degree equations ax + b = 0 (mod p). Since F p is a field
this has a solution whenever a ̸= 0 mod p given by x = a−1 (−b) (mod p) where a−1 mod p is the
multiplicative inverse of a mod p in F p .
Consider now a second degree equation of the form
where we assume a ̸= 0 (mod p). By Proposition 2.4.1 this equation can have at most 2 solutions.
Assume that p ̸= 2 (See Exercise 4.1 for the case p = 2.) To understand when it has solutions we
complete the square, write D = b2 − 4ac and find
1
0 = ax2 + bx + c = 4a2 x2 + 4abx + 4ac
4a
1 1
(2ax + b) − b2 + 4ac =
2
(2ax + b)2 − D
= (mod p)
4a 4a
This small computation reduces the question of existence of solutions to (4.1) to the question of
existence of solutions to
Putting y = 2ax + b we see that (4.1) has solutions if and only if (4.2) has solutions. If D = 0
(mod p) equation√ 4.2 has exactly one solution namely y = 0, and if√D ̸= 0 (mod p) and (4.2) has
a solution (call it D) then it has two distinct solutions namely ± D. In this case it follows that
the solutions to (4.1) are given by the familiar formula
1 √
−b ± D .
2a
We will now develop methods for determining whether the equation y2 = D (mod p) has solutions
or not.
4.1 Quadratic residues and Legendre symbols 33
We note that a ∈ Z not divisible by p is a quadratic residue modulo p if and only a mod p ∈ F× p
is a square in F×
p i.e. a mod p = b2 for some b ∈ F× , and a quadratic non-residue if and only if
p
a mod p is not a square in F×
p .
■ Example 4.1 Let p = 7. We compute all squares modulo p
y mod 7 0 1 2 3 4 5 6
y2 mod 7 0 1 4 2 2 4 1
and we see that 1, 2 and 4 are quadratic residues and 3,5,6 are quadratic non-residues modulo 7 ■
Definition 4.1.2 Let p > 2 be a prime and let a ∈ Z. We define the Legendre symbol by
0
if a = 0 (mod p)
a
= 1 if a is a quadratic residue modulo p
p
−1 if a is a quadratic non-residue modulo p
We note that
a1 a2
if a1 = a2 (mod p) then = . (4.3)
p p
We note also that ap = 0 if and only if a = 0 (mod p). It follows that the map
ψ: F×
p → {±1}
a
(4.4)
a mod p 7→ p
is well-defined. Note that the set {±1} equipped with ordinary multiplication is a group of order 2.
Lemma 4.1.1 The map ψ is a surjective group homomorphism.
g0 g2 g
=
=
so f ∈ F× 2i j j
p is a square precisely if f = g for some i. It follows that ψ(g ) = (−1) . But this is
clearly a group homomorphism since
Proof. If p | ab then by Lemma 1.4.1 we have p | a or p | b, so both sides of the equation are zero.
If p ∤ ab then ab mod p ∈ F×p and by Lemma 4.1.1 we have
ab a b
= ψ(ab mod p) = ψ(a mod p)ψ(b mod p) = .
p p p
Proof. We have Q p = ker ψ, from which the result follows that Q p is a subgroup. It follows
from the isomorphism theorem that Im(ψ) = {±1} is isomorphic to the quotient F× p / ker ψ so
#Q p = (p − 1)/2 from which the claim follows. Note also that it follows from the proof of Lemma
4.1.1 that the squares are exactly g2 , g4 , . . . , g p−1 ■
We note that in case (ii) the right-hand side depends only on p mod 4 since if p = a (mod 4) then
(p − 1)/2 = (a − 1)/2 + 2k for a k ∈ Z. In case (iii) the right-hand side depends only on p mod 8
since if p = a (mod 8) then (p2 − 1)/8 = (a2 − 1)/8 + 2k′ for a k′ ∈ Z.
R The quadratic reciprocity theorem was first conjectured by Legendre and Euler and first
rigorously proved by Gauss, probably in 1796. Gauss gave several proofs and called it the
“aureum theorema” (golden theorem) in his diary. There are now more than 300 proofs (see
e.g. [Bau15, Ch. 15]). We will make an elegant proof which uses Gauss sums. Gauss sums
are important tools in number theory. For a very elementary proof using nothing more than
the Chinese remainder theorem see e.g [Rou91].
■ Example 4.2 Before diving into the proof of quadratic reciprocity let us see how it can be used
to determine if -14 is a quadratic residue modulo 2503 which can easily be verified to be a prime.
4.2 Quadratic reciprocity 35
We compute
−14 −1 2 7
= by Corollary 4.1.2
2503 2503 2503 2503
2503−1 25032 −1 2503−1 7−1 2503
= (−1) (−1)
2 (−1) 8 2 2 by Quadratic reciprocity
7
2503
= (−1)1(−1)
7
4
= by (4.3) since 2503 = 357 · 7 + 4
7
2
2
= =1 by Corollary 4.1.2
7
so −14 is a quadratic residue modulo 2503. Indeed x2 = −14 has solutions ±274 ■
Proposition 4.2.2 — Euler’s criterion. Let p > 2 be a prime and a ∈ Z. Then ap = 1 if and only
Proposition 4.2.3 Let p > 2 be a prime and let a ∈ Z. Then ap = a(p−1)/2 (mod p).
Proof. If p | a both sides are zero so suppose not. We notice that a(p−1)/2 mod p is a solution
to x2 = 1 mod p. By Proposition 2.4.2 there are exactly two solutions to this equation and since
±1 mod p are indeed solutions we must have a(p−1)/2 = ±1 (mod p).
Proposition 4.2.2 says that a is a quadratic residue precisely if a(p−1)/2 = 1 (mod p) so, if a is
a quadratic non-residues we must have a(p−1)/2 = −1 (mod p), since there are no other possible
values for it. ■
We note that ζ is rational if and only if ζ is algebraic and of degree 1. If ζ is algebraic of degree 2
we call ζ a quadratic irrational.
R There are many complex numbers which are not algebraic. In fact the algebraic numbers are
countable (See Exercise 4.4), and since the complex numbers are uncountable most complex
numbers are non-algebraic. We call them transcendental. Proving that specific numbers are
transcendental is usually √
rather tricky. Examples of provably transcendental numbers are
∞
10 −i! , e, π, eπ , 2 2 . On the other hand there are many numbers which are conjectured
∑i=1
but not proved to be transcendental like π ± e, γ, ζ (3) and many others. The study of such
properties is usually called transcendence theory (Consult [Wal79]).
Since Z ⊆ C we can multiply, and subtract two algebraic integers, and – perhaps surprisingly –
the result is still an algebraic integer as we see from the following result:
We will not prove Theorem 4.2.5, but refer to [Jar14, Cor 2.28 and Cor. 2.12]. We shall only use
the first part the theorem.
Theorem 4.2.6 The rational algebraic integers are precisely the integers i.e.
Z∩Q = Z
Proof. We have already seen in the above examples that Z ⊆ Z ∩ Q. Let ζ = mn be a rational
algebraic integer which is a root of the monic polynomial f (x) = xk + ∑k−1 i
i=0 ai x with integer
coefficients. We may assume that gcd(m, n) = 1 and n ≥ 1. Inserting ζ in f , multiplying by nk and
subtracting mk we find
k−1
∑ ai mi nk−i = −mk , (4.5)
i=0
so since n is a clearly a factor in the left-hand side we see that n | mk which implies that n = 1 since
gcd(m, n) = 1. But then ζ = m ∈ Z which completes the proof. ■
˜ (mod c) there exist a d ∈ Z such that a − b = cd. But this implies that d ∈ Q so
Proof. Since a=b
by Theorem 4.2.6 we have d ∈ Z which proves the claim. ■
4.2 Quadratic reciprocity 37
ζ2 ζ
ζ3 ζ6 = 1
ζ4 ζ5
R The theory of algebraic numbers is vast, fascinating, and important in many modern mathemat-
ical theories. Here we are only barely touching the surface. For a more thorough introduction
consult [Hec81; Jar14; Mar77; Neu99].
Definition 4.2.3 An nth root of 1 is a complex number ζ ∈ C such that ζ n = 1. The root is
called primitive if ζ k ̸= 1 for 1 ≤ k < n.
Note that both primitive and non-primitive roots of 1 are algebraic integers as they are roots of
f (x) = xn − 1
2πik
Proposition 4.2.8 The nth roots of 1 are precisely ζnk = e nwhere 0 ≤ k < n. The primitive nth
2πik
roots of 1 are precisely ζnk =e n where 0 ≤ k < n and gcd(n, k) = 1 .
■ Example 4.4 The 6th primitive roots of 1 are ζ6 = e2πi/6 and ζ65 = e2πi5/6 . All 6th roots of 1 are
illustrated in Figure 4.1 . ■
We notice immediately that Z[ζn ] is a subring of Z. We can now prove Theorem 4.2.1 (iii):
Proof. Let p be an odd prime. Consider ζ = ζ8 = e2πi/8 ∈ Z[ζ8 ] which is an 8th primitive root of
1. Note that ζ −1 = ζ 7 ∈ Z[ζ8 ]. We have
0 = ζ 8 − 1 = (ζ 4 + 1)(ζ 4 − 1) = (ζ 2 + ζ −2 )ζ 2 (ζ 4 − 1)
Since the binomial coefficients pi are divisible by p if 1 ≤ i ≤ p − 1 (See Exercise 4.10). It follows
x, y ∈ Z we have (x + y) p =x
˜ p + yp (mod p). (4.6)
Bezout to find such b) we see that 1 = 2p (mod p). We conclude that p | ( 2p − 1) and since
2 2
p − 1 ∈ {0, ±2} we find that in this case p = 1.
If p = −1 (mod 8) then ζ ±p = ζ ∓1 so we find τ =τ ˜ 2p (mod p) as before so we can again
conclude 2p = 1.
p−1
n an
g p (a) = ∑ ζ .
n=0 p p
We note that since ζ pa+p = ζ pa since ζ p is a pth root of 1 we have g p (a + p) = g p (a) so the Gauss sum
depends only on a mod p. We note also that since ζ p lies in the ring Z[ζ p ] we have g p (a) ∈ Z[ζ p ].
■ Example 4.5 Let p = 5. We have ζ5 = e2πi/5 and
n 0 1 2 3 4
n
5 0 1 -1 -1 1
The next lemma shows that this pattern holds much more generally.
Lemma 4.2.9 Let p be a prime. Then for any a ∈ Z we have
a
g p (a) = g p (1)
p
Proof. If a = 0 (mod p) then the legendre symbol is zero so we have to show that in this case we
p−1 n
have g p (a) = 0. But we have g p (a) = g p (0) = ∑n=1 p which equals 0 by Corollary 4.1.3.
4.2 Quadratic reciprocity 39
where in the penultimate equality we have used the following reasoning: The summands only
depend on n mod p. Since gcd(a, p) = 1 we see that multiplication by a gives a bijection from
Z/pZ to itself so an mod pruns through the same set as n mod p and the sum is unchanged.
Multiplying both sides by ap = ±1 gives the result. ■
Proof. If x = y (mod n) then ζnx−y = 1 and the claim is clear. If not then ζnx−y ̸= 1 and we may use
the formula for the geometric series to see that
n−1 (x−y)n
(x−y)a ζn −1
∑ ζn = (x−y)
=0
a=0 ζn −1
since ζnn = e2πin/n = 1. ■
√
Theorem 4.2.11 Let a ̸= 0 (mod p). Then g p (a)2 = (−1)(p−1)/2 p. In particular |g p (a)| = p.
2
Proof. It suffices to prove the claim for a = 1, since by Lemma 4.2.9 g p (a)2 = g p (1)2 ap = g2p .
(Recall that we write g p = g p (1)).
Note that by−1Lemma 4.2.9 and Corollary 4.1.2 we have – if a ̸= 0 (mod p) – that g p (a)g p (−a) =
a −a 2 (p−1)/2 g2 , where we have used Corollary 4.2.4. Also g (0) = 0 by
2
p p g p = p g p = (−1) p p
Lemma 4.2.9. The proof of the claim will follow from two different computations of the expression
p−1
∑a=0 g p (a)g p (−a). By the above observation it equals (p − 1)(−1)(p−1)/2 g2p . Using the definition
of g p (a) we find
p−1 p−1 p−1 p−1
n m an −am
(p − 1)(−1)(p−1)/2 g2p = ∑ g p (a)g p (−a) = ∑ ∑ ∑ ζ ζ
a=0 a=0 n=0 m=0 p p p p
p−1 p−1 p−1
n m a(n−m)
=∑ ∑ ∑ ζp
n=0 m=0 p p a=0
p−1
n n
=p∑ = p(p − 1)
n=0 p p
where in the penultimate equality we have used Lemma 4.2.10. Dividing by p − 1 gives the
result. ■
40 Chapter 4. Quadratic reciprocity
We are now ready to prove the main part of quadratic reciprocity namely Theorem 4.2.1 (i).
Since gcd(p∗ , q) = 1 we can multiply by an (p∗ )−1 satisfying p∗ (p∗ )−1 = 1 (mod q) which gives
that
∗
p q
= (mod q)
q p
∗
from which we may conclude that pq = qp . To finish we note that
∗ (p−1)/2
(−1)(p−1)/2 p
p −1 p (p−1)(q−1)/4 p
= = = (−1)
q q q q q
which finishes the proof. ■
■ Example 4.6 We want to determine if 5x4 + 35x2 + 3 = 0 (mod 541) has any integer solutions.
We first notice that if this has a solution x0 then y0 = x02 will be a solution to 5y2 + 35y + 3 = 0
(mod 541) which has discriminant D = 352 − 4 · 5 · 3 = 1165. We need to verify if the discriminant
is a square modulo 541 to see if this is even possible. So we compute using Theorem 4.2.1
1165 83 (83−1)(541−1)/4 541 43 (83−1)(43−1)/4 83
= = (−1) = = (−1)
541 541 83 83 43
2
40 2 2 5 2 43 3
=− =− = −(−1)(43 −1)/8 (−1)(43−1)(5−1)/4 =
43 43 43 43 5 5
= −1
But this means, that 5y2 +35y +3 = 0 (mod 541) has no solutions and therefore 5x4 +35x2 +3 = 0
(mod 541) also cannot have any solutions. ■
4.3 Exercises for chapter 4 41
Exercise 4.6 Prove, using quadratic reciprocity, that for a prime p ≥ 5 we have
(
3 1 if p = 1, 11 mod 12,
=
p −1 if p = 5, 7 mod 12.
Exercise 4.7 In this exercise you will give a direct proof that −3
p = 1 when p = 1 mod 3.
1. Show that (Z/pZ)∗ has an element c of order 3
2. Show that (2c + 1)2 = −3 mod p.
3. Conclude that −3 p = 1.
Show independently, using quadratic reciprocity, that −3
4. p = 1.
Exercise 4.8 You may use without proof that 213 − 1 = 8191 is prime. How many natural numbers
x ≤ 8191 satisfy the equation
x2 = 5 (mod 8191).
Show that in general an = 1 does not imply that x2 = a (mod n) has a solution.
Showthat
1. ab a b
n = n n for integers a, b.
−1
2. n = n (mod 4).
3. n2 = 1 if n = ±1 (mod 8), and −1 otherwise.
4. Assume further that a is positive and odd. Show that
a n
= (−1)(a−1)(n−1)/4 .
n a
q−1
Exercise 4.12 Let p, q be different odd primes and let q∗ = (−1)q. Show that x2 = p (mod q)
2
f :N→C
Note that the set of arithmetical functions are in bijective correspondence to the set of complex se-
quences under f 7→ { f (n)}∞n=1 . As such, without making further assumptions, arithmetic functions
are not more nor less interesting than complex sequences.
■ Example 5.1 The following are all clearly arithmetical functions and it is very easy to cook up
other examples:
1. u(n) = 1
2. N(n) = n
3. Euler’s ϕ function
4. Fix p. The Legendre symbol np .
(
1 if n = 1
5. I(n) =
0 otherwise.
6. τ(n) = ∑d|n 1, the number of divisors of n.
7. σ (n) = ∑d|n d, the sum of divisors of n.
8. Let k ∈ C. σk (n) = ∑d|n d k ,
In the sums the d | n means that we sum over positive divisors.
■
whenever m, n are coprime natural numbers. If (5.1) holds for all m, n ∈ N the arithmetical
function f is called completely multiplicative.
5.1 Arithmetical functions 43
All the arithmetical functions from Example 5.1 are multiplicative. (See Exercise 5.1). The
arithmetical functions u, N, np , I are completely multiplicative (See Exercise 5.2).
We note that if n has prime factorization n = pl11 · · · plkk with pi ̸= p j for i ̸= j then
l l
Proposition 5.1.1 If n = p11 · · · pkk then
k
τ(n) = ∏(li + 1)
i=1
k
plii +1 − 1
σ (n) = ∏
i=1 pi − 1
Proof. Since
pl+1 − 1
σ (pl ) = 1 + p + p2 + · · · + pl =
p−1
l
τ(p ) = 1 + 1 + 1 + · · · + 1 = l + 1
the proof is immediate.
■
We note that it follows directly from Proposition 5.1.1 that τ(n) is odd if and only if li is even
for each i which happens if and only if n is a square, i.e. n = a2 for some a ∈ N.
We note that formulated in terms of the divisor function we have that a n is perfect if and only if
n = σ (n) − n, i.e. 2n = σ (n)
■ Example 5.2 The number 6 is perfect since σ (6) = 1 + 2 + 3 + 6 = 12 = 2 · 6. In the same way
we can check that 28, 496, 8128, 33550336, 8589869056 are perfect numbers. Using a computer
one can verify that these are the only perfect numbers less that 1010 . ■
Theorem 5.1.2 The number n is perfect and even if and only if n = 2 p−1 (2 p − 1) where p and
2 p − 1 are primes.
The “if” part of this theorem goes back to Euclid, but the “only if” part was a conjecture from
around year 1000 until it was proven by Euler in the 18th century.
Proof. The “if” part follows from observing that if p and 2 p − 1 are prime then 2 p−1 and 2 p − 1
are coprime and by Proposition 5.1.1 we have
2p − 1
σ (n) = σ (2 p−1 )σ (2 p − 1) = (1 + 2 p − 1) = 2 · 2 p−1 (2 p − 1) = 2n
2−1
which shows that n is perfect.
44 Chapter 5. Arithmetic functions and Dirichlet series
To prove the only if part we assume that n is perfect and even and write n = 2 p−1 q with p, q ∈ N,
p ≥ 2, and q odd. We note that q > 1 since otherwise n = 2 p−1 but σ (2 p−1 ) = 2 p − 1 ̸= 2 · 2 p−1 .
Using that n is perfect we see that
2p − 1
2 p q = 2n = σ (n) = σ (2 p−1 )σ (q) = σ (q),
2−1
σ (q) = 2 p r = (2 p − 1)r + r = q + r
This theorem classifies even perfect numbers, but doesn’t say much about their existence. Also
it makes no claim about odd perfect numbers.
Conjecture 5.1.3 There are no odd perfect numbers, but infinitely many even perfect numbers.
It has been verified that there are no odd perfect numbers less than 101500 . Note that by
Theorem 5.1.2 there is a 1-1 correspondence between even perfect numbers and Mersenne primes
(see Exercise 1.18). See mersenne.org/primes for a list of all 49 known Mersenne primes. Hence
we know also 49 even perfect numbers.
( f ∗ g)(n) = ∑ f (d)g(n/d)
d|n
Let
A = { f arithmetical functions}
G = { f ∈ A| f (1) ̸= 0}
M = { f ∈ G| f is multiplicative}
Notice that if f is multiplicative then f (m) = f (m) f (1) so if f ̸= 0 then we have f (1) = 1.
f ∗ g(n) = ∑ f (a)g(b).
ab=n
5.1 Arithmetical functions 45
and similarly
Proof. It is clear that for any f ∈ A we have f ∗ I(n) = I ∗ f (n) = ∑d|n I(d) f (n/d) = f (n). If
f ∈ G we need to show that there exists another arithmetical function f −1 satisfying f −1 ∗ f (n) =
∑d|n f −1 (d) f (n/d) = I(n). But if f (1) ̸= 0 we can construct such a function recursively by
f −1 (1) = 1/ f (1) and
1
f −1 (n) = − ∑ f −1 (d) f (n/d).
f (1) d|n
d̸=n
{d | m} × {e | n} →{ f | mn}
(d, e) 7→ de.
We claim that this is a bijection: It is surjective since if f | mn we may define d = gcd( f , m) and
e = gcd( f , n). It is clear from Proposition 1.2.6 (iii) that de | f . If f ̸= de we let f = d · e · g with
g ̸= 1. Let p | g be a prime. Then since f | mn Euclid’s Lemma 1.4.1 implies that p | m or p | n. If
p | n we see that pe | f . We have also pe | mn. Since pe is coprime with m (A common prime factor
would be a common prime factor of both m and n but they have no common factor other than 1)
Theorem 1.2.6 (iv) implies that pe | n. Hence pe is a common divisor of f and n, which contradicts
e being their greatest common divisor. If instead p | m we are lead to the same contradiction, and
we may conclude that g = 1 which means that the map is surjective.
To see that it is injective assume that de = d ′ e′ with d, d ′ | m and e, e′ | n. Then d,e′ are coprime,
and e, d ′ are coprime. By Proposition 1.2.6 (iv) we may conclude that d | d ′ , d ′ | d, e | e′ , and e′ | e
which implies that (d, e) = (d ′ , e′ ).
46 Chapter 5. Arithmetic functions and Dirichlet series
To show that if F is multiplicative then so is F −1 we assume that it is not and let n be the smallest
natural number such that there exists a factorization n = ml with m, l coprime natural numbers
such that F −1 (ml) ̸= F −1 (m)F −1 (l). Notice that multiplicative functions satisfies F(1) = 1 so
1 = I(1) = F ∗ F −1 (1) = F(1)F −1 (1) = F −1 (1). This implies that n ̸= 1.
Note that for all coprime natural numbers a, b with ab < n we have F −1 (ab) = F −1 (a)F −1 (b).
Hence using the above isomorphism we find that
−1 −1 ml
I(ml) = F ∗ F(ml) = ∑ F ( f ) F
f |ml
f
ml
= ∑ ∑ F −1 (de)F
d|m e|l
de
ml
= ∑ F −1 (de)F + F −1 (ml)F(1)
d|m
de
e|l
de̸=ml
m l
−1 −1
= ∑ F (d)F (e)F F + F −1 (ml)F(1)
d|m
d e
e|l
de̸=ml
m l
−1 −1
= ∑F (d)F (e)F F + F −1 (ml) − F −1 (m)F −1 (l)
d|m
d e
e|l
m
−1 −1 l
= ∑F (d)F ∑F (e)F + F −1 (ml) − F −1 (m)F −1 (l)
d|m
d e|l
e
= I(m)I(l) + F −1 (ml) − F −1 (m)F −1 (l) ̸= I(m)I(l)
which is a contradiction. Hence F −1 is indeed multiplicative.
■
This theorem allows us to prove the following “two out of three” result:
Proof. This follows in a straightforward manner from Theorem 5.1.6: If f , g are multiplicative then
so is f ∗ g = h. If f , h are multiplicative then so is f −1 ∗ h = g, and if g, h are multiplicative then so
is g−1 ∗ h = f . ■
5.2 Dirichlet series 47
A few comments are needed in order to see that this is a reasonable definition. Clearly u ∈ G. By
Theorem 5.1.5 u therefore has an inverse with respect to *. By general group theory such an inverse
is unique in G. It is clearly also unique in A since any µ satisfying µ ∗ u = I must have µ(1) = 1, so
µ ∈ G. This shows that µ is well-defined. It further follows from Theorem 5.1.6 and the fact that u
is multiplicative that µ is multiplicative. We can describe µ(n) in terms of the prime decomposition
of n:
Theorem 5.1.8
(
(−1)k if n is the product of k different primes
µ(n) =
0 otherwise.
Proof. Let µ ′ be the right-hand side. We have µ ′ ∗ u(1) = ∑ab=1 µ ′ (a)u(b) = µ ′ (1)u(1) = 1 · 1 =
I(1). For n = pl11 · · · plkk > 1 we observe that
Many number theory texts uses Theorem 5.1.8 as the definition of the Möbius function. Then
they need to prove that µ ∗ u = I.
We note that there is no requirement in this definition about convergence of the series, but if the
series is divergent everywhere the series is not so useful. Following Riemann it is customary to
write s = σ + it ∈ C with σ ,t ∈ R.
48 Chapter 5. Arithmetic functions and Dirichlet series
Lemma 5.2.2 Assume | f (n)| ≤ Cnc . Then D f is absolutely convergent for σ > 1 + c, i.e. the
abscissa of absolute convergence is less than or equal to 1 + c.
Proof. This follows directly from Example 5.4 and the comparison test. ■
We can now show that there is a precise relation between multiplication of Dirichlet series and
Dirichlet convolution of the corresponding arithmetical functions.
Theorem 5.2.3 Assume f1 , f2 , f3 are arithmetical functions satisfying | fi (n)| ≤ Cnc then we
have D f1 (s)D f2 (s) = D f3 (s) for σ sufficiently large if and only if f1 ∗ f2 = f3 .
Proof. Assume f1 ∗ f2 = f3 . Since for σ sufficiently large we have that D f1 , D f2 are absolutely
convergent we have, using Cauchy products, that
∞
f1 (n) ∞ f2 (m) ∞
f1 (n) f2 (m)
D f1 (s)D f2 (s) = ∑ s ∑ s
= ∑
n=1 n m=1 m n,m=1 (nm)s
∞ ∞
∑nm=k f1 (n) f2 (m) f1 ∗ f2 (k)
= ∑ s
= ∑ = D f3 (s)
k=1 k k=1 ks
But an absolutely convergent Dirichlet series is uniquely determined by its coefficients (See Exercise
5.5). ■
■ Example 5.5 Note that |µ(n)| ≤ 1 so Dµ is absolutely convergent for σ > 1 by Lemma 5.2.2.
Note also that DI (s) = 1. It then follows from Theorem 5.2.3 and the definition of the Möbius
function that Dµ (s)Du (s) = 1. Since Du (s) = ζ (s) we see that ζ (s) has no zeroes for σ > 1. We
note also that for σ > 1 we have
∞
µ(n) 1
∑ ns = ζ (s) .
n=1
In fact Dµ (s) has abcissa of absolute convergence equal to 1 as follows from the above and Exercise
5.6. ■
Euler products
Let pn be the n’th prime.
∞
∑ g(n) − Pl = ∑ g(n) ≤ ∑ |g(n)| ≤ ∑ |g(n)|
n=1 n∈A
/ l n∈A
/ l n>pl
which can be made arbitrarily small by making l large since ∑∞ n=1 g(n) is absolutely convergent.
This proves the first claim.
The second claim follows after we observe that g(pk ) = g(p)k . For this implies, since the
terms in a convergent series goes to zero, that |g(p)| < 1, and hence the sum ∑∞ k
k=0 g(p ) is really a
convergent geometric series which equals (1 − g(p))−1 . This finishes the proof. ■
50 Chapter 5. Arithmetic functions and Dirichlet series
Definition 5.2.2 We say that D f (s) admits an Euler product if D f (s) = ∏ p prime D f ,p (s) where
∞
f (pk )
D f ,p (s) = ∑ .
k=0 pks
Corollary 5.2.5 Assume f is a multiplicative arithmetic function and that D f (s) is absolutely
convergent in some halfplane. Then D f (s) admits an Euler product. If f is completely multi-
plicative then
1
D f ,p (s) = .
(1 − f (p)p−s )
n
■ Example 5.8 Fix a prime q and consider χ(n) = q which is completely multiplicative by
Corollary 4.1.2. The corresponding Dirichlet series is usually denoted L(s, χ) and we find
Here Γ(s) = 0∞ e−t t s−1 dt (when σ > 0) is the Euler Gamma function which admits a con-
R
ξ (s) = ξ (1 − s).
5.3 Exercises for chapter 5 51
For a proof of this and a wealth of other information about the Riemann zeta function consult
[Ivi85; KV92; Tit86]
Conjecture 5.2.7 — The Riemann hypothesis. All zeros of ξ (s) lie on the critical line ℜ(s) =
1/2.
We have already seen one equivalent formulation of this namely Conjecture 1.4.5. Here is
another equivalent formulation:
∑ µ(n)
n≤x
Conjecture 5.2.8 For every ε > 0 is bounded.
x1/2+ε
R The Riemann hypothesis has not been proved despite many attempts. Hardy [Har14] proved
that there are infinitely many zeros on the critical line. Selberg [Sel42] proved that a positive
(but unspecified) proportion of the zeroes lie on the critical line. This has been improved
several times with a 2012 result of Feng [Fen12] as the current record: Feng proved that
at least 41.28% of all zeros of ζ lies on the critical line. See [Bor+08] for a treasure of
information about the Riemann hypothesis.
∞
| f (N)| ≤ N σk (N + 1)−(σk −c) ∑ | f (n)| n−c
n=N+1
(iii) Conclude, by letting k → ∞, that f (N) = 0, which gives a contradiction and therefore
f (n) = 0 for all n ∈ N.
Exercise 5.6 In this exercise we show that the reciprocal of the primes diverges. Let pn denote the
n’th prime. We start by assuming that ∑∞ −1
n=1 pn is convergent with sum l.
(i) Show that there exist N ∈ N such that ∑n>N p−1 n = l − ∑Nn=1 p−1
n ≤ 1/2.
k
(ii) Show that ∑k=1 ∑n>N p−1
∞
n is absolutely convergent.
(iii) Let W = p1 · · · pN . For r ∈ N consider W r + 1. Show that the prime factorization of W r + 1
contains only primes p > pN .
52 Chapter 5. Arithmetic functions and Dirichlet series
!k
1
(iv) Show, using Cauchy multiplication, that ∑
n=q1 ···qk n
= ∑ p−1
n where the left-hand side
n>N
qi prime
qi >pN
denotes the reciprocals of all natural numbers which can be written of precisely k primes all
bigger that pN .
1 −1 k
(v) Show that ∑∞ ∞
r=1 W r+1 ≤ ∑k=1 ∑n>N pn
(vi) Derive a contradiction and conclude that ∑∞ −1
n=1 pn cannot be convergent.
Exercise 5.7 Let g(n) be the sum of primitive nth roots of 1, i.e.
g(n) = ∑ ζ.
ζ n =1
ζ m ̸=1
for 0<m<n
Show that µ(n) = g(n); i.e. µ(n) is the sum of primitive nth roots of 1.
Exercise 5.8 Let k ∈ N. A number n is called k-perfect if σ (n) = kn. It is not known if there are
infinitely many k-perfect numbers for any k. The following are the known 3-perfect numbers: 120,
672, 523776, 459818240, 1476304896, 51001180160. Show that they are indeed 3-perfect.
Exercise 5.9 Prove Gottschalck’s theorem: Let n be a k-perfect number such that 2 divides n
precisely m times. Then 2m (2m+1 − 1) divides kn.
6. Sums of squares
In this chapter we shall investigate questions concerning sums of squares, but before we dive
into this we make a short digression into more general sums of powers. We start by the trivial
observation that the number of l’th powers less than or equal to x is at essentially x1/l namely
exactly the integers ml with m ≤ x1/l . Hence the set of l’th powers is very sparse in the set of
integers. By comparison the prime number theorem (1.4) implies that there are many more primes,
than l’th powers.
Theorem 6.1.1 — Lagrange. Every natural number can be written as a sum of at most 4
squares.
We will prove this theorem in Section 6.1.5 with a much more recent proof using geometric number
theory.
Let k ∈ N. Let
Sk = {n ∈ Z|n = x12 + · · · + xk2 , xi ∈ Z}.
Hence Sk denotes the subset of N ∪ {0} of integers that can be written as a sum of at most k squares.
We clearly have S1 ⊆ S2 ⊆ S3 ⊆ · · · . Proving Lagrange’s theorem 6.1.1 is equivalent to proving that
S4 = N ∪ {0}. We will also see that Si ̸= N ∪ {0} for i = 1, 2, 3. Hence 3 squares does not suffice.
Before giving a proof of Lagrange’s theorem 6.1.1 we describe explicitly S1 , S2 , S3 .
6.1.1 Squares
The set S1 is simply the set of squares
Notice also that the proof gives a way of finding the representation of s1 s2 as the sum of two
squares since
(a21 + b21 )(a22 + b22 ) = |z1 z2 |2 = |z1 z2 |2 = |(a1 a2 + b1 b1 ) + i(−a1 b2 + b1 a2 )|2 (6.2)
2 2
= (a1 a2 + b1 b2 ) + (−a1 b2 + b1 a2 ) (6.3)
2 2
■ Example 6.1 We have 8 = 22 + 22 and 10 = 32 + 12 . But then 80 = 8 · 10 = |2 + i2| |3 + i| =
|4 + 8i|2 = 42 + 82 . ■
The fact that S2 is closed under multiplication makes it natural to consider which primes can be
written as a sum of two squares. Clearly 2=1+1 is a sum of two squares.
Proof. By Theorem 4.2.1 (ii) we have −1 (p−1)/2 = 1 so there exist u, r ∈ Z such that
p = (−1)
−1 = u2 − rp. By division modulo p we may assume 0 < u ≤ p − 1 which forces
u2 + 1 p2 + 1 − 2p + 1
0<r= ≤ < p.
p p
Consider
m = min{n ∈ N|np ∈ S2 }.
By the above considerations the sum of which we are taking the minimal element is non-empty and
the minimum is strictly less than p. We claim that m = 1 from which it follows that p ∈ S2 .
To see that m = 1 we write mp = a21 + b21 . We can choose representatives a2 for a1 mod m and
b2 for b1 mod m satisfying |a2 | , |b2 | ≤ m/2. Then
But
a1 a2 +b1 b2 = a21 +b21 = mp = 0 (mod m), and −a1 b2 +b1 a2 = −a1 b1 +b1 a1 = 0 (mod m)
a1 a2 + b1 b2 2 −a1 b2 + b1 a2 2
+ = sp.
m m
But since s < m this contradicts m being the smallest integer such that mp can be written as a sum
of 2 squares. This finishes the proof. ■
R There are many other proofs of this theorem. Zagier found a delightful one-sentence proof,
see [Zag90].
Theorem 6.1.4 A natural number n is a sum of 2 squares if and only if every p = 3 (mod 4)
has an even (possibly zero) exponent in the prime factorization of n.
Proof. Assume that every p = 3 (mod 4) has an even (possibly zero) exponent in the prime
factorization of n. Then n = 2e pe11 · · · pekk m2 where pi = 1 (mod 4). Since 2 = 12 +12 , m2 = m2 +02 ,
and pi = a2i + b2i by Theorem 6.1.3, we find – since S2 is closed under multiplication – that n is the
sum of two squares.
Assume that n = x2 + y2 . If gcd(x, y) = d > 1 then d 2 | n and we consider n/d 2 = (x/d)2 +
(y/d)2 . We notice that every prime factor of n/d 2 comes with the same parity as in n. In particular
any prime which divides n an odd number of times will still be a divisor of n/d 2 . Hence we can
assume, without loss of generality that gcd(x, y) = 1.
Assume that p | n. Then x2 + y2 = 0 (mod p), i.e. x2 = −y2 (mod p). We notice that
gcd(y, p) = 1 for if p | y then also p | x, but we have assumed that gcd(x, y) = 1. Hence us-
ing Bezout we may find z ∈ Z such that yz = 1 (mod p), and it follows that (zx)2 = z2 (−y2 ) = −1
(mod p). But hence −1 is a quadratic residue modulo p this implies, by Theorem 4.2.1 (ii), that
p = 1 (mod 4), or p = 2. Hence no prime p = 3 (mod 4) can occur an odd number of times in
the prime factorization of n, which finishes the proof. ■
56 Chapter 6. Sums of squares
We now know exactly which natural numbers are in S2 . But we can go even further and count
the number of ways a number can be written as a number of 2 squares, i.e. we want to understand
We see that r2 (1) = #{(±1, 0), (0, ±1)} = 4, r2 (2) = #{(±1, 1), (±1, −1)} = 4. From Theorem
6.1.4 we see that r2 (n) = 0 if any p = 3 (mod 4) appears an odd number of times in the prime
factorization of n.
To find a more explicit expression for r2 (n) it is convenient to look at the Gaussian integers.
d : R\{0} → N ∪ {0}
satisfying
(i) d(ab) ≥ d(b) for a, b ̸= 0.
(ii) for all a, b ∈ R with b ̸= 0 there exist q, r ∈ R such that
The function d is called a Euclidean function on R. Note that the notion is clearly modelled on the
classical case of Theorem 1.2.1, but there is also subtle differences. For instance the remainder
need not be unique. Also the inequality (i) is in a certain sense superfluous. If an integral domain
admits a d-function satisfying only (ii) then it also admits another d-function satisfying both (i) and
(ii).
Proposition 6.1.5 Let R be a Euclidean ring with Euclidean function d. Then for a, b non-zero
elements d(ab) = d(b) if and only if a ∈ R× .
Proof. If a ∈ R× there exist c ∈ R such that ac = 1. But by (i) we have d(b) = d(abc) ≥ d(ab) and
d(ab) ≥ d(b), so d(ab) = d(b).
If d(ab) = d(b) with a, b ̸= 0 we claim that a is a unit. Assume not: We can divide b by ab and
find b = qab + r with r = 0 or d(r) < d(ab) i.e.
Since a is not a unit (1 − qa) is non-zero so r cannot be zero. Hence we have d(b(1 − qa)) < d(b).
But at the same time (i) gives d(b(1 − qa)) ≥ d(b) which is a contradiction. ■
Corollary 6.1.6 Let R be a Euclidean domain with Euclidean function d. Then d(a) = d(1) if
and only if a is a unit, i.e a ∈ R× .
Proof. If u is a unit then so is its inverse u−1 . Using Proposition 6.1.5 we find d(1) = d(u−1 u) =
d(u). If d(u) = d(1) then by (ii) there exist q, r such that 1 = qu + r. We claim that r = 0 which
shows that u is a unit. For if not then d(r) < d(u) = d(1). But at the same time d(r) = d(r ·1) ≥ d(1)
by (i) which is a contradiction. ■
6.1 Sums of squares 57
Proposition 6.1.7 The Gaussian integers Z[i] is a Euclidean ring with Euclidean function d(z) = zz.
Proof. We have already discussed why Z[i] is an integral domain. We verify easily that d(ab) =
|ab|2 = |a|2 |b|2 ≥ d(b) i.e. (i) is satisfied.
Let a, b ∈ Z[i] with b ̸= 0 and consider the complex number z = a/b = x + iy with x, y ∈ R.
Choose n1 , n2 ∈ Z such that |n1 − x| ≤ 1/2 and |n2 − y| ≤ 1/2. Let q = n1 + in2 . If we let r = a − qb
then either r = 0 or d(r) = |a − qb|2 = |b|2 |a/b − q|2 = |b|2 |x + iy − (n1 + in2 )|2 < |b|2 = d(b)
which verifies (ii). ■
In a Euclidean ring R greatest common divisors exist (but are generally not unique) and can be
found in the same way as in the Euclidean algorithm. Also the “obvious” analogue of Bezout’s
identity is true. (See [DF04, Theorem 4 p. 275]) and using these tools one can prove that in a
Euclidean ring we have unique factorization (For a proof see [DF04, Theorem 14 p. 287], or go
through the relevant parts of Chapter 1 making the appropriate changes.)
Theorem 6.1.8 Let R be a Euclidean domain. Then every non-zero element r ∈ R which is
not a unit can be written as a finite product of irreducibles of R. This product is unique up to
permutation and multiplication of units.
We want to use this theorem to describe r2 (n). To this end we need to find all irreducibles of Z[i].
Proposition 6.1.9 The irreducible elements of Z[i] are precisely the elements associated to one of
the following:
(i) π = 1 + i.
(ii) π = x + iy with x2 + y2 = ππ = p = 1 (mod 4), x > |y| > 0.
(iii) q = 3 (mod 4).
Here p, q are primes in N.
Proof. The numbers in (i) and (ii) are irreducible since d(π) is prime: If π = ab is a factorization
then d(π) = d(a)d(b) which is prime, which implies that d(a) or d(b) is 1. But then either a or b
is a unit which means that π is irreducible.
An element q as in (iii) is irreducible since a non-trivial factorization q = ab would imply
q2 = d(q) = d(a)d(b). But since d(a), d(b) ̸= 1 this is only possible if q = d(b) = d(a) = d(x +
iy) = x2 + y2 . But this contradicts Theorem 6.1.4.
58 Chapter 6. Sums of squares
This shows that all the listed elements are irreducible. To see that every irreducible is associated
to one on the list we argue as follows: Let ρ be any irreducible element. We notice that if ρ is
irreducible then ρ is irreducible also, since any non-trivial factorization ρ = ab gives a non-trivial
factorization ρ = ab.
We have d(ρ) = ρρ ∈ N, which we can factor into primes in N. Now we can factor all primes
in N into irreducible in Z[i] of the listed form (Note that 2 = −i(1 + i)(1 + i)). Hence using the
uniqueness of factorization we see that ρ is associated to one of the listed irreducible. ■
Using the above properties of Z[i] we can determine r2 (n) for any n ∈ N:
e e f f
Theorem 6.1.10 Let n ∈ N with prime factorization n = 2e p11 · · · pkk q11 · · · ql l where pi = 1
(mod 4) and q j = 3 (mod 4). Then
0
if f j is odd for some j = 1, . . . , l
r2 (n) = k
e1 ek
4τ(p 1 · · · pk ) = 4 ∏(ei + 1) otherwise.
i=1
k l
f
n = (−i)e (1 + i)2e ∏ πiei πi ei ∏ q j j .
i=1 j=1
To count the factors z of n satisfying zz = n we notice that any factor of n must factor in irreducibles
c
as z = u(1 + i)a ∏ki=1 πiai πi bi ∏lj=1 q j j for some 0 ≤ a ≤ 2e, 0 ≤ ai , bi ≤ ei , 0 ≤ c j ≤ f j . so we have
k l
2c j
n = zz = (−i)a (1 + i)2a ∏ πiai +bi πi ai +bi ∏ q j
i=1 j=1
and that every such expression is indeed a factor. Since there are 4 possible choices of units this
implies that
k
r2 (n) = 4#{(a1 , . . . , ak )|0 ≤ ai ≤ ei } = 4 ∏(1 + ei ).
i=1
■ Example 6.4 Let us find all ways to write 221 as a sum of two squares. Note that 221 = 13 · 17
and 13, 17 = 1 (mod 4). We easily find that 13 = 32 + 22 = (3 + 2i)(3 − 2i), and 17 = 42 + 12 =
(4 + i)(4 − i) By Theorem 6.1.10 we find that r2 (221) = 4(1 + 1)(1 + 1) = 16. Indeed the proof
shows that 221 = x2 + y2 if and only if z = x + iy is of the form z = u(3 + 2i)a1 (3 − 2i)1−a1 (4 +
6.1 Sums of squares 59
i)a2 (4 − i)1−a2 with 0 ≤ ai ≤ 1. This means that z = u(3 ± 2i)(4 ± i) with all possible choices of ±.
This corresponds to z = u(10 ± 11i) or z = u(14 ± 5i) which gives the 16 representations
221 = (±10)2 + (±11)2 = (±11)2 + (±10)2 = (±14)2 + (±5)2 = (±5)2 + (±14)2 ,
with all possible choices of ±. ■
■ Example 6.5 We find all ways to write 4168 = 23 521 as a sum of two squares. Note that 521 = 1
(mod 4) so Theorem 6.1.10 gives r2 (4168) = 4(1 + 1) = 8 corresponding to factors of n of the type
z = u(1 + i)3 (20 ± i11) = u(−62 + 18i) or u(−18 + 62i) corresponding to the 8 representations
4168 = (±62)2 + (±18)2 = (±18)2 + (±62)2
with all possible choices of ±. ■
Note that any square is either 0,1,4 modulo 8 so if n is a sum of 3 square then n ̸= 7 (mod 8).
If n = 0 (mod 4) then since a square is either 0 or 1 modulo 4 we must have that all three
squares are 0 modulo 4. But this implies that if n = x2 + y2 + z2 then x,y,z are divisible by 2,
so n/4 = (x/2)2 + (y/2)2 + (z/2)2 ∈ S3 . We have shown that if n ∈ S3 then n ̸= 7 + 8k and if
n = 4b ∈ S3 then b ∈ S3 . It follows that S3 ⊆ {n ∈ N ∪ {0}|n ̸= 4e (8k + 7)}.
Showing the opposite inclusion is much trickier, and we will not give the proof. See e.g.
[Ank57] for a relatively simple proof, which in principle can be understood using the techniques of
section 6.1.5.
Proof. This follows immediately from following identity which is straightforward to verify:
(a21 + b21 + c21 + d12 )(a22 + b22 + c22 + d22 ) =(a1 a2 − b1 b2 − c1 c2 − d1 d2 )2
+ (a1 b2 + b1 a2 + c1 d2 − d1 c2 )2
+ (a1 c2 − b1 d2 + c1 a2 + d1 b2 )2
+ (a1 d2 + b1 c2 − c1 b2 + d1 a2 )2 .
1 There is some disagreement in the literature as to whether Legendre or Gauss gave the first correct proof of this
result.
60 Chapter 6. Sums of squares
v2
v1
Analogous to the way that Lemma 6.1.2 can be proved using complex numbers, this lemma may be
proved using quaternions. See e.g. [JJ98, Sec 10.5]. ■
Minkowski theory
Definition 6.1.2 A (full) lattice Λ in Rn is a subset Λ ⊆ Rn satisfying
Λ = {α1 v1 + · · · αn vn |αi ∈ Z}
where (v1 , . . . , vn ) is a basis for Rn . The set (v1 , . . . , vn ) is also called a basis for Λ.
Elements v, w ∈ Rn
are called equivalent modulo Λ if v − w ∈ Λ. It is straightforward to verify
that this is indeed an equivalence relation, and we shall write v ∼Λ w.
Definition 6.1.3 If (v1 , . . . , vn ) is a basis for a lattice Λ then
FΛ = {t1 v1 + · · · + tn vn |0 ≤ ti < 1}
We note that an alternative way of formulating Lemma 6.1.13 is that we can write Rn as a
disjoint union
[
Rn = (FΛ + l)
l∈Λ
1. Lebesgue measure is σ -additive, i.e. if {An }∞n=1 are disjoint measurable sets then their union
is measurable and vol(∪n An ) = ∑n vol(An ).
2. Lebesgue measure is translation invariant, i.e if A is measurable and l ∈ Rn then A + l is
measurable and vol(A + l) = vol(A).
3. ( (2π)m
π n/2 if n = 2m
vol({x ∈ Rn | ∥x∥ < r}) = rn = rn · n(n−2)···4·2
2m+1 π m
Γ(n/2 + 1) n·(n−2)···3·1 if n = 2m + 1.
4. The measure of the parallelepiped spanned by v1 , . . . , vn equals the absolute value of de-
terminant of the matrix whose columns are v1 , . . . , vn , i.e. vol({∑ni=1 ti vi |0 ≤ ti < 1}) =
|det(v1 v2 · · · vn )|.
where Xl = X ∩ (FΛ + l) and the last union is disjoint. Note that if x ∈ Xl then φΛ (x) = x − l, so
φΛ (Xl ) = Xl − l. By translation invariance of Lebesgue measure we have vol(φΛ (Xl )) = vol(Xl ). If
φΛ |X were injective then by σ -additivity (and the fact that Λ is countable) we have
[
vol(X) = vol( Xl ) = ∑ vol(Xl ) = ∑ vol(φΛ (Xl )) ≤ vol(FΛ )
l∈Λ l∈Λ l∈Λ
Note that Lemma 6.1.14 has some resemblance to Dirichlet’s pigeon-hole principle, which
can be stated formally as follows: a map φ : A → B between finite sets A, B cannot be injective
if #A > #B . Lemma 6.1.14 shows that a set X cannot be mapped injectively to set with smaller
volume if the map does not decrease volume.
Definition 6.1.4 A set X ⊆ Rn is called
1. centrally symmetric if −x ∈ X when x ∈ X,
2. convex if when v, w ∈ X then tv + (1 − t)w ∈ X for every 0 ≤ t ≤ 1.
■ Example 6.7 A ball centered at the origin is both centrally symmetric and convex. ■
Proof. Consider the lattice 2Λ = {2v|v ∈ Λ}. If Λ has basis v1 , . . . , vn then 2Λ has basis 2v1 , . . . , 2vn
so 2Λ has fundamental domain vol(2FΛ ) = |det(2(v1 v2 · · · vn ))| = 2n |det(v1 v2 · · · vn )| = 2n vol(FΛ ),
so vol(X) > vol(2FΛ ). It follows from Lemma 6.1.14 that φ2Λ |X is not injective so there exist
different v, w ∈ X such that φ2Λ (v) = φ2Λ (w), i.e. v = w0 + l1 , w = w0 + l2 with w0 ∈ F2Λ and
li ∈ 2Λ. Clearly l1 ̸= l2 since otherwise v = w.
Since X is centrally symmetric −w ∈ X. Since X is convex 12 v + (1 − 12 )(−w) ∈ X.
But
1 1 v − w w0 + l1 − (w0 + l2 ) l1 − l2
v + (1 − )(−w) = = = ∈ Λ\{0},
2 2 2 2 2
which finishes the proof. ■
Note that all three requirements of Theorem 6.1.15 are crucial (See Exercise 6.2).
We are now ready to prove the main theorem of this section:
62 Chapter 6. Sums of squares
Proof. Clearly 2 = 12 + 12 + 02 + 02 , so it suffices to prove the claim for odd primes. We note that
by Corollary 4.1.3 there are (p − 1)/2 squares in F× p . Clearly 0 is also a square so we find that
#K = #{z ∈ F p |z = u2 , u ∈ F p } = (p + 1)/2
#L = #{z ∈ F p | − (1 + z) = v2 , v ∈ F p } = (p + 1)/2
Since there are p elements in F p there must be at least one z lying in K ∩ L, i.e. a z such that z = u2
(mod p) and −(1 + z) = v2 (mod p). Adding these equations we find that
−1 = u2 + v2 (mod p)
Consider now
v −u 0 p
corresponding fundamental domain equals
1 0 0 0
0 1 0 0 2
vol(FΛu,v ) = det u v p 0 = p .
v −u 0 p
√
Consider the set X = {x ∈ R4 | ∥x∥ < 2p} which has volume
p (2π)2
vol(X) = ( 2p)4 = 2π 2 p2 > 24 p2 = 24 vol(FΛu,v ).
4·2
By Minkowski’s theorem 6.1.15 there exist a non-zero lattice element 0 ̸= l ∈ Λu,v ∩ X. This l
√
satisfies 0 < ∥l∥2 < ( 2p)2 = 2p, and at the same time
We note that 0, 1 can clearly be written as a sum of at most 4 squares, so Theorem 6.1.16 and
Lemma 6.1.12 implies Lagrange’s theorem 6.1.1.
v −u 0 p
Exercise 6.4 Find the factorization into irreducibles of Z[i] of the numbers 221 = 13 · 17, 234 =
2 · 32 · 13, 702 = 2 · 33 · 13, and 16660 = 22 · 5 · 72 · 17.
Exercise 6.5 Compute r(n) and find all representations of n as a sum of two squares when n = 221,
n = 234, n = 702, and n = 16660.
Exercise 6.6 Show that the average value of r2 (n) equals π i.e. that
1 n
∑ r2 (m) → π as n → ∞.
n m=1
Finding the correct rate of convergence is a very difficult unsolved problem in analytic number
theory (Gauss’ circle problem).
7. Continued fractions and approximations
Continued fractions provide a very useful tool in number theory. It is a tool with many applications,
and we shall here look at a few of them
1
a0 + (7.1)
1
a1 +
1
a2 +
1
···+
an
where a0 ∈ R, and ai > 0 for i ∈ N. We denote this by [a0 , a1 , . . . , an ]. We call n the length of the
continued fraction. The continued fraction is called simple if ai ∈ Z.
We note that for 0 < j < n − 1
1 1 3 51
[7, 2, 3] = 7 + = 7+ = 7+ =
1 7 7 7
2+
3 3
■
7.1 Continued fractions 65
Fix an n’th order finite continued fraction x = [a0 , . . . , an ]. For 0 ≤ m ≤ n the continued
fraction [a0 , . . . , am ] is called the m’th partial convergent of x. For m = −2, . . . , n define numbers
pm = pm (a0 , . . . , am ), qm = qm (a0 , . . . , am ) by
p−2 0 p−1 1 pm pm−1 pm−2 am
= , = , = when m ≥ 0 (7.4)
q−2 1 q−1 0 qm qm−1 qm−2 1
p
Proposition 7.1.1 [a0 , . . . , am ] = qmm .
Hence for simple finite continued fractions x the procedure (7.4) allows us to express x as a
fraction.
p0 p−1 a0 + p−2 a0
= = = [a0 ]
q0 q−1 a0 + q−2 1
Assuming the statement holds for all continued fraction of lengths m − 1 we find
Proposition 7.1.2
Corollary 7.1.3 If [a0 , . . . , an ] is simple then for every n we have that pn , qn are relatively prime
integers.
Proof. It is clear from the definition of pn , qn that they are integers with qn ≥ 1. If they have
a common factor g Proposition 7.1.2 implies that g | (−1)n+1 which implies that g = 1, and we
conclude that pn , qn are relatively prime. ■
It is obvious that a simple finite continued fraction x = [a0 , . . . , an ] is a rational number and according
to Proposition 7.1.1 and Corollary 7.1.3 we have that x = pn /qn where the fraction is written in
lowest terms. In fact we can write any rational number as a continued fraction:
Lemma 7.1.4 Every rational number can be written as a simple continued fraction and every
simple continued fraction is a rational number.
Proof. It is clear from the above that every simple continued fraction is a rational number. Given a
rational number q = a/b where we may assume that a,b are relatively prime and b > 1 (If b = 1 we
have q = [a] and the statement is clear.) we use Euclid’s algorithm and find
a =a0 b + r1
b =a1 r1 + r2
r1 =a2 r2 + r3
..
.
rn−1 =an rn + 0
where ai ∈ Z and ai > 0 for i > 0, and r1 > r2 > . . . > rn = 1. We find that
a r1 1
q= = a0 + = a0 +
b b b
r1
1 1
= a0 + = a0 +
r2 1
a1 + a1 +
r1 r1
r2
1 1
a0 + = · · · = a0 + = [a0 , a1 , . . . , an ]
1 1
a1 + a1 +
r3 1
a2 + a2 +
r2 1
···+
an
which shows that q can be written as a simple continued fraction. ■
We will now show that [a0 , a1 , . . .] is well-defined if a0 ∈ Z and ai ∈ N. We call such infinite
continued fractions simple. For simple continued fractions we have q1 < q2 < · · · . Furthermore
qn = qn−1 an + qn−2 ≥ 2qn−2 . It follows, since q0 = 1 and q1 ≥ 1, that for n ≥ 1
qn ≥ 2(n−1)/2 (7.6)
Proof. We start by showing that c2n and c2n+1 converge as n goes to infinity. Note that qn ≥ 1. By
propositions 7.1.1 and 7.1.2 we have for n ≥ 2
pn pn−2 (−1)n an
cn − cn−2 = − = (7.7)
qn qn−2 qn qn−2
from which it follows that c2n is strictly increasing, i.e. c2(n−1) < c2n , and c2n+1 is strictly decreasing.
From the same propositions we find that
pn pn−1 (−1)n−1
cn − cn−1 = − = (7.8)
qn qn−1 qn qn−1
It follows that c2n < c2n−1 < c1 , and c2n+1 > c2n > c0 . Hence c2n is an increasing sequence which
is bounded from above and hence converges to some α0 . Similarly c2n+1 is a decreasing sequence
bounded from below and hence converges to some α1 . To prove α0 = α1 we can use (7.8) and (7.6)
to conclude that
This associates to every real number a finite or infinite sequence of integers a0 , a1 , . . ., with ai > 0
for i ≥ 1. We note if tn ̸= 0 then a small induction argument (See Exercise 7.4) shows that
We note also that if x is rational then by the proof of Lemma 7.1.4 the continued fraction procedure
gives a finite sequence and the corresponding continued fraction equals x.
68 Chapter 7. Continued fractions and approximations
√ √ √ √
■ Example 7.3 Let x =
√ 2. We have 2√= 1 + ( √ 2 − 1) and 0 < √ 2 − 1 < 1, so a0 = 1 and
t√0 = 2 − 1. Continuing we find that 1/( 2 − 1) = 2 + 1 = 2 + ( 2 − 1) so a1 = 2 and t1 =
2 − 1. Since t1 = t0 we find a2 = a1 and t2 = t1 . So the continued fraction procedure gives in this
case [1, 2, 2, 2, . . .], which we will also denote by [1, 2] where the overline denotes that it continues
periodically with 2’s. We already know that this continued fraction is convergent.
√ Call
√ the limit y.
Using the periodicity we see that y − 1 = 1/(2 + (y − 1)), leading to y = 2. Hence 2 = [1, 2]. ■
Theorem 7.1.6 Let x ∈ R, and let [a0 , a1 , . . .] be the continued fraction coming from the contin-
ued fraction procedure. Let cn = qpnn be the partial convergents and assume that the length of the
continued fraction is at least n + 1. Then
1
|x − cn | ≤ (7.10)
qn qn+1
If the length of the continued fraction is exactly n + 1 we have x = [a0 , a1 , . . . an+1 ]. In particular
we have in both the finite and infinite case that x = [a0 , a1 , . . .].
from which (7.10) follows using qntn+1 ≥ 0 and Proposition 7.1.2. If tn+1 = 0 we see from the
above computations that x = pn+1 /qn+1 = [a0 , . . . , an+1 ]. If the continued fraction is infinite then
(7.6) implies that cn → x as n → ∞. ■
We note that – in the above proof – if tn+1 > 0, then we have that qntn+1 > 0 and we find that, in
this case, Eq. (7.10) is a strict inequality. Together with the following lemma, this shows that for
irrational numbers the inequality in Eq. (7.10) can be strengthened to a strict inequality.
Lemma 7.1.7 The continued fraction procedure terminates after finitely many steps if and only if
x is rational.
Using Theorem 7.1.6 and (7.6) we see that the √continued fraction procedure gives a rational
n
sequence cn which converges as fast as |x − cn | ≤ 2/2 .
Quadratic irrationals
In Section 4.2.1 we studied algebraic numbers. Given an algebraic number ζ ∈ C there exist a
polynomial with integer coefficients of minimal degree having ζ as a root. We denote this number
the degree of ζ . Irrational numbers of degree 2 are called quadratic irrationals. Hence quadratic
irrationals are non-rational numbers which are roots in integer polynomials of degree 2.
√
■ Example 7.4 The number ζ1 = (1 + 5)/2
√is a quadratic irrational since it is not rational, and it
is a root of x2 − x − 1 = 0. The number ζ2 = 2 is a quadratic irrational since it is not rational, and
it is a root of x2 − 2 = 0. ■
Definition 7.1.1 An infinite continued fraction x = [a0 , a1 , . . .] is called periodic if there exist
N, h such that
an = an+h
7.1 Continued fractions 69
for all n > N. The minimal choice of h is called the period of x. We write x = [a0 , a1 , . . . , aN , aN+1 , . . . , aN+h ].
√ √
■ Example 7.5
√ For x = 2 we have (compare Example 7.3) N = 0 and h = 1 i.e. 2 = [1, 2].
1.
2. For x√= (1 + 5)/2 we have (compare Example 7.2) N = −1 (or N = 0) and h = 1. i.e.
(1 + 5)/2 = [1].
3. If x = [1, 2, 3] then x = [1, 2, 3, x], i.e.
1 1 3x + 1
x = 1+ = 1+ = 1+
1 x 7x + 2
2+ 2+
1 3x + 1
3+
x
from which it follows that (7x + 2)x = 10x + 3, i.e. 7x2 − 8x − 3 = 0, which√means that x
is a quadratic irrational. Solving the quadratic equation we find that x = 4+7 37 (the other
solution is negative, and x is clearly positive)
■
The next theorem says that what we saw in Example 7.5 holds in general:
Theorem 7.1.8 An infinite simple continued fraction α = [a0 , a1 , . . .] is periodic if and only if α
is a quadratic irrational.
with β = [aN+1 , . . . , aN+h ]. Note that by periodicity β = [aN+1 , . . . , aN+h , β ] and by Proposition
7.1.1 we have
p′h−1 β + p′h−2
β=
q′h−1 β + q′h−2
where p′i = pi (aN+1 , . . . , aN+1+i ), q′i = qi (aN+1 , . . . , aN+1+i ) ∈ Z from which it follows that β is a
root of the degree 2 equation
Hence the degree of β is at most 2. If it is 1 then β would be rational, but then its continued
fraction would be finite (Lemma 7.1.4) which it is not. It follows that β is a quadratic irrational and
that α = [a0 , a1 , . . . , aN , β ]. Now we observe that β ′ = k + 1/β is a quadratic irrational if β is a
quadratic irrational and k ∈ Z: Clearly β ′ cannot be rational and if β is a root of f (x) = ax2 + bx + c
with a, b, c ∈ Z and a, c ̸= 0 then β ′ is a root of f ′ (x) = c(x − k)2 + b(x − k) + a since
f ′ (β ′ ) = cβ −2 + bβ −1 + a = β −2 (c + bβ + aβ 2 ) = β −2 f (β ) = 0.
0 = aα 2 + bα + c. (7.11)
70 Chapter 7. Continued fractions and approximations
By Proposition 7.1.1
pn−1 rn + pn−2
α= . (7.12)
qn−1 rn + qn−2
By combining (7.11) and (7.12) we see, after a small computation, that
We claim that there are only finitely many possible values of An . If this is so then there are at most
finitely many possible values of Cn since An−1 = Cn . But then then there are also only finitely many
values of Bn since B2n = b2 − 4ac + 4AnCn .
This means that all rn ’s are roots of one of a list of finitely many polynomials, and hence there
can only be finitely many rn ’s. By the previous discussion this would conclude the proof.
To see that there are only finitely may possible values of An we recall from Theorem 7.1.6 that
pn−1 1
α− <
qn−1 qn−1 qn
−1 −1
so |qn−1 α − pn−1 | < q−1
n < qn−1 i.e. pn−1 = qn−1 α + δn qn−1 where |δn | < 1. Inserting this in the
definition of An we see that
An = a(qn−1 α + δn q−1 2 −1 2
n−1 ) + b(qn−1 α + δn qn−1 )qn−1 + cqn−1
= (aα 2 + bα + c)q2n−1 + 2aαδn + aδn2 q−2 2 −2
n−1 + bδn = 2aαδn + aδn qn−1 + bδn
It follows that |An | ≤ 2 |aα| + |a| + |b| which shows that there are at most finitely many possible
values of An which finishes the proof.
■
R Dirichlet’s approximation theorem 7.2.1 is the beginning of an entire research topic called
Diophantine approximation. For a much more thorough introduction check [Cas72; Ste16].
The proof we have presented here is constructive, as it allows us to use the continued fraction
procedure to actually find the number numbers a, b such that a/b approximates α to the given
precision. It is possible to give another simple (non-constructive) proof which uses only
Dirichlet’s pigeon-hole principle.
Corollary 7.2.2 For every α ∈ R there exist infinitely many pairs (a, b) ∈ Z2 such that
a 1
α− < 2. (7.14)
b b
Proof. If α ∈ Q the result is clear, since we are not requiring a, b to be coprime: If α = p/q any
(a, b) = n(p, q) with n ∈ N will do.
/ Q then we can choose any n = N1 and since b1 (N11 +1) < b12 if 0 < b1 ≤ N1 we find
If α ∈
1
the first pair (a1 , b1 ) by using Theorem 7.2.1. Assume now that there are k different pairs
(a1 , b1 ), . . . , (ak , bk ) satisfying (7.14).
Since α is not rational we have α − abii > 0, so we may choose Nk+1 ∈ N such that
1 ai
< min bi α −
Nk+1 i=1,...,k bi
Applying Theorem 7.2.1 with n = Nk+1 we get a pair (ak+1 , bk+1 ) such that
ak+1 1 1
α− < < .
bk+1 (Nk+1 + 1)bk+1 b2k+1
If (ak+1 , bk+1 ) = (a j , b j ) for some j ≤ k then
1 ai aj ak+1 1
< min bi α − ≤ bj α − = bk+1 α − ≤
Nk+1 i=1,...,k bi bj bk+1 Nk+1 + 1
which is impossible, i.e. (ak+1 , bk+1 ) is different from the first k pairs, and the proof is finished. ■
a C
α− ≥ 2
b b
72 Chapter 7. Continued fractions and approximations
for every ab ∈ Q.
• well approximable if there exists ε > 0 such that
a 1
α− ≤ 2+ε
b b
for infinitely many ab ∈ Q.
• normally approximable if it is not well approximable nor badly approximable.
Maybe surprisingly we can determine if α is badly approximable by looking at its continued
fraction expansion.
Theorem 7.2.3 A number α ∈ R\Q is badly approximable if and only if the sequence a0 , a1 , . . . ,
produced by the continued fraction procedure is bounded.
Theorem 7.2.6 The set of normally approximable numbers has full Lebesgue measure.
It follows that the generic irrational number is normally approximable. The next theorem is deep,
and in 1958 it earned Klaus Roth the Fields medal. The proof is well beyond the scope of these
notes:
Theorem 7.2.7 — Roth. No algebraic irrational number is well approximable.
R Note that it follows from Roth’s theorem 7.2.7 that all well-approximable numbers are
transcendental (note that there are much easier ways to prove this). It is not known if
algebraic numbers of degree d > 2 are normally approximable or badly approximable, but it
is conjectured that they are all normally approximable.
In the 1980s Oesterlé and Masser suggested the following important conjecture:
For an integer n we define the radical, rad(n) as the product of all distinct prime factors of n i.e.
rad(n) = ∏ p.
p|n
The ABC-conjecture mixes the additive structure (by relating a, b, c by a + b = c) and the
multiplicative structure (the radical of abc is defined through the factors of abc). In informal terms
the conjecture implies that if a and b are positive numbers which contain large powers of small
primes then c must contain small powers of larger primes.
■ Example 8.1 Here are a couple of randomly chosen a, b which contain large powers of small
primes, illustrating that their sum must have at least one small power of a large prime.
• a = 217 and b = 529 then c = a + b = 186264514923095834197 factors as
c = 72 · 3801316631083588453
c = 2 · 587 · 29406058061
c = 52523171 · 2758479286845241
■
74 Chapter 8. The ABC-conjecture
The ABC-conjecture is surprisingly powerful, and it solves a lot of other conjectures in number
theory, especially concerning diophantine equations. A long and extremely complex proposed
proof by Mochizuki is currently under scrutiny by the mathematical community.
For one application of the ABC-conjecture note the following:
Wiles and Taylor proved, in the mid 1990’ies Fermat’s last theorem, which we state below. The
proof was long using advanced machinery about elliptic curves, modular functions, etc.
xn + yn = zn (8.1)
has no solutions x, y, z ∈ N.
The case n = 3 is due to Euler, n = 4 is due to Fermat, and n = 5 is due to Dirichlet and Legendre.
The n = 6 case follows from the n = 3 case, and the n = 7 case was proved in 1839 by Lamé.
Using the ABC-conjecture we can get pretty close to a short proof of the rest of the theo-
rem. Given a solution (x, y, z) ∈ N3 we would get immediately from the ABC-conjecture since
rad(xn yn zn ) = rad(xyz) ≤ xyz ≤ z3 that
zn ≤ Cε z3(1+ε) . (8.2)
so Fermat’s last theorem is true for n > log(Cε )/ log(2) + 3(1 + ε).
If we know the ABC-conjecture with any exponent ε0 satisfying n − 3(1 + ε0 ) > 0 then by (8.2)
1/(n−3(1+ε0 ))
we have |z| ≤ Cε0 , i.e. given n there can only be finitely many z’s and hence only finitely
many solutions to Fermat’s equation (8.1) for that given n.
If the ABC-conjecture holds with ε = 1 and C = 1 we find zn ≤ z6 . In particular zn−6 ≤ 1
proves Fermat’s last theorem if n > 6. Note that the cases n = 3, 4, 5, 6 have been resolved much
earlier; See [Rib79].
Bibliography
[AKS04] Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. “PRIMES is in P”. In: Ann. of
Math. (2) 160.2 (2004), pp. 781–793 (cit. on p. 11).
[AGP94] W. R. Alford, Andrew Granville, and Carl Pomerance. “There are infinitely many
Carmichael numbers”. In: Ann. of Math. (2) 139.3 (1994), pp. 703–722 (cit. on p. 26).
[Ank57] N. C. Ankeny. “Sums of three squares”. In: Proc. Amer. Math. Soc. 8 (1957), pp. 316–
319 (cit. on p. 59).
[Apo76] Tom M. Apostol. Introduction to analytic number theory. Undergraduate Texts in
Mathematics. New York: Springer-Verlag, 1976, pp. xii+338 (cit. on p. 13).
[Bac90] Eric Bach. “Explicit bounds for primality testing and related problems”. In: Math.
Comp. 55.191 (1990), pp. 355–380 (cit. on p. 27).
[Bau15] Oswald Baumgart. The quadratic reciprocity law. A collection of classical proofs,
Edited, translated from the German, and with contributions by Franz Lemmermeyer.
Birkhäuser/Springer, Cham, 2015, pp. xiv+172 (cit. on p. 34).
[Bor+08] Peter Borwein et al., eds. The Riemann hypothesis. CMS Books in Mathematics/Ouvrages
de Mathématiques de la SMC. A resource for the afficionado and virtuoso alike.
Springer, New York, 2008, pp. xiv+533 (cit. on p. 51).
[Cas72] J. W. S. Cassels. An introduction to Diophantine approximation. Facsimile reprint of
the 1957 edition, Cambridge Tracts in Mathematics and Mathematical Physics, No. 45.
Hafner Publishing Co., New York, 1972, pp. x+169 (cit. on p. 71).
[DH76] Whitfield Diffie and Martin E. Hellman. “New directions in cryptography”. In: IEEE
Trans. Information Theory IT-22.6 (1976), pp. 644–654 (cit. on p. 28).
[DF04] David S. Dummit and Richard M. Foote. Abstract algebra. Third. Hoboken, NJ: John
Wiley & Sons Inc., 2004, pp. xii+932 (cit. on p. 57).
[Fen12] Shaoji Feng. “Zeros of the Riemann zeta function on the critical line”. In: J. Number
Theory 132.4 (2012), pp. 511–542 (cit. on p. 51).
76 Chapter 8. The ABC-conjecture
[Gra05] Andrew Granville. “It is easy to determine whether a given integer is prime”. In: Bull.
Amer. Math. Soc. (N.S.) 42.1 (2005), pp. 3–38 (cit. on p. 11).
[Har14] G. H. Hardy. “Sur les zéros de la fonction zeta(s) de Riemann.” French. In: C. R. Acad.
Sci., Paris 158 (1914), pp. 1012–1014 (cit. on p. 51).
[Hea86] D. R. Heath-Brown. “Artin’s conjecture for primitive roots”. In: Quart. J. Math. Oxford
Ser. (2) 37.145 (1986), pp. 27–38 (cit. on p. 22).
[Hec81] Erich Hecke. Lectures on the theory of algebraic numbers. Vol. 77. Graduate Texts in
Mathematics. Translated from the German by George U. Brauer, Jay R. Goldman and
R. Kotzen. New York: Springer-Verlag, 1981, pp. xii+239 (cit. on p. 37).
[Hil09] David Hilbert. “Beweis für die Darstellbarkeit der ganzen Zahlen durch eine feste
Anzahl nter Potenzen (Waringsches Problem)”. In: Math. Ann. 67.3 (1909), pp. 281–
300 (cit. on p. 53).
[Hoo67] Christopher Hooley. “On Artin’s conjecture”. In: J. Reine Angew. Math. 225 (1967),
pp. 209–220 (cit. on p. 23).
[Ivi85] Aleksandar Ivić. The Riemann zeta-function. A Wiley-Interscience Publication. The
theory of the Riemann zeta-function with applications. New York: John Wiley & Sons
Inc., 1985, pp. xvi+517 (cit. on p. 51).
[Jar14] Frazer Jarvis. Algebraic number theory. Springer Undergraduate Mathematics Series.
Springer, Cham, 2014, pp. xiv+292 (cit. on pp. 36, 37).
[JJ98] Gareth A. Jones and J. Mary Jones. Elementary number theory. Springer Undergraduate
Mathematics Series. London: Springer-Verlag London Ltd., 1998, pp. xiv+301 (cit. on
pp. 22, 24, 59, 60).
[KV92] A. A. Karatsuba and S. M. Voronin. The Riemann zeta-function. Vol. 5. de Gruyter
Expositions in Mathematics. Translated from the Russian by Neal Koblitz. Berlin:
Walter de Gruyter & Co., 1992, pp. xii+396 (cit. on p. 51).
[Mar77] Daniel A. Marcus. Number fields. Universitext. New York: Springer-Verlag, 1977,
pp. viii+279 (cit. on p. 37).
[Neu99] Jürgen Neukirch. Algebraic number theory. Vol. 322. Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences]. Translated from
the 1992 German original and with a note by Norbert Schappacher, With a foreword by
G. Harder. Berlin: Springer-Verlag, 1999, pp. xviii+571 (cit. on p. 37).
[Rib79] Paulo Ribenboim. 13 lectures on Fermat’s last theorem. Springer-Verlag, New York-
Heidelberg, 1979, xvi+302 pp. (1 plate) (cit. on p. 74).
[Rie59] Bernhard Riemann. “Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse.”
In: Monatsberichte der Berliner Akademie (1859), pp. 671–680 (cit. on p. 50).
[RSA78] R. L. Rivest, A. Shamir, and L. Adleman. “A method for obtaining digital signatures
and public-key cryptosystems”. In: Comm. ACM 21.2 (1978), pp. 120–126 (cit. on
p. 30).
[Rou91] G. Rousseau. “On the quadratic reciprocity law”. In: J. Austral. Math. Soc. Ser. A 51.3
(1991), pp. 423–425 (cit. on p. 34).
[Sel42] Atle Selberg. “On the zeros of Riemann’s zeta-function”. In: Skr. Norske Vid. Akad.
Oslo I. 1942.10 (1942), p. 59 (cit. on p. 51).
[Sho94] Peter W Shor. “Algorithms for quantum computation: Discrete logarithms and factor-
ing”. In: Foundations of Computer Science, 1994 Proceedings., 35th Annual Symposium
on. IEEE. 1994, pp. 124–134 (cit. on pp. 12, 29).
77
[Sho97] Peter W. Shor. “Polynomial-time algorithms for prime factorization and discrete log-
arithms on a quantum computer”. In: SIAM J. Comput. 26.5 (1997), pp. 1484–1509
(cit. on p. 12).
[Ste09] William Stein. Elementary number theory: primes, congruences, and secrets. Under-
graduate Texts in Mathematics. A computational approach. Springer, New York, 2009,
pp. x+166 (cit. on p. 25).
[Ste16] Jörn Steuding, ed. Diophantine analysis. Trends in Mathematics. Course notes from the
summer school held at the University of Würzburg, July 21–26, 2014. Birkhäuser/Springer,
Cham, 2016, pp. xi+232 (cit. on p. 71).
[Tit86] Edward C. Titchmarsh. The theory of the Riemann zeta-function. Second. Edited
and with a preface by D. R. Heath-Brown. New York: The Clarendon Press Oxford
University Press, 1986, pp. x+412 (cit. on p. 51).
[Wal79] Michel Waldschmidt. Transcendence methods. Vol. 52. Queen’s Papers in Pure and
Applied Mathematics. Available at the authors webpage. Kingston, Ont.: Queen’s
University, 1979, 126 pp. (not consecutively paged) (cit. on p. 36).
[Zag90] D. Zagier. “A one-sentence proof that every prime p ≡ 1 (mod 4) is a sum of two
squares”. In: Amer. Math. Monthly 97.2 (1990), p. 144 (cit. on p. 55).
Index
Möbius function, 47
Möbius inversion, 47
man in the middle attack, 29
Mersenne primes, 15
Miller-Rabin primality test, 26
Minkowski’s theorem, 61
modular exponentiation, 27
multiplicative, 42
normally approximable, 72
partial convergent, 65
perfect number, 43
periodic continued fraction, 68
prime, 11
prime number theorem, 13
primitive root mod n, 20
pseudo-prime, 27
Pythagorean triples, 13
radical, 73
relatively prime, 9
Riemann hypothesis, 13, 51
Riemann zeta function, 13, 50
ring, 16
Roth’s theorem, 72
RSA, 30
skew field, 17