0% found this document useful (0 votes)
47 views79 pages

Main

These lecture notes provide an introduction to number theory, covering topics such as divisibility, primes, modular arithmetic, cryptography, quadratic reciprocity, and more. The course is structured over seven weeks and assumes prior knowledge of algebra and modular arithmetic. The notes aim to give a comprehensive overview of number theory while encouraging further exploration of the subject through additional texts.

Uploaded by

myvikass15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views79 pages

Main

These lecture notes provide an introduction to number theory, covering topics such as divisibility, primes, modular arithmetic, cryptography, quadratic reciprocity, and more. The course is structured over seven weeks and assumes prior knowledge of algebra and modular arithmetic. The notes aim to give a comprehensive overview of number theory while encouraging further exploration of the subject through additional texts.

Uploaded by

myvikass15
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Introduction to number theory

Lecture notes 2025

Morten S. Risager
These are notes for the 7 week course “Introduction to Number Theory” at the University of
Copenhagen.
Cover photo kindly provided by Christian Rasmussen.

Please send comments and corrections to risager@math.ku.dk


Last compiled with LATEX 2ε on June 27, 2025.
Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1 Divisibility and primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


1.1 Notation and basic properties 6
1.2 Euclidean division and greatest common divisor 6
1.3 Diophantine equations 9
1.3.1 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 Primes 11
1.4.1 The fundamental theorem of arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.2 On the number of primes less than a given size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 Pythagorean triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.5 Exercises for Chapter 1 15

2 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1 Basic structures 16
2.2 Modular arithmetic 17
2.3 Euler’s ϕ-function 18
2.3.1 The Chinese remainder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Primitive roots modulo n. 20


2.4.1 Roots of polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 Artin’s conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.5 Local obstructions 23


2.6 Exercises for Chapter 2 23

3 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1 Primality testing 25
3.1.1 Carmichael numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Miller-Rabin’s primality test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.3 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4

3.2 Concrete cryptosystems 28


3.2.1 Diffie–Hellman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.2 RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Exercises for Chapter 3 31

4 Quadratic reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Quadratic residues and Legendre symbols 33
4.2 Quadratic reciprocity 34
4.2.1 Algebraic integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Gauss sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.3 Exercises for chapter 4 41

5 Arithmetic functions and Dirichlet series . . . . . . . . . . . . . . . . . . . . . . . 42


5.1 Arithmetical functions 42
5.1.1 Perfect numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.1.2 Convolution of arithmetical functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.1.3 The Möbius function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.2 Dirichlet series 47


5.3 Exercises for chapter 5 51

6 Sums of squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.0.1 Waring’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

6.1 Sums of squares 53


6.1.1 Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1.2 Sums of two squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1.3 Gaussian integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.1.4 Sums of three squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.5 Sums of four squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.2 Exercises for chapter 6 62

7 Continued fractions and approximations . . . . . . . . . . . . . . . . . . . . . . . 64


7.1 Continued fractions 64
7.1.1 Infinite continued fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

7.2 Approximating real numbers with rationals 70


7.3 Exercises for chapter 7 72

8 The ABC-conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Introduction

Number theory has a long history in mathematics. Indeed its problems and concepts have played
a formative role in many branches of mathematics. Even today it is a vibrant and active part of
modern mathematics, and it continues to offer new insights and stimulate the creation of new
mathematical subdisciplines and theories.
Historically there is a clear tendency that number theory has been driven by the desire to tackle
specific problems, e.g.: Does there exist positive integer solutions to the equation

x n + yn = zn

for n ≥ 3 (the answer is no)? Are there infinitely many twin primes (we do not know)? Trying to
answer such questions has often taken precedence, rather than generalizing the theory as much as
possible. This comment should not be taken as an indication that number theory does not use or
develop deep theories, but merely as statement of what motivates and drives many number theorists.
This text gives an introduction to the many facets of number theory, including tastes of its
algebraic, analytic, metric, Diophantine and geometric incarnations. We assume the reader has
taken a first course in algebra and has familiarity with groups as well as modular arithmetic.
The text is somewhat brief at points, and we strongly encourage the reader to dive into other
texts on number theory to get a more multifaceted view on the subject. Several are listed in the
bibliography and they all offer their own unique views on the topic.
1. Divisibility and primes

1.1 Notation and basic properties


We denote by
• N the set of natural numbers 1, 2, 3, . . .,
• Z the set of integers,
• Q the set of rationals,
• R the set of real numbers, and
• C the set of complex numbers.
If a, b ∈ Z we say that a divides b and write a | b if there exists c ∈ Z such that b = ac. We call a a
factor of b. If a does not divide b we write a ∤ b.
Proposition 1.1.1 Let a, b, c, d ∈ Z. Then
(i) if a | b and b | c then a | c.
(ii) if a | b and c | d then ac | bd.
(iii) if m ̸= 0 then a | b if and only if ma | mb.
(iv) if d | a and a ̸= 0 then |d| ≤ |a|.

Proof. See Exercise 1.5. ■

1.2 Euclidean division and greatest common divisor


We start by stating and proving a fundamental property of the integers that we learn already in
elementary school.

Theorem 1.2.1 — Euclidean division. Let a, b ∈ Z with b ̸= 0. Then there exist unique integers
q, r satisfying 0 ≤ r < |b| and a = bq + r.

Proof. The case a = 0 is left as an easy exercise. See exercise 1.7.


We first reduce to the case a, b > 0. If b < 0 we let b′ = −b and a unique solution to a = b′ q′ + r′
with 0 ≤ r′ < b′ gives a unique solution q = −q′ , r = r′ to a = bq + r with 0 ≤ r < |b|. Hence, we
may assume b > 0. If a < 0 and b > 0 we let a′ = −a. Then a unique solution to a′ = bq′ + r′ gives
1.2 Euclidean division and greatest common divisor 7

a unique solution q = −q′ − 1, r = b − r′ to a = bq + r with 0 ≤ r < |b|. We may therefore assume


a, b > 0.
If a, b > 0 consider
Q = {n ∈ Z|n ≥ 0, a − bn ≥ 0}.
This set is non-empty (0 ∈ Q) and bounded (if n > a/b then a − bn < 0). Let q ∈ Q be its maximal
element. Then r = a − bq < b since if a − bq ≥ b then a − b(q + 1) ≥ 0 which contradicts the
maximality of q. This proves existence in this case.
To prove uniqueness suppose q′ , r′ is another set of integers satisfying the conclusion. Then
q′ ∈ Q so q′ ≤ q. If q′ < q then

r′ = a − bq′ = a − b(q + (q′ − q)) = r − b(q′ − q) ≥ b

which is a contradiction. Hence, q = q′ and r = r′ , which proves uniqueness in this case. ■

■Example 1.1 The integers q, r from Euclidean division can be found by standard long division:
Given a = 36948, b = 2100 we find

36948 : 2100 = 17
2100
15948
14700
1248

so 36948 = 2100 · 17 + 1248 i.e. q = 17, r = 1248. ■

As an example of the power of Theorem 1.2.1 we prove a simple result about division. This
will be somewhat easier to prove after we have proved the fundamental theorem of arithmetic below
(Theorem 1.4.2) but the following simple proof shows that it can be proved without the use of
unique factorization.
Proposition 1.2.2 Assume m2 | b2 . Then m | b.

Proof. Certainly there exist natural numbers l such that m | bl (e.g. l = m). By the well-ordering
principle we can let l0 be the smallest such number. Let n, k be the integers such that b2 = m2 n,
bl0 = mk. Notice that mbk = b2 l0 = m2 nl0 so bk = mnl0 . By Theorem 1.2.1 we can write k = ql0 + r
with 0 ≤ r < l0 . But then

br = b(k − ql0 ) = mnl0 − bql0 = m(nl0 − qk)

so m divides br. By the minimality of l0 we must have r = 0. But then bl0 = mk = mql0 which
shows that b = mq. ■

Definition 1.2.1 Given a, b ∈ Z not both zero we define their greatest common divisor as the
largest natural number which divides both a and b, i.e.

gcd(a, b) = max{d ∈ N : d | a, d | b}.

We define gcd(0, 0) = 0

We note that 1 divides any integer, and that if d | a then |d| ≤ |a|, so the maximum in the above
definition is over a finite non-empty set.
Proposition 1.2.3 Let a, b ∈ Z. Then

gcd(a, b) = gcd(b, a) = gcd(−a, b) = gcd(a, −b) = gcd(a, b ± a)


8 Chapter 1. Divisibility and primes

Proof. We prove that gcd(a, b) = gcd(a, b + a). The other identities are proved a similar way. Let
g1 = gcd(a, b) and g2 = gcd(a, a + b). Since g1 | a and g1 | b we have g1 | a + b so g1 is a common
divisor of a and a + b. It follows that g1 ≤ g2 . Similarly, g2 is a divisor of a and a + b and therefore
also of a and b = (a + b) − a. Hence, g2 ≤ g1 which completes the proof. ■

The greatest common divisor of two integers a, b can be computed effectively as follows: We
have gcd(a, 0) = gcd(a, ±a) = |a|. If ab ̸= 0, |a| ̸= |b| we may use Proposition 1.2.3 to reduce to
the case a > b > 0. Using Theorem 1.2.1 we write a = bq + r with 0 ≤ r < b. By repeated use of
Proposition 1.2.3 we see that gcd(a, b) = gcd(r, b). If r = 0 we have gcd(a, b) = b. If not we repeat
the above argument with a′ = b, b′ = r. This procedure will eventually terminate since r goes down
by at least one in each step. This procedure is called Euclid’s algorithm.
■ Example 1.2 Consider a = 567 and b = 32. We use Euclid’s algorithm and find

567 = 17 · 32 + 23 so gcd(567, 32) = gcd(23, 32)


32 = 1 · 23 + 9 so gcd(32, 23) = gcd(9, 23)
23 = 2 · 9 + 5 so gcd(23, 9) = gcd(5, 9)
9 = 1·5+4 so gcd(9, 5) = gcd(4, 5)
5 = 1·4+1 so gcd(5, 4) = gcd(1, 4)
4 = 4·1+0 so gcd(4, 1) = gcd(0, 1) = 1,

and we may conclude that gcd(567, 32) = 1. ■

We see that this procedure constructs integers a0 = a, b0 = r0 = b,


ai = qi bi−1 + ri (1.1)
with 0 ≤ ri < ri−1 . This ends when rn = 0 and in this case rn−1 = gcd(a, b).

Theorem 1.2.4 — Bezout’s identity. Let a, b be integers not both zero. Then there exist an
integer solution x, y ∈ Z to the equation

gcd(a, b) = ax + by.

Proof. One way of proving this is to take the procedure described immediately before the theorem
and then use backwards substitution. Here is another proof that has the advantage of giving an
alternative characterization of gcd(a, b):
Consider
S = {ax + by > 0|x, y ∈ Z} ⊆ N
Clearly S is non-empty. Let g = ax0 + by0 ∈ S be its smallest element. Then g | a since if
not a = gq + r with 0 < r < g by Theorem 1.2.1. But since r = a − gq = a − qax0 − by0 q =
a(1 − qx0 ) − by0 q ∈ S this contradicts the minimality of g. By the same argument g | b. Hence g is
a common divisor i.e. g ≤ gcd(a, b).
Clearly gcd(a, b) | ax0 + bx0 = g so gcd(a, b) ≤ g, and we may conclude that gcd(a, b) = g. ■

R We notice that the proof of Theorem 1.2.4 actually proves that for any a, b ∈ Z we have
gcd(a, b) = min{ax + by > 0|x, y ∈ Z}.
From this observation we easily deduce the following corollary:
1.3 Diophantine equations 9

Corollary 1.2.5 Let a, b ∈ Z. Then gcd(na, nb) = |n| gcd(a, b).

■ Example 1.3 In the proof of Theorem 1.2.4 we used the well-ordering principle for N i.e. that
any non-empty subset of the natural numbers contains a smallest element to prove the existence of
solutions to the integer equation gcd(a, b) = ax + by. If we want to actually find such a solution we
may use the procedure described above for finding gcd(a, b) = rn−1 : We see from (1.1) that we can
express rn−1 as an integer linear combination of an−1 and bn−2 . Doing successive substitution of
the previous equation we arrive at the result.
With the integers from Example 1.2 we find that

gcd(567, 32) = 1 = 5 − 1 · 4
= 5 − 1 · (9 − 1 · 5) = −9 + 2 · 5
= −9 + 2 · (23 − 2 · 9) = 2 · 23 − 5 · 9
= 2 · 23 − 5 · (32 − 23) = −5 · 32 + 7 · 23
= −5 · 32 + 7 · (567 − 17 · 32) = 7 · 567 − 124 · 32,

so we find a solution x = 7, y = −124. ■

Definition 1.2.2 Let a, b ∈ Z. We call a, b coprime or relatively prime if gcd(a, b) = 1.

Proposition 1.2.6
(i) The integers a, b are coprime if and only if 1 = ax + by for some x, y ∈ Z.
(ii) Let d = gcd(a, b). Then the integers a/d and b/d are coprime.
(iii) Assume a, b are coprime and a | c and b | c. Then ab | c.
(iv) Assume a, b are coprime and a | bc. Then a | c.

Proof. The claim in (i) follows directly from the remark after Theorem 1.2.4. The remaining 3
claims all use (i).
To prove (ii) we use again Theorem 1.2.4 and divide ax + by = d by d. Then (i) proves the
claim. To see (iii) we note that on the assumptions we have c = ea and c = f b so by (i) we have
c = c(ax + by) = f bax + eaby = ab( f x + ey) which proves the claim. To prove (iv) we note if
a, b are relatively prime and bc = da then by (i) we have c = c(ax + by) = cax + day = a(cx + dy)
which shows that a | c. ■

1.3 Diophantine equations


Let p(X1 , . . . , Xn ) ∈ Z[X1 , . . . , Xn ] be a polynomial in n variables with integer coefficients. We
want to find integer solutions to the equation p(X1 , . . . , Xn ) = 0. Such equations are usually called
Diophantine equations and they are of central importance in number theory.

1.3.1 Linear equations


We are now ready to discuss this in the simplest possible case namely the case p(X,Y ) = aX +bY −c
where a, b, c ∈ Z:

Theorem 1.3.1 Assume a, b, c ∈ Z with a, b not both zero. Let d = gcd(a, b). Then the equation

ax + by = c

has integer solutions if and only if d | c. If d | c the complete solution consists of all pairs of the
10 Chapter 1. Divisibility and primes

form  
c b a
(x, y) = (x0 , y0 ) + n · ,−
d d d
where n ∈ Z. Here (x0 , y0 ) is any particular solution to ax + by = gcd(a, b).

Proof. If an integer solution (x, y) exists then since d divides a and b it also divides ax + by = c.
If on the other hand d | c then we pick a solution (x0 , y0 ) of ax0 + by0 = d which is possible by
Bezout’s identity (Theorem 1.2.4). It is then clear that dc (x0 , y0 ) is a solution to ax + by = c. Let
now (x, y) be any integer solution. Then (u, v) = (x, y) − dc (x0 , y0 ) is an integer solution to
au + bv = 0
and therefore also to
a b
u+ v = 0 (1.2)
d d
It follows from Proposition 1.2.6 (ii) that a/d and b/d are coprime. By Proposition 1.2.6 (iv) we
can conclude that b/d | u, i.e. u = nb/d for some n ∈ Z. Substituting back into 1.2 we find that
v = −(a/d)n. We have now proved that any integer solution is of the form
 
c b a
(x, y) = (x0 , y0 ) + n ,−
d d d
for some integer n. It is easy to see – by direct verification – that all such integer pairs are indeed
solutions. ■
■ Example 1.4 We find all solution to
1485x + 1745y = 15
We start by finding the greatest common divisor of 1485 and 1745. Using division with remainder
we have
1745 = 1 · 1485 + 260
1485 = 5 · 260 + 185
260 = 1 · 185 + 75
185 = 2 · 75 + 35
75 = 2 · 35 + 5
35 = 7 · 5 + 0
so by the discussion before Theorem 1.2.4 we have gcd(1485, 1745) = 5, and since 5 | 15 there are
infinitely many solutions. Doing backwards substitution we find
5 = 75 − 2 · 35
= 75 − 2(185 − 2 · 75) = 5 · 75 − 2 · 185
= 5 · (260 − 185) − 2 · 185 = 5 · 260 − 7 · 185
= 5 · 260 − 7 · (1485 − 5 · 260) = 40 · 260 − 7 · 1485
= 40 · (1745 − 1485) − 7 · 1485
= 40 · 1745 − 47 · 1485
Using Theorem 1.3.1 we find that the complete solution is
 
1745 1485
(x, y) = 3 · (−47, 40) + n ,− = (−141, 120) + n · (349, −297)
5 5

1.4 Primes 11

1.4 Primes
We recall the definition of primes:
Definition 1.4.1 A natural number p > 1 is called a prime if its only divisors are 1 and p. A
natural number n > 1 which is not a prime is called a composite number.

We note that by this definition 1 is neither composite or prime. Historically 1 has been considered a
prime for a long time, but certain theorems become easier to formulate if this is not so.
■ Example 1.5 This is the first 100 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,-
61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173,-
179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283,-
293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421,-
431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541.
Short lists like this can be easily found by hand. For a given n we note that if n is composite

then n = ab where either a or b is at most n. Hence if we have checked that n has no factors less

than n it must be prime. (See Exercise 1.8). Using Eratosthenes’ sieve we can find a complete list
of small primes. Much longer lists can be found online e.g. at primes.utm.edu. ■

R There is a very clever algorithm (published in 2004) based on Fermat’s little theorem (See
Corollary 2.3.2) which checks whether a number is prime or not. The algorithm is called
AKS after its inventors Agrawal, Kayal and Saxena. We will encounter several algorithms
later which verifies whether a number is prime or not, but a major interest in AKS is that it is
the first algorithm that provably runs in ‘computational time’ bounded by a polynomial in
log(p) where p is the number we want to investigate if it is a prime or not. Hence the required
time is polynomial in the number of digits of p. For more information see [AKS04; Gra05].

Maybe the most important fact about prime numbers is the fundamental theorem of arithmetic.
The crucial result we need to prove it, uses the divisibility results from Section 1.2:
Lemma 1.4.1 — Euclid. Let a, b ∈ Z and assume p is a prime. If p | ab then p | a or p | b.

Proof. If p | a we are done. Assume not: Then gcd(p, a) = 1, for if gcd(p, a) ̸= 1 then since 1, p
are the only divisors of p then gcd(p, a) = p which in particular means that p | a but we have
assumed this not to be the case, i.e. gcd(p, a) = 1. Using Bezout (Theorem 1.2.4) we find that there
exist integers x0 , y0 such that x0 p + y0 a = 1. Multiplying this by b we find that

x0 pb + y0 ab = b

Since p | ab we see that p | b which finishes the proof. ■

1.4.1 The fundamental theorem of arithmetic


We are now ready to prove the main result:

Theorem 1.4.2 — The fundamental theorem of arithmetic. Every natural number n > 1 can
be written as a product of primes. The product is unique up to permutation of terms.

We interpret 1 as being written as the empty product over primes.

Proof. Let X ⊆ N be the set of natural numbers which can not be written as a product of primes. To
see that X is the empty set assume that it is not, and let n be its smallest member. Then n cannot be
a prime (then it is clearly a product of primes), so it has a factorization n = ab where a, b < n. But
by minimality a, b are not in X so they can both be written as a product of primes, which implies
12 Chapter 1. Divisibility and primes

that n is a product of primes. But then n is not in X which shows that X is the empty set. Hence
every n > 1 can be written as a product of primes.
To show that such a product is unique we assume that

p1 p2 · · · pd = q1 · · · ql

By successive use of Lemma 1.4.1 we find that pd = q j for some j = 1, . . . l. By possibly renaming
the qi ’s we may assume that pd = ql . Cancelling by ql we find

p1 p2 · · · pd−1 = q1 · · · ql−1

Repeating this argument d times we find

1 = q1 · · · ql−d

But this is only possible if l = d, which means we have removed all primes q j on the right. The
claim follows. ■

R The fundamental theorem of arithmetic says that every natural number can be factored
in primes. Is factoring a given number n ‘easy’or ‘hard’? We discussed earlier that the
AKS-algorithm determines whether a number n is prime or not, and does so in polynomial
time in the digits of n, i.e. determining whether a number is prime is ‘easy’. If n is not a
prime the AKS algorithm does not give a factorization of n. It is a big unsolved problem if
there exist a polynomial time algorithm which factors n. There is an algorithm due to Schor
[Sho94; Sho97] which can do prime factorizations on quantum computers in polynomial time.
However quantum computing is still in its infancy, so this has not had great practical use so
far. As we shall see later some of the most used cryptosystems like RSA relies on the fact
that no one knows how to factor ‘easily’.

Theorem 1.4.3 There are infinitely many prime numbers.

Proof. We will construct infinitely many prime numbers recursively. 2 is a prime. Assume now
that we have a list of n primes p1 , p2 , . . . pn . Multiplying them together, adding 1, and then using
the fundamental theorem of arithmetic (Theorem 1.4.2) we have

p1 p2 · · · pn + 1 = q1 q2 . . . ql (1.3)

where q1 , . . . ql are (not necessarily distinct) primes. We now notice that q j ̸= pi for all i, j since
if pi = q j = p then by (1.3) we find p | 1 which is impossible. Letting pn+1 be the smallest of the
q j ’s we have added one extra prime to our list. Doing this recursively we see that we can construct
arbitrarily many primes. ■

The first few terms in the sequence of primes constructed in the proof of Theorem 1.4.3 are

2, 3, 7, 43, 13, 53, 5, 6221671, 38709183810571, 139, 2801, 11, 17, 5471, 52662739, . . .

This is usually called the Euclid-Mullin sequence.


Conjecture 1.4.4 The Euclid-Mullin sequence contains all primes.
The numerical evidence for this conjecture is very sparse. Only the first 51 terms are known
(the last term found September 2012). For primes less than 100 we do not know if 41, 47, 67, 73,
79, 83 are in the Euclid-Mullin sequence.
1.4 Primes 13

Even if we know that there are at infinitely many primes we have of course only explicitly
computed finitely many. The largest known prime is 282589933 − 1 which is a monstrous number with
24862048 decimal digits. It was found in December 2018. If you want to join the prime-finding
community go to GIMPS or PrimeGrid. The only thing you have to do is to run a piece of software
on your computer.
It was fairly straightforward to prove that there are infinitely many primes. Consider two
coprime integers a and b. Then it is natural to ask if there are infinitely many primes of the form
an + b i.e. infinitely many primes which has remainder b if we divide by a. This is indeed so, but
except for a few simple choices of a and b this turns out to be significantly harder to prove. It was
first proved by Dirichlet more that 2000 years after Euclid’s result (Theorem 1.4.3). See [Apo76,
Thm 7.9] for an accessible proof.

1.4.2 On the number of primes less than a given size


Consider the function π(x) which counts the number of primes less than a given size

π(x) = #{p ≤ x|p prime}.

where x is any real number. Theorem 1.4.3 can be formulated as π(x) → ∞ as x → ∞. The
proof of Theorem 1.4.3 can be used to show a bit more. One can use the proof to see that
π(x) ≥ log2 (log2 (x)) (See Exercise 1.13). Much more is true: By a relatively elementary argument
one can show that for x > 2 we have π(x) ≥ 61 log(x)
x
([Apo76, Thm 4.6]) as well as an analogous
x
upper bound 6 log(x) . Riemann was probably the first to fully understand the link between π(x) and
the Riemann zeta function

1
ζ (s) = ∑ ns when ℜ(s) > 1,
n=1

which we shall study more in Chapter 5. Hadamard and de la Vallée Poussin proved in 1896, using
Riemann’s insights, that in fact
x
π(x) ∼ , (1.4)
log(x)

meaning that the quotient of the two sides converges to 1 as x → ∞. This result is called the prime
number theorem and its validity was conjectured independently by Gauss and Legendre 100 years
earlier. The prime number theorem is very powerful, but we expect a much stronger result to be
true: The Riemann hypothesis, which is usually stated in terms of the zeros of ζ (s), is equivalent to
the following conjecture:
Z x
1
Conjecture 1.4.5 For every ε > 0 we have π(x) − dt = O(x1/2+ε ).
2 log(t)

1.4.3 Pythagorean triples


We now discuss another example of a Diophantine equation: Consider the polynomial equation

P(X,Y, Z) = X 2 +Y 2 − Z 2 = 0.

By Pythagoras’ theorem solutions with X,Y, Z > 0 to this equation corresponds to side lengths in
right triangles. Given an integer solution (e.g. (X,Y, Z) = (3, 4, 5)) we can always multiply this by
an integer m and get another integer solution (mX, mY, mZ). Such solutions will always have m as
a common factor. Solutions (X,Y, Z) without common factor are called primitive. Note that if two
of X, Y , Z has a common factor the third has the same factor.
14 Chapter 1. Divisibility and primes

Observe now that no primitive solution has Z = 0, and if (X,Y, Z) is a primitive solution then
(X,Y, −Z) is also a primitive solution. Since X and Y has no common factor they cannot both be
even. We claim that they also cannot both be odd. If they were then Z 2 would be of the form 2 + 4k
for some k. But no perfect square is of this form, so one of X, Y is even and the other is odd. Given
a primitive solution (X,Y, Z) we have that (Y, X, Z) is also a solution and this mapping interchanges
the primitive solutions with X even with the primitive solutions with X odd.
We now parametrize all primitive solutions.

Theorem 1.4.6 The primitive integer solution to the equation

X 2 +Y 2 = Z 2 (1.5)

with X odd, Y even and Z > 0 are exactly the triples

(X,Y, Z) = (p2 − q2 , −2pq, p2 + q2 )

where p, q are coprime integers with p ≥ 0 satisfying p − q is odd.

Proof.
We recall that there is a bijection between primitive solution of (1.5) and the rational points on
the unit circle, i.e. rational solutions of the equation
X 2 +Y 2 = 1.
This is seen as follows:
Given a primitive solution x2 + y2 = z2 with z > 0 then (X,Y ) = (x/z, y/z) is a rational number
on the unit circle. Furthermore given a rational point (X ′ ,Y ′ ) = (x′ /z′ , y′ /w′ ) with z′ , w′ > 0 on the
unit circle then consider (x′ w′ )2 + (y′ z′ )2 = (z′ w′ )2 . It is immediate from Proposition 1.2.2 that if
two of the three numbers x′ w′ , y′ z′ , and z′ w′ has a common factor then so does the third. Hence
we can divide by this common factor m . It follows that we can write (X ′ ,Y ′ ) = (a/c, b/c) where
(a, b, c) is a primitive solution to (1.5) with c > 0. The Pythagorean triples with X odd and Y even
correspond exactly to those rational points on the circle (a/c, b/c) with a odd and b even.
We now parametrize all rational points on the unit circle. Consider P = (1, 0) and another
rational point Q = (q1 , q2 ) ̸= P on the unit circle. Then the line between them y = α(x − 1) has
rational slope α = q2 /(q1 −1). On the other hand, given a rational slope α ∈ Q the line y = α(x−1)
and x2 + y2 = 1 intersects at a point Q which satisfies
x2 + α 2 (x − 1)2 − 1 = (1 + α 2 )x2 − 2α 2 x + α 2 − 1 = 0
which has the two solutions
p
2α 2 ± 4α 4 − 4(1 + α 2 )(α 2 − 1) 2α 2 ± 2 α2 − 1
2
= 2
= 1, 2 .
2(1 + α ) 2(1 + α ) α +1
In particular we note that Q has rational coordinates:
 2 
α − 1 −2α
Q= , .
α2 + 1 α2 + 1

Q
P
1.5 Exercises for Chapter 1 15
2 2 −2pq


If α = qp is an irreducible fraction with p ≥ 0, we find that Q = pp2 −q ,
+q2 p2 +q2
. It is straight-
forward to see that the second coordinate is of the form b/c with b even precisely if p − q is odd.
By the above discussion the result follows. The Pythagorean triple (1, 0, 1) is covered by p = 1,
q = 0 which corresponds to α = ∞. ■
The above theorems might give the impression that it is relatively straightforward to determine
if integer polynomial equations have integer solutions. This is wrong. Most often it is very difficult
to determine all integer solutions. To illustrate this consider Fermat’s equation
xn + yn = zn , for n ≥ 3.
It took almost 400 years between 1637 when Fermat thought that he had proved that this does not
have any integer solutions satisfying xyz ̸= 0 until A. Wiles (and his collaborators) finally found a
definitive proof in 1995.

1.5 Exercises for Chapter 1


Exercise 1.1 Compute gcd(455, 1235).
Exercise 1.2 Find all integer solutions to 455x + 1235y = 130.
Exercise 1.3 Find all integer solutions to 455x + 1235y = 143.
Exercise 1.4 What are the possible reminders when dividing a square n2 by 3? By 4? By 5? By 6?
Exercise 1.5 Prove Proposition 1.1.1
Exercise 1.6 In each of the following, apply the division algorithm to find q and r such that
a = bq + r with 0 ≤ r < |b|:
(a, b) = (302, 19), (a, b) = (829, 31), (a, b) = (300, −17), , (a, b) = (449, 4),
Exercise 1.7 Prove Theorem 1.2.1 in the case a = 0. K1
Exercise 1.8 Prove that n ∈ N with n > 1 is a prime if and only if n has no non-trivial factors less

than or equal to n.
Exercise 1.9 Find all integer solution to 1485x + 1745y = gcd(1485, 1745).
Exercise 1.10 Show that Proposition 1.2.6 (iii), (iv) fails if we do not assume a, b to be coprime.
Exercise 1.11 — Eratosthenes’ Sieve. Assume that P = {p1 , p2 , . . . pk } are the smallest k primes.
Erase from the list pk +1, pk +2, . . . , p2k all multiples of p1 , p2 , . . . pk and add the remaining numbers
to P. Show that P now contains all prime numbers less than p2k . This may be used to easily find
complete lists of primes of moderate size. K
Exercise 1.12 Find, using nothing put pen and paper, all primes less than 200.
Exercise 1.13 Let pn be the n-th prime, and let π(x) = #{pn ≤ x}. Show that π(x) ≥ ⌊log2 (log2 (x))⌋+
n−1
1. (Hint: Use Theorem 1.4.3 to show that pn ≤ 22 .) K
Exercise 1.14 Show that there are infinitely many primitive Pythagorean triangles, i.e. right
triangles whose side lengths are integers and pairwise coprime.
Exercise 1.15 Prove Proposition 1.2.2 using the theory of primes.
Exercise 1.16 Show that there are infinitely many primes of the form 4n − 1. (Hint: Assume
p1 , . . . pn are of the desired form and consider the prime factorization of 4p1 · · · pn − 1)
Exercise 1.17 Prove that no integer in the sequence 11, 111, 1111, . . . is a perfect square.
Hint: What are the possible remainders when dividing a perfect square by 4?
Exercise 1.18 Assume that ab − 1 is a prime where a, b ∈ N with b ≥ 2. Show that a = 2 and b
is prime. Primes of this form are called Mersenne primes. Most of the largest known primes are
Mersenne primes.
Exercise 1.19 Show that there does not exist a Pythagorean triple (x, y, z) where x, y, z are all
primes.
1 The K means that it might be a good idea to serve yourself a nice beverage.
2. Modular arithmetic

2.1 Basic structures


We recall the definition of a ring
Definition 2.1.1 A ring R is a set with two binary operations +,· (usually called addition and
multiplication), and two elements 0, 1 ∈ R such that
(i) (R, +, 0) is an abelian group, and for every a, b, c ∈ R we have
(ii) 1 · a = a · 1 = a, i.e 1 is a multiplicative identity,
(iii) (a · b) · c = a · (b · c), i.e. multiplication is associative,
(iv) a · (b + c) = a · b + a · c, i.e. multiplication is distributive from the right,
(v) (a + b) · c = a · c + b · c, i.e. multiplication is distributive from the left.

■ Example 2.1 The following are all examples of rings


• Z, Q, R, C (with the usual two binary operations +, ·) are all examples of commutative (or
abelian) rings, i.e. rings where the multiplication is commutative.
• Let n > 1. The set of n by n matrices with coefficients in Z, Q, R, or C denote Mn (Z), Mn (R),
Mn (Q) , and Mn (C) equipped with the usual matrix addition and matrix multiplication are all
examples of non-commutative rings. In this course we will almost exclusively be considering
commutative rings.
• Let n ∈ N. The set of equivalence classes of integers modulo n, Z/nZ with addition and
multiplication defined by
(a + nZ) + (b + nZ) := (a + b) + nZ
(a + nZ) · (b + nZ) := a · b + nZ
It is straightforward to verify that these operations are well-defined and that they make Z/nZ
a commutative ring. You have already seen in your basic algebra course that (Z/nZ, +) is an
abelian group. We shall use both a + nZ, [a]n , and a mod n to denote the equivalence class a
modulo n. If, on the other hand, we write a = b (mod n) we mean that n | (a − b) which is
equivalent to the equivalence classes a mod n and b mod n being equal.

2.2 Modular arithmetic 17
Definition 2.1.2 Let R be a ring and let a ∈ R. We say that a is invertible or that it is a unit if
there exists a b ∈ R such that ab = ba = 1. The set of invertible elements is denoted by R× .

Notice that if a is invertible then a corresponding b is uniquely determined: If both ab = ba = 1 and


ab′ = b′ a = 1 then b = b(ab′ ) = (ba)b′ = b′ . We denote this element by a−1 . We notice also that if
a, b, c ∈ R with c invertible then ac = bc implies a = b by multiplication from the right with c−1 .
It is straightforward to verify the following proposition (See Exercise 2.1) :
Proposition 2.1.1 Let (R, +, ·) be a ring. The set (R× , ·) of invertible elements with · is a group.
■ Example 2.2 We have
• Z× = {±1}, Q× = Q\{0}, R× = R\{0} , C× = C\{0}.
• We have

(Z/nZ)× = {x mod n ∈ Z/nZ| gcd(x, n) = 1}. (2.1)

It follows from Proposition 1.2.3 that the set on the right is well-defined. If x mod n ∈ Z/nZ
is invertible then there exist y mod n ∈ Z/nZ such that xy = 1 + kn for some k ∈ Z. But
then by Proposition 1.2.6 (i) we have gcd(x, n) = 1. On the other hand if gcd(x, n) = 1
then again by Proposition 1.2.6 (i) there exist y, k such that 1 = yx + kn which shows that
(x mod n) · (y mod n) = 1 mod n.

Definition 2.1.3 Let R be a ring. If all non-zero elements are invertible then R is called a division
ring or a skew field. If additionally R is commutative then R is called a field.

■ Example 2.3 By Example 2.2 we see that Q, R, C are fields and Z/nZ is a field if and only if
n = p is a prime. This field is usually denoted F p . ■

2.2 Modular arithmetic


We will investigate further the ring Z/nZ.
Proposition 2.2.1 The equation

ax = b (mod n) (2.2)

has a solution if and only if gcd(a, n) | b.

Proof. Let g = gcd(a, n). Assume x is a solution. Then n | ax − b so ax − b = nc for some integer
c and hence g | b. Assume on the other hand g | b. Note that by Proposition 1.2.6 (ii) we have
gcd(a/g, n/g) = 1. By (2.1) a/g is a unit in Z/(n/g)Z, so there exists a y such that ya/g = 1+kn/g.
Multiplying by b we find that x = yb/g solves (2.2). ■

■ Example 2.4 • Note that x = 4 (mod 8) and x = 0 (mod 8) are two different solutions to
2x = 0 (mod 8). Hence there is not necessarily uniqueness in Proposition 2.2.1.
• Consider solutions to 3x = 7 (mod 5). Euclidean division gives

5 = 1·3+2
3 = 1·2+1
2 = 2·1+0

so gcd(5, 3) = 1 and 1 = 3 − 2 = 3 − (5 − 3) = 2 · 3 − 5. It follows that 3−1 = 2 (mod 5) so


we may multiply by 2 to find the unique solution x = 2 · 7 = 4 (mod 5).

18 Chapter 2. Modular arithmetic

2.3 Euler’s ϕ-function


We now consider the function

ϕ(n) = # (Z/nZ)× = #{1 ≤ a ≤ n| gcd(a, n) = 1} (2.3)

called Euler’s ϕ-function.


■ Example 2.5 If we want to find ϕ(6) we have to count how many of 1, 2, 3, 4, 5 are coprime to 6.
But this is is exactly 1, 5, so ϕ(6) = 2. ■

It is obvious that ϕ(1) = 1. It follows from Example 2.3 that if p is a prime ϕ(p) = p − 1.
More generally we have

ϕ(pn ) = pn − pn−1 = pn (1 − p−1 ). (2.4)

To see this we note that we are counting all the numbers 1, 2, . . . , pn except for those that have
a factor in common with pn . But if gcd(a, pn ) > 1 then using that p is a prime we see that
gcd(a, p) > 1 which means that p divides a. Hence the numbers we are excluding are exactly
p, 2p, 3p, . . . , pn and the claim follows.

Theorem 2.3.1 — Euler. Assume gcd(x, n) = 1. Then xϕ(n) = 1 (mod n).

Proof. From basic algebra we have Lagrange’s theorem which implies that for a finite group G we
have g#G = 1. Applying this to G = (Z/nZ)× gives the result by (2.1) and (2.3). ■

Corollary 2.3.2 — Fermat’s little theorem. Let p be a prime. Then a p = a (mod p) for every
a ∈ Z.

Proof. For p | a both sides are zero and the claim is trivial. For p ∤ a we have gcd(a, p) = 1 and
Euler’s theorem and (2.4) gives that a p−1 = 1 (mod p). Multiplying by a gives the result. ■

We note that the above proof is somewhat anachronistic. Historically Lagrange’s theorem was
a generalization of Fermat’s little theorem. For a more elementary proof of Euler’s theorem we
can argue as follows: Since gcd(x, n) = 1 multiplication by x gives a permutation of the elements
G = {a mod n|gcd(a, n) = 1}. It follows that

∏ ax = ∏ a (mod n).
a∈G a∈G

Since gcd(∏a∈G a, n) = 1 it follows from Bezout’s theorem that there exists an integer b such that
b ∏a∈G a = 1 (mod n). Multiplying by this b finishes the proof of Theorem 2.3.1 .

2.3.1 The Chinese remainder theorem


In order to find the values of Euler’s ϕ-function on general integers we need the Chinese remainder
theorem.
Theorem 2.3.3 — The Chinese remainder theorem. Let a1 , . . . al ∈ Z and let n1 , . . . , nl be
pairwise coprime. Then the system of l equations

x = ai (mod ni ), i = 1, . . . , l

has an integer solution. This solution is unique modulo n1 · · · nl .


2.3 Euler’s ϕ-function 19

Proof. For l = 1 the claim is clear: Choose x = a1 . Assume l = 2 and consider the equation in t
given by
a1 + n1 t = a2 (mod n2 ).
As gcd(n1 , n2 ) = 1 Proposition 2.2.1 ensures that there is a solution to this. Indeed we may take
t = (a2 − a1 )n−1
1 (mod n2 ). Letting x = a1 + n1t we find that this solves the two equations. Assume
two solutions x, y. Then subtracting them we find that x − y = 0 (mod ni ). Then by Proposition
1.2.6 (iii) we have that n1 n2 | x − y which shows uniqueness modulo n1 n2 .
Assume now that the theorem holds for a given l. Then we want to show that it holds also for
l + 1. So assume we have a solution to
x = ai (mod ni ), i = 1, . . . , l
Then we want to show that we can find a solution to
y = ai (mod ni ), i = 1, . . . , l + 1
We consider now the equation
x + tn1 n2 · · · nl = al+1 (mod nl+1 ) (2.5)
As gcd(n1 · · · nl , nl+1 ) = 1 Proposition 2.2.1 ensures that there is a solution to this. Indeed we may
take t = (al+1 − x)(n1 · · · nl )−1 (mod nl+1 ). Letting y = x + tn1 n2 · · · nl we find that this solves all
l + 1 equations. Uniqueness is proved like the case l = 2. ■

■ Example 2.6 The Chinese mathematician Sun Tzu asked in a late 3rd century book for the
smallest solution to the equations
x = 2 (mod 3)
x = 3 (mod 5)
x = 2 (mod 7).
Using Theorem 2.3.3 we see immediately that a solution exist. Recalling the proof we see that in
order to solve the two first equations we solve, in t the equation
2 + t · 3 = 3 (mod 5).
We easily find that t = 2 solves this, and hence 2 + 2 · 3 = 8 is a solution of the two first equations.
To get a solution of all three equations we need to solve in t the equation
8+t ·3·5 = 2 (mod 7).
Since 15 = 1 (mod 7) we find that t = −6 = 1 (mod 7) solves the equation, and x = 8 + 1 · 15 = 23
solves all three equations. Since all solution are of the form x = 23 + k · 3 · 5 · 7 (and all integers of
this form are indeed solutions), this is also the smallest solution. ■

R Note that the proof given for the Chinese remainder theorem above is constructive. A shorter
non-constructive proof can be given as follows: Consider
ψ : Z/n1 · · · nl Z → Z/n1 Z × · · · Z/nl Z
(2.6)
c mod n1 · · · nl 7→ (c mod n1 , . . . , c mod nl )
which is well-defined. It is also injective, for if c mod ni = 0 for all i = 1, . . . l , then by
Proposition 1.2.6 (iii) we have c mod n1 · · · nl = 0. Since the domain and the codomain have
the same number of elements this map must also be surjective, which is a reformulation of
the Chinese remainder theorem.
20 Chapter 2. Modular arithmetic

Lemma 2.3.4 Let m, n be coprime integers. The mapping between groups

ψ : (Z/mnZ)× → (Z/mZ)× × (Z/nZ)×


c mod mn 7→ (c mod m, c mod n)

is a bijection.

Proof. Notice that the map is independent of choice of representative. Note also that if c mod mn ∈
(Z/mnZ)× then gcd(c, mn) = 1. It follows that gcd(c, n) = 1 since gcd(c, n) | gcd(c, mn) = 1. The
same argument shows that gcd(c, m) = 1. It follows that ψ(c mod mn) ∈ (Z/mZ)× × (Z/nZ)× so
the map is indeed well-defined.
As ψ is the restriction to (Z/mnZ)× of the bijection ψ from (2.6) it follows that ψ is injective.
To see that it is surjective we have to show that if ψ(c mod mn) ∈ (Z/mZ)× × (Z/nZ)× then
c mod mn ∈ (Z/mnZ)× . But assume that a prime p | gcd(c, mn). Then by Euclid’s lemma (Lemma
1.4.1) we have p | m or p | n. But this means that p | gcd(c, m) = 1 or p | gcd(c, n) = 1 which
cannot be the case. So no such prime exists, and we have gcd(c, mn) = 1 which shows that
c mod mn ∈ (Z/mnZ)× . Hence ψ is surjective, which finishes the proof. ■

In fact a bit more is true. One can show that ψ is a group isomorphism. Indeed ψ is a ring
isomorphism (we have not defined what this means) but since we do not need it we settle for the
above statement. We need it to be able to conclude the following:

Corollary 2.3.5 Let m, n be coprime integers. Then ϕ(mn) = ϕ(m)ϕ(n).

Proof. This follows immediately from Lemma 2.3.4, and the definition of ϕ. ■

Combining Corollary 2.3.5 with (2.4) we find immediately that

ϕ(n) = n ∏(1 − p−1 ). (2.7)


p|n

2.4 Primitive roots modulo n.


Recall that if g is an element of a group G then the order of g, ord(g), is the smallest natural number
n such that gn is the neutral element of the group. By Lagrange theorem the order of a group
element divides the group order #G. We know also that if gm equals the neutral element in G then
ord(g) | m as can be proved by Euclidean division.
Definition 2.4.1 Let n be a natural number. An integer a ∈ Z is called a primitive root modulo
n if a mod n ∈ (Z/nZ)× has order ϕ(n).

■ Example 2.7 Recall Example 2.5. We have (5 mod 6)2 = 25 mod 6 = 1 mod 6. It follows that
5 mod 6 has order 2 = ϕ(6). I.e. 5 is a primitive root modulo 6. ■

Note that if a is a primitive root mod n then by the definition of ϕ(n) we have that (Z/nZ)× is
cyclic with a mod n being a generator. It is a natural question to ask whether to a given n primitive
roots modulo n exist. We will prove that if n is prime this is indeed the case. In order to prove this
we need to study roots of polynomials:

2.4.1 Roots of polynomials.


Consider the polynomial f (x) = x2 − 1 ∈ (Z/8Z)[x]. It is straightforward to verify that this has
roots 1, 3, 5, 7 mod 8. Hence this is an example of a degree 2 polynomial with 4 roots. If we are
working over a field polynomials cannot have such an abundance of roots.
2.4 Primitive roots modulo n. 21

Proposition 2.4.1 Let k be a field and let 0 ̸= f ∈ k[x] be a polynomial over k. Then f has at most
deg( f ) roots.
We note that f (x) = x2 − 2 ∈ (Z/3Z)[x] has no roots in Z/3Z since modulo 3 we have f (0) =
−2, f (1) = −1, f (2) = 2.

Proof. The proof is induction in deg( f ). For polynomials of degree 0, i.e. constants, the claim is
clear. Assume now that the claim holds for polynomials of degree at most n. Let

f (x) = an+1 xn+1 + · · · a1 x1 + a0 ∈ k[x]

be a polynomial of degree n + 1. If there are no roots in k we are done. So assume α ∈ k is a root.


Then

f (x) = f (x) − f (α) = an+1 (xn+1 − α n+1 ) + · · · a1 (x1 − α)

Since xi − α i = ∑i−1 j i−1− j (x − α) this implies that f (x) = (x − α)g(x) where deg(g) ≤ n.
j=0 x α
If β ̸= α is a root of f then 0 = (β − α)g(β ) and since β − α ̸= 0 we may multiply by its
inverse (since k is a field) and find g(β ) = 0. By induction hypothesis this can be true for at most n
different β and it follows that f has at most n + 1 roots. ■

Recall that if p is a prime then F p = Z/pZ is a field.


Proposition 2.4.2 Let p be a prime and d a divisor of p − 1. Then f (x) = xd − 1 ∈ F p [x] has exactly
d roots.

Proof. Write p − 1 = dl, and consider e(x) = x p−1 − 1 ∈ F p [x]. By Euler’s theorem (Theorem
2.3.1) all non-zero elements of F p are roots of this polynomial, i.e. it has p − 1 different roots. We
notice also the factorization

e(x) = (xd )l − 1 = (xd − 1)((xd )l−1 + (xd )l−2 + · · · + 1) = f (x)g(x),

where g(x) ∈ F p [x] has degree p − 1 − d. By Proposition 2.4.1 f has at most d roots and g has at
most p − 1 − d roots but since their product has p − 1 different roots f must have at least d roots
and g must have at least p − 1 − d roots which proves the claim. ■

Lemma 2.4.3 Let G be a group. Assume g1 , g2 ∈ G are commuting elements with coprime finite
orders n1 ,n2 . Then their product g1 g2 has order n1 n2 .

Proof. Let n = ord(g1 g2 ). Since g1 and g2 commutes (g1 g2 )n1 n2 = (gn11 )n2 (gn22 )n1 = 1 so n | n1 n2 .
On the other hand we have gn1 gn2 = 1. Raising this to the n2 ’th power and using gn22 = 1 we find
gnn
1 = 1 so n1 | nn2 . Since n1 and n2 are coprime we get from Proposition 1.2.6 (iv) that n1 | n. A
2

similar argument shows that n2 |n, which then by Proposition 1.2.6 (iii) implies that n1 n2 | n. Hence
we have n = n1 n2 . ■

We are now ready to prove the main result of this section.

Theorem 2.4.4 Primitive roots modulo p exist for every prime number p.

Proof. By the fundamental theorem of arithmetic we can factor ϕ(p) = p − 1 = pl11 · · · plkk into a
product of powers of distinct primes. If we can show that for every i = 1, . . . k, we can find an
li
element ai of F×
p with order pi , then by successive use of Lemma 2.4.3 we find that a = a1 a2 . . . ak
has order ϕ(p) which proves that there exists a primitive root modulo p.
To show that there exists an elements of order plii , we observe that by Proposition 2.4.2 we see
li
that there are plii − plii −1 elements of order plii since the roots of x pi − 1 = 0 which are not roots of
22 Chapter 2. Modular arithmetic
li −1 l
pi
x pi − 1 = 0 have order plii : For such a root ai satisfies ai i = 1 so ord(ai ) | plii i.e. ord(ai ) = pli for
li −1
some 0 ≤ l ≤ li . But if l < li then ai is a root of x pi − 1 = 0 which we have assumed it is not.
This finishes the proof. ■

Since a primitive root modulo p induces an element in F× ×


p of order ϕ(p) = #F p we conclude the
following corollary:

Corollary 2.4.5 Let p be a prime. Then F×


p is cyclic.

Theorem 2.4.4 settles the question of the existence of primitive roots modulo primes. There are
no primitive roots modulo 8 (Exercise 2.2), so some restriction on n needs to be assumed. It turns
out that primitive roots modulo n exists precisely if n = 2, 4, pe , or 2pe where p is any odd prime
and e ∈ N. We shall not prove this but please consult [JJ98, Theorem 6.7] for an accessible proof.
Proposition 2.4.6 Let n be a natural number and assume that there exist primitive roots modulo n.
Then ϕ(ϕ(n)) of the numbers 1, 2, . . . , n − 1 are primitive roots modulo n.

Proof. If a is a primitive root modulo n then a (mod n) is a generator of (Z/nZ)× . We see the
number we are looking for equals the number of 1 ≤ r ≤ ϕ(n) where ar (mod n) is a generator of
(Z/nZ)× . But ar (mod n) is a generator precisely if for every 1 ≤ c ≤ ϕ(n) the equation ary = ac
(mod n) has a solution. This happens precisely if ry = c (mod ϕ(n)) has a solution. But by
Proposition 2.2.1 this happens precisely if gcd(r, ϕ(n)) | c for all c which happens precisely if
gcd(r, ϕ(n)) = 1. Hence we are counting the number of elements in the set

{1 ≤ r ≤ ϕ(n)| gcd(r, ϕ(n)) = 1}

which is exactly ϕ(ϕ(n)). ■

R We remark that the proof of Proposition 2.4.6 shows that if a is a primitive root modulo n then
set of all primitive roots are the integers b such that b = ar mod n for some gcd(r, ϕ(n)) = 1.

■ Example 2.8 We want to find all primitive roots modulo 7. Recall that ϕ(7) = 6, so we are
looking for all natural numbers a such that a mod 7 has order 6, or said differently that a mod 7
is a generator for (Z/7Z)× . Note that 32 = 2 (mod 7), 33 = 6 (mod 7) so 3 mod 7 does not have
order 1,2, or 3. Since the order is a divisor of the order of the group. The group order is 6 so we
may conclude that 3 mod 7 has order 6 and is therefore a generator. By the proof of Proposition
2.4.6 the element 3r mod 7 is a generator precisely if gcd(r, 6) = 1 which happens precisely for
r = 1, 5 (mod 6). Note that 35 = 5 (mod 7).
Therefore 3,5 are the only primitive roots modulo 7 which are less than 6. Note that that this
agrees with ϕ(ϕ(7)) = ϕ(6) = ϕ(2)ϕ(3) = 2.
Hence a is a primitive root modulo 7 if and only if a = 3 (mod 7) or a = 5 (mod 7). ■

2.4.2 Artin’s conjecture


Conjecture 2.4.7 [Artin’s conjecture] Let a ∈ Z\{−1} with a ̸= b2 . Then there are infinitely
many primes p such that a is a primitive root modulo p
This is a long-standing and important conjecture. On the one hand we do not know a single a
for which there are infinitely many primes p such that a is a primitive root modulo p. On the other
hand we know – due to Heath-brown [Hea86] – that it is true whenever a is a prime with at most
two exceptions. In particular it is true for either a = 2, 3, or 5 but we do not know which of the
2.5 Local obstructions 23

three. Heath-brown also showed that it holds whenever a is square-free (i.e. a being a product of
different primes) with at most 3 exceptions.
Hooley showed [Hoo67] that Artin’s conjecture follows from a generalization of the Riemann
hypothesis.

2.5 Local obstructions


When we want to determine whether a specific diophantine equation has integer solutions this can
sometimes be ruled out by clever congruence considerations.
■ Example 2.9 Consider the Diophantine equation

x5 − x2 + x − 3 = 0,

i.e. we are looking for integer roots of the polynomial f (x) = x5 − x2 + x − 3. If such a root exist we
can reduce it modulo n (where n is any integer) and we will have that x satisfies f (x) = 0 (mod n).
In this case it turns out to be convenient to consider n = 4. We can then verify that f (0) = 1,
f (1) = 2, f (2) = 3 and f (3) = 2, in particular these numbers are all non-zero modulo 4, and hence
no such integer solution can exist. We say that there is a local obstruction to f (x) = 0 having
integer solutions.
Note that it is straightforward to verify that f (x) has real roots as f (x) → ∞ as x → ∞, and
f (x) → −∞ as x → −∞, so by the intermediate value theorem f (x0 ) = 0 for some real x0 . ■

■ Example 2.10 Consider the Diophantine equation

3 = x 2 + y2 .

It is easy to verify that any square x2 is either 0 or 1 modulo 4. Hence any expression x2 + y2 must
be either 0, 1, or 2 modulo 4. But since 3 is 3 modulo 4 this Diophantine equation does not have a
solution. Notice that this argument generalizes to proving that the equation

p = x 2 + y2 (2.8)

does not have solutions whenever p is equal to 3 modulo 4. We will later in Theorem 6.1.3 that if p
is a prime which is not equal to 3 modulo 4 then (2.8) does indeed have a solution. ■

Local obstructions are useful for proving that there are no solutions to a Diophantine equation.
In other cases there are no local obstructions (e.g. if the Diophantine equation has solutions there
can be no local obstruction). There are families of Diophantine equations where the existence of
both solutions modulo pn for every prime power and a real solution suffices to conclude that the
Diophantine equation has an integer solution. Such a result is called a local to global principle.

2.6 Exercises for Chapter 2


Exercise 2.1 Prove Proposition 2.1.1.
Exercise 2.2 Show that there are no primitive roots modulo 8.
Exercise 2.3 Show that there are no primitive roots modulo 2n for any n ≥ 3.
Exercise 2.4 Find all primitive roots modulo 17.
Exercise 2.5 — Freshman’s dream. Let a, b be integers and let p be a prime. Prove that
(a + b) p = a p + b p (mod p).
Exercise 2.6 Determine if x5 = 2 (mod 2003) has a solution. Note that 2003 is a prime.
Exercise 2.7 Prove that the Diophantine equation f (x) = 0 has no solution when f (x) is one of
the following: a) x3 − x + 1, b) x3 + x2 − x + 1, c) x3 + x2 − x + 3.
24 Chapter 2. Modular arithmetic

Exercise 2.8 Seven pirates try to share a stolen pile of identical diamonds equally between
themselves. Unfortunately, six diamonds are left over, and in the fight over them, one pirate is
thrown overboard and gets eaten by sharks. The remaining six pirates, still unable to share the
diamonds equally since two are left over, again fight, and another is killed. When the remaining
five share the diamonds, one diamond is left over, and it is only after yet another pirate is thrown
overboard that an equal sharing is possible. What is the minimum number of diamonds that the
pirates have stolen?
Exercise 2.9 For which values n is ϕ(n) odd? Show that there exists integers n with ϕ(n) =
2, 4, 6, 8, 10, 12 but not 14.
Exercise 2.10 Find the smallest integer n such that ϕ(n)/n ≤ 1/4
Exercise 2.11 Show that if n > 4 is composite then (n − 1)! = 0 mod n.
Exercise 2.12 Assume a = −1 or that a is a perfect square. Show that a is not a primitive root
modulo p for any prime p > 3. Determine whether a is a primitive root modulo 3. Compare with
Artin’s Conjecture 2.4.7.
Exercise 2.13 (Hard) Let p be an odd prime. Let g be a primitive root modulo p. Prove that either
g or g + p is a primitive roots modulo p2 . In particular this shows that primitive roots modulo p2
exists One can show that primitive roots modulo pe exist for any e ≥ 1. See e.g. [JJ98, Thm 6.7].
Exercise 2.14 Find all integer solutions to each of the equations

3x = 4 (mod 123)
5x = 23 (mod 123)
5x = 234 (mod 2002).
3. Cryptography

The basic idea of cryptography is that we want to send coded messages between two persons
(call them Alice and Bob) without anyone else being able to read and/or change the message.
A major benefit of modern cryptosystems is that they allow Alice and Bob to exchange secret
messages without having secretly exchanged the keys they are using to encrypt and decrypt their
messages. They can use keys that they construct and then exchange in an open non-encrypted form
of communication (like the internet). In fact we all use such encrypted systems every day in our
daily interactions with modern digital technology. We will now describe the principles in some of
these cryptosystems.
We will always assume that the message is a number or a string of numbers. Clearly any text
can be translated to a string of numbers. For instance one can use the unicode-values of the standard
characters. See also [Ste09, p. 3.3.2] for an even more basic implementation.
In the two cryptographic settings we will describe in some detail (Diffie-Hellman and RSA) a
central role is played by large primes. We therefore start by discussing techniques to verify if a
number is prime or not.

3.1 Primality testing


We want to find various ways to verify if a number n is a prime. The naive way is of course to see if

any 1 < d ≤ n divides n. If not n is indeed a prime. This method of testing for primality is very
slow
There are several ways of characterizing primes. Here are a few basic ones:

Theorem 3.1.1 — Wilson. A number p > 1 is a prime if and only if (p − 1)! = −1 (mod p).

Proof. The statement is clear for p = 2 so assume p > 2. If p is a prime then every 1, 2, . . . p − 1 has
multiplicative inverses in F p since it is a field. If b is its own inverse then b2 − 1 = (b − 1)(b + 1) is
divisible by p so by Euclid’s lemma 1.4.1 p divides b − 1 or b + 1. But this means that b equals
1 or p − 1. Since all the other integers less that p − 1 has a unique inverse different from itself it
follows that (p − 1)! = p − 1 = −1 (mod p).
26 Chapter 3. Cryptography

Let on the other hand (p − 1)! = −1 (mod p). If we assume that p is not a prime then it has
a non-trivial factor 1 < l < p. It follows that 0 = −1 (mod l) which is a contradiction since an
integer l > 1 cannot divide 1. ■

In terms of efficiency using Wilson’s theorem to check for primality is horribly slow as it is very
time consuming to find (p − 1)! if p is large.

Theorem 3.1.2 Let p > 1. Then p is a prime if and only if a p−1 = 1 (mod p) for every a ̸= 0
(mod p).

Proof. Assume p is prime. Then by Theorem 2.3.1 and (2.4) we have a p−1 = 1 (mod p).
If on the other hand a p−1 = 1 (mod p) for every a ̸= 0 (mod p) then if p is not a prime let
1 < l < p be a non-trivial factor. Then l p−1 = 1 (mod l) which implies that l|1 which is impossible.
By contradiction we conclude that p must be a prime. ■

3.1.1 Carmichael numbers


By Theorem 3.1.2 we can test if n is a prime number by verifying an−1 = 1 for all a not divisible
by n. One could speculate if it was enough to verify this for all a which are relatively prime to n.
This is not the case: There exists “many” composite numbers n such that an−1 = 1 for all a which
are relatively prime to n. Such numbers are called Carmichael numbers:
Definition 3.1.1 A Carmichael number n is a composite number which satisfies that an−1 = 1
(mod n) for every a which is relatively prime to n.

■ Example 3.1 We want to show that 561 is a Carmichael number. Note that 561 = 3 · 11 · 17.

Assume that a is relatively prime to 561. Then a is also relatively prime to 3, 11, and 17. It then
follows from Euler’s theorem 2.3.1 that

a560 = a2·280 = 1 (mod 3), a560 = a10·56 = 1 (mod 11), a560 = a16·35 = 1 (mod 17)

It follows that a560 − 1 is divisible by 3, 11, and 17, and therefore by Proposition 1.2.6 (iii) also by
561. Hence a560 = 1 (mod 561), so 561 is a Carmichael number. ■

R There are many Carmichael numbers: 561, 1105, 1729, 2465, 2821, 6601, 8911 are all the
Carmichael numbers less than 10000. Ahlford, Granville, and Pomerance [AGP94] proved
in 1994 that there are infinitely many Carmichael numbers. It is a wide open problem to
determine asymptotics for the number of Carmichael numbers less than a given x.

3.1.2 Miller-Rabin’s primality test


The Miller-Rabin primality test is based on the following theorem:

Theorem 3.1.3 Let p > 1 be an integer, and write p − 1 = 2k m where m is odd. Then p is a
prime if and only if for every a ̸= 0 (mod p)
r
either am = 1 (mod p) or a2 m = −1 (mod p) for some 0 ≤ r < k. (3.1)

k
Proof. Assume that p > 1 is prime. By Theorem 3.1.2 1 = a p−1 = am2 (mod p), so
k−1 k−1
(am2 + 1)(am2 − 1) = 0 (mod p).
3.1 Primality testing 27
k−1 k−1
By Euclid’s lemma 1.4.1 this implies that am2 = ±1 (mod p). If am2 = −1 (mod p) we are
k−2
done. If not we can use the same argument to conclude that am2 = ±1 (mod p). Continuing to
r
strip off a power of 2 if am2 = 1 (mod p) until r = 0 we find that am = ±1 (mod p) which shows
that (3.1) holds.
Assume now instead that (3.1) holds. If am = 1 (mod p) then
k k
a p−1 = am2 = (am )2 = 1 (mod p).
r
If a2 m = −1 (mod p) then
r k−r k−r−1
a p−1 = (a2 m )2 = ((−1)2 )2 =1 (mod p).
In any case we have a p−1 = 1 (mod p) so by Theorem 3.1.2 p is a prime. ■
Given an odd number p, the existence of a single a ̸= 0 (mod p) not satisfying (3.1) suffices for
proving that p is not a prime.
If (3.1) holds for a given a ̸= 0 (mod p) we say that p is a pseudo-prime to base a. Hence a
prime is a number which is a pseudo-prime to any base a ̸= 0 (mod p). If we have tested that p
is a pseudo-prime to sufficiently many bases we might start to suspect that maybe it is indeed a
prime. This can be made precise and there is a way to use this to verify that p is a prime with a
sufficiently high probability. We will not go into the details of how this is defined but mention only
that such probabilistic tests are usually much faster than deterministic tests. And for applications in
cryptosystems we might be content with knowing that the communication is secure with probability
≥ 1 − ε with the size of ε depending on the concrete case. If I am encrypting my bank account
information ε = 10−10 might suffice, whereas if the national security is at stake I might want a
smaller ε.

R The Miller-Rabin primality test as described above is usually used as a probabilistic test. It
turns out that if we knew a generalization of the Riemann hypothesis, then we could prove
that if p is a pseudo-prime to base a for the first 2 log p bases then p is indeed a prime. See
[Bac90] for a proof.

3.1.3 Modular exponentiation


In order to make good use of Theorem 3.1.2 as a primality testing tool we must have efficient ways
of computing an (mod m). This is called modular exponentiation. If a and m are huge number it is
very inefficient to simply compute an (mod m) by simply computing a mod m, a2 mod m, a3 mod
m, . . . , an mod m by successive multiplication by a and reducing modulo m. It is much more
k k−1
efficient to write n in binary and then use that a2 = (a2 )2 . This dramatically reduces the number
of multiplications needed. We illustrate by an example:
■ Example 3.2 We want to compute 375 (mod 100). The naive approach uses 75 multiplications
in (Z/100Z)× (note that 3 and 100 are relatively prime). Using Eulers’ theorem we see that
3ϕ(100) = 1 (mod 100). Using (2.7) we find that ϕ(100) = 100(1 − 1/2)(1 − 1/5) = 40. It follows
that 375 = 335 (mod 100). We now write 35 in binary, i.e. we find εi such that 35 = ∑ki=0 εi 2i . An
easy way to do this is as follows:
35 is odd so ε0 = 1,
(35 − 1)/2 = 17 is odd so ε1 = 1,
(17 − 1)/2 = 8 is even so ε2 = 0,
(8 − 0)/2 = 4 is even so ε3 = 0,
(4 − 0)/2 = 2 is even so ε4 = 0,
(2 − 0)/2 = 1 is odd so ε5 = 1.
28 Chapter 3. Cryptography

This implies that 35 is 100011 in binary. We now compute the 2i powers of 3 (mod 100).
0
32 = 3 (mod 100)
21 20 2 2
3 = (3 ) = 3 = 9 (mod 100)
22 21 2 2
3 = (3 ) = 9 = 81 (mod 100)
3 2
32 = (32 )2 = 812 = 61 (mod 100)
4 3
32 = (32 )2 = 612 = 21 (mod 100)
5 4
32 = (32 )2 = 212 = 41 (mod 100)

5 1 0 5 1 0
It follows that 335 = 32 +2 +2 = 32 32 32 = 41 · 9 · 3 = 7 (mod 100). This method is much more
efficient than the naive multiplication. We have only made 7 multiplications (and 5 divisions by 2).

3.2 Concrete cryptosystems


We are now equipped with effective tools for
1. finding large primes (Section 3.1.2 )
2. doing modular exponentiation (Section 3.1.3) , and
3. solving the equation ax = 1 (mod n) when gcd(a, n) = 1 using Bezout’s theorem 1.2.4 and
Remark 1.3
These tools will be instrumental in the concrete cryptosystems we are now going to describe.
We should warn that there are many delicate points when one wants to actually implement such
systems in order not to introduce more or less subtle vulnerabilities. We will not discuss such points
here but only focus on the general mechanism.

3.2.1 Diffie–Hellman
The Diffie–Hellman key exchange protocol is a way for Alice and Bob to agree on a secret number
that they can use to encrypt and decrypt messages that they send to each other. It allows them to
agree on a secret number that only they know, even if anyone can listen to their communication.
The protocol was proposed by W. Diffie and M.E. Hellman in the 1970s [DH76], and is still being
used in various cryptosystems.
It is a so-called symmetric key exchange meaning that the secret key (a number) is used for
both encryption and decryption.
Agreeing on a key
The Diffie–Hellman key exchange protocol consists of the following steps.

Agreeing on a secret number


1. Alice and Bob agree in public on a prime p and a random number 1 < g < p.
2. Alice secretly chooses an integer nA and computes gnA (mod p).
3. Bob secretly chooses an integer nB and computes gnB (mod p).
4. Alice and Bob sends, possibly on an insecure line, the results of their computations to the
other.
5. The secret shared key is now

s = (gnA )nB = (gnB )nA (mod p),


3.2 Concrete cryptosystems 29

which both Alice and Bob can compute.

This protocol enables Alice and Bob to find a secret key that only they know, and even if
they have only communicated over insecure channels they have managed to agree on a number s
that only they know how to compute. They can now use this number to encode and decode their
messages (using standard tools like AES, Twofish, Serpent, Blowfish, CAST5, Grasshopper, RC4
or similar).
■ Example 3.3 Here is a small example of what Alice and Bob needs to compute. In more realistic
examples the numbers are of course much larger. Even for these small examples it is convenient to
use a computer to do the computations.
1. p = 101, g = 41.
2. nA = 56, gnA = 95 (mod 101).
3. nB = 22, gnB = 65 (mod 101).
4. Alice sends 95 to Bob and receives 65 from him.
5. Alice computes s = 6556 = 36 (mod 101), and Bob computes s = 9522 = 36 (mod 101), so
36 is the secret key.

The Discrete Log Problem


It is reasonable to consider if someone listening to Alice and Bob’s communication (call him
Charlie) can find s based on the information available to him. He does after all have access to
both g, p, gnA (mod p), and gnB (mod p). One thing he could try to do is to compute gi (mod p)
for every i = 1, 2, .. until he finds an i0 such that gi0 = gnA (mod p) . Then ord(g) | nA − i0 so
i0 = nA + k · ord(g) which implies that

(gnB )i0 = gnB i0 = gnB (nA +kord(g)) = gnA nB = s (mod p),

i.e. Charlie has found the secret key.


This is indeed a way to break this type of code. The upshot is that if p is large and Alice has
not been extremely unlucky in her choice of parameters this will be extremely time-consuming, to
the extent where it is practically impossible using current technology. We have run into a particular
case of the discrete log problem.

Problem 3.1 — The discrete log problem. Let G be a finite group and let g ∈ G. Given an
element a in the cyclic group generated by g, find n such that

gn = a

In general solving the discrete log problem is believed to be computationally infeasible. No general
algorithm is known, which is polynomial in the number of digits of the size of the group G. There
are groups (Like (Z/nZ, +)) where solving the discrete logarithm problem is easy, but in general,
and for the group (Z/pZ)× it is considered very hard. Analogous to factoring into primes there is a
quantum computer algorithm developed by Schor [Sho94] which solves this in polynomial time, so
if the technology of quantum computers computers matures sufficiently, Alice and Bob need to be
careful when using the Diffie–Hellman protocol. And at all times they need to choose the size of p
in a way where they are not vulnerable to this type of attack.

Man in the Middle attack


The Diffie–Hellman key exchange is also vulnerable to another kind of attack called the man-in-
the-middle attack. The problem is the following:
30 Chapter 3. Cryptography

What if, when Alice and Bob transmit their respective values gnA (mod p) and gnB (mod p) to
each other, Charlie manages to substitute gnA (mod p) by gnC (mod p) and also gnB (mod p) by
gnC (mod p) without Alice and Bob noticing. Here nC is a value that Charlie chooses.
Then Alice computes sA = (bnC )nA (mod p) believing this is the shared secret, while Bob
computes sB = (bnC )nB (mod p) believing this is the shared secret. But Charlie can compute both
sA , and sB since he knows gnA (mod p), gnB (mod p), and nC . Hence Charlie can completely
control the encrypted correspondence between Alice and Bob.

3.2.2 RSA
A more flexible and much more widely used protocol is the RSA protocol invented by R. Rivest, A.
Shamir, and L. Adleman in the 1970’ies [RSA78]. This is a so-called asymmetric algorithm which
uses different “keys” for encrypting and decrypting.
Constructing private and public keys
In RSA Alice needs to construct 2 keys. One for encrypting and one for decrypting. What she
needs to do is the following:

Constructing keys
1. Choose secretly two large primes p1 , p2 .
2. Compute secretly nA = p1 p2 and ϕ(nA ) = (p1 − 1)(p2 − 1), and
3. Choose a “random” 1 < eA < ϕ(nA ) which is coprime to ϕ(nA ).
4. Find secretly a solution dA to the equation eA x = 1 (mod ϕ(nA )) (using Bezout’s identity).
5. Make public the public key (eA , nA ).
6. Keep secret the private key (dA , nA ).

If Bob now wants to send the message a mod nA to Alice he simply encrypts his message by
computing aeA mod nA and sends it to Alice. Alice then computes (aeA )dA mod nA and uses the
following Proposition:
Proposition 3.2.1 Let n be a product of distinct primes. Assume ed = 1 (mod ϕ(n)). Then for
every a we have
aed = a (mod n).

Proof. Notice that ϕ(n) = ∏ p|n (p − 1). Since distinct primes are relatively prime it follows from
Proposition 1.2.6 (iii) that it suffices to prove that aed = a (mod p) for every prime dividing n. If a
is divisible by p this is clear since both sides equal 0. If a is not divisible by p the numbers a and p
are coprime, and Theorem 2.3.1 then gives that

aed = aed−1 a = aϕ(n)k a = a(p−1)l p a = a (mod p),

which completes the proof. ■

Note that in the proof of Proposition 3.2.1 we do not use that ed is a product of integers, but only
that it equals 1 mod ϕ(n). We could just as well have formulated it for an integer m = 1 mod ϕ(n),
and then concluded that am = a mod n. The reason it is formulated with a product m = ed is
that this shows directly that Alice can compute (aeA )dA mod nA and will get as result the original
message a mod nA . This is the basic idea of RSA.
Now Bob has a secure way to send messages to Alice. If Bob wants to be able to receive
encrypted messages he needs to go through the same procedure as Alice to construct his own public
key (eB , nB ) and private key (dB , nB ). If he makes the public key public Alice (or anyone else) can
encrypt a message b mod nB by computing beB mod nB and sending this coded messages to Bob
who can decrypt it using his private key.
3.3 Exercises for Chapter 3 31

■ Example 3.4 Let us compute a (unrealistically small) set of RSA-keys for Alice.
1. Choose the primes p1 = 43, p2 = 67.
2. nA = 43 · 67 = 2881, ϕ(nA ) = 42 · 66 = 2772.
3. We choose eA = 1003.
4. 1003x = 1 (mod 2772) has solution 655.
5. The public key is (1003, 2881).
6. The private key is (655, 2881).
Alice now puts the public key on her website. Bob wants to send her the number 1330 which is the
time he wants her to meet at their secret place the next day. He computes 13301003 mod 2881 =
2259 mod 2881 and sends Alice 2259. She computes 2259655 mod 2881 = 1330 mod 2881 and
now knows that Bob wants to meet her at 13.30.
In real life applications the primes are typically several 100 digits long. ■

Keep it secret, keep it safe


The security on RSA rests on a very simple observation: It is very easy to multiply two large primes
n = p1 · p2 , but if all we know is the result n, it seems very hard to find, in a reasonable amount of
time, the two prime factors p1 , p2 .1
If Charlie could somehow compute the prime factorization of nA he would be able to compute
ϕ(nA ) and then he could solve the equation eA x = 1 (mod ϕ(n)) meaning that he could find dA .
Once he knows dA he can decrypt any encrypted message to Alice he can get his hands on. Hence
it is crucial that Alice does not treat the two primes p1 , p2 , or ϕ(nA ) carelessly. In fact once the
keys (eA , nA ), (dA , nA ) have been generated the safest thing she can do is probably to delete them
completely. Many people suspect that without knowing the prime factorization of nA beforehand it
is generally as difficult to find dA as to factor nA .
If Charlie knows p1 , p2 he can compute ϕ(nA ) and break the code. Knowing ϕ(nA ) and nA
is in fact essentially equivalent to knowing the prime factors p1 , p2 as they are the roots of the
polynomial
x2 + (ϕ(nA ) − (nA + 1))x + nA ,
which follows from Exercise 3.3.

3.3 Exercises for Chapter 3


Exercise 3.1 — Korselt’s criterion. Let n be a composite square-free number. Assume that
p − 1 | n − 1 for every p dividing n. Show that n is a Carmichael number.
Exercise 3.2 Show that 1105, 1729, and 2465 are Carmichael numbers.
Exercise 3.3 Let n be a product of two primes p1 , p2 . Show that the polynomial

x2 + (ϕ(n) − (n + 1))x + n

has p1 and p2 as its roots.


Exercise 3.4 Find the last two digits of 575 and 775 .

1I
dare you to find the prime factors of
n =159542902770478353477333649649796581389938839451613607718093662563072803655525608627699609028760
74683622288154113013374670636273789063661078805497874071102398808291120799306047641864281161403791
13482778195649341783699438770737541361816167097612982009724078014971271538123606653607727522296003
21587160692562661
which is a product of two primes. The primes were computed and multiplied together on my laptop in about a second.
This is a realistic size n for RSA.
4. Quadratic reciprocity

In the field F p it is easy to solve first degree equations ax + b = 0 (mod p). Since F p is a field
this has a solution whenever a ̸= 0 mod p given by x = a−1 (−b) (mod p) where a−1 mod p is the
multiplicative inverse of a mod p in F p .
Consider now a second degree equation of the form

ax2 + bx + c = 0 (mod p) (4.1)

where we assume a ̸= 0 (mod p). By Proposition 2.4.1 this equation can have at most 2 solutions.
Assume that p ̸= 2 (See Exercise 4.1 for the case p = 2.) To understand when it has solutions we
complete the square, write D = b2 − 4ac and find

1
0 = ax2 + bx + c = 4a2 x2 + 4abx + 4ac

4a
1 1
(2ax + b) − b2 + 4ac =
2
(2ax + b)2 − D
 
= (mod p)
4a 4a
This small computation reduces the question of existence of solutions to (4.1) to the question of
existence of solutions to

y2 = D (mod p). (4.2)

Putting y = 2ax + b we see that (4.1) has solutions if and only if (4.2) has solutions. If D = 0
(mod p) equation√ 4.2 has exactly one solution namely y = 0, and if√D ̸= 0 (mod p) and (4.2) has
a solution (call it D) then it has two distinct solutions namely ± D. In this case it follows that
the solutions to (4.1) are given by the familiar formula

1  √ 
−b ± D .
2a

We will now develop methods for determining whether the equation y2 = D (mod p) has solutions
or not.
4.1 Quadratic residues and Legendre symbols 33

4.1 Quadratic residues and Legendre symbols


Definition 4.1.1 Let p be a prime. An integer a ̸= 0 (mod p) is called a quadratic residue
modulo p if y2 = a (mod p) has a solution. If not a is called a quadratic non-residue.

We note that a ∈ Z not divisible by p is a quadratic residue modulo p if and only a mod p ∈ F× p
is a square in F×
p i.e. a mod p = b2 for some b ∈ F× , and a quadratic non-residue if and only if
p
a mod p is not a square in F×
p .
■ Example 4.1 Let p = 7. We compute all squares modulo p

y mod 7 0 1 2 3 4 5 6
y2 mod 7 0 1 4 2 2 4 1

and we see that 1, 2 and 4 are quadratic residues and 3,5,6 are quadratic non-residues modulo 7 ■

Definition 4.1.2 Let p > 2 be a prime and let a ∈ Z. We define the Legendre symbol by

0
   if a = 0 (mod p)
a
= 1 if a is a quadratic residue modulo p
p 
−1 if a is a quadratic non-residue modulo p

We note that
   
a1 a2
if a1 = a2 (mod p) then = . (4.3)
p p

We note also that ap = 0 if and only if a = 0 (mod p). It follows that the map


ψ: F×
p → {±1}
a
 (4.4)
a mod p 7→ p

is well-defined. Note that the set {±1} equipped with ordinary multiplication is a group of order 2.
Lemma 4.1.1 The map ψ is a surjective group homomorphism.

Proof. Note that f ∈ F p satisfies ψ( f ) = 1 if and only if f is a square in F×


p . By Corollary 2.4.5

p is cyclic, so let g be a generator. Hence g p−1 = g0 = 1 mod p, and p − 1 is the smallest number

with this property. The elements of F× p are precisely

p−3 p−1 p+1


g1 , g2 , . . . g 2 ,g 2 ,g 2 , . . . , g p−1 .

The squares of these gives

g0 g2 g
=
=

g2 , g4 , . . . g p−3 , g p−1 , g p+1 , . . . , g2(p−1)

so f ∈ F× 2i j j
p is a square precisely if f = g for some i. It follows that ψ(g ) = (−1) . But this is
clearly a group homomorphism since

ψ(gi g j ) = ψ(gi+ j ) = (−1)i+ j = (−1)i (−1) j = ψ(gi )ψ(g j ).

Observing that ψ(g) = −1, and ψ(g2 ) = 1 shows that ψ is surjective. ■


34 Chapter 4. Quadratic reciprocity

Corollary 4.1.2 Let p be a prime and let a, b ∈ Z. Then ab a b


  
p = p p .

Proof. If p | ab then by Lemma 1.4.1 we have p | a or p | b, so both sides of the equation are zero.
If p ∤ ab then ab mod p ∈ F×p and by Lemma 4.1.1 we have

    
ab a b
= ψ(ab mod p) = ψ(a mod p)ψ(b mod p) = .
p p p

Corollary 4.1.3 Let Q p = {a mod p ∈ F×


p |a is a quadratic residue modulo p}. Then Q p is a
subgroup of F×
p of index 2. In particular there are the same number of squares as non-squares in
F×p .

Proof. We have Q p = ker ψ, from which the result follows that Q p is a subgroup. It follows
from the isomorphism theorem that Im(ψ) = {±1} is isomorphic to the quotient F× p / ker ψ so
#Q p = (p − 1)/2 from which the claim follows. Note also that it follows from the proof of Lemma
4.1.1 that the squares are exactly g2 , g4 , . . . , g p−1 ■

4.2 Quadratic reciprocity


Consider the following question: Let p, q be two different primes. What is the relation (if any)
between the solvability of the equation x2 = p (mod q) and the solvability of x2 = q (mod p)? Or
p q
said using Legendre symbols: What is the relation between q and p
This is the content of the main theorem of this chapter:

Theorem 4.2.1— Quadratic reciprocity. Let p, q be different odd primes. Then


 
p q (p−1)(q−1)
(i) = (−1) 4 ,
q  p
−1 p−1
(ii) = (−1) 2 ,
 p
2 p2 −1
(iii) = (−1) 8 .
p

We note that in case (ii) the right-hand side depends only on p mod 4 since if p = a (mod 4) then
(p − 1)/2 = (a − 1)/2 + 2k for a k ∈ Z. In case (iii) the right-hand side depends only on p mod 8
since if p = a (mod 8) then (p2 − 1)/8 = (a2 − 1)/8 + 2k′ for a k′ ∈ Z.

R The quadratic reciprocity theorem was first conjectured by Legendre and Euler and first
rigorously proved by Gauss, probably in 1796. Gauss gave several proofs and called it the
“aureum theorema” (golden theorem) in his diary. There are now more than 300 proofs (see
e.g. [Bau15, Ch. 15]). We will make an elegant proof which uses Gauss sums. Gauss sums
are important tools in number theory. For a very elementary proof using nothing more than
the Chinese remainder theorem see e.g [Rou91].

■ Example 4.2 Before diving into the proof of quadratic reciprocity let us see how it can be used

to determine if -14 is a quadratic residue modulo 2503 which can easily be verified to be a prime.
4.2 Quadratic reciprocity 35

We compute
     
−14 −1 2 7
= by Corollary 4.1.2
2503 2503 2503 2503
 
2503−1 25032 −1 2503−1 7−1 2503
= (−1) (−1)
2 (−1) 8 2 2 by Quadratic reciprocity
7
 
2503
= (−1)1(−1)
7
 
4
= by (4.3) since 2503 = 357 · 7 + 4
7
 2
2
= =1 by Corollary 4.1.2
7

so −14 is a quadratic residue modulo 2503. Indeed x2 = −14 has solutions ±274 ■

Proposition 4.2.2 — Euler’s criterion. Let p > 2 be a prime and a ∈ Z. Then ap = 1 if and only


if a(p−1)/2 = 1 (mod p).

Proof. Recall the homomorphism ψ from (4.4). The map ρ : F× ×


p → F p , defined by ρ(a mod p) =
a(p−1)/2 mod p is a group homomorphism (Exercise 4.2), so ker(ρ) is a subgroup of F× p i.e.
ker(ρ) ≤ F× p . We have also Q p = ker ψ ≤ ker ρ ≤ F × since if g ∈ ker ψ then g = h2 so ρ(g) =
p
g(p−1)/2 = h p−1 = 1, since F× p has order p − 1. Since Q p has index 2, ker ρ will have index 1 or 2
× ×
in F p . But if it has index 1 then every g ∈ F p is a root of x(p−1)/2 − 1 which contradicts Proposition
2.4.1, so ker ρ must have index 2 in F× p . But this implies that #Q p = # ker ρ = (p − 1)/2 which
forces Q p = ker ρ which shows the result. ■

Proposition 4.2.3 Let p > 2 be a prime and let a ∈ Z. Then ap = a(p−1)/2 (mod p).


Proof. If p | a both sides are zero so suppose not. We notice that a(p−1)/2 mod p is a solution
to x2 = 1 mod p. By Proposition 2.4.2 there are exactly two solutions to this equation and since
±1 mod p are indeed solutions we must have a(p−1)/2 = ±1 (mod p).
Proposition 4.2.2 says that a is a quadratic residue precisely if a(p−1)/2 = 1 (mod p) so, if a is
a quadratic non-residues we must have a(p−1)/2 = −1 (mod p), since there are no other possible
values for it. ■

We can now prove Theorem 4.2.1 (ii):

Corollary 4.2.4 Let p be an odd prime. Then −1 (p−1)/2 .



p = (−1)

Proof. By Proposition 4.2.3 with a = −1 we find −1 = (−1)(p−1)/2 (mod p), so −1 −(−1)(p−1)/2


 
p p
is divisible by p. Since −1 (p−1)/2 ∈ {0, ±2} and p > 2 this implies the claim.

p − (−1) ■

4.2.1 Algebraic integers


Before continuing to prove the rest of the quadratic reciprocity theorem, we need a small intermezzo
about algebraic integers. We need to work in congruences in a slightly more general setting than
we have done so far.
Definition 4.2.1 A complex number ζ is called algebraic if it is a root in non-zero polynomial
with integer coefficients, i.e. if f (ζ ) = 0 for some 0 ̸= f ∈ Z[x]. If ζ is root of a monic
polynomial it is called an algebraic integer. The set of algebraic numbers are denoted by Q and
36 Chapter 4. Quadratic reciprocity

the set of algebraic integers are denoted by Z.


■ Example√
4.3 • If n ∈ Z then n ∈ Z since n is a root of x − n.
• (1 + √13)/2 is an algebraic integer since it is a root of x2 − x − 3
• 54 + 17 2 is an algebraic integer since it is a root of (x − 54)17 − 2
2πi
• Let n ∈ N. Then ζn = e n is an algebraic integer since it is a root of xn − 1
• Let n, m ∈ Z. Then n + im is an algebraic integer since it is a root of (x − n)2 + m2

Definition 4.2.2 Let ζ be an algebraic number. Then d = min{deg(p)|p(ζ ) = 0, p ∈ Z[x]} is


called the degree of ζ .

We note that ζ is rational if and only if ζ is algebraic and of degree 1. If ζ is algebraic of degree 2
we call ζ a quadratic irrational.

R There are many complex numbers which are not algebraic. In fact the algebraic numbers are
countable (See Exercise 4.4), and since the complex numbers are uncountable most complex
numbers are non-algebraic. We call them transcendental. Proving that specific numbers are
transcendental is usually √
rather tricky. Examples of provably transcendental numbers are

10 −i! , e, π, eπ , 2 2 . On the other hand there are many numbers which are conjectured
∑i=1
but not proved to be transcendental like π ± e, γ, ζ (3) and many others. The study of such
properties is usually called transcendence theory (Consult [Wal79]).

Since Z ⊆ C we can multiply, and subtract two algebraic integers, and – perhaps surprisingly –
the result is still an algebraic integer as we see from the following result:

Theorem 4.2.5 The set Z is a subring of C, and Q is a subfield of C.

We will not prove Theorem 4.2.5, but refer to [Jar14, Cor 2.28 and Cor. 2.12]. We shall only use
the first part the theorem.

Theorem 4.2.6 The rational algebraic integers are precisely the integers i.e.

Z∩Q = Z

Proof. We have already seen in the above examples that Z ⊆ Z ∩ Q. Let ζ = mn be a rational
algebraic integer which is a root of the monic polynomial f (x) = xk + ∑k−1 i
i=0 ai x with integer
coefficients. We may assume that gcd(m, n) = 1 and n ≥ 1. Inserting ζ in f , multiplying by nk and
subtracting mk we find
k−1
∑ ai mi nk−i = −mk , (4.5)
i=0

so since n is a clearly a factor in the left-hand side we see that n | mk which implies that n = 1 since
gcd(m, n) = 1. But then ζ = m ∈ Z which completes the proof. ■

We define congruences in the algebraic integers as follows: For a, b, c ∈ Z we write a=b


˜ (mod c)
if there exist a d ∈ Z such that a − b = cd.
˜ (mod c) where a, b, c ∈ Z. Then a = b (mod c).
Proposition 4.2.7 Suppose a=b

˜ (mod c) there exist a d ∈ Z such that a − b = cd. But this implies that d ∈ Q so
Proof. Since a=b
by Theorem 4.2.6 we have d ∈ Z which proves the claim. ■
4.2 Quadratic reciprocity 37

ζ2 ζ

ζ3 ζ6 = 1

ζ4 ζ5

Figure 4.1: 6th roots of 1

R The theory of algebraic numbers is vast, fascinating, and important in many modern mathemat-
ical theories. Here we are only barely touching the surface. For a more thorough introduction
consult [Hec81; Jar14; Mar77; Neu99].

Definition 4.2.3 An nth root of 1 is a complex number ζ ∈ C such that ζ n = 1. The root is
called primitive if ζ k ̸= 1 for 1 ≤ k < n.

Note that both primitive and non-primitive roots of 1 are algebraic integers as they are roots of
f (x) = xn − 1
2πik
Proposition 4.2.8 The nth roots of 1 are precisely ζnk = e nwhere 0 ≤ k < n. The primitive nth
2πik
roots of 1 are precisely ζnk =e n where 0 ≤ k < n and gcd(n, k) = 1 .

Proof. See Exercise 4.9 ■

■ Example 4.4 The 6th primitive roots of 1 are ζ6 = e2πi/6 and ζ65 = e2πi5/6 . All 6th roots of 1 are
illustrated in Figure 4.1 . ■

If n ∈ N we will always write ζn = e2πi/n which is an nth primitive root of 1.


We consider the set

Z[ζn ] = {a0 + a1 ζn + a2 ζn2 + . . . an−1 ζnn−1 |ai ∈ Z}.

We notice immediately that Z[ζn ] is a subring of Z. We can now prove Theorem 4.2.1 (iii):

Proof. Let p be an odd prime. Consider ζ = ζ8 = e2πi/8 ∈ Z[ζ8 ] which is an 8th primitive root of
1. Note that ζ −1 = ζ 7 ∈ Z[ζ8 ]. We have

0 = ζ 8 − 1 = (ζ 4 + 1)(ζ 4 − 1) = (ζ 2 + ζ −2 )ζ 2 (ζ 4 − 1)

Since ζ is an 8th primitive root of 1 it follows that ζ 2 + ζ −2 = 0. If we let τ =ζ + ζ −1 , then


τ 2 = ζ 2 + ζ −2 + 2 = 2. It follows from Proposition 4.2.3 that τ p−1 = 2(p−1)/2 = 2p (mod p) and
by multiplying the underlying equality by τ we find that
 
p 2
τ =τ˜ (mod p)
p

Since the binomial coefficients pi are divisible by p if 1 ≤ i ≤ p − 1 (See Exercise 4.10). It follows


from the binomial formula that for

x, y ∈ Z we have (x + y) p =x
˜ p + yp (mod p). (4.6)

Using this for x = ζ , y = ζ −1 we see that


 
p −p 2
ζ + ζ =τ ˜ (mod p) (4.7)
p
38 Chapter 4. Quadratic reciprocity

We now have to go through the 4 cases p = ±1, ±3 (mod 8):


If p = 1 (mod 8) then ζ ±p = ζ ±1 so (4.7) reads τ =τ ˜ 2p (mod p) so multiplying by τ and
using τ 2 = 2 we find 2=2 ˜ 2p (mod p). We can now use Proposition 4.2.7 to conclude that


2 = 2 2p (mod p). Multiplying by an integer b satisfying 2b = 1 (mod p) (Use gcd(2, p) = 1 and




Bezout to find such b) we see that 1 = 2p (mod p). We conclude that p | ( 2p − 1) and since
 
2 2
 
p − 1 ∈ {0, ±2} we find that in this case p = 1.
If p = −1 (mod 8) then ζ ±p = ζ ∓1 so we find τ =τ ˜ 2p (mod p) as before so we can again


conclude 2p = 1.


If p = 3 (mod 8) then ζ p = ζ 3 = ζ 4 ζ −1 = −ζ −1 and ζ −p = −ζ so ζ p + ζ −p = −τ and (4.7)


˜ 2p (mod p). Arguing as before this leads to 2p = −1.
 
reads −τ =τ
If p = −3 (mod 8) then ζ p = ζ −3 = ζ −4 ζ 1 = −ζ 1 and ζ −p = −ζ −1 so again ζ p + ζ −p = −τ
and we find 2p = −1.

4.2.2 Gauss sums


In order to generalize the proof of Theorem 4.2.1 (ii) presented above we introduce some very
important sums:
Definition 4.2.4 Let p be an odd prime. The Gauss sum associated to a ∈ Z is the sum

p−1  
n an
g p (a) = ∑ ζ .
n=0 p p

We note that since ζ pa+p = ζ pa since ζ p is a pth root of 1 we have g p (a + p) = g p (a) so the Gauss sum
depends only on a mod p. We note also that since ζ p lies in the ring Z[ζ p ] we have g p (a) ∈ Z[ζ p ].
■ Example 4.5 Let p = 5. We have ζ5 = e2πi/5 and

n 0 1 2 3 4
n
5 0 1 -1 -1 1

g5 (0) = 0 + 1 + (−1) + (−1) + 1 = 0


√ √
2πi 2πi2 2πi3 2π
2πi4 2π3 −1 + 5 −1 − 5 √
g5 (1) = e − e − e + e
5 5 = 2 cos( ) − 2 cos(
5 5 )=2 −2 = 5
5 5 4 4
2πi2 2πi4 2πi6 2πi8 2πi2 2πi4 2πi 2πi3 √
g5 (2) = e 5 − e 5 − e 5 + e 5 = e 5 − e 5 − e 5 + e 5 = −g5 (1) = − 5
2πi3 2πi6 2πi9 2πi12 2πi3 2πi 2π4 2πi2 √
g5 (3) = e 5 − e 5 − e 5 + e 5 = e 5 − e 5 − e 5 + e 5 = −g5 (1) = − 5
2πi4 2πi8 2πi12 2πi16 2πi4 2πi3 2π2 2πi √
g5 (4) = e 5 − e 5 − e 5 + e 5 = e 5 − e 5 − e 5 + e 5 = g5 (1) = 5

The next lemma shows that this pattern holds much more generally.
Lemma 4.2.9 Let p be a prime. Then for any a ∈ Z we have
 
a
g p (a) = g p (1)
p

Proof. If a = 0 (mod p) then the legendre symbol is zero so we have to show that in this case we
p−1 n 
have g p (a) = 0. But we have g p (a) = g p (0) = ∑n=1 p which equals 0 by Corollary 4.1.3.
4.2 Quadratic reciprocity 39

Assume now that a ̸= 0 (mod p). Then


  p−1   
a a n an
g p (a) = ∑ ζ
p n=0 p p p
p−1  
an an
=∑ ζ by Corollary 4.1.2
n=0 p p
p−1  
n n
=∑ ζ p = g p (1)
n=0 p

where in the penultimate equality we have used the following reasoning: The summands only
depend on n mod p. Since gcd(a, p) = 1 we see that multiplication by a gives a bijection from
Z/pZ to itself so an mod pruns through the same set as n mod p and the sum is unchanged.
Multiplying both sides by ap = ±1 gives the result. ■

We will often write g p = g p (1).


Lemma 4.2.10 Let n ∈ N and x, y ∈ Z. Then
(
1 n−1 (x−y)a 1 if x = y (mod n)
∑ ζn =
n a=0 0 otherwise.

Proof. If x = y (mod n) then ζnx−y = 1 and the claim is clear. If not then ζnx−y ̸= 1 and we may use
the formula for the geometric series to see that
n−1 (x−y)n
(x−y)a ζn −1
∑ ζn = (x−y)
=0
a=0 ζn −1
since ζnn = e2πin/n = 1. ■


Theorem 4.2.11 Let a ̸= 0 (mod p). Then g p (a)2 = (−1)(p−1)/2 p. In particular |g p (a)| = p.
2
Proof. It suffices to prove the claim for a = 1, since by Lemma 4.2.9 g p (a)2 = g p (1)2 ap = g2p .
(Recall that we write g p = g p (1)).
 Note  that by−1Lemma 4.2.9 and Corollary 4.1.2 we have – if a ̸= 0 (mod p) – that g p (a)g p (−a) =
a −a 2 (p−1)/2 g2 , where we have used Corollary 4.2.4. Also g (0) = 0 by
 2
p p g p = p g p = (−1) p p
Lemma 4.2.9. The proof of the claim will follow from two different computations of the expression
p−1
∑a=0 g p (a)g p (−a). By the above observation it equals (p − 1)(−1)(p−1)/2 g2p . Using the definition
of g p (a) we find
p−1 p−1 p−1 p−1   
n m an −am
(p − 1)(−1)(p−1)/2 g2p = ∑ g p (a)g p (−a) = ∑ ∑ ∑ ζ ζ
a=0 a=0 n=0 m=0 p p p p
p−1 p−1    p−1
n m a(n−m)
=∑ ∑ ∑ ζp
n=0 m=0 p p a=0
p−1   
n n
=p∑ = p(p − 1)
n=0 p p

where in the penultimate equality we have used Lemma 4.2.10. Dividing by p − 1 gives the
result. ■
40 Chapter 4. Quadratic reciprocity

We are now ready to prove the main part of quadratic reciprocity namely Theorem 4.2.1 (i).

Proof. Let p∗ = (−1)(p−1)/2 p. Then according to Theorem 4.2.11 we have g2p = p∗ .


∗
By Proposition 4.2.3 we have (p∗ )(q−1)/2 = pq (mod q). Combining these we find that
 ∗
2 (q−1)/2 ∗ (q−1)/2 p
gq−1
p = (g p ) = (p ) = (mod q).
q

Multiplying by g p we no longer have a congruence in Z but only in Z. We get


 ∗
q p
g p =g
˜ p (mod q)
q
Using (4.6) we see that the left-hand side satisfies
!q
p−1   p−1  q
q n n n
gp = ∑ ζp = ˜ ∑ ζ pqn (mod q)
n=0 p n=0 p
q q
Since np = np the right-hand side equals g p (q) which by Lemma 4.2.9 equals

p gp. Putting it
all together we have proved
 ∗  
p q
gp =g
˜ p (mod q). (4.8)
q p
Multiplying by g p and using Theorem 4.2.11 and Proposition 4.2.7 we find that
 ∗  
p q
p∗ = p∗ (mod q)
q p

Since gcd(p∗ , q) = 1 we can multiply by an (p∗ )−1 satisfying p∗ (p∗ )−1 = 1 (mod q) which gives
that
 ∗  
p q
= (mod q)
q p
∗
from which we may conclude that pq = qp . To finish we note that


 ∗    (p−1)/2  
(−1)(p−1)/2 p
 
p −1 p (p−1)(q−1)/4 p
= = = (−1)
q q q q q
which finishes the proof. ■

■ Example 4.6 We want to determine if 5x4 + 35x2 + 3 = 0 (mod 541) has any integer solutions.
We first notice that if this has a solution x0 then y0 = x02 will be a solution to 5y2 + 35y + 3 = 0
(mod 541) which has discriminant D = 352 − 4 · 5 · 3 = 1165. We need to verify if the discriminant
is a square modulo 541 to see if this is even possible. So we compute using Theorem 4.2.1
         
1165 83 (83−1)(541−1)/4 541 43 (83−1)(43−1)/4 83
= = (−1) = = (−1)
541 541 83 83 43
   2       
40 2 2 5 2 43 3
=− =− = −(−1)(43 −1)/8 (−1)(43−1)(5−1)/4 =
43 43 43 43 5 5
= −1

But this means, that 5y2 +35y +3 = 0 (mod 541) has no solutions and therefore 5x4 +35x2 +3 = 0
(mod 541) also cannot have any solutions. ■
4.3 Exercises for chapter 4 41

4.3 Exercises for chapter 4


Exercise 4.1 Find all solutions to all second degree equations in F2 .
Exercise 4.2 Let G be an abelian group. Prove that ρn : G → G defined by ρn (g) = gn is a group
homomorphism.
5 3
  5! 
Exercise 4.3 Compute 107 , 1871 , 41 .
Exercise 4.4 Prove that there are only countably many algebraic numbers (recall that a countable
union of countable sets is again countable).
Exercise 4.5 Determine which of the following degree 2 equations have integer solutions:

x2 + 4x + 2 = 0 mod 11, x2 + 7x + 1 = 0 mod 59, x2 + 7x + 1 = 0 mod 61.

Exercise 4.6 Prove, using quadratic reciprocity, that for a prime p ≥ 5 we have
  (
3 1 if p = 1, 11 mod 12,
=
p −1 if p = 5, 7 mod 12.

Exercise 4.7 In this exercise you will give a direct proof that −3

p = 1 when p = 1 mod 3.
1. Show that (Z/pZ)∗ has an element c of order 3
2. Show that (2c + 1)2 = −3 mod p.
3. Conclude that −3 p = 1.
Show independently, using quadratic reciprocity, that −3

4. p = 1.
Exercise 4.8 You may use without proof that 213 − 1 = 8191 is prime. How many natural numbers
x ≤ 8191 satisfy the equation

x2 = 5 (mod 8191).

Exercise 4.9 Prove Proposition 4.2.8


Exercise 4.10 Let p be a prime. Show that if 1 ≤ i ≤ p − 1 then pi is divisible by p.

n k e
Exercise 4.11 (Jacobi symbols) Let n be an odd positive number. Factor it like n = ∏i=1 pi i . Then
a
we define the Jacobi symbol n by
  kn  ei
a a
=∏
n i=1 pi

Show that in general an = 1 does not imply that x2 = a (mod n) has a solution.


Showthat  
1. ab a b
n = n n for integers a, b.
−1
2. n = n (mod 4).
3. n2 = 1 if n = ±1 (mod 8), and −1 otherwise.
4. Assume further that a is positive and odd. Show that
   
a n
= (−1)(a−1)(n−1)/4 .
n a
q−1
Exercise 4.12 Let p, q be different odd primes and let q∗ = (−1)q. Show that x2 = p (mod q)
2

has an integer solution if and only if =x2 q∗


(mod p) has an integer solution.
2
Exercise 4.13 For which primes does x + 9x + 19 = 0 (mod p) have integer solutions? How
many solutions mod p does it have?
5. Arithmetic functions and Dirichlet series

5.1 Arithmetical functions


Definition 5.1.1 An arithmetic function is a map

f :N→C

Note that the set of arithmetical functions are in bijective correspondence to the set of complex se-
quences under f 7→ { f (n)}∞n=1 . As such, without making further assumptions, arithmetic functions
are not more nor less interesting than complex sequences.
■ Example 5.1 The following are all clearly arithmetical functions and it is very easy to cook up
other examples:
1. u(n) = 1
2. N(n) = n
3. Euler’s ϕ function
4. Fix p. The Legendre symbol np .

(
1 if n = 1
5. I(n) =
0 otherwise.
6. τ(n) = ∑d|n 1, the number of divisors of n.
7. σ (n) = ∑d|n d, the sum of divisors of n.
8. Let k ∈ C. σk (n) = ∑d|n d k ,
In the sums the d | n means that we sum over positive divisors.

Definition 5.1.2 An arithmetical function f is called multiplicative if

f (mn) = f (n) f (m) (5.1)

whenever m, n are coprime natural numbers. If (5.1) holds for all m, n ∈ N the arithmetical
function f is called completely multiplicative.
5.1 Arithmetical functions 43

All the arithmetical functions from Example 5.1 are multiplicative. (See Exercise 5.1). The
arithmetical functions u, N, np , I are completely multiplicative (See Exercise 5.2).
We note that if n has prime factorization n = pl11 · · · plkk with pi ̸= p j for i ̸= j then

f (n) = f (pl11 ) · · · f (plkk ) if f is multiplicative


= f (p1 )l1 · · · f (pk )lk if f is completely multiplicative

l l
Proposition 5.1.1 If n = p11 · · · pkk then
k
τ(n) = ∏(li + 1)
i=1
k
plii +1 − 1
σ (n) = ∏
i=1 pi − 1
Proof. Since
pl+1 − 1
σ (pl ) = 1 + p + p2 + · · · + pl =
p−1
l
τ(p ) = 1 + 1 + 1 + · · · + 1 = l + 1
the proof is immediate.

We note that it follows directly from Proposition 5.1.1 that τ(n) is odd if and only if li is even
for each i which happens if and only if n is a square, i.e. n = a2 for some a ∈ N.

5.1.1 Perfect numbers


We now describe an ancient notion going back at least to Euclid, namely that of perfect num-
bers:
Definition 5.1.3 A natural number n ∈ N is called perfect if n equals the sum of its proper
positive divisors.

We note that formulated in terms of the divisor function we have that a n is perfect if and only if
n = σ (n) − n, i.e. 2n = σ (n)
■ Example 5.2 The number 6 is perfect since σ (6) = 1 + 2 + 3 + 6 = 12 = 2 · 6. In the same way

we can check that 28, 496, 8128, 33550336, 8589869056 are perfect numbers. Using a computer
one can verify that these are the only perfect numbers less that 1010 . ■

Theorem 5.1.2 The number n is perfect and even if and only if n = 2 p−1 (2 p − 1) where p and
2 p − 1 are primes.

The “if” part of this theorem goes back to Euclid, but the “only if” part was a conjecture from
around year 1000 until it was proven by Euler in the 18th century.

Proof. The “if” part follows from observing that if p and 2 p − 1 are prime then 2 p−1 and 2 p − 1
are coprime and by Proposition 5.1.1 we have
2p − 1
σ (n) = σ (2 p−1 )σ (2 p − 1) = (1 + 2 p − 1) = 2 · 2 p−1 (2 p − 1) = 2n
2−1
which shows that n is perfect.
44 Chapter 5. Arithmetic functions and Dirichlet series

To prove the only if part we assume that n is perfect and even and write n = 2 p−1 q with p, q ∈ N,
p ≥ 2, and q odd. We note that q > 1 since otherwise n = 2 p−1 but σ (2 p−1 ) = 2 p − 1 ̸= 2 · 2 p−1 .
Using that n is perfect we see that

2p − 1
2 p q = 2n = σ (n) = σ (2 p−1 )σ (q) = σ (q),
2−1

so by Theorem 1.2.6 (iv) 2 p − 1 | q. Writing q = r(2 p − 1) we find σ (q) = 2 p r. Hence

σ (q) = 2 p r = (2 p − 1)r + r = q + r

If r > 1 then q, r, and 1 are 3 different divisors of q so σ (q) ≥ q + r + 1 = σ (q) + 1 which is


impossible; Hence r = 1. It follows that σ (q) = 1 + q which means that q only has divisors 1 and q
which means that q = 2 p − 1 is prime. But this can only happen if p is also prime (see Exercise
1.18) which finishes the proof. ■

This theorem classifies even perfect numbers, but doesn’t say much about their existence. Also
it makes no claim about odd perfect numbers.
Conjecture 5.1.3 There are no odd perfect numbers, but infinitely many even perfect numbers.
It has been verified that there are no odd perfect numbers less than 101500 . Note that by
Theorem 5.1.2 there is a 1-1 correspondence between even perfect numbers and Mersenne primes
(see Exercise 1.18). See mersenne.org/primes for a list of all 49 known Mersenne primes. Hence
we know also 49 even perfect numbers.

5.1.2 Convolution of arithmetical functions


Definition 5.1.4 Let f , g be arithmetical functions. Then we define their Dirichlet convolution
f ∗ g by

( f ∗ g)(n) = ∑ f (d)g(n/d)
d|n

■ Example 5.3 σ (n) = ∑d|n d = N ∗ u(n), τ = ∑d|n 1 = u ∗ u(n) ■

Let

A = { f arithmetical functions}
G = { f ∈ A| f (1) ̸= 0}
M = { f ∈ G| f is multiplicative}

Notice that if f is multiplicative then f (m) = f (m) f (1) so if f ̸= 0 then we have f (1) = 1.

Theorem 5.1.4 (A, ∗) is an abelian semigroup i.e. for every f , g, h ∈ A we have

f ∗ g ∈ A, and ( f ∗ g) ∗ h = f ∗ (g ∗ h), and f ∗g = g∗ f

Proof. It is clear that f ∗ g ∈ A.


We note that we can write f ∗ g(n) as the sum over all expressions f (a)g(b) where a,b are
natural numbers multiplying to n, i.e.

f ∗ g(n) = ∑ f (a)g(b).
ab=n
5.1 Arithmetical functions 45

It follows immediately that f ∗ g = g ∗ f . To see that Dirichlet convolution is associative we note


that

( f ∗ g) ∗ h(n) = ∑ f ∗ g(a)h(b) = ∑ ∑ f (c)g(d)h(b) = ∑ f (c)g(d)h(b)


ab=n ab=n cd=a cdb=n

and similarly

f ∗ (g ∗ h)(n) = ∑ f (a)g ∗ h(b) = ∑ ∑ f (a)g(c)h(d) = ∑ f (a)g(c)h(d)


ab=n ab=n cd=b acd=n

which proves the claim. ■


(
1 if n = 1
We recall from Example 5.1 that I(n) =
0 otherwise.

Theorem 5.1.5 (G, ∗) is an abelian group with identity I.

Proof. It is clear that for any f ∈ A we have f ∗ I(n) = I ∗ f (n) = ∑d|n I(d) f (n/d) = f (n). If
f ∈ G we need to show that there exists another arithmetical function f −1 satisfying f −1 ∗ f (n) =
∑d|n f −1 (d) f (n/d) = I(n). But if f (1) ̸= 0 we can construct such a function recursively by
f −1 (1) = 1/ f (1) and

1
f −1 (n) = − ∑ f −1 (d) f (n/d).
f (1) d|n
d̸=n

which proves that (G, ∗) is an abelian group. ■

Theorem 5.1.6 (M, ∗) is a subgroup of (G, ∗)

Proof. It is clear that I is multiplicative.


We need to show that if F, G are multiplicative then so is F ∗ G. If m, n are coprime natural
numbers consider the following map between sets of positive divisors

{d | m} × {e | n} →{ f | mn}
(d, e) 7→ de.

We claim that this is a bijection: It is surjective since if f | mn we may define d = gcd( f , m) and
e = gcd( f , n). It is clear from Proposition 1.2.6 (iii) that de | f . If f ̸= de we let f = d · e · g with
g ̸= 1. Let p | g be a prime. Then since f | mn Euclid’s Lemma 1.4.1 implies that p | m or p | n. If
p | n we see that pe | f . We have also pe | mn. Since pe is coprime with m (A common prime factor
would be a common prime factor of both m and n but they have no common factor other than 1)
Theorem 1.2.6 (iv) implies that pe | n. Hence pe is a common divisor of f and n, which contradicts
e being their greatest common divisor. If instead p | m we are lead to the same contradiction, and
we may conclude that g = 1 which means that the map is surjective.
To see that it is injective assume that de = d ′ e′ with d, d ′ | m and e, e′ | n. Then d,e′ are coprime,
and e, d ′ are coprime. By Proposition 1.2.6 (iv) we may conclude that d | d ′ , d ′ | d, e | e′ , and e′ | e
which implies that (d, e) = (d ′ , e′ ).
46 Chapter 5. Arithmetic functions and Dirichlet series

Using this we see that



mn
F ∗ G(mn) = ∑ F ( f ) G
f |nm
f
 mn 
= ∑ ∑ F(de)G
d|m e|n
de
m n
= ∑ ∑ F(d)F(e)G G
d|m e|n
d e
m n
= ∑ F(d)G F(e)G = F ∗ G(m) · F ∗ G(n)
d|m
d ∑ e|n
e

To show that if F is multiplicative then so is F −1 we assume that it is not and let n be the smallest
natural number such that there exists a factorization n = ml with m, l coprime natural numbers
such that F −1 (ml) ̸= F −1 (m)F −1 (l). Notice that multiplicative functions satisfies F(1) = 1 so
1 = I(1) = F ∗ F −1 (1) = F(1)F −1 (1) = F −1 (1). This implies that n ̸= 1.
Note that for all coprime natural numbers a, b with ab < n we have F −1 (ab) = F −1 (a)F −1 (b).
Hence using the above isomorphism we find that
 
−1 −1 ml
I(ml) = F ∗ F(ml) = ∑ F ( f ) F
f |ml
f
 
ml
= ∑ ∑ F −1 (de)F
d|m e|l
de
 
ml
= ∑ F −1 (de)F + F −1 (ml)F(1)
d|m
de
e|l
de̸=ml
m  l 
−1 −1
= ∑ F (d)F (e)F F + F −1 (ml)F(1)
d|m
d e
e|l
de̸=ml
m  l 
−1 −1
= ∑F (d)F (e)F F + F −1 (ml) − F −1 (m)F −1 (l)
d|m
d e
e|l
m  
−1 −1 l
= ∑F (d)F ∑F (e)F + F −1 (ml) − F −1 (m)F −1 (l)
d|m
d e|l
e
= I(m)I(l) + F −1 (ml) − F −1 (m)F −1 (l) ̸= I(m)I(l)
which is a contradiction. Hence F −1 is indeed multiplicative.

This theorem allows us to prove the following “two out of three” result:

Corollary 5.1.7 Let f , g, h be arithmetical functions satisfying f ∗ g = h. If two of f , g, h are


multiplicative then so is the third.

Proof. This follows in a straightforward manner from Theorem 5.1.6: If f , g are multiplicative then
so is f ∗ g = h. If f , h are multiplicative then so is f −1 ∗ h = g, and if g, h are multiplicative then so
is g−1 ∗ h = f . ■
5.2 Dirichlet series 47

5.1.3 The Möbius function


We now define a very important function called the Möbius function.
Definition 5.1.5 The Möbius function is the arithmetical function satisfying µ ∗ u = I.

A few comments are needed in order to see that this is a reasonable definition. Clearly u ∈ G. By
Theorem 5.1.5 u therefore has an inverse with respect to *. By general group theory such an inverse
is unique in G. It is clearly also unique in A since any µ satisfying µ ∗ u = I must have µ(1) = 1, so
µ ∈ G. This shows that µ is well-defined. It further follows from Theorem 5.1.6 and the fact that u
is multiplicative that µ is multiplicative. We can describe µ(n) in terms of the prime decomposition
of n:
Theorem 5.1.8
(
(−1)k if n is the product of k different primes
µ(n) =
0 otherwise.

Proof. Let µ ′ be the right-hand side. We have µ ′ ∗ u(1) = ∑ab=1 µ ′ (a)u(b) = µ ′ (1)u(1) = 1 · 1 =
I(1). For n = pl11 · · · plkk > 1 we observe that

µ ′ ∗ u(n) = ∑ µ ′ (d) = ∑ µ ′ (d)


d|n d|p1 ···pk
k  
k
=∑ (−1) j = (1 − 1)k = 0 = I(n)
j=0 j

It follows that µ ′ ∗ u = I so by uniqueness we have µ = µ ′ . ■

Many number theory texts uses Theorem 5.1.8 as the definition of the Möbius function. Then
they need to prove that µ ∗ u = I.

Theorem 5.1.9 — Möbius inversion. Let f , g ∈ A then f = u ∗ g if and only if g = f ∗ µ

Proof. If f = u ∗ g then f ∗ µ = u ∗ g ∗ µ = u ∗ µ ∗ g = I ∗ g = g. If g = f ∗ µ then u ∗ g = u ∗ f ∗ µ =


u∗µ ∗ f = I ∗ f = f ■

5.2 Dirichlet series


In the last section we discussed arithmetical functions. Very often it is useful to try to make a
generating function from such a series. A Dirichlet series is one particularly useful such generating
series.
Definition 5.2.1 Let f be an arithmetical function. Then the corresponding Dirichlet series is
defined, for s ∈ C by

f (n)
D f (s) = ∑
n=1 ns

We note that there is no requirement in this definition about convergence of the series, but if the
series is divergent everywhere the series is not so useful. Following Riemann it is customary to
write s = σ + it ∈ C with σ ,t ∈ R.
48 Chapter 5. Arithmetic functions and Dirichlet series

We note that if σ ≥ a then since |ns | = nσ ≥ na we have


| f (n)| | f (n)|
≤ . (5.2)
|ns | na
It then follows from the comparison test that if D f (s) is absolutely convergent at s = a + ib then
D f (s) is absolutely convergent in the half-plane σ ≥ a.

Theorem 5.2.1 Assume that ∑∞ −s


n=1 | f (n)n | is not always convergent nor always divergent.
Then there exist σa such that D f (s) converges absolutely if σ > σa , and does not converge
absolutely if σ < σa . This number is called the abscissa of absolute convergence.

Proof. Let σa = inf{a ∈ R|∃b ∈ R : D f (a + ib) converges absolutely}. By assumption we are


taking infimum over an non-empty set. If s = σ + it with σ > σa we have that D f (s) converges
absolutely by the observation following (5.2). If s = σ + it with σ < σa we have that D f (s) is not
absolutely convergent: If it were then σ would be in the set we are taking infimum over which it
cannot be, since σ is strictly less than the infimum. ■

If D f is always absolutely convergent then we define σa = −∞ and if D f is never absolutely


convergent we define σa = ∞.
■ Example 5.4 The Riemann zeta function is defined as the Dirichlet series related to u, i.e.
1
ζ (s) := Du (s). We note that ∑∞
n=1 is divergent so we must have σa ≥ 1. At the same time we have
R n n −σ
1
for n ≥ 1 and σ > 0 that nσ ≤ n−1 t dt and it follows that for σ > 1 we have

1 1
Z ∞
∑ |ns | ≤ 1 + t −σ dt = 1 − (5.3)
n=1 1 1−σ
which implies that σa ≤ 1. Hence σa = 1. ■

Lemma 5.2.2 Assume | f (n)| ≤ Cnc . Then D f is absolutely convergent for σ > 1 + c, i.e. the
abscissa of absolute convergence is less than or equal to 1 + c.

Proof. This follows directly from Example 5.4 and the comparison test. ■

We can now show that there is a precise relation between multiplication of Dirichlet series and
Dirichlet convolution of the corresponding arithmetical functions.

Theorem 5.2.3 Assume f1 , f2 , f3 are arithmetical functions satisfying | fi (n)| ≤ Cnc then we
have D f1 (s)D f2 (s) = D f3 (s) for σ sufficiently large if and only if f1 ∗ f2 = f3 .

Proof. Assume f1 ∗ f2 = f3 . Since for σ sufficiently large we have that D f1 , D f2 are absolutely
convergent we have, using Cauchy products, that

f1 (n) ∞ f2 (m) ∞
f1 (n) f2 (m)
D f1 (s)D f2 (s) = ∑ s ∑ s
= ∑
n=1 n m=1 m n,m=1 (nm)s
∞ ∞
∑nm=k f1 (n) f2 (m) f1 ∗ f2 (k)
= ∑ s
= ∑ = D f3 (s)
k=1 k k=1 ks

for σ sufficiently large.


Assume on the other hand that D f1 (s)D f2 (s) = D f3 (s) for σ large enough. Then by the above
computations we have that

f1 ∗ f2 (k) − f3 (k)
0= ∑ (5.4)
k=1 ks
5.2 Dirichlet series 49

But an absolutely convergent Dirichlet series is uniquely determined by its coefficients (See Exercise
5.5). ■

■ Example 5.5 Note that |µ(n)| ≤ 1 so Dµ is absolutely convergent for σ > 1 by Lemma 5.2.2.
Note also that DI (s) = 1. It then follows from Theorem 5.2.3 and the definition of the Möbius
function that Dµ (s)Du (s) = 1. Since Du (s) = ζ (s) we see that ζ (s) has no zeroes for σ > 1. We
note also that for σ > 1 we have

µ(n) 1
∑ ns = ζ (s) .
n=1

In fact Dµ (s) has abcissa of absolute convergence equal to 1 as follows from the above and Exercise
5.6. ■

Euler products
Let pn be the n’th prime.

Theorem 5.2.4 Let g be a multiplicative arithmetic function and assume that ∑∞


n=1 g(n) converges
absolutely. Then this sum can be expressed as an infinite product
∞ ∞ l ∞
∑ g(n) = ∏ ∑ g(pk ) := lim ∏ ∑ g(pkn ).
l→∞ n=1
n=1 p prime k=0 k=0

If furthermore g is completely multiplicative we have



1
∑ g(n) = ∏ .
n=1 p prime 1 − g(p)

Proof. Consider Pl = ∏ln=1 ∑∞ k ∞ k


k=0 g(pn ). Since ∑k=0 g(p ) is a subsum of an absolutely convergent
sum it is itself absolutely convergent. Hence we find, using successive Cauchy multiplication, that

Pl = ∑ ∑ g(pk11 ) · · · g(pkl l )
h=0 k1 +k2 +···kl =h

= ∑ ∑ g(pk11 · · · pkl l )
h=0 k1 +k2 +···kl =h

which is also absolutely convergent. By absolute convergence we may rearrange as we please,


and we find, using the fundamental theorem of arithmetic, that for every n whose prime factors
are all less than or equal to pl , the term g(n) appears and appears exactly once. Let Al = {n|n =
pk11 · · · pkl l , ki ≥ 0}. Then we have shown that Pl = ∑n∈Al g(n). It follows that


∑ g(n) − Pl = ∑ g(n) ≤ ∑ |g(n)| ≤ ∑ |g(n)|
n=1 n∈A
/ l n∈A
/ l n>pl

which can be made arbitrarily small by making l large since ∑∞ n=1 g(n) is absolutely convergent.
This proves the first claim.
The second claim follows after we observe that g(pk ) = g(p)k . For this implies, since the
terms in a convergent series goes to zero, that |g(p)| < 1, and hence the sum ∑∞ k
k=0 g(p ) is really a
convergent geometric series which equals (1 − g(p))−1 . This finishes the proof. ■
50 Chapter 5. Arithmetic functions and Dirichlet series

Definition 5.2.2 We say that D f (s) admits an Euler product if D f (s) = ∏ p prime D f ,p (s) where


f (pk )
D f ,p (s) = ∑ .
k=0 pks

We notice that n 7→ f (n) is multiplicative (respectively completely multiplicative) if and only if


n 7→ f (n)n−s is multiplicative (respectively completely multiplicative). Combining this observation
with Theorem 5.2.4 we arrive at the following result:

Corollary 5.2.5 Assume f is a multiplicative arithmetic function and that D f (s) is absolutely
convergent in some halfplane. Then D f (s) admits an Euler product. If f is completely multi-
plicative then
1
D f ,p (s) = .
(1 − f (p)p−s )

■ Example 5.6 The arithmetical function u is completely multiplicative so

ζ (s) = Du (s) = ∏(1 − p−s )−1


p

which is the Euler product of the Riemann zeta function. ■

■ Example 5.7 The Möbius function is multiplicative so



1
= ∏ ∑ µ(pk )p−ks = ∏(1 − p−s )
ζ (s) p k=0 p

where we have used Theorem 5.1.8. ■

n

■ Example 5.8 Fix a prime q and consider χ(n) = q which is completely multiplicative by
Corollary 4.1.2. The corresponding Dirichlet series is usually denoted L(s, χ) and we find

L(s, χ) = ∏(1 − χ(p)p−s )−1


p

The Riemann zeta function


The Riemann zeta function ζ (s) is one of the most fundamental functions in all of number theory.
We have only described its most basic properties. We have shown, in Example 5.4 that if σ > 1
then it can be defined as the absolutely convergent Dirichlet series Du (s) which in this region is
free of zeroes. Also we have seen that in this same region it is given by a convergent infinite Euler
product. Consider the function

ξ (s) = π −s/2 Γ(s/2)ζ (s). (5.5)

Here Γ(s) = 0∞ e−t t s−1 dt (when σ > 0) is the Euler Gamma function which admits a con-
R

tinuation as a meromorphic function to s ∈ C. Riemann [Rie59] proved the following theo-


rem:
Theorem 5.2.6 The Riemann zeta function ζ (s) admits a meromorphic continuation to s ∈ C
with a simple pole in s = 1. The function ξ (s) defined by (5.5) satisfies the functional equation

ξ (s) = ξ (1 − s).
5.3 Exercises for chapter 5 51

Moreover all the zeroes of ξ (s) are in the critical strip 0 ≤ σ ≤ 1.

For a proof of this and a wealth of other information about the Riemann zeta function consult
[Ivi85; KV92; Tit86]
Conjecture 5.2.7 — The Riemann hypothesis. All zeros of ξ (s) lie on the critical line ℜ(s) =
1/2.
We have already seen one equivalent formulation of this namely Conjecture 1.4.5. Here is
another equivalent formulation:
∑ µ(n)
n≤x
Conjecture 5.2.8 For every ε > 0 is bounded.
x1/2+ε

R The Riemann hypothesis has not been proved despite many attempts. Hardy [Har14] proved
that there are infinitely many zeros on the critical line. Selberg [Sel42] proved that a positive
(but unspecified) proportion of the zeroes lie on the critical line. This has been improved
several times with a 2012 result of Feng [Fen12] as the current record: Feng proved that
at least 41.28% of all zeros of ζ lies on the critical line. See [Bor+08] for a treasure of
information about the Riemann hypothesis.

5.3 Exercises for chapter 5


Exercise 5.1 Show that all the arithmetical functions of Example 5.1 are multiplicative.
n

Exercise 5.2 Show that u, N, I, and p are completely multiplicative.
Exercise 5.3 Is 210 · (211 − 1) perfect? Hint: 23.
Exercise 5.4 Show that ∑d|n |µ(d)| = ∏ p|n 2.
Exercise 5.5 Let D f be a Dirichlet series with finite abcissa of absolute convergence σa . In this
exercise we will show that D f determines f uniquely.
Assume that there exist a complex sequence sk = σk + itk with σk → ∞ when k → ∞, satisfying
that D f (sk ) = 0 for all k sufficiently large. We will show that this implies that f (n) = 0 for all n.
(i) Assume that f is not identically zero, and choose N ∈ N minimal such that f (N) ̸= 0. Show
that

f (N) = −N sk ∑ f (n)n−sk .
n=N+1

(ii) Let c > σa . Show that for k sufficiently large we have


| f (N)| ≤ N σk (N + 1)−(σk −c) ∑ | f (n)| n−c
n=N+1

(iii) Conclude, by letting k → ∞, that f (N) = 0, which gives a contradiction and therefore
f (n) = 0 for all n ∈ N.
Exercise 5.6 In this exercise we show that the reciprocal of the primes diverges. Let pn denote the
n’th prime. We start by assuming that ∑∞ −1
n=1 pn is convergent with sum l.
(i) Show that there exist N ∈ N such that ∑n>N p−1 n = l − ∑Nn=1 p−1
n ≤ 1/2.
k
(ii) Show that ∑k=1 ∑n>N p−1


n is absolutely convergent.
(iii) Let W = p1 · · · pN . For r ∈ N consider W r + 1. Show that the prime factorization of W r + 1
contains only primes p > pN .
52 Chapter 5. Arithmetic functions and Dirichlet series
!k
1
(iv) Show, using Cauchy multiplication, that ∑
n=q1 ···qk n
= ∑ p−1
n where the left-hand side
n>N
qi prime
qi >pN
denotes the reciprocals of all natural numbers which can be written of precisely k primes all
bigger that pN .
1 −1 k

(v) Show that ∑∞ ∞
r=1 W r+1 ≤ ∑k=1 ∑n>N pn
(vi) Derive a contradiction and conclude that ∑∞ −1
n=1 pn cannot be convergent.
Exercise 5.7 Let g(n) be the sum of primitive nth roots of 1, i.e.

g(n) = ∑ ζ.
ζ n =1
ζ m ̸=1
for 0<m<n

Show that µ(n) = g(n); i.e. µ(n) is the sum of primitive nth roots of 1.
Exercise 5.8 Let k ∈ N. A number n is called k-perfect if σ (n) = kn. It is not known if there are
infinitely many k-perfect numbers for any k. The following are the known 3-perfect numbers: 120,
672, 523776, 459818240, 1476304896, 51001180160. Show that they are indeed 3-perfect.
Exercise 5.9 Prove Gottschalck’s theorem: Let n be a k-perfect number such that 2 divides n
precisely m times. Then 2m (2m+1 − 1) divides kn.
6. Sums of squares

In this chapter we shall investigate questions concerning sums of squares, but before we dive
into this we make a short digression into more general sums of powers. We start by the trivial
observation that the number of l’th powers less than or equal to x is at essentially x1/l namely
exactly the integers ml with m ≤ x1/l . Hence the set of l’th powers is very sparse in the set of
integers. By comparison the prime number theorem (1.4) implies that there are many more primes,
than l’th powers.

6.0.1 Waring’s problem


Let l ∈ N be a natural number. Waring’s problem is the following: Does there exist a number s such
that every natural number n can be written as the sum of at most s l’th powers, or said differently
such that the Diophantine equation

n = x1l + · · · + xsl (6.1)

has at least one solution in non-negative integers.


The answer is yes. Hilbert [Hil09] showed that for every l such an s exist. Obviously we
may ask – and this question also goes back to Waring – what the smallest such number is? We
call this number g(l). Trivially g(1) = 1. One may verify by simple computations (See exercise
6.1) that g(2) ≥ 4, g(3) ≥ 9, g(4) ≥ 19. Waring conjectured that these inequalities attains the
actual values. In fact it was later conjectured that this holds much more generally, namely that
g(l) = 2l + [(3/2)l] − 2 for every l ∈ N. Here square bracket denotes the integer part.

6.1 Sums of squares


We shall now consider in more detail the case l = 2, i.e. we shall consider the case of sums of
squares. Lagrange proved that g(2) = 4.
54 Chapter 6. Sums of squares

Theorem 6.1.1 — Lagrange. Every natural number can be written as a sum of at most 4
squares.

We will prove this theorem in Section 6.1.5 with a much more recent proof using geometric number
theory.
Let k ∈ N. Let
Sk = {n ∈ Z|n = x12 + · · · + xk2 , xi ∈ Z}.
Hence Sk denotes the subset of N ∪ {0} of integers that can be written as a sum of at most k squares.
We clearly have S1 ⊆ S2 ⊆ S3 ⊆ · · · . Proving Lagrange’s theorem 6.1.1 is equivalent to proving that
S4 = N ∪ {0}. We will also see that Si ̸= N ∪ {0} for i = 1, 2, 3. Hence 3 squares does not suffice.
Before giving a proof of Lagrange’s theorem 6.1.1 we describe explicitly S1 , S2 , S3 .

6.1.1 Squares
The set S1 is simply the set of squares

S1 = {0, 1, 4, 9, 16, 25, 36, 49, . . .}.

6.1.2 Sums of two squares


The set S2 is the set of numbers that can be written as the sum of two squares i.e. the sum of two
numbers from the set S1 . Clearly 3 cannot be written as the sum of two numbers from S1 as 0, 1 are
the only two numbers from S1 which are less than 3. Similar simple considerations show that 6,7,
and 11 do not lie in S2 . We have

S2 = {0, 1, 2, 4, 5, 8, 9, 10, 13, . . .}

Lemma 6.1.2 The set S2 is closed under multiplication.


2
Proof. This is most easily proved using complex numbers. If s1 , s2 ∈ S2 then s j = a2j + b2j = z j
where z j = a j + ib j ∈ {a + ib|a, b ∈ Z} = Z[i]. We notice that clearly z1 · z2 ∈ Z[i]. But for every
z = a + ib ∈ Z[i] we have |z|2 = a2 + b2 ∈ S2 . In particular s1 s2 = |z1 z2 |2 ∈ S2 . ■

Notice also that the proof gives a way of finding the representation of s1 s2 as the sum of two
squares since

(a21 + b21 )(a22 + b22 ) = |z1 z2 |2 = |z1 z2 |2 = |(a1 a2 + b1 b1 ) + i(−a1 b2 + b1 a2 )|2 (6.2)
2 2
= (a1 a2 + b1 b2 ) + (−a1 b2 + b1 a2 ) (6.3)
2 2
■ Example 6.1 We have 8 = 22 + 22 and 10 = 32 + 12 . But then 80 = 8 · 10 = |2 + i2| |3 + i| =
|4 + 8i|2 = 42 + 82 . ■

The fact that S2 is closed under multiplication makes it natural to consider which primes can be
written as a sum of two squares. Clearly 2=1+1 is a sum of two squares.

Theorem 6.1.3 — Euler. Every prime p = 1 (mod 4) is a sum of 2 squares.

Proof. By Theorem 4.2.1 (ii) we have −1 (p−1)/2 = 1 so there exist u, r ∈ Z such that

p = (−1)
−1 = u2 − rp. By division modulo p we may assume 0 < u ≤ p − 1 which forces

u2 + 1 p2 + 1 − 2p + 1
0<r= ≤ < p.
p p

We hence have rp = 1 + u2 with 0 < r < p.


6.1 Sums of squares 55

Consider

m = min{n ∈ N|np ∈ S2 }.

By the above considerations the sum of which we are taking the minimal element is non-empty and
the minimum is strictly less than p. We claim that m = 1 from which it follows that p ∈ S2 .
To see that m = 1 we write mp = a21 + b21 . We can choose representatives a2 for a1 mod m and
b2 for b1 mod m satisfying |a2 | , |b2 | ≤ m/2. Then

a22 + b22 = a21 + b21 = mp = 0 (mod m)

which implies that a22 + b22 = sm with 0 ≤ s ≤ m/2 < m.


If s = 0 then a2 = b2 = 0 so a1 = b1 = 0 (mod m). Hence m2 |a21 + b21 = pm which implies that
m|p which implies m ∈ {1, p}. But m < p so m = 1 and we are done.
We will show that s > 0 leads to a contradiction: We use the expression (6.2) to see that

(a1 a2 + b1 b2 )2 + (−a1 b2 + b1 a2 )2 = (a21 + b21 )(a22 + b22 ) = mpsm = m2 sp

But

a1 a2 +b1 b2 = a21 +b21 = mp = 0 (mod m), and −a1 b2 +b1 a2 = −a1 b1 +b1 a1 = 0 (mod m)

so a1 a2 + b1 b2 , −a1 b2 + b1 a2 are divisible by m and we find that

a1 a2 + b1 b2 2 −a1 b2 + b1 a2 2
   
+ = sp.
m m
But since s < m this contradicts m being the smallest integer such that mp can be written as a sum
of 2 squares. This finishes the proof. ■

R There are many other proofs of this theorem. Zagier found a delightful one-sentence proof,
see [Zag90].

We can now describe the set S2 completely:

Theorem 6.1.4 A natural number n is a sum of 2 squares if and only if every p = 3 (mod 4)
has an even (possibly zero) exponent in the prime factorization of n.

Proof. Assume that every p = 3 (mod 4) has an even (possibly zero) exponent in the prime
factorization of n. Then n = 2e pe11 · · · pekk m2 where pi = 1 (mod 4). Since 2 = 12 +12 , m2 = m2 +02 ,
and pi = a2i + b2i by Theorem 6.1.3, we find – since S2 is closed under multiplication – that n is the
sum of two squares.
Assume that n = x2 + y2 . If gcd(x, y) = d > 1 then d 2 | n and we consider n/d 2 = (x/d)2 +
(y/d)2 . We notice that every prime factor of n/d 2 comes with the same parity as in n. In particular
any prime which divides n an odd number of times will still be a divisor of n/d 2 . Hence we can
assume, without loss of generality that gcd(x, y) = 1.
Assume that p | n. Then x2 + y2 = 0 (mod p), i.e. x2 = −y2 (mod p). We notice that
gcd(y, p) = 1 for if p | y then also p | x, but we have assumed that gcd(x, y) = 1. Hence us-
ing Bezout we may find z ∈ Z such that yz = 1 (mod p), and it follows that (zx)2 = z2 (−y2 ) = −1
(mod p). But hence −1 is a quadratic residue modulo p this implies, by Theorem 4.2.1 (ii), that
p = 1 (mod 4), or p = 2. Hence no prime p = 3 (mod 4) can occur an odd number of times in
the prime factorization of n, which finishes the proof. ■
56 Chapter 6. Sums of squares

We now know exactly which natural numbers are in S2 . But we can go even further and count
the number of ways a number can be written as a number of 2 squares, i.e. we want to understand

r2 (n) := #{(x, y) ∈ Z2 |x2 + y2 = n} (6.4)

We see that r2 (1) = #{(±1, 0), (0, ±1)} = 4, r2 (2) = #{(±1, 1), (±1, −1)} = 4. From Theorem
6.1.4 we see that r2 (n) = 0 if any p = 3 (mod 4) appears an odd number of times in the prime
factorization of n.
To find a more explicit expression for r2 (n) it is convenient to look at the Gaussian integers.

6.1.3 Gaussian integers


Consider the set Z[i] = {x + iy|x, y ∈ Z} which is called the ring of Gaussian integers. It is
straightforward to verify that this is indeed a ring. We have already seen (Example 4.3) that Z[i] is
a subset of the set of algebraic integers Z, and since it is a subring of C it is an integral domain (i.e.
a non-zero commutative ring in which the product of non-zero elements is non-zero).
Euclidean rings
Definition 6.1.1 An integral domain R is called a Euclidean ring if there exist a function

d : R\{0} → N ∪ {0}

satisfying
(i) d(ab) ≥ d(b) for a, b ̸= 0.
(ii) for all a, b ∈ R with b ̸= 0 there exist q, r ∈ R such that

a = bq + r, where r = 0 or d(r) < d(b).

The function d is called a Euclidean function on R. Note that the notion is clearly modelled on the
classical case of Theorem 1.2.1, but there is also subtle differences. For instance the remainder
need not be unique. Also the inequality (i) is in a certain sense superfluous. If an integral domain
admits a d-function satisfying only (ii) then it also admits another d-function satisfying both (i) and
(ii).
Proposition 6.1.5 Let R be a Euclidean ring with Euclidean function d. Then for a, b non-zero
elements d(ab) = d(b) if and only if a ∈ R× .

Proof. If a ∈ R× there exist c ∈ R such that ac = 1. But by (i) we have d(b) = d(abc) ≥ d(ab) and
d(ab) ≥ d(b), so d(ab) = d(b).
If d(ab) = d(b) with a, b ̸= 0 we claim that a is a unit. Assume not: We can divide b by ab and
find b = qab + r with r = 0 or d(r) < d(ab) i.e.

b(1 − qa) = r, with r = 0 or d(r) < d(ab) = d(b).

Since a is not a unit (1 − qa) is non-zero so r cannot be zero. Hence we have d(b(1 − qa)) < d(b).
But at the same time (i) gives d(b(1 − qa)) ≥ d(b) which is a contradiction. ■

Corollary 6.1.6 Let R be a Euclidean domain with Euclidean function d. Then d(a) = d(1) if
and only if a is a unit, i.e a ∈ R× .

Proof. If u is a unit then so is its inverse u−1 . Using Proposition 6.1.5 we find d(1) = d(u−1 u) =
d(u). If d(u) = d(1) then by (ii) there exist q, r such that 1 = qu + r. We claim that r = 0 which
shows that u is a unit. For if not then d(r) < d(u) = d(1). But at the same time d(r) = d(r ·1) ≥ d(1)
by (i) which is a contradiction. ■
6.1 Sums of squares 57

Proposition 6.1.7 The Gaussian integers Z[i] is a Euclidean ring with Euclidean function d(z) = zz.

Proof. We have already discussed why Z[i] is an integral domain. We verify easily that d(ab) =
|ab|2 = |a|2 |b|2 ≥ d(b) i.e. (i) is satisfied.
Let a, b ∈ Z[i] with b ̸= 0 and consider the complex number z = a/b = x + iy with x, y ∈ R.
Choose n1 , n2 ∈ Z such that |n1 − x| ≤ 1/2 and |n2 − y| ≤ 1/2. Let q = n1 + in2 . If we let r = a − qb
then either r = 0 or d(r) = |a − qb|2 = |b|2 |a/b − q|2 = |b|2 |x + iy − (n1 + in2 )|2 < |b|2 = d(b)
which verifies (ii). ■

■ Example 6.2 The following can also be verified to be Euclidean rings


• Any field F with Euclidean function d( f ) = 1.
• The ring Z with Euclidean function d(n) = |n|.
2πi
• Let ζ3 = e 3 . We note that ζ32 + ζ3 + 1 = 0. Then Z[ζ3 ] = {a + bζ3 |a, b ∈ Z} (The Eisenstein
integers) is a Euclidean ring with Euclidean function d(a + bζ3 ) = |a + bζ3 |2 = a2 + b2 − ab.
• The ring of polynomials F[X] over a field F with Euclidean function d( f ) = deg( f ).

Factorization in Euclidean rings


Let a, b ∈ R. We say that a, b are associates if a = ub for a unit u ∈ R× . We say that a divides b and
write a | b if b = ac for some c ∈ R. A non-zero, non-unit element a is called irreducible if its only
divisors are units and associates of a. A greatest common divisor of a, b not both zero is an element
d ∈ R such that d | a and d | b and if d ′ is another element satisfying d ′ | a and d ′ | b then d ′ | d.
■ Example 6.3 By Corollary 6.1.6 we see that the units in Z[i] are precisely those satisfying that
d(a + ib) = d(1) = 1, i.e. the units are 1, i, −1, and −i. Geometrically these 4 elements rotate by
0, π/2, π, and 3π/2. ■

In a Euclidean ring R greatest common divisors exist (but are generally not unique) and can be
found in the same way as in the Euclidean algorithm. Also the “obvious” analogue of Bezout’s
identity is true. (See [DF04, Theorem 4 p. 275]) and using these tools one can prove that in a
Euclidean ring we have unique factorization (For a proof see [DF04, Theorem 14 p. 287], or go
through the relevant parts of Chapter 1 making the appropriate changes.)

Theorem 6.1.8 Let R be a Euclidean domain. Then every non-zero element r ∈ R which is
not a unit can be written as a finite product of irreducibles of R. This product is unique up to
permutation and multiplication of units.

We want to use this theorem to describe r2 (n). To this end we need to find all irreducibles of Z[i].
Proposition 6.1.9 The irreducible elements of Z[i] are precisely the elements associated to one of
the following:
(i) π = 1 + i.
(ii) π = x + iy with x2 + y2 = ππ = p = 1 (mod 4), x > |y| > 0.
(iii) q = 3 (mod 4).
Here p, q are primes in N.

Proof. The numbers in (i) and (ii) are irreducible since d(π) is prime: If π = ab is a factorization
then d(π) = d(a)d(b) which is prime, which implies that d(a) or d(b) is 1. But then either a or b
is a unit which means that π is irreducible.
An element q as in (iii) is irreducible since a non-trivial factorization q = ab would imply
q2 = d(q) = d(a)d(b). But since d(a), d(b) ̸= 1 this is only possible if q = d(b) = d(a) = d(x +
iy) = x2 + y2 . But this contradicts Theorem 6.1.4.
58 Chapter 6. Sums of squares

This shows that all the listed elements are irreducible. To see that every irreducible is associated
to one on the list we argue as follows: Let ρ be any irreducible element. We notice that if ρ is
irreducible then ρ is irreducible also, since any non-trivial factorization ρ = ab gives a non-trivial
factorization ρ = ab.
We have d(ρ) = ρρ ∈ N, which we can factor into primes in N. Now we can factor all primes
in N into irreducible in Z[i] of the listed form (Note that 2 = −i(1 + i)(1 + i)). Hence using the
uniqueness of factorization we see that ρ is associated to one of the listed irreducible. ■

Using the above properties of Z[i] we can determine r2 (n) for any n ∈ N:
e e f f
Theorem 6.1.10 Let n ∈ N with prime factorization n = 2e p11 · · · pkk q11 · · · ql l where pi = 1
(mod 4) and q j = 3 (mod 4). Then

0
 if f j is odd for some j = 1, . . . , l
r2 (n) = k
e1 ek

 4τ(p 1 · · · pk ) = 4 ∏(ei + 1) otherwise.
i=1

Proof. Write pi = πi πi as in Proposition 6.1.9. Then we have a factorization into irreducible in


Z[i]:

k l
f
n = (−i)e (1 + i)2e ∏ πiei πi ei ∏ q j j .
i=1 j=1

Note that n = x2 + y2 if and only if n = zz where z = x + iy, so

r2 (n) = #{z ∈ Z[i] : z | n, zz = n}.

To count the factors z of n satisfying zz = n we notice that any factor of n must factor in irreducibles
c
as z = u(1 + i)a ∏ki=1 πiai πi bi ∏lj=1 q j j for some 0 ≤ a ≤ 2e, 0 ≤ ai , bi ≤ ei , 0 ≤ c j ≤ f j . so we have

k l
2c j
n = zz = (−i)a (1 + i)2a ∏ πiai +bi πi ai +bi ∏ q j
i=1 j=1

But by Theorem 6.1.8 this is only possible if a = e, 2c j = f j and ai + bi = ei . It follows that


r2 (n) = 0 if f j is odd for some j. If on the other hand f j is even we see that every factor is of the
form
k l
f /2
z = u(1 + i)e ∏ πiai πi ei −ai ∏ q j j
i=1 j=1

and that every such expression is indeed a factor. Since there are 4 possible choices of units this
implies that
k
r2 (n) = 4#{(a1 , . . . , ak )|0 ≤ ai ≤ ei } = 4 ∏(1 + ei ).
i=1

■ Example 6.4 Let us find all ways to write 221 as a sum of two squares. Note that 221 = 13 · 17
and 13, 17 = 1 (mod 4). We easily find that 13 = 32 + 22 = (3 + 2i)(3 − 2i), and 17 = 42 + 12 =
(4 + i)(4 − i) By Theorem 6.1.10 we find that r2 (221) = 4(1 + 1)(1 + 1) = 16. Indeed the proof
shows that 221 = x2 + y2 if and only if z = x + iy is of the form z = u(3 + 2i)a1 (3 − 2i)1−a1 (4 +
6.1 Sums of squares 59

i)a2 (4 − i)1−a2 with 0 ≤ ai ≤ 1. This means that z = u(3 ± 2i)(4 ± i) with all possible choices of ±.
This corresponds to z = u(10 ± 11i) or z = u(14 ± 5i) which gives the 16 representations
221 = (±10)2 + (±11)2 = (±11)2 + (±10)2 = (±14)2 + (±5)2 = (±5)2 + (±14)2 ,
with all possible choices of ±. ■

■ Example 6.5 We find all ways to write 4168 = 23 521 as a sum of two squares. Note that 521 = 1
(mod 4) so Theorem 6.1.10 gives r2 (4168) = 4(1 + 1) = 8 corresponding to factors of n of the type
z = u(1 + i)3 (20 ± i11) = u(−62 + 18i) or u(−18 + 62i) corresponding to the 8 representations
4168 = (±62)2 + (±18)2 = (±18)2 + (±62)2
with all possible choices of ±. ■

6.1.4 Sums of three squares


It is easy to verify that not all natural numbers can be written as a sum of 3 squares, noticing that
numbers which are sums of 3 squares can be written as n = a + b where a ∈ S1 is a square and
b ∈ S2 is a sum of two squares. We find
S3 = {0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 16, . . .},
where we note that 7 and 15 are missing. We note also that although 3, 5 ∈ S3 their product is not,
so S3 is not closed under multiplication. The following result holds1 :

Theorem 6.1.11 S3 = {n ∈ N ∪ {0}|n ̸= 4e (8k + 7)}.

Note that any square is either 0,1,4 modulo 8 so if n is a sum of 3 square then n ̸= 7 (mod 8).
If n = 0 (mod 4) then since a square is either 0 or 1 modulo 4 we must have that all three
squares are 0 modulo 4. But this implies that if n = x2 + y2 + z2 then x,y,z are divisible by 2,
so n/4 = (x/2)2 + (y/2)2 + (z/2)2 ∈ S3 . We have shown that if n ∈ S3 then n ̸= 7 + 8k and if
n = 4b ∈ S3 then b ∈ S3 . It follows that S3 ⊆ {n ∈ N ∪ {0}|n ̸= 4e (8k + 7)}.
Showing the opposite inclusion is much trickier, and we will not give the proof. See e.g.
[Ank57] for a relatively simple proof, which in principle can be understood using the techniques of
section 6.1.5.

6.1.5 Sums of four squares


We now describe S4 which according to Lagrange’s theorem 6.1.1 equals N ∪ {0}. It is possible to
give an algebraic proof in the same spirit as Theorem 6.1.3 (See e.g. [JJ98, Theorem 10.6]), but
here we give a more geometric proof using Minkowski theory. We start by noticing that it is enough
to show that every prime is a sum of 4 squares. This follows immediately from the fundamental
theorem of arithmetic and the following fact:
Lemma 6.1.12 The set S4 is closed under multiplication.

Proof. This follows immediately from following identity which is straightforward to verify:
(a21 + b21 + c21 + d12 )(a22 + b22 + c22 + d22 ) =(a1 a2 − b1 b2 − c1 c2 − d1 d2 )2
+ (a1 b2 + b1 a2 + c1 d2 − d1 c2 )2
+ (a1 c2 − b1 d2 + c1 a2 + d1 b2 )2
+ (a1 d2 + b1 c2 − c1 b2 + d1 a2 )2 .
1 There is some disagreement in the literature as to whether Legendre or Gauss gave the first correct proof of this

result.
60 Chapter 6. Sums of squares

v2
v1

Figure 6.1: Fundamental domain

Analogous to the way that Lemma 6.1.2 can be proved using complex numbers, this lemma may be
proved using quaternions. See e.g. [JJ98, Sec 10.5]. ■

Minkowski theory
Definition 6.1.2 A (full) lattice Λ in Rn is a subset Λ ⊆ Rn satisfying

Λ = {α1 v1 + · · · αn vn |αi ∈ Z}

where (v1 , . . . , vn ) is a basis for Rn . The set (v1 , . . . , vn ) is also called a basis for Λ.

It is straightforward to verify that (Λ, +) is a subgroup of (Rn , +).


■ Example 6.6 Zn = {α1 e1 + · · · αn en } where (e1 , . . . , en ) is the standard basis of Rn is called the
integer lattice. ■

Elements v, w ∈ Rn
are called equivalent modulo Λ if v − w ∈ Λ. It is straightforward to verify
that this is indeed an equivalence relation, and we shall write v ∼Λ w.
Definition 6.1.3 If (v1 , . . . , vn ) is a basis for a lattice Λ then

FΛ = {t1 v1 + · · · + tn vn |0 ≤ ti < 1}

is called a fundamental domain or a fundamental mesh for Λ.

Lemma 6.1.13 Let v ∈ Rn . Then there exists a unique w ∈ FΛ such that v ∼Λ w.

Proof. Since (v1 , . . . , vn ) is a basis for Rn there exist r1 , . . . , rn ∈ R such that v = r1 v1 + . . . + rn vn .


Let αi = ⌊ri ⌋ where ⌊r⌋ is the largest integer less than or equal to r. Define w = v − ∑ni=1 αi vi which
is then clearly in FΛ and equivalent to v modulo Λ. Assume now that w′ ∈ FΛ with w′ ∼Λ v. Write
w = ∑ni=1 ti vi , and w′ = ∑ni=1 ti′ vi with 0 ≤ ti ,ti′ < 1. Note that −1 < ti − ti′ < 1. Since w − w′ ∈ Λ
we have ti − ti′ ∈ Z which is only possible if ti − ti′ = 0, i.e. w = w′ . ■

We note that an alternative way of formulating Lemma 6.1.13 is that we can write Rn as a
disjoint union
[
Rn = (FΛ + l)
l∈Λ

where for a set A we write A + l = {a + l|a ∈ A}.


Using Lemma 6.1.13 we can define the map φΛ : Rn → FΛ sending v to the unique point w in
FΛ equivalent to v modulo Λ. The map φΛ has the following property:
Lemma 6.1.14 — Blichfeldt’s lemma. Let X ⊆ Rn be measurable with respect to Lebesgue
measure. If vol(X) > vol(FΛ ) then the restriction of φΛ to X, φΛ |X is not injective.
Before proving this we recall a few facts about n-dimensional Lebesgue measure which we
simply denote by vol:
6.1 Sums of squares 61

1. Lebesgue measure is σ -additive, i.e. if {An }∞n=1 are disjoint measurable sets then their union
is measurable and vol(∪n An ) = ∑n vol(An ).
2. Lebesgue measure is translation invariant, i.e if A is measurable and l ∈ Rn then A + l is
measurable and vol(A + l) = vol(A).
3. ( (2π)m
π n/2 if n = 2m
vol({x ∈ Rn | ∥x∥ < r}) = rn = rn · n(n−2)···4·2
2m+1 π m
Γ(n/2 + 1) n·(n−2)···3·1 if n = 2m + 1.
4. The measure of the parallelepiped spanned by v1 , . . . , vn equals the absolute value of de-
terminant of the matrix whose columns are v1 , . . . , vn , i.e. vol({∑ni=1 ti vi |0 ≤ ti < 1}) =
|det(v1 v2 · · · vn )|.

Proof of Lemma 6.1.14. : We have


[ [ [
X = X ∩ Rn = X ∩ (FΛ + l) = X ∩ (FΛ + l) = Xl
l∈Λ l∈Λ l∈Λ

where Xl = X ∩ (FΛ + l) and the last union is disjoint. Note that if x ∈ Xl then φΛ (x) = x − l, so
φΛ (Xl ) = Xl − l. By translation invariance of Lebesgue measure we have vol(φΛ (Xl )) = vol(Xl ). If
φΛ |X were injective then by σ -additivity (and the fact that Λ is countable) we have
[
vol(X) = vol( Xl ) = ∑ vol(Xl ) = ∑ vol(φΛ (Xl )) ≤ vol(FΛ )
l∈Λ l∈Λ l∈Λ

which contradicts the assumption of the lemma. ■

Note that Lemma 6.1.14 has some resemblance to Dirichlet’s pigeon-hole principle, which
can be stated formally as follows: a map φ : A → B between finite sets A, B cannot be injective
if #A > #B . Lemma 6.1.14 shows that a set X cannot be mapped injectively to set with smaller
volume if the map does not decrease volume.
Definition 6.1.4 A set X ⊆ Rn is called
1. centrally symmetric if −x ∈ X when x ∈ X,
2. convex if when v, w ∈ X then tv + (1 − t)w ∈ X for every 0 ≤ t ≤ 1.

■ Example 6.7 A ball centered at the origin is both centrally symmetric and convex. ■

Theorem 6.1.15 — Minkowski. Let Λ be a lattice in Rn with fundamental domain FΛ . Let


X ⊆ Rn be a convex, centrally symmetric measurable set satisfying vol(X) > 2n vol(FΛ ). Then X
contains a non-zero lattice point l ∈ Λ.

Proof. Consider the lattice 2Λ = {2v|v ∈ Λ}. If Λ has basis v1 , . . . , vn then 2Λ has basis 2v1 , . . . , 2vn
so 2Λ has fundamental domain vol(2FΛ ) = |det(2(v1 v2 · · · vn ))| = 2n |det(v1 v2 · · · vn )| = 2n vol(FΛ ),
so vol(X) > vol(2FΛ ). It follows from Lemma 6.1.14 that φ2Λ |X is not injective so there exist
different v, w ∈ X such that φ2Λ (v) = φ2Λ (w), i.e. v = w0 + l1 , w = w0 + l2 with w0 ∈ F2Λ and
li ∈ 2Λ. Clearly l1 ̸= l2 since otherwise v = w.
Since X is centrally symmetric −w ∈ X. Since X is convex 12 v + (1 − 12 )(−w) ∈ X.
But
1 1 v − w w0 + l1 − (w0 + l2 ) l1 − l2
v + (1 − )(−w) = = = ∈ Λ\{0},
2 2 2 2 2
which finishes the proof. ■

Note that all three requirements of Theorem 6.1.15 are crucial (See Exercise 6.2).
We are now ready to prove the main theorem of this section:
62 Chapter 6. Sums of squares

Theorem 6.1.16 Every prime can be written as a sum of 4 squares.

Proof. Clearly 2 = 12 + 12 + 02 + 02 , so it suffices to prove the claim for odd primes. We note that
by Corollary 4.1.3 there are (p − 1)/2 squares in F× p . Clearly 0 is also a square so we find that

#K = #{z ∈ F p |z = u2 , u ∈ F p } = (p + 1)/2

Since −(1 + z) runs through F p when z runs through F p we find that

#L = #{z ∈ F p | − (1 + z) = v2 , v ∈ F p } = (p + 1)/2

Since there are p elements in F p there must be at least one z lying in K ∩ L, i.e. a z such that z = u2
(mod p) and −(1 + z) = v2 (mod p). Adding these equations we find that

−1 = u2 + v2 (mod p)

Consider now

Λu,v = {(x, y, z,t)t ∈ Z4 |z = ux + vy (mod p), t = vx − uy (mod p)}


       
1 0 0 0
0  1   0   0 
Then Λu,v is a lattice with basis u,  v ,  p,  0 . See Exercise 6.3. The volume of the
      

v −u 0 p
corresponding fundamental domain equals
 
1 0 0 0
0 1 0 0  2
vol(FΛu,v ) = det u v p 0  = p .

v −u 0 p

Consider the set X = {x ∈ R4 | ∥x∥ < 2p} which has volume
p (2π)2
vol(X) = ( 2p)4 = 2π 2 p2 > 24 p2 = 24 vol(FΛu,v ).
4·2
By Minkowski’s theorem 6.1.15 there exist a non-zero lattice element 0 ̸= l ∈ Λu,v ∩ X. This l

satisfies 0 < ∥l∥2 < ( 2p)2 = 2p, and at the same time

∥l∥2 = x2 + y2 + z2 + t 2 = x2 + y2 + (u2 x2 + v2 y2 + 2uxvy) + (v2 x2 + u2 y2 − 2vxuy) (mod p)


= (1 + u2 + v2 )(x2 + y2 ) = 0 (mod p),

i.e. x2 + y2 + z2 +t 2 is divisible by p. But since 0 < ∥l∥2 < 2p this implies x2 + y2 + z2 +t 2 = p. ■

We note that 0, 1 can clearly be written as a sum of at most 4 squares, so Theorem 6.1.16 and
Lemma 6.1.12 implies Lagrange’s theorem 6.1.1.

6.2 Exercises for chapter 6


Exercise 6.1 Show that 7 can be written by no less that 4 squares, 23 requires 9 cubes, and 79
requires 19 fourth-powers. Conclude that g(2) ≥ 4, g(3) ≥ 9, g(4) ≥ 19
Exercise 6.2 Show that if any of the requirements – convex, centrally symmetric, or vol(X) >
2n vol(F) – are left out, then Theorem 6.1.15 fails to hold. K
6.2 Exercises for chapter 6 63

Exercise 6.3 Show that

Λu,v = {(x, y, z,t)t ∈ Z4 |z = ux + vy (mod p), t = vx − uy (mod p)}


       
1 0 0 0
0  1   0  0
is a lattice with basis u,  v ,  p, and  0 . K
      

v −u 0 p
Exercise 6.4 Find the factorization into irreducibles of Z[i] of the numbers 221 = 13 · 17, 234 =
2 · 32 · 13, 702 = 2 · 33 · 13, and 16660 = 22 · 5 · 72 · 17.
Exercise 6.5 Compute r(n) and find all representations of n as a sum of two squares when n = 221,
n = 234, n = 702, and n = 16660.
Exercise 6.6 Show that the average value of r2 (n) equals π i.e. that

1 n
∑ r2 (m) → π as n → ∞.
n m=1

Finding the correct rate of convergence is a very difficult unsolved problem in analytic number
theory (Gauss’ circle problem).
7. Continued fractions and approximations

Continued fractions provide a very useful tool in number theory. It is a tool with many applications,
and we shall here look at a few of them

7.1 Continued fractions


A n-th order finite continued fraction is an expression of the form

1
a0 + (7.1)
1
a1 +
1
a2 +
1
···+
an
where a0 ∈ R, and ai > 0 for i ∈ N. We denote this by [a0 , a1 , . . . , an ]. We call n the length of the
continued fraction. The continued fraction is called simple if ai ∈ Z.
We note that for 0 < j < n − 1

[a0 , a1 , . . . , an−1 , an ] = [a0 , a1 , . . . , a j , [a j+1 , . . . , an ]] (7.2)

which we shall use in particular for j = 1 and j = n − 2 which can be written as


1
[a0 , a1 , . . . , an−1 , an ] = a0 + = [a0 , . . . , an−2 , an−1 + 1/an ] (7.3)
[a1 , . . . , an ]
■ Example 7.1

1 1 3 51
[7, 2, 3] = 7 + = 7+ = 7+ =
1 7 7 7
2+
3 3

7.1 Continued fractions 65

Fix an n’th order finite continued fraction x = [a0 , . . . , an ]. For 0 ≤ m ≤ n the continued
fraction [a0 , . . . , am ] is called the m’th partial convergent of x. For m = −2, . . . , n define numbers
pm = pm (a0 , . . . , am ), qm = qm (a0 , . . . , am ) by
            
p−2 0 p−1 1 pm pm−1 pm−2 am
= , = , = when m ≥ 0 (7.4)
q−2 1 q−1 0 qm qm−1 qm−2 1
p
Proposition 7.1.1 [a0 , . . . , am ] = qmm .
Hence for simple finite continued fractions x the procedure (7.4) allows us to express x as a
fraction.

Proof. This is a straightforward induction argument on m:

p0 p−1 a0 + p−2 a0
= = = [a0 ]
q0 q−1 a0 + q−2 1
Assuming the statement holds for all continued fraction of lengths m − 1 we find

pm−2 (am−1 + 1/am ) + pm−3


[a0 , . . . , am ] = [a0 , . . . , am−1 + 1/am ] =
qm−2 (am−1 + 1/am ) + qm−3
pm−2 (am am−1 + 1) + am pm−3 am (am−1 pm−2 + pm−3 ) + pm−2
= =
qm−2 (am am−1 + 1) + am qm−3 am (am−1 qm−2 + qm−3 ) + qm−2
am pm−1 + pm−2 pm
= =
am qm−1 + qm−2 qm
where we have used that pm−2 , pm−3 , qm−2 , qm−3 depend only on a0 , . . . am−2 . ■

Proposition 7.1.2

pn qn−1 − qn pn−1 = (−1)n+1


pn qn−2 − qn pn−2 = (−1)n an

Proof. By (7.4) we have

qn−1 pn =qn−1 (pn−1 an + pn−2 )


pn−1 qn =pn−1 (qn−1 an + qn−2 )

Subtracting these two equations we find

pn qn−1 − qn pn−1 = −(pn−1 qn−2 − qn−1 pn−2 )

Continuing in this way we find

pn qn−1 − qn pn−1 = (−1)n (p0 q−1 − q0 p−1 ) = (−1)n+1 (7.5)

We note also that again using (7.4) we have

pn qn−2 − qn pn−2 = (pn−1 an + pn−2 )qn−2 − (qn−1 an + qn−2 )pn−2


= an (pn−1 qn−2 − qn−1 pn−2 ) = (−1)n an .

where we have used (7.5). ■


66 Chapter 7. Continued fractions and approximations

Corollary 7.1.3 If [a0 , . . . , an ] is simple then for every n we have that pn , qn are relatively prime
integers.

Proof. It is clear from the definition of pn , qn that they are integers with qn ≥ 1. If they have
a common factor g Proposition 7.1.2 implies that g | (−1)n+1 which implies that g = 1, and we
conclude that pn , qn are relatively prime. ■

It is obvious that a simple finite continued fraction x = [a0 , . . . , an ] is a rational number and according
to Proposition 7.1.1 and Corollary 7.1.3 we have that x = pn /qn where the fraction is written in
lowest terms. In fact we can write any rational number as a continued fraction:
Lemma 7.1.4 Every rational number can be written as a simple continued fraction and every
simple continued fraction is a rational number.

Proof. It is clear from the above that every simple continued fraction is a rational number. Given a
rational number q = a/b where we may assume that a,b are relatively prime and b > 1 (If b = 1 we
have q = [a] and the statement is clear.) we use Euclid’s algorithm and find

a =a0 b + r1
b =a1 r1 + r2
r1 =a2 r2 + r3
..
.
rn−1 =an rn + 0

where ai ∈ Z and ai > 0 for i > 0, and r1 > r2 > . . . > rn = 1. We find that

a r1 1
q= = a0 + = a0 +
b b b
r1
1 1
= a0 + = a0 +
r2 1
a1 + a1 +
r1 r1
r2
1 1
a0 + = · · · = a0 + = [a0 , a1 , . . . , an ]
1 1
a1 + a1 +
r3 1
a2 + a2 +
r2 1
···+
an
which shows that q can be written as a simple continued fraction. ■

7.1.1 Infinite continued fractions


We will now consider infinite continued fractions. These are limits of of finite continued fractions.
Let ai ∈ R and ai > 0 for i > 0. We denote

[a0 , a1 , a3 , . . .] = lim [a0 , . . . , an ],


n→∞

if the limit exists.


7.1 Continued fractions 67

■ Example 7.2 If x = [1, 1, 1, . . .] exists then



x > 1 and x = [1, x] i.e. x = 1 + 1/x which implies
1+ 5
that x2 − x − 1 = 0, and we find that x = 2 which is the so-called golden ratio. ■

We will now show that [a0 , a1 , . . .] is well-defined if a0 ∈ Z and ai ∈ N. We call such infinite
continued fractions simple. For simple continued fractions we have q1 < q2 < · · · . Furthermore
qn = qn−1 an + qn−2 ≥ 2qn−2 . It follows, since q0 = 1 and q1 ≥ 1, that for n ≥ 1

qn ≥ 2(n−1)/2 (7.6)

i.e qn grows at least as fast as the terms of a geometric progression.

Theorem 7.1.5 Let a0 ∈ Z and ai ∈ N if i > 0. Then cn = [a0 , a1 , . . . , an ] converges as n goes to


infinity.

Proof. We start by showing that c2n and c2n+1 converge as n goes to infinity. Note that qn ≥ 1. By
propositions 7.1.1 and 7.1.2 we have for n ≥ 2

pn pn−2 (−1)n an
cn − cn−2 = − = (7.7)
qn qn−2 qn qn−2

from which it follows that c2n is strictly increasing, i.e. c2(n−1) < c2n , and c2n+1 is strictly decreasing.
From the same propositions we find that

pn pn−1 (−1)n−1
cn − cn−1 = − = (7.8)
qn qn−1 qn qn−1

It follows that c2n < c2n−1 < c1 , and c2n+1 > c2n > c0 . Hence c2n is an increasing sequence which
is bounded from above and hence converges to some α0 . Similarly c2n+1 is a decreasing sequence
bounded from below and hence converges to some α1 . To prove α0 = α1 we can use (7.8) and (7.6)
to conclude that

|α0 − α1 | ≤ |α0 − c2n | + |c2n − c2n−1 | + |c2n−1 − α1 |


1
= |α0 − c2n | + + |c2n−1 − α1 | → 0
q2n q2n−1

from which it follows that α0 = α1 and cn → α0 . ■

The continued fraction procedure


We will now describe a procedure which, to any given x ∈ R, finds a simple continued fraction
[a0 , a1 , . . .] (finite or infinite) such that x = [a0 , a1 , . . .].

The continued fraction procedure For x ∈ R


1. Write x = a0 + t0 where a0 ∈ Z and 0 ≤ t0 < 1.
2. As long as tn ̸= 0 write 1/tn = an+1 + tn+1 with an+1 ∈ N and 0 ≤ tn+1 < 1

This associates to every real number a finite or infinite sequence of integers a0 , a1 , . . ., with ai > 0
for i ≥ 1. We note if tn ̸= 0 then a small induction argument (See Exercise 7.4) shows that

x = [a0 , a1 , . . . , an+1 + tn+1 ] (7.9)

We note also that if x is rational then by the proof of Lemma 7.1.4 the continued fraction procedure
gives a finite sequence and the corresponding continued fraction equals x.
68 Chapter 7. Continued fractions and approximations
√ √ √ √
■ Example 7.3 Let x =
√ 2. We have 2√= 1 + ( √ 2 − 1) and 0 < √ 2 − 1 < 1, so a0 = 1 and
t√0 = 2 − 1. Continuing we find that 1/( 2 − 1) = 2 + 1 = 2 + ( 2 − 1) so a1 = 2 and t1 =
2 − 1. Since t1 = t0 we find a2 = a1 and t2 = t1 . So the continued fraction procedure gives in this
case [1, 2, 2, 2, . . .], which we will also denote by [1, 2] where the overline denotes that it continues
periodically with 2’s. We already know that this continued fraction is convergent.
√ Call
√ the limit y.
Using the periodicity we see that y − 1 = 1/(2 + (y − 1)), leading to y = 2. Hence 2 = [1, 2]. ■

Theorem 7.1.6 Let x ∈ R, and let [a0 , a1 , . . .] be the continued fraction coming from the contin-
ued fraction procedure. Let cn = qpnn be the partial convergents and assume that the length of the
continued fraction is at least n + 1. Then
1
|x − cn | ≤ (7.10)
qn qn+1

If the length of the continued fraction is exactly n + 1 we have x = [a0 , a1 , . . . an+1 ]. In particular
we have in both the finite and infinite case that x = [a0 , a1 , . . .].

Proof. We note that if tn ̸= 0 we have, by (7.9) and Proposition 7.1.1, that

pn (an+1 + tn+1 ) + pn−1 pn pn+1 + pntn+1 pn pn+1 qn − pn qn+1


x − cn = − = − =
qn (an+1 + tn+1 ) + qn−1 qn qn+1 + qntn+1 qn (qn+1 + qntn+1 )qn

from which (7.10) follows using qntn+1 ≥ 0 and Proposition 7.1.2. If tn+1 = 0 we see from the
above computations that x = pn+1 /qn+1 = [a0 , . . . , an+1 ]. If the continued fraction is infinite then
(7.6) implies that cn → x as n → ∞. ■

We note that – in the above proof – if tn+1 > 0, then we have that qntn+1 > 0 and we find that, in
this case, Eq. (7.10) is a strict inequality. Together with the following lemma, this shows that for
irrational numbers the inequality in Eq. (7.10) can be strengthened to a strict inequality.
Lemma 7.1.7 The continued fraction procedure terminates after finitely many steps if and only if
x is rational.

Proof. Use Theorem 7.1.6 and the proof of Lemma 7.1.4. ■

Using Theorem 7.1.6 and (7.6) we see that the √continued fraction procedure gives a rational
n
sequence cn which converges as fast as |x − cn | ≤ 2/2 .

Quadratic irrationals
In Section 4.2.1 we studied algebraic numbers. Given an algebraic number ζ ∈ C there exist a
polynomial with integer coefficients of minimal degree having ζ as a root. We denote this number
the degree of ζ . Irrational numbers of degree 2 are called quadratic irrationals. Hence quadratic
irrationals are non-rational numbers which are roots in integer polynomials of degree 2.

■ Example 7.4 The number ζ1 = (1 + 5)/2
√is a quadratic irrational since it is not rational, and it
is a root of x2 − x − 1 = 0. The number ζ2 = 2 is a quadratic irrational since it is not rational, and
it is a root of x2 − 2 = 0. ■

Definition 7.1.1 An infinite continued fraction x = [a0 , a1 , . . .] is called periodic if there exist
N, h such that

an = an+h
7.1 Continued fractions 69

for all n > N. The minimal choice of h is called the period of x. We write x = [a0 , a1 , . . . , aN , aN+1 , . . . , aN+h ].
√ √
■ Example 7.5
√ For x = 2 we have (compare Example 7.3) N = 0 and h = 1 i.e. 2 = [1, 2].
1.
2. For x√= (1 + 5)/2 we have (compare Example 7.2) N = −1 (or N = 0) and h = 1. i.e.
(1 + 5)/2 = [1].
3. If x = [1, 2, 3] then x = [1, 2, 3, x], i.e.

1 1 3x + 1
x = 1+ = 1+ = 1+
1 x 7x + 2
2+ 2+
1 3x + 1
3+
x
from which it follows that (7x + 2)x = 10x + 3, i.e. 7x2 − 8x − 3 = 0, which√means that x
is a quadratic irrational. Solving the quadratic equation we find that x = 4+7 37 (the other
solution is negative, and x is clearly positive)

The next theorem says that what we saw in Example 7.5 holds in general:

Theorem 7.1.8 An infinite simple continued fraction α = [a0 , a1 , . . .] is periodic if and only if α
is a quadratic irrational.

Proof. Assume first that the continued fraction is periodic i.e.

α = [a0 , a1 , . . . , aN , aN+1 , . . . , aN+h ] = [a0 , a1 , . . . , aN , β ]

with β = [aN+1 , . . . , aN+h ]. Note that by periodicity β = [aN+1 , . . . , aN+h , β ] and by Proposition
7.1.1 we have
p′h−1 β + p′h−2
β=
q′h−1 β + q′h−2

where p′i = pi (aN+1 , . . . , aN+1+i ), q′i = qi (aN+1 , . . . , aN+1+i ) ∈ Z from which it follows that β is a
root of the degree 2 equation

q′h−1 β 2 + (q′h−2 − p′h−1 )β − p′h−2 = 0.

Hence the degree of β is at most 2. If it is 1 then β would be rational, but then its continued
fraction would be finite (Lemma 7.1.4) which it is not. It follows that β is a quadratic irrational and
that α = [a0 , a1 , . . . , aN , β ]. Now we observe that β ′ = k + 1/β is a quadratic irrational if β is a
quadratic irrational and k ∈ Z: Clearly β ′ cannot be rational and if β is a root of f (x) = ax2 + bx + c
with a, b, c ∈ Z and a, c ̸= 0 then β ′ is a root of f ′ (x) = c(x − k)2 + b(x − k) + a since

f ′ (β ′ ) = cβ −2 + bβ −1 + a = β −2 (c + bβ + aβ 2 ) = β −2 f (β ) = 0.

Using (7.3) repeatedly shows that α is a quadratic irrational.


Assume on the other hand that α = [a0 , a1 , . . .] is a quadratic irrational. Since α is not rational
it is an infinite continued fraction. For every n we have by (7.2), α = [a0 , a1 , . . . , an−1 , rn ] where
rn = [an , an+1 , . . .]. We will show that the set of possible rn ’s is finite. It follows that for some
n, h we have rn = rn+h which implies that α = [a0 , a1 , . . . , an−1 , an , an+1 , an+h−1 ], i.e. the continued
fraction expansion is periodic.
To see that the set of rn ’s is finite note that, since α is a quadratic irrational, there exist a, b, c ∈ Z
with ac ̸= 0, such that

0 = aα 2 + bα + c. (7.11)
70 Chapter 7. Continued fractions and approximations

By Proposition 7.1.1
pn−1 rn + pn−2
α= . (7.12)
qn−1 rn + qn−2
By combining (7.11) and (7.12) we see, after a small computation, that

0 = (ap2n−1 + bpn−1 qn−1 + cq2n−1 )rn2


+ (2apn−1 pn−2 + b(pn−1 qn−2 + pn−2 qn−1 ) + 2cqn−1 qn−2 )rn
+ ap2n−2 + bpn−2 qn−2 + cq2n−2
= An rn2 + Bn rn +Cn

Note that An , Bn ,Cn ∈ Z and An−1 = Cn , and

B2n − 4AnCn = (b2 − 4ac)(pn−1 qn−2 − qn−1 pn−2 )2 = b2 − 4ac (7.13)

We claim that there are only finitely many possible values of An . If this is so then there are at most
finitely many possible values of Cn since An−1 = Cn . But then then there are also only finitely many
values of Bn since B2n = b2 − 4ac + 4AnCn .
This means that all rn ’s are roots of one of a list of finitely many polynomials, and hence there
can only be finitely many rn ’s. By the previous discussion this would conclude the proof.
To see that there are only finitely may possible values of An we recall from Theorem 7.1.6 that

pn−1 1
α− <
qn−1 qn−1 qn
−1 −1
so |qn−1 α − pn−1 | < q−1
n < qn−1 i.e. pn−1 = qn−1 α + δn qn−1 where |δn | < 1. Inserting this in the
definition of An we see that

An = a(qn−1 α + δn q−1 2 −1 2
n−1 ) + b(qn−1 α + δn qn−1 )qn−1 + cqn−1
= (aα 2 + bα + c)q2n−1 + 2aαδn + aδn2 q−2 2 −2
n−1 + bδn = 2aαδn + aδn qn−1 + bδn

It follows that |An | ≤ 2 |aα| + |a| + |b| which shows that there are at most finitely many possible
values of An which finishes the proof.

R We have now seen that


• an algebraic number is of degree 1 if and only if the continued fraction is finite (Lemma
7.1.4).
• an algebraic number is of degree 2 if and only if the continued fraction is periodic
(Theorem 7.1.8).
so it is natural to ask if there is some property of the continued fraction expansion which
characterizes the algebraic numbers of degree 3. This question is completely open.

7.2 Approximating real numbers with rationals


Since Q is dense in R we can approximate a real number α by a rational one to any precision. But
what if we want to approximate α by a rational one with small denominator? Is this possible? Here
is the first important result in this direction:
7.2 Approximating real numbers with rationals 71

Theorem 7.2.1 — Dirichlet’s approximation theorem. Let α ∈ R, and n ∈ N. Then there


exist co-prime integers a, b ∈ Z, with 0 < b ≤ n such that
a 1
α− ≤ .
b b(n + 1)

Proof. Consider first the case where α = qp ∈ Q. If 0 < q ≤ n we may choose a = p, b = q.


Assume q > n. From the continued fraction algorithm we find α = [a0 , a1 . . . , ak ] and we have
1 = q0 ≤ q1 < q2 · · · < qk = q. We choose m < k such that qm ≤ n < qm+1 . Then qm+1 ≥ n + 1 and
Theorem 7.1.6 gives the result with a = pm , b = qm .
If α ∈
/ Q we have α = [a0 , a1 , . . .] and qm → ∞. We may choose m such that qm ≤ n < qm+1 ,
which again gives qm+1 ≥ n + 1 and Theorem 7.1.6 gives the result with a = pm , b = qm . ■

R Dirichlet’s approximation theorem 7.2.1 is the beginning of an entire research topic called
Diophantine approximation. For a much more thorough introduction check [Cas72; Ste16].
The proof we have presented here is constructive, as it allows us to use the continued fraction
procedure to actually find the number numbers a, b such that a/b approximates α to the given
precision. It is possible to give another simple (non-constructive) proof which uses only
Dirichlet’s pigeon-hole principle.

Corollary 7.2.2 For every α ∈ R there exist infinitely many pairs (a, b) ∈ Z2 such that

a 1
α− < 2. (7.14)
b b

Proof. If α ∈ Q the result is clear, since we are not requiring a, b to be coprime: If α = p/q any
(a, b) = n(p, q) with n ∈ N will do.
/ Q then we can choose any n = N1 and since b1 (N11 +1) < b12 if 0 < b1 ≤ N1 we find
If α ∈
1
the first pair (a1 , b1 ) by using Theorem 7.2.1. Assume now that there are k different pairs
(a1 , b1 ), . . . , (ak , bk ) satisfying (7.14).
Since α is not rational we have α − abii > 0, so we may choose Nk+1 ∈ N such that

1 ai
< min bi α −
Nk+1 i=1,...,k bi
Applying Theorem 7.2.1 with n = Nk+1 we get a pair (ak+1 , bk+1 ) such that
ak+1 1 1
α− < < .
bk+1 (Nk+1 + 1)bk+1 b2k+1
If (ak+1 , bk+1 ) = (a j , b j ) for some j ≤ k then
1 ai aj ak+1 1
< min bi α − ≤ bj α − = bk+1 α − ≤
Nk+1 i=1,...,k bi bj bk+1 Nk+1 + 1
which is impossible, i.e. (ak+1 , bk+1 ) is different from the first k pairs, and the proof is finished. ■

Definition 7.2.1 An irrational number α ∈ R\Q is called


• badly approximable if for some C = C(α) > 0

a C
α− ≥ 2
b b
72 Chapter 7. Continued fractions and approximations

for every ab ∈ Q.
• well approximable if there exists ε > 0 such that
a 1
α− ≤ 2+ε
b b
for infinitely many ab ∈ Q.
• normally approximable if it is not well approximable nor badly approximable.
Maybe surprisingly we can determine if α is badly approximable by looking at its continued
fraction expansion.

Theorem 7.2.3 A number α ∈ R\Q is badly approximable if and only if the sequence a0 , a1 , . . . ,
produced by the continued fraction procedure is bounded.

Corollary 7.2.4 The set of badly approximable numbers is uncountable.

Proof. See Exercise 7.5 ■

Corollary 7.2.5 Every quadratic irrational is badly approximable.

Proof. See Exercise 7.6 ■


A measurable set A is said to have full Lebesgue measure if its complement has measure 0. Hence
“almost every” number is in the set A.

Theorem 7.2.6 The set of normally approximable numbers has full Lebesgue measure.

It follows that the generic irrational number is normally approximable. The next theorem is deep,
and in 1958 it earned Klaus Roth the Fields medal. The proof is well beyond the scope of these
notes:
Theorem 7.2.7 — Roth. No algebraic irrational number is well approximable.

R Note that it follows from Roth’s theorem 7.2.7 that all well-approximable numbers are
transcendental (note that there are much easier ways to prove this). It is not known if
algebraic numbers of degree d > 2 are normally approximable or badly approximable, but it
is conjectured that they are all normally approximable.

7.3 Exercises for chapter 7


Exercise 7.1 Find the continued fraction of 13/9.
Exercise 7.2 Evaluate the infinite continued fraction√[1, 2, 1, 2].
Exercise 7.3 Determine the continued fraction (1 + 7)/3.
Exercise 7.4 Prove (7.9). (Hint: (7.3)).
Exercise 7.5 Prove Corollary 7.2.4.
Exercise 7.6 Prove Corollary 7.2.5.
Exercise 7.7 Prove Theorem 7.1.5 by showing – using propositions 7.1.1 and 7.1.2, combined
with (7.6) – that
√ n 1
|cn − cm | ≤ 2 ∑ k ,
k=m 2
and conclude that cn is a Cauchy sequence.
8. The ABC-conjecture

In the 1980s Oesterlé and Masser suggested the following important conjecture:
For an integer n we define the radical, rad(n) as the product of all distinct prime factors of n i.e.

rad(n) = ∏ p.
p|n

It is the largest square-free factor of n.


Conjecture 8.0.1 — ABC. For every ε > 0 there exists a constant Cε > 0 such that for all mutually
co-prime integers a,b,c satisfying a + b = c we have

max(|a| , |b| , |c|) ≤ Cε (rad(abc))1+ε

The ABC-conjecture mixes the additive structure (by relating a, b, c by a + b = c) and the
multiplicative structure (the radical of abc is defined through the factors of abc). In informal terms
the conjecture implies that if a and b are positive numbers which contain large powers of small
primes then c must contain small powers of larger primes.
■ Example 8.1 Here are a couple of randomly chosen a, b which contain large powers of small
primes, illustrating that their sum must have at least one small power of a large prime.
• a = 217 and b = 529 then c = a + b = 186264514923095834197 factors as

c = 72 · 3801316631083588453

• a = 39 and b = 1110 then c = a + b = 34522712163614 factors as

c = 2 · 587 · 29406058061

• a = (2 · 3)12 and b = (5 · 7)15 then c = a + b = 144884079282930643579211 factors as

c = 52523171 · 2758479286845241


74 Chapter 8. The ABC-conjecture

The ABC-conjecture is surprisingly powerful, and it solves a lot of other conjectures in number
theory, especially concerning diophantine equations. A long and extremely complex proposed
proof by Mochizuki is currently under scrutiny by the mathematical community.
For one application of the ABC-conjecture note the following:
Wiles and Taylor proved, in the mid 1990’ies Fermat’s last theorem, which we state below. The
proof was long using advanced machinery about elliptic curves, modular functions, etc.

Theorem 8.0.2 — Fermat’s last theorem. Let n ≥ 3. The diophantine equation

xn + yn = zn (8.1)

has no solutions x, y, z ∈ N.

The case n = 3 is due to Euler, n = 4 is due to Fermat, and n = 5 is due to Dirichlet and Legendre.
The n = 6 case follows from the n = 3 case, and the n = 7 case was proved in 1839 by Lamé.
Using the ABC-conjecture we can get pretty close to a short proof of the rest of the theo-
rem. Given a solution (x, y, z) ∈ N3 we would get immediately from the ABC-conjecture since
rad(xn yn zn ) = rad(xyz) ≤ xyz ≤ z3 that

zn ≤ Cε z3(1+ε) . (8.2)

Clearly z > 1 so taking logs we find that


logCε logCε
n − 3(1 + ε) ≤ ≤
log z log 2

so Fermat’s last theorem is true for n > log(Cε )/ log(2) + 3(1 + ε).
If we know the ABC-conjecture with any exponent ε0 satisfying n − 3(1 + ε0 ) > 0 then by (8.2)
1/(n−3(1+ε0 ))
we have |z| ≤ Cε0 , i.e. given n there can only be finitely many z’s and hence only finitely
many solutions to Fermat’s equation (8.1) for that given n.
If the ABC-conjecture holds with ε = 1 and C = 1 we find zn ≤ z6 . In particular zn−6 ≤ 1
proves Fermat’s last theorem if n > 6. Note that the cases n = 3, 4, 5, 6 have been resolved much
earlier; See [Rib79].
Bibliography

[AKS04] Manindra Agrawal, Neeraj Kayal, and Nitin Saxena. “PRIMES is in P”. In: Ann. of
Math. (2) 160.2 (2004), pp. 781–793 (cit. on p. 11).
[AGP94] W. R. Alford, Andrew Granville, and Carl Pomerance. “There are infinitely many
Carmichael numbers”. In: Ann. of Math. (2) 139.3 (1994), pp. 703–722 (cit. on p. 26).
[Ank57] N. C. Ankeny. “Sums of three squares”. In: Proc. Amer. Math. Soc. 8 (1957), pp. 316–
319 (cit. on p. 59).
[Apo76] Tom M. Apostol. Introduction to analytic number theory. Undergraduate Texts in
Mathematics. New York: Springer-Verlag, 1976, pp. xii+338 (cit. on p. 13).
[Bac90] Eric Bach. “Explicit bounds for primality testing and related problems”. In: Math.
Comp. 55.191 (1990), pp. 355–380 (cit. on p. 27).
[Bau15] Oswald Baumgart. The quadratic reciprocity law. A collection of classical proofs,
Edited, translated from the German, and with contributions by Franz Lemmermeyer.
Birkhäuser/Springer, Cham, 2015, pp. xiv+172 (cit. on p. 34).
[Bor+08] Peter Borwein et al., eds. The Riemann hypothesis. CMS Books in Mathematics/Ouvrages
de Mathématiques de la SMC. A resource for the afficionado and virtuoso alike.
Springer, New York, 2008, pp. xiv+533 (cit. on p. 51).
[Cas72] J. W. S. Cassels. An introduction to Diophantine approximation. Facsimile reprint of
the 1957 edition, Cambridge Tracts in Mathematics and Mathematical Physics, No. 45.
Hafner Publishing Co., New York, 1972, pp. x+169 (cit. on p. 71).
[DH76] Whitfield Diffie and Martin E. Hellman. “New directions in cryptography”. In: IEEE
Trans. Information Theory IT-22.6 (1976), pp. 644–654 (cit. on p. 28).
[DF04] David S. Dummit and Richard M. Foote. Abstract algebra. Third. Hoboken, NJ: John
Wiley & Sons Inc., 2004, pp. xii+932 (cit. on p. 57).
[Fen12] Shaoji Feng. “Zeros of the Riemann zeta function on the critical line”. In: J. Number
Theory 132.4 (2012), pp. 511–542 (cit. on p. 51).
76 Chapter 8. The ABC-conjecture

[Gra05] Andrew Granville. “It is easy to determine whether a given integer is prime”. In: Bull.
Amer. Math. Soc. (N.S.) 42.1 (2005), pp. 3–38 (cit. on p. 11).
[Har14] G. H. Hardy. “Sur les zéros de la fonction zeta(s) de Riemann.” French. In: C. R. Acad.
Sci., Paris 158 (1914), pp. 1012–1014 (cit. on p. 51).
[Hea86] D. R. Heath-Brown. “Artin’s conjecture for primitive roots”. In: Quart. J. Math. Oxford
Ser. (2) 37.145 (1986), pp. 27–38 (cit. on p. 22).
[Hec81] Erich Hecke. Lectures on the theory of algebraic numbers. Vol. 77. Graduate Texts in
Mathematics. Translated from the German by George U. Brauer, Jay R. Goldman and
R. Kotzen. New York: Springer-Verlag, 1981, pp. xii+239 (cit. on p. 37).
[Hil09] David Hilbert. “Beweis für die Darstellbarkeit der ganzen Zahlen durch eine feste
Anzahl nter Potenzen (Waringsches Problem)”. In: Math. Ann. 67.3 (1909), pp. 281–
300 (cit. on p. 53).
[Hoo67] Christopher Hooley. “On Artin’s conjecture”. In: J. Reine Angew. Math. 225 (1967),
pp. 209–220 (cit. on p. 23).
[Ivi85] Aleksandar Ivić. The Riemann zeta-function. A Wiley-Interscience Publication. The
theory of the Riemann zeta-function with applications. New York: John Wiley & Sons
Inc., 1985, pp. xvi+517 (cit. on p. 51).
[Jar14] Frazer Jarvis. Algebraic number theory. Springer Undergraduate Mathematics Series.
Springer, Cham, 2014, pp. xiv+292 (cit. on pp. 36, 37).
[JJ98] Gareth A. Jones and J. Mary Jones. Elementary number theory. Springer Undergraduate
Mathematics Series. London: Springer-Verlag London Ltd., 1998, pp. xiv+301 (cit. on
pp. 22, 24, 59, 60).
[KV92] A. A. Karatsuba and S. M. Voronin. The Riemann zeta-function. Vol. 5. de Gruyter
Expositions in Mathematics. Translated from the Russian by Neal Koblitz. Berlin:
Walter de Gruyter & Co., 1992, pp. xii+396 (cit. on p. 51).
[Mar77] Daniel A. Marcus. Number fields. Universitext. New York: Springer-Verlag, 1977,
pp. viii+279 (cit. on p. 37).
[Neu99] Jürgen Neukirch. Algebraic number theory. Vol. 322. Grundlehren der Mathematischen
Wissenschaften [Fundamental Principles of Mathematical Sciences]. Translated from
the 1992 German original and with a note by Norbert Schappacher, With a foreword by
G. Harder. Berlin: Springer-Verlag, 1999, pp. xviii+571 (cit. on p. 37).
[Rib79] Paulo Ribenboim. 13 lectures on Fermat’s last theorem. Springer-Verlag, New York-
Heidelberg, 1979, xvi+302 pp. (1 plate) (cit. on p. 74).
[Rie59] Bernhard Riemann. “Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse.”
In: Monatsberichte der Berliner Akademie (1859), pp. 671–680 (cit. on p. 50).
[RSA78] R. L. Rivest, A. Shamir, and L. Adleman. “A method for obtaining digital signatures
and public-key cryptosystems”. In: Comm. ACM 21.2 (1978), pp. 120–126 (cit. on
p. 30).
[Rou91] G. Rousseau. “On the quadratic reciprocity law”. In: J. Austral. Math. Soc. Ser. A 51.3
(1991), pp. 423–425 (cit. on p. 34).
[Sel42] Atle Selberg. “On the zeros of Riemann’s zeta-function”. In: Skr. Norske Vid. Akad.
Oslo I. 1942.10 (1942), p. 59 (cit. on p. 51).
[Sho94] Peter W Shor. “Algorithms for quantum computation: Discrete logarithms and factor-
ing”. In: Foundations of Computer Science, 1994 Proceedings., 35th Annual Symposium
on. IEEE. 1994, pp. 124–134 (cit. on pp. 12, 29).
77

[Sho97] Peter W. Shor. “Polynomial-time algorithms for prime factorization and discrete log-
arithms on a quantum computer”. In: SIAM J. Comput. 26.5 (1997), pp. 1484–1509
(cit. on p. 12).
[Ste09] William Stein. Elementary number theory: primes, congruences, and secrets. Under-
graduate Texts in Mathematics. A computational approach. Springer, New York, 2009,
pp. x+166 (cit. on p. 25).
[Ste16] Jörn Steuding, ed. Diophantine analysis. Trends in Mathematics. Course notes from the
summer school held at the University of Würzburg, July 21–26, 2014. Birkhäuser/Springer,
Cham, 2016, pp. xi+232 (cit. on p. 71).
[Tit86] Edward C. Titchmarsh. The theory of the Riemann zeta-function. Second. Edited
and with a preface by D. R. Heath-Brown. New York: The Clarendon Press Oxford
University Press, 1986, pp. x+412 (cit. on p. 51).
[Wal79] Michel Waldschmidt. Transcendence methods. Vol. 52. Queen’s Papers in Pure and
Applied Mathematics. Available at the authors webpage. Kingston, Ont.: Queen’s
University, 1979, 126 pp. (not consecutively paged) (cit. on p. 36).
[Zag90] D. Zagier. “A one-sentence proof that every prime p ≡ 1 (mod 4) is a sum of two
squares”. In: Amer. Math. Monthly 97.2 (1990), p. 144 (cit. on p. 55).
Index

nth root of 1, 37 Diophantine equation, 9


primitive, 37 Diophantine equations, 9
Dirichlet convolution, 44
ABC conjecture, 73 Dirichlet’s approximation theorem, 71
abscissa of absolute convergence, 48 discrete log problem, 29
AKS algorithm, 11 divides, 6
algebraic number, 35 division ring, 17
algebraic integer, 35
arithmetical function, 42 equivalent modulo lattice, 60
Artin’s conjecture, 22 Euclid’s algorithm, 8
associates, 57 Euclid’s lemma, 11
aureum theorema, 34 Euclid-Mullin sequence, 12
Euclidean division, 6
badly approximable, 71 Euclidean function, 56
Bezout’s identity, 8 Euclidean ring, 56
Euler products, 49
Carmichael numbers, 26
Euler’s criterion, 35
centrally symmetric set, 61
Euler’s ϕ-function, 18
Chinese remainder theorem, 18
Euler’s theorem, 18, 54
completely multiplicative, 42
existence of primitive roots modulo p, 21
composite number, 11
continued fraction procedure, 67 factor, 6
continued fractions, 64 Fermat’s last theorem, 74
length, 64 Fermat’s little theorem, 18
simple, 64 field, 17
convex set, 61 Freshman’s dream, 23
coprime, 9 fundamental domain, 60
cryptography, 25 fundamental mesh, 60
fundamental theorem of arithmetic, 11
degree of an algebraic number, 68
Diffie–Hellman key exchange, 28 Gauss sum, 38
INDEX 79

Gaussian integers, 56 sums of four squares, 59


GIMP, 13 sums of three squares, 59
greatest common divisor, 7 sums of two squares, 54

infinite continued fractions, 66 transcendental number, 36


invertible, 17
unit, 17
Korselt’s criterion, 31
Waring’s problem, 53
Lagrange’s theorem, 54 well approximable, 72
lattice, 60 Wilson’s theorem, 25
Legendre symbol, 33
linear equations modulo n, 9
local obstructions, 23

Möbius function, 47
Möbius inversion, 47
man in the middle attack, 29
Mersenne primes, 15
Miller-Rabin primality test, 26
Minkowski’s theorem, 61
modular exponentiation, 27
multiplicative, 42

normally approximable, 72

order of group element, 20

partial convergent, 65
perfect number, 43
periodic continued fraction, 68
prime, 11
prime number theorem, 13
primitive root mod n, 20
pseudo-prime, 27
Pythagorean triples, 13

quadratic irrational, 36, 68


quadratic irrationals, 68
quadratic non-residue modulo p, 33
quadratic reciprocity, 34
quadratic residue modulo p, 33

radical, 73
relatively prime, 9
Riemann hypothesis, 13, 51
Riemann zeta function, 13, 50
ring, 16
Roth’s theorem, 72
RSA, 30

skew field, 17

You might also like