Prime Twin
Prime Twin
JAMES MAYNARD
Abstract. We introduce a refinement of the GPY sieve method for studying prime k-tuples
and small gaps between primes. This refinement avoids previous limitations of the method,
and allows us to show that for each k, the prime k-tuples conjecture holds for a positive pro-
portion of admissible k-tuples. In particular, lim inf n (pn+m − pn ) < ∞ for every integer m. We
arXiv:1311.4600v3 [math.NT] 28 Oct 2019
also show that lim inf(pn+1 − pn ) ≤ 600, and, if we assume the Elliott-Halberstam conjecture,
that lim inf n (pn+1 − pn ) ≤ 12 and lim inf n (pn+2 − pn ) ≤ 600.
1. Introduction
We say that a set H = {h1 , . . . , hk } of distinct non-negative integers is ‘admissible’ if, for
every prime p, there is an integer a p such that a p . h (mod p) for all h ∈ H. We are interested
in the following conjecture.
Conjecture (Prime k-tuples conjecture). Let H = {h1 , . . . , hk } be admissible. Then there are
infinitely many integers n such that all of n + h1 , . . . , n + hk are prime.
When k > 1 no case of the prime k-tuples conjecture is currently known. Work on approx-
imations to the prime k-tuples conjecture has been very successful in showing the existence
of small gaps between primes, however. In their celebrated paper [5], Goldston, Pintz and
Yıldırım introduced a new method for counting tuples of primes, and this allowed them to
show that
pn+1 − pn
(1.1) lim inf = 0.
n log pn
The recent breakthrough of Zhang [9] managed to extend this work to prove
(1.2) lim inf(pn+1 − pn ) ≤ 70 000 000,
n
thereby establishing for the first time the existence of infinitely many bounded gaps between
primes. Moreover, it follows from Zhang’s theorem the that number of admissible sets of
size 2 contained in [1, x]2 which satisfy the prime 2-tuples conjecture is ≫ x2 for large x.
Thus, in this sense, a positive proportion of admissible sets of size 2 satisfy the prime 2-tuples
conjecture. The recent polymath project [7] has succeeded in reducing the bound (1.2) to
4680, by optimizing Zhang’s arguments and introducing several new refinements.
The above results have used the ‘GPY method’ to study prime tuples and small gaps be-
tween primes, and this method relies heavily on the distribution of primes in arithmetic pro-
gressions. Given θ > 0, we say the primes have ‘level of distribution θ’1 if, for every A > 0,
we have
X π(x) x
(1.3) max π(x; q, a) − ≪A .
θ
(a,q)=1 ϕ(q) (log x)A
q≤x
1
We note that different authors have given slightly different names or definitions to this concept. For the
purposes of this paper, (1.3) will be our definition of the primes having level of distribution θ.
1
2 JAMES MAYNARD
The Bombieri-Vinogradov theorem establishes that the primes have level of distribution θ for
every θ < 1/2, and Elliott and Halberstam [1] conjectured that this could be extended to every
θ < 1. Friedlander and Granville [2] have shown that (1.3) cannot hold with xθ replaced with
x/(log x)B for any fixed B, and so the Elliott-Halberstam conjecture is essentially the strongest
possible result of this type.
The original work of Goldston, Pintz and Yıldırım showed the existence of bounded gaps
between primes if (1.3) holds for some θ > 1/2. Moreover, under the Elliott-Halberstam
conjecture one had lim infn (pn+1 − pn ) ≤ 16. The key breakthrough of Zhang’s work was in
establishing that a slightly weakened form of (1.3) holds for some θ > 1/2.
If one looks for bounded length intervals containing two or more primes, then the GPY
method fails to prove such strong results. Unconditionally we are only able to improve upon
the trivial bound from the prime number theorem by a constant factor [4], and even assuming
the Elliott-Halberstam conjecture, the best available result [5] is
pn+2 − pn
(1.4) lim inf = 0.
n log pn
The aim of this paper is to introduce a refinement of the GPY method which removes the
barrier of θ = 1/2 to establishing bounded gaps between primes, and allows us to show the
existence of arbitrarily many primes in bounded length intervals. This answers the second and
third questions posed in [5] on extensions of the GPY method (the first having been answered
by Zhang’s result). Our new method also has the benefit that it produces numerically superior
results to previous approaches.
Theorem 1.1. Let m ∈ N. We have
lim inf(pn+m − pn ) ≪ m3 e4m .
n
Terence Tao (private communication) has independently proven Theorem 1.1 (with a slightly
weaker bound) at much the same time. He uses a similar method; the steps are more-or-less
the same but the calculations are done differently. We will indicate some of the differences in
our proofs as we go along.
We see that the bound in Theorem 1.1 is quite far from the conjectural bound of approxi-
mately m log m predicted by the prime m-tuples conjecture.
Our proof naturally generalizes (but with a weaker upper bound) to many subsequences of
the primes which have a level of distribution θ > 0. For example, we can show corresponding
results where the primes are contained in short intervals [N, N + N 7/12+ǫ ] for any ǫ > 0 or in
an arithmetic progression modulo q ≪ (log N)A . In particular, our method gives results for
simultaneously prime values of linear functions, which might have specific interest. Given k
distinct linear functions Li (n)Q= ai n + bi (1 ≤ i ≤ k) with positive integer coefficients such that
the product function Π(n) = ki=1 Li (n) has no fixed prime divisor, the method presented here
shows that there are infinitely many integers n such that at least (1/4 + ok→∞ (1)) log k of the
Li (n) are prime.
Theorem 1.2. Let m ∈ N. Let r ∈ N be sufficiently large depending on m, and let A =
{a1 , a2, . . . , ar } be a set of r distinct integers. Then we have
#{{h1, . . . , hm } ⊆ A : for infinitely many n all of n + h1 , . . . , n + hm are prime}
≫m 1.
#{{h1 , . . . , hm } ⊆ A}
Thus a positive proportion of admissible m-tuples satisfy the prime m-tuples conjecture for
every m, in an appropriate sense.
SMALL GAPS BETWEEN PRIMES 3
We emphasize that the above result does not incorporate any of the technology used by
Zhang to establish the existence of bounded gaps between primes. The proof is essentially
elementary, relying only on the Bombieri-Vinogradov theorem. Naturally, if we assume that
the primes have a higher level of distribution, then we can obtain stronger results.
Theorem 1.4. Assume that the primes have level of distribution θ for every θ < 1. Then
lim inf(pn+1 − pn ) ≤ 12,
n
lim inf(pn+2 − pn ) ≤ 600.
n
Although the constant 12 of Theorem 1.4 appears to be optimal with our method in its
current form, the constant 600 appearing in Theorem 1.3 and Theorem 1.4 is certainly not
optimal. By performing further numerical calculations our method could produce a better
bound, and also most of the ideas of Zhang’s work (and the refinements produced by the
polymath project) should be able to be combined with this method to reduce the constant
further. We comment that the assumption of the Elliott-Halberstam conjecture allows us to
improve the bound on Theorem 1.1 to O(m3 e2m ).
2. An improved GPY sieve method
We first give an explanation of the key idea behind our new approach. The basic idea of
the GPY method is, for a fixed admissible set H = {h1 , . . . , hk }, to consider the sum
X X k
(2.1) S (N, ρ) = χP (n + hi ) − ρ wn .
N≤n<2N i=1
Here χP is the characteristic function of the primes, ρ > 0 and wn are non-negative weights.
If we can show that S (N, ρ) > 0 then at least one term in the sum over n must have a positive
contribution. By the non-negativity of wn , this means that there must be some integer n ∈
[N, 2N] such that at least ⌊ρ + 1⌋ of the n + hi are prime. (Here ⌊x⌋ denotes the largest integer
less than or equal to x.) Thus if S (N, ρ) > 0 for all large N, there are infinitely many integers
n for which at least ⌊ρ + 1⌋ of the n + hi are prime (and so there are infinitely many bounded
length intervals containing ⌊ρ + 1⌋ primes).
The weights wn are typically chosen to mimic Selberg sieve weights. Estimating (2.1)
can be interpreted as a ‘k-dimensional’ sieve problem. The standard Selberg k-dimensional
weights (which can be shown to be essentially optimal in other contexts) are
X 2
(2.2) wn = λd , λd = µ(d)(log R/d)k .
Qk
d| i=1 (n+hi )
d<R
With this choice we find that we just fail to prove the existence of bounded gaps between
primes if we assume the Elliott-Halberstam conjecture. The key new idea in the paper of
Goldston, Pintz and Yıldırım [5] was to consider more general sieve weights of the form
(2.3) λd = µ(d)F(log R/d),
for a suitable smooth function F. Goldston, Pintz and Yıldırım chose F(x) = xk+l for suitable
l ∈ N, which has been shown to be essentially optimal when k is large. This allows us to gain a
factor of approximately 2 for large k over the previous choice of sieve weights. As a result we
4 JAMES MAYNARD
just fail to prove bounded gaps using the fact that the primes have exponent of distribution θ
for any θ < 1/2, but succeed in doing so if we assume they have level of distribution θ > 1/2.
The new ingredient in our method is to consider a more general form of the sieve weights
X 2
(2.4) wn = λd1 ,...,dk .
di |n+hi ∀i
Using such weights with λd1 ,...,dk is the key feature of our method. It allows us to improve
on the previous choice of sieve weights by an arbitrarily large factor, provided that k is suf-
ficiently large. It is the extra flexibility gained by allowing the weights to depend on the
divisors of each factor individually which gives this improvement.
The idea to use such weights is not entirely new. Selberg [8, Page 245] suggested the
possible use of similar weights in his work on approximations to the twin prime problem, and
Goldston and Yıldırım [6] considered similar weights in earlier work on the GPY method, but
with the support restricted to di < R1/k for all i.
We comment that our choice of λd1 ,...,dk will look like
k
Y
(2.5) λd1 ,...,dk ≈ µ(di ) f (d1, . . . , dk ),
i=1
for a suitable smooth function f . For our precise choice of λd1 ,...,dk (given in Proposition 4.1)
we find it convenient to give a slightly different form of λd1 ,...,dk , but weights of the form (2.5)
should produce essentially the same results.
3. Notation
We shall view k as a fixed integer, and H = {h1 , . . . , hk } as a fixed admissible set. In
particular, any constants implied by the asymptotic notation o, O or ≪ may depend on k and
H. We will let N denote a large integer, and all asymptotic notation should be interpreted as
referring to the limit N → ∞.
All sums, products and suprema will be assumed to be taken over variables lying in the
natural numbers N = {1, 2, . . . } unless specified otherwise. The exception to this is when
sums or products are over a variable p, which instead will be assumed to lie in the prime
numbers P = {2, 3, . . . , }.
Throughout the paper, ϕ will denote the Euler totient function, τr (n) the number of ways
of writing n as a product of r natural numbers and µ the Moebius function. We will let ǫ be a
fixed positive real number, and we may assume without further comment that ǫ is sufficiently
small at various stages of our argument. We let pn denote the nth prime, and #A denote the
number of elements of a finite set A. We use ⌊x⌋ to denote the largest integer n ≤ x, and ⌈x⌉
the smallest integer n ≥ x. We let (a, b) be the greatest common divisor of integers a and b.
Finally, [a, b] will denote the closed interval on the real line with endpoints a and b, except
for in Section 5 where it will denote the least common multiple of integers a and b instead.
so certainly W ≪ (log log N)2 by the prime number theorem. By the Chinese remainder
theorem, we can choose v0 such that v0 + hi is coprime to W for each i since H is admissible.
When n ≡ v0 (mod W), we choose our weights wn of the form (2.4). We now wish to estimate
the sums
2
X X
(4.2) S1 =
λ d1 ,...,dk
,
N≤n<2N di |n+hi ∀i
n≡v0 (mod W)
k
2
X X X
(4.3) S2 = χP (n + hi ) λd1 ,...,dk .
N≤n<2N i=1 di |n+hi ∀i
n≡v0 (mod W)
whenever ( ki=1 di , W) = 1, and let λd1 ,...,dk = 0 otherwise. Moreover, let F be supported on
Q
Rk = {(x1 , . . . , xk ) ∈ [0, 1]k : ki=1 xi ≤ 1}. Then we have
P
We recall that if S 2 is large compared to S 1 , then using the GPY method we can show that
there are infinitely many integers n such that several of the n + hi are prime. The following
proposition makes this precise.
Proposition 4.2. Let the primes have level of distribution θ > 0. Let δ > 0 and H =
{h1 , . . . , hk } be an admissible set. Let Ik (F) and Jk(m) (F) be given as in Proposition 4.1,
and let Sk denote the set of Riemann-integrable functions F : [0, 1]k → R supported on
Rk = {(x1 , . . . , xk ) ∈ [0, 1]k : ki=1 xi ≤ 1} with Ik (F) , 0 and Jk(m) (F) , 0 for each m. Let
P
Pk (m)
m=1 Jk (F)
l θMk m
Mk = sup , rk = .
F∈Sk Ik (F) 2
Then there are infinitely many integers n such that at least rk of the n + hi (1 ≤ i ≤ k) are
prime. In particular, lim infn (pn+rk −1 − pn ) ≤ max1≤i, j≤k (hi − h j ).
6 JAMES MAYNARD
Proof of Proposition 4.2. We let S = S 2 − ρS 1 , and recall that from Section 2 that if we can
show S > 0 for all large N, then there are infinitely many integers n such that at least ⌊ρ + 1⌋
of the n + hi are prime.
We put R = N θ/2−δ for a small δ > 0. By the definition of Mk , we can choose F 0 ∈ Sk such
that km=1 Jk(m) (F 0 ) > (Mk − δ)Ik (F 0 ) > 0. Since F 0 is Riemann-integrable, there is a smooth
P
function F 1 such that km=1 Jk(m) (F 1 ) > (Mk − 2δ)Ik (F 1 ) > 0. Using Proposition 4.1, we can
P
then choose λd1 ,...,dk such that
k
ϕ(W)k N(log R)k log R X (m)
S = J (F 1 ) − ρIk (F 1 ) + o(1)
W k+1 log N j=1 k
ϕ(W)k N(log R)k Ik (F 1 ) θ
(4.4) ≥ − δ M k − 2δ − ρ + o(1) .
W k+1 2
If ρ = θMk /2 − ǫ then, by choosing δ suitably small (depending on ǫ), we see that S > 0 for
all large N. Thus there are infinitely many integers n for which at least ⌊ρ + 1⌋ of the n + hi
are prime. Since ⌊ρ + 1⌋ = ⌈θMk /2⌉ if ǫ is suitably small, we obtain Proposition 4.2.
Thus, if the primes have a fixed level of distribution θ, to show the existence of many of the
n + hi being prime for infinitely many n ∈ N we only require a suitable lower bound for Mk .
The following proposition establishes such a bound for different values of k.
Proposition 4.3. Let k ∈ N, and Mk be as given by Proposition 4.2. Then
(1) We have M5 > 2.
(2) We have M105 > 4.
(3) If k is sufficiently large, we have Mk > log k − 2 log log k − 2.
We now prove Theorems 1.1, 1.2, 1.3 and 1.4 from Propositions 4.2 and 4.3.
First we consider Theorem 1.3. We take k = 105. By Proposition 4.3, we have M105 > 4.
By the Bombieri-Vinogradov theorem, the primes have level of distribution θ = 1/2 − ǫ for
every ǫ > 0. Thus, if we take ǫ sufficiently small, we have θM105 /2 > 1. Therefore, by
Proposition 4.2, we have lim inf(pn+1 − pn ) ≤ max1≤i, j≤105 (hi − h j ) for any admissible set
H = {h1 , . . . , h105 }. By computations performed by Thomas Engelsma (unpublished), we can
choose2 H such that 0 ≤ h1 < . . . < h105 and h105 − h1 = 600. This gives Theorem 1.3.
If we assume the Elliott-Halberstam conjecture then the primes have level of distribution
θ = 1 − ǫ. First we take k = 105, and see that θM105 /2 > 2 for ǫ sufficiently small (since
M105 > 4). Therefore, by Proposition 4.2, lim infn (pn+2 − pn ) ≤ max1≤i, j≤105 (hi − h j ). Thus,
choosing the same admissible set H as above, we see lim infn (pn+2 − pn ) ≤ 600 under the
Elliott-Halberstam conjecture.
Next we take k = 5 and H = {0, 2, 6, 8, 12}, with θ = 1 − ǫ again. By Proposition 4.3
we have M5 > 2, and so θM5 /2 > 1 for ǫ sufficiently small. Thus, by Proposition 4.2,
lim infn (pn+1 − pn ) ≤ 12 under the Elliott-Halberstam conjecture. This completes the proof of
Theorem 1.4.
2
Explicitly, we can take H = {0, 10, 12, 24, 28, 30, 34, 42, 48, 52, 54, 64, 70, 72, 78, 82, 90, 94, 100, 112,
114, 118, 120, 124, 132, 138, 148, 154, 168, 174, 178, 180, 184, 190, 192, 202, 204, 208, 220, 222, 232, 234,
250, 252, 258, 262, 264, 268, 280, 288, 294, 300, 310, 322, 324, 328, 330, 334, 342, 352, 358, 360, 364, 372,
378, 384, 390, 394, 400, 402, 408, 412, 418, 420, 430, 432, 442, 444, 450, 454, 462, 468, 472, 478, 484, 490,
492, 498, 504, 510, 528, 532, 534, 538, 544, 558, 562, 570, 574, 580, 582, 588, 594, 598, 600}. This set was
obtained from the website http://math.mit.edu/˜primegaps/ maintained by Andrew Sutherland.
SMALL GAPS BETWEEN PRIMES 7
Finally, we consider the case when k is large. For the rest of this section, any constants im-
plied by asymptotic notation will be independent of k. By the Bombieri-Vinogradov theorem,
we can take θ = 1/2 − ǫ. Thus, by Proposition 4.3, we have for k sufficiently large
θMk 1 ǫ
(4.5) ≥ − log k − 2 log log k − 2 .
2 4 2
We choose ǫ = 1/k, and see that θMk /2 > m if k ≥ Cm2 e4m for some absolute constant C
(independent of m and k). Thus, for any admissible set H = {h1 , . . . , hk } with k ≥ Cm2 e4m ,
at least m + 1 of the n + hi must be prime for infinitely many integers n. We can choose
our set H to be the set {pπ(k)+1 , . . . , pπ(k)+k } of the first k primes which are greater than k.
This is admissible, since no element is a multiple of a prime less than k (and there are k
elements, so it cannot cover all residue classes modulo any prime greater than k.) This set has
diameter pπ(k)+k − pπ(k)+1 ≪ k log k. Thus lim infn (pn+m − pn ) ≪ k log k ≪ m3 e4m if we take
k = ⌈Cm2 e4m ⌉. This gives Theorem 1.1.
We can now establish Theorem 1.2 by a simple counting argument. Given m, we let
k = ⌈Cm2 e4m ⌉ as above. Therefore if {h1 , . . . , hk } is admissible, then there exists a subset
{h′1 , . . . , h′m } ⊆ {h1, . . . , hk } with the property that there are infinitely many integers n for which
all of the n + h′i are prime (1 ≤ i ≤ m).
We let A2 denote the set formed by starting with the given set A = {a1 , . . . , ar }, and for
each prime p ≤ k in turn removing all elements Q of the residue class modulo p which contains
the fewest integers. We see that #A2 ≥ r p≤k (1 − 1/p) ≫m r. Moreover, any subset of A2
of size k must be admissible, since it cannot cover all residue classes modulo p for any prime
p ≤ k. We let s = #A2 , and since r is taken sufficiently large in terms of m, we may assume
that s > k.
We see there are ks sets H ⊆ A2 of size k. Each of these is admissible, and so contains
at least one subset {h′1, . . . , h′m } ⊆ A2 which satisfies s−m the prime m-tuples conjecture. Any
admissible set B ⊆ A2 of size m is contained in k−m sets H ⊆ A2 of size k. Thus there are
s−m−1
at least ks k−m ≫m sm ≫m rm admissible sets B ⊆ A2 of size m which satisfy the prime
m-tuples conjecture. Since there are mr ≤ rm sets {h1 , . . . , hm } ⊆ A, Theorem 1.2 holds.
We are left to establish Propositions 4.1 and 4.3.
Proof. We expand out the square, and swap the order of summation to give
X X 2 X X
(5.1) S1 = λd1 ,...,dk = λd1 ,...,dk λe1 ,...,ek 1.
N≤n<2N di |n+hi ∀i d1 ,...,dk N≤n<2N
n≡v0 (mod W) e1 ,...,ek n≡v0 (mod W)
[di ,ei ]|n+hi ∀i
We recall that here, and throughout this section, we are using [a, b] to denote the least common
multiple of a and b.
By the Chinese remainderQtheorem, the inner sum can be written as a sum over a single
residue class modulo q = W ki=1 [di , ei ], provided that the integers W, [d1, e1 ], . . . , [dk , ek ] are
pairwise coprime. In this case the inner sum is N/q + O(1). If the integers are not pairwise
coprime then the inner sum is empty. This gives
N X′ λd1 ,...,dk λe1 ,...,ek X′
(5.2) S1 = Qk + O |λ |
d1 ,...,dk e1 ,...,ek ,
λ
W d ,...,d i=1 [di , ei ] d ,...,d
1 k 1 k
e1 ,...,ek e1 ,...,ek
where ′ is used to denote the restriction that we require W, [d1, e1 ], . . . , [dk , ek ] to be pairwise
P
coprime. To ease notation we will put λmax = supd1 ,...,dk |λd1 ,...,dk |. We now see that since λd1 ,...,dk
is non-zero only when ki=1 di < R, the error term contributes
Q
X 2
(5.3) ≪ λ2max τk (d) ≪ λ2max R2 (log R)2k ,
d<R
We recall that λd1 ,...,dk is supported on integers d1 , . . . , dk with (di , W) = 1 for each i and
(di , d j ) = 1 for all i , j. Thus we may drop the requirement that W is coprime to each of
the [di , ei ] from the summation, since these terms have no contribution. Similarly, we may
drop the requirement that the di variables are all pairwise coprime, and the requirement that
the ei variables are all pairwise coprime. Thus the only remaining restriction coming from the
pairwise coprimality of W, [d1 , e1 ], . . . , [dk , ek ] is that (di , e j ) = 1 for all i , j.
SMALL GAPS BETWEEN PRIMES 9
P
We can remove the requirement that (di , e j ) = 1 by multiplying our expression by si, j |di ,e j µ(si, j ).
We do this for all i, j with i , j. This transforms the main term to
k
N X Y X Y X λd1 ,...,dk λe1 ,...,ek
(5.6) ϕ(ui ) µ(si, j ) .
( i=1 di )( ki=1 ei )
Qk
W u ,...,u i=1
Q
1 k s ,...,s 1≤i, j≤k
1,2 k,k−1 d1 ,...,dk
i, j e1 ,...,ek
ui |di ,ei ∀i
si, j |di ,e j ∀i, j
We can restrict the si, j to be coprime to ui and u j , because terms with si, j not coprime to ui or
u j make no contribution to our sum. This is because λd1 ,...,dk = 0 unless (di , d j) = 1. Similarly
we can further restrict our sum so that si, j is coprime to si,a and sb, j for
P∗all a , j and b , i. We
denote the summation over s1,2 , . . . , sk,k−1 with these restrictions by .
We now introduce a change of variables to make the estimation of the sum more straight-
forward. We let
k
Y X λd1 ,...,dk
(5.7) yr1 ,...,rk = µ(ri )ϕ(ri ) Qk .
i=1 d1 ,...,dk i=1 d i
ri |di ∀i
Qk
This change is invertible. For d1 , . . . , dk with i=1 di square-free we find that
k X λe1 ,...,ek
X yr1 ,...,rk X Y
Qk = µ(ri ) Qk
r1 ,...,rk i=1 ϕ(ri ) r1 ,...,rk i=1 e1 ,...,ek i=1 ei
di |ri ∀i di |ri ∀i ri |ei ∀i
X λe ,...,e X Y k
1
λd ,...,d
(5.8) = Qk
k
µ(ri ) = Qk 1 k .
e1 ,...,ek i=1 ei r1 ,...,rk i=1 i=1 µi (di )di
di |ri ∀i
ri |ei ∀i
Thus any choice of yr1 ,...,rk supported on r1 , . . . , rk , with the product r = ki=1 ri square-free
Q
and satisfying r < R and (r, W) = 1, will P give a suitable choice of λd1 ,...,dk . We let ymax =
supr1 ,...,rk |yr1 ,...,rk |. Now, since d/ϕ(d) = e|d 1/ϕ(e) for square-free d, we find by taking r′ =
Qk
i=1 ri /di that
k k
Y X Y µ(ri )2
λmax ≤ sup ymax di
d1 ,...,dk i=1 r1 ,...,rk i=1
ϕ(ri )
Qk
i=1 di square-free d |r ∀i
Qk i i
ri <R
Qk i=1
i=1 ri square-free
k
Y di X µ(r′ )2 τk (r′ )
≤ ymax sup
d1 ,...,dk i=1
ϕ(di ) ϕ(r′ )
r ′ <R/ ki=1 di
Q
Qk
i=1 di square-free ′ Qk
(r , i=1 di )=1
X µ(d)2 X µ(r′ )2 τk (r′ )
≤ ymax sup
d1 ,...,dk Qk ϕ(d) ϕ(r′ )
r ′ <R/ ki=1 di
Q
d| i=1 di
′ Qk
(r , i=1 di )=1
X µ(u)2 τk (u)
(5.9) ≤ ymax ≪ ymax (log R)k .
u<R
ϕ(u)
10 JAMES MAYNARD
In the last line we have taken u = dr′ , and used the fact τk (dr′ ) ≥ τk (r′ ). Hence the error term
O(λ2max R2 (log N)2k ) is of size O(y2max R2 (log N)4k ).
Substituting our change of variables (5.7) into the main term (5.6), and using the above
estimate for the error term, we obtain
k k
N X Y X∗ Y Y µ(ai )µ(bi )
S1 = ϕ(ui ) µ(si, j ) ya ,...,a yb ,...,b
W u ,...,u i=1 s ,...,s 1≤i, j≤k i=1
ϕ(ai )ϕ(bi ) 1 k 1 k
1 k 1,2 k,k−1
i, j
(5.10) + O y2max R2 (log R)4k ,
Q Q
where a j = u j i, j s j,i and b j = u j i, j si, j . In these expressions we have used the fact that
we have restricted si, j to be coprime to the other Q terms in the expression for ai and b j . For
the same reason we may rewrite µ(a j ) as µ(u j ) i, j µ(si, j ), and similarly for ϕ(a j ), µ(b j ) and
ϕ(b j ). This gives us
k
N X Y µ(ui )2 X∗ Y µ(si, j )
2 2 4k
(5.11) S 1 = 2
ya1 ,...,ak
yb1 ,...,bk
+ O ymax R (log R) .
W u ,...,u i=1 ϕ(ui ) s ,...,s 1≤i, j≤k
ϕ(s i, j )
1 k 1,2 k,k−1
i, j
We see that there is no contribution from si, j with (si, j , W) , 1 because of the restricted
support of y. Thus we only need to consider si, j = 1 or si, j > D0 . The contribution when
si, j > D0 is
y2 N X µ(u)2 k X µ(si, j )2 X µ(s)2 k2 −k−1 y2max ϕ(W)k N(log R)k
(5.12) ≪ max 2 2
≪ k+1 D
.
W u<R
ϕ(u) s >D
ϕ(s i, j ) s≥1
ϕ(s) W 0
i, j 0
(u,W)=1
Thus we may restrict our attention to the case when si, j = 1 ∀i , j. This gives
N X y2u1 ,...,uk y2max ϕ(W)k N(log R)k
!
2 2 4k
(5.13) S1 = +O + ymax R (log R) .
W u ,...,u ki=1 ϕ(ui ) W k+1 D0
Q
1 k
θ−2δ
We recall that R = N 2
≤ N 1−2δ and W ≪ N δ , and so the first error term dominates. This
gives the result.
We now consider S 2 . We write S 2 = km=1 S 2(m) , where
P
X X 2
(5.14) S 2(m) = χP (n + hm ) λd1 ,...,dk .
N≤n<2N d1 ,...,dk
n≡v0 (mod W) di |n+hi ∀i
where g is the totally multiplicative function defined on primes by g(p) = p − 2. Let y(m)
max =
(m)
supr1 ,...,rk |yr1 ,...,rk |. Then for any fixed A > 0 we have
X (y(m) r1 ,...,rk )
2 (y(m) 2 k−2
N(log N)k−2 y2max N
N max ) ϕ(W)
S 2(m) = Qk +O + O .
ϕ(W) log N r ,...,r i=1 g(ri ) W k−1 D0 (log N)A
1 k
SMALL GAPS BETWEEN PRIMES 11
Proof. We first expand out the square and swap the order of summation to give
X X
(5.15) S 2(m) = λd1 ,...,dk λe1 ,...,ek χP (n + hm ).
d1 ,...,dk N≤n<2N
e1 ,...,ek n≡v0 (mod W)
[di ,ei ]|n+hi ∀i
AsQ with S 1 , the inner sum can be written as a sum over a single residue class modulo q =
W ki=1 [di , ei ], provided that W, [d1, e1 ], . . . , [dk , ek ] are pairwise coprime. The integer n + hm
will lie in a residue class coprime to the modulus if and only if dm = em = 1. In this case the
inner sum will contribute XN /ϕ(q) + O(E(N, q)), where
X 1 X
(5.16) E(N, q) = 1 + sup χP (n) − χP (n) ,
(a,q)=1 N≤n<2N
ϕ(q) N≤n<2N
n≡a (mod q)
X
(5.17) XN = χP (n).
N≤n<2N
If either one pair of W, [d1 , e1 ], . . . , [dk , ek ] share a common factor, or if either dm or em are not
1, then the contribution of the inner sum is zero. Thus we obtain
By Cauchy-Schwarz, the trivial bound E(N, q) ≪ N/ϕ(q), and our hypothesis that the primes
have level of distribution θ, this contributes for any fixed A > 0
X N 1/2 X 1/2 y2 N
(5.20) ≪ y2max (log R)2k µ(r)2 τ23k (r) µ(r)2 E(N, r) ≪ max A .
ϕ(r) (log N)
r<R2 W 2 r<R W
We now concentrate on the main sum. As in the treatment of S 1 in the P proof of Lemma 5.1,
we rewrite the conditions (di , e j ) = 1 by multiplying our expression by si, j |di ,e j µ(si, j ). Again
we may restrict si, j to be coprime to ui , u j , P
si,a and sb, j for all a , j and b , i. We denote the
summation subject to these restrictions by ∗ . We also split the ϕ([di , ei ]) terms by using the
equation (valid for square-free di , ei )
1 1 X
(5.21) = g(ui ),
ϕ([di , ei ]) ϕ(di )ϕ(ei ) u |d ,e
i i i
12 JAMES MAYNARD
where g is the totally multiplicative function defined on primes by g(p) = p − 2. This gives
us a main term of
k
XN X Y X∗ Y X λd1 ,...,dk λe1 ,...,ek
(5.22) g(ui ) µ(si, j ) Qk .
ϕ(W) u ,...,u i=1 s1,2 ,...,sk,k−1 1≤i, j≤k d1 ,...,dk i=1 ϕ(di )ϕ(ei )
1 k
i, j e1 ,...,ek
ui |di ,ei ∀i
si, j |di ,e j ∀i, j
dm =em =1
We have now separated the dependencies between the e and d variables, so again we make a
substitution. We let
Yk X λd1 ,...,dk
(m)
(5.23) yr1 ,...,rk = µ(ri )g(ri ) Qk .
i=1 d1 ,...,dk i=1 ϕ(di )
ri |di ∀i
dm =1
We note y(m)
r1 ,...,rk = 0 unless rm = 1. Substituting this into (5.22), we obtain a main term of
k
XN X Y µ(ui )2 X∗ Y µ(si, j ) (m)
(5.24) y y(m) ,
2 a1 ,...,ak b1 ,...,bk
ϕ(W) u ,...,u i=1 g(ui ) s ,...,s 1≤i, j≤k
g(s i, j )
1 k 1,2 k,k−1
i, j
Q Q
where a j = u j Q i, j s j,i and b j = u j i, j si, j for each 1 ≤ j ≤ k. As before, we have replaced
µ(a j ) with µ(u j ) i, j µ(s j,i ) (and similarly for g(a j ), µ(b j ) and g(b j )). This is valid since terms
with a j or b j not square-free make no contribution.
We see the contribution from si, j , 1 is of size
(y(m) 2
max ) N
X µ(u)2 k−1 X µ(s)2 k(k−1)−1 X µ(si, j )2
≪
ϕ(W) log N u<R g(u) s
g(s)2 s >D
g(si, j )2
i, j 0
(u,W)=1
(y(m) 2
max ) ϕ(W)
k−2
N(log R)k−1
(5.25) ≪ .
W k−1 D0 log N
Thus we find that
(m) (y(m)
XN X (yu1 ,...,uk )2 2
max ) ϕ(W)
k−2
N(log R)k−2 y2max N
(5.26) S 2(m) = +O +O .
ϕ(W) u ,...,u ki=1 g(ui ) D0 W k−1 (log N)A
Q
1 k
Finally, by the prime number theorem, XN = N/ log N + O(N/(log N)2 ). This error term
contributes
(y(m) 2
max ) N
X µ(u)2 k−1 (y(m) 2
max ) ϕ(W)
k−2
N(log R)k−3
(5.27) ≪ ≪ ,
ϕ(W)(log N)2 u<R g(u) W k−1
(u,W)=1
which can be absorbed into the first error term of (5.26). This completes the proof.
Remark. In our proof of Lemma 5.2 we only really require λd1 ,...,dk to be supported on d1 , . . . , dk
satisfying i, j di < R for all j instead of ki=1 di < R. For k ≥ 3, the numerical benefit of this
Q Q
extension is small and so we do not consider it further.
Remark. As our result relies on the Bombieri-Vinogradov theorem, the implied constant in
the error term is not effectively computable. However, if we restrict the λd1 ,...,dk to be supported
on di which are coprime to the largest prime factor of a possible exceptional modulus of a
SMALL GAPS BETWEEN PRIMES 13
primitive character then we can make this error term (and all others in this paper) effective
at the cost of a negligible error.
Proof. We assume throughout the proof that rm = 1. We first substitute our expression (5.8)
into the definition (5.23). This gives
k k
Y X Y µ(di )di X ya1 ,...,ak
(5.28) y(m)
r1 ,...,rk = µ(ri )g(ri ) .
ϕ(di ) a1 ,...,ak ki=1 ϕ(ai )
Q
i=1 d ,...,d i=1 1 k
ri |di ∀i di |ai ∀i
dm =1
We see that from the support of ya1 ,...,ak that we may restrict the summation over a j to (a j , W) =
1. Thus either a j = r j or a j > D0 r j . For j , m, the total contribution from a j , r j is
k
Y X µ(a j )2 X µ(am )2 Y X µ(ai )2
≪ ymax g(ri )ri
i=1 a >D r
ϕ(a j )2 a <R ϕ(am ) 1≤i≤k r |a ϕ(ai )2
j 0 j m i i
r j |a j (am ,W)=1 i, j,m
k
Y g(ri )ri ymax ϕ(W) log R ymax ϕ(W) log R
(5.31) ≪ ≪ .
i=1
ϕ(ri )2 WD0 WD0
Thus we find that the main contribution is when a j = r j for all j , m. We have
k
Y g(ri )ri X yr1 ,...,rm−1 ,am ,rm+1 ,...,rk ymax ϕ(W) log R
(5.32) y(m)
r1 ,...,rk = +O .
i=1
ϕ(ri )2 am
ϕ(am ) WD0
We note that g(p)p/ϕ(p)2 = 1 + O(p−2 ). Thus, since the contribution is zero unless ki=1 ri is
Q
coprime to W, we see that the product in the above expression may be replaced by 1 +O(D−10 ).
This gives the result.
14 JAMES MAYNARD
6. Smooth choice of y
We now choose suitable values for our y variables, and complete the proof of Proposition
4.1.
We first give some comments to motivate our choice of the y variables, which we believe
should be close to optimal. We wish to choose y so as to maximize the ratio of the main terms
of S 2 and S 1 . If we use Lagrangian multipliers to maximize this ratio (treating all error terms
as zero) we arrive at the condition that
k k
Y ϕ(ri ) X g(rm )
(6.1) λyr1 ,...,rk = y(m)
r1 ,...,rm−1 ,1,rm+1 ,...,rk
i=1
g(ri ) m=1
ϕ(rm )
for some fixed constant λ. The y terms are supported on integers free of small prime factors,
and for most integers r free of small prime factors we have g(r) ≈ ϕ(r) ≈ r, and so the above
condition reduces to
k
X
(6.2) λyr1 ,...,rk ≈ y(m)
r1 ,...,rm−1 ,1,rm+1 ,...,rk .
m=1
This condition looks smooth (it has no dependence on the prime factorization of the ri ), and
should be able to be satisfied if yr1 ,...,rk is a smooth function of the ri variables. Motivated by
the above, when the product r = ki=1 ri satisfies (r, W) = 1 and µ(r)2 = 1 we choose
Q
log r1 log rk
(6.3) yr1 ,...,rk = F ,..., ,
log R log R
for some smooth function F : Rk → R, supported on Rk = {(x1 , . . . , xk ) ∈ [0, 1]k : ki=1 xi ≤
P
1}. As previously required, we set yr1 ,...,rk = 0 if the product r is either not coprime to W or is
not square-free. With this choice of y, we can obtain suitable asymptotic estimates for S 1 and
S 2.
We will use the following Lemma to estimate our sums S 1 and S 2 with this choice of y.
Lemma 6.1. Let A1 , A2 , L > 0. Let γ be a multiplicative function satisfying
γ(p)
0≤ ≤ 1 − A1 ,
p
and
X γ(p) log p
−L ≤ − log z/w ≤ A2
w≤p≤z
p
for any 2 ≤ w ≤ z. Let g be the totally multiplicative function defined on primes by g(p) =
γ(p)/(p−γ(p)). Finally, let G : [0, 1] → R be smooth, and let Gmax = supt∈[0,1] (|G(t)|+|G′ (t)|).
Then Z 1
X log d
2
µ(d) g(d)G = S log z G(x)dx + OA1 ,A2 (SLGmax ),
d<z
log z 0
where
Y γ(p) −1 1
S= 1− 1− .
p
p p
Here the constant implied by the ‘O’ term is independent of G and L.
Proof. This is [3, Lemma 4], with κ = 1 and slight changes to the notation.
SMALL GAPS BETWEEN PRIMES 15
We now finish our estimations of S 1 and S 2(m) , completing the proof of Proposition 4.1. We
first estimate S 1 .
Lemma 6.2. Let yr1 ,...,rk be given in terms of a smooth function F by (6.3), with F supported
on Rk = {(x1 , . . . , xk ) ∈ [0, 1]k : ki=1 xi ≤ 1}. Let
P
k
X ∂F
F max = sup |F(t1 , . . . , tk )| + | (t1 , . . . , tk )|.
(t1 ,...,tk )∈[0,1]k i=1
∂ti
Then we have
2
ϕ(W)k N(log R)k F max ϕ(W)k N(log R)k
S1 = Ik (F) + O ,
W k+1 W k+1 D0
where
Z 1 Z 1
Ik (F) = ··· F(t1 , . . . , tk )2 dt1 . . . dtk .
0 0
Proof. We substitute our choice (6.3) of y into our expression of S 1 in terms of yr1 ,...,rk given
by Lemma 5.1. This gives
k F 2 ϕ(W)k N(log R)k
N X Y µ(ui )2 log u1 log uk 2
(6.4) S1 = F ,..., + O max .
W u1 ,...,uk i=1
ϕ(ui ) log R log R W k+1 D0
(ui ,u j )=1∀i, j
(ui ,W)=1∀i
We note that two integers a and b with (a, W) = (b, W) = 1 but (a, b) , 1 must have a common
prime factor which is greater than D0 . Thus we can drop the requirement that (ui , u j ) = 1, at
the cost of an error of size
2 k
F max N X X Y µ(ui )2
≪
W p>D u1 ,...,uk <R i=1
ϕ(ui )
0
p|ui ,u j
(ui ,W)=1∀i
2 2
F max N X 1 X µ(u)2 k F max ϕ(W)k N(log R)k
(6.5) ≪ ≪ .
W p>D (p − 1)2 u<R ϕ(u) W k+1 D0
0
(u,W)=1
We can now estimate this sum by k applications of Lemma 6.1, dealing with the sum over
each ui in turn. For each application we take
1, p ∤ W,
(6.7) γ(p) =
0, otherwise,
X log p
(6.8) L≪1+ ≪ log D0 ,
p|W
p
16 JAMES MAYNARD
Proof. The estimation of S 2(m) is similar to the estimation of S 1 . We first estimate y(m) r1 ,...,rk . We
(m) Qk
recall that yr1 ,...,rk = 0 unless rm = 1 and r = i=1 ri satisfies (r, W) = 1 and µ(r)2 = 1, in
which case y(m) r1 ,...,rk is given in terms of yr1 ,...,rk by Lemma 5.3. We first concentrate on this case
(m)
when yr1 ,...,rk , 0. We substitute our choice (6.3) of y into our expression from Lemma 5.3.
This gives
X µ(u)2 log r1 log rm−1 log u log rm+1 log rk
y(m)
r1 ,...,rk = F ,..., , , ,...,
Qk ϕ(u) log R log R log R log R log R
(u,W i=1 ri )=1
F max ϕ(W) log R
(6.10) +O .
WD0
We can see from this that y(m)
max ≪ ϕ(W)F max (log R)/W. We now estimate the sum over u in
(6.10). We apply Lemma 6.1 with
Qk
1, p ∤ W i=1 ri ,
(6.11) γ(p) =
0, otherwise,
X log p X log p X log log R
(6.12) L≪1+ ≪ + ≪ log log N,
Qk p p<log R
p Qk log R
p|W i=1 ri p|W i=1 ri
p>log R
We obtain
(6.16)
k 2
ϕ(W)N(log R)2 X Y µ(ri )2 ϕ(ri )2 (m) 2 F max ϕ(W)k N(log R)k
S 2(m) = (F r1 ,...,rk ) + O .
W 2 log N r1 ,...,rk i=1
g(ri )ri2 W k+1 D0
(ri ,W)=1∀i
(ri ,r j )=1∀i, j
rm =1
We remove the condition that (ri , r j ) = 1 in the same way we did when considering S 1 . Instead
of (6.5), this introduces an error which is of size
(6.17)
ϕ(W)N(log R)2 F max
2 X
ϕ(p)4 X µ(r)2 ϕ(r)2 k−1 2
F max ϕ(W)k N(log N)k
≪ ≪ .
W 2 log N p>D
g(p) 2 p4
r<R
g(r)r 2 W k+1 D
0
0
(r,W)=1
We estimate this by applying Lemma 6.1 to each summation variable in turn. In each case we
take
p2 −3p+1
1 − p3 −p2 −2p+1 , p ∤ W
(6.19) γ(p) =
0,
otherwise,
X log p
(6.20) L≪1+ ≪ log D0 ,
p|W
p
and A1 , A2 suitable fixed constants. This gives
2
ϕ(W)k N(log R)k+1 (m) F max ϕ(W)k N(log N)k
(6.21) S 2(m) = Jk + O ,
W k+1 log N W k+1 D0
where
Z 1 Z 1 Z 1 2
(6.22) Jk(m) = ··· F(t1 , . . . , tk )dtm dt1 . . . dtm−1 dtm+1 . . . dtk ,
0 0 0
as required.
Remark. If F(t1 , . . . , tk ) = G( i=1 ti ) for some function G, then Ik (F) and Jk(m) (F) simplify to
Pk
R1 R1 R1
Ik (F) = 0 G(t)2 tk−1 dt/(k − 1)! and Jk(m) (F) = 0 ( t G(v)dv)2 tk−2 dt/(k − 2)! for each m, which
is equivalent to the results obtained using the original GPY method using weights given by
(2.3).
Remark. Tao gives an alternative approach to arrive at his equivalent of Proposition 4.1. His
approach is to define λd1 ,...,dk in terms of a suitable smooth function f (t1, . . . , tk ) as in (2.5).
He then estimates the corresponding sums directly using Fourier integrals. This is somewhat
18 JAMES MAYNARD
similar to the original paper of Goldston, Pintz and Yıldırım [5]. Our function F corresponds
to f (t1 , . . . , tk ) differentiated with respect to each coordinate.
whenever (u1 , . . . , uk ) ∈ Rk , and zero otherwise. We expect that if F maximizes the ratio
Pk (m)
m=1 Jk (F)/Ik (F), then F is an eigenfunction for Lk , and the corresponding eigenvalue is
the value of ratio at F. Unfortunately the author has not been able to solve the eigenvalue
equation for Lk when k > 2.
We obtain a lower bound for Mk by constructing a function F = F k which makes the ratio
Pk (m)
m=1 Jk (F)/Ik (F) large provided k is large. We choose F to be of the form
Q
k Pk
i=1 g(kti ), if i=1 ti ≤ 1,
(7.3) F(t1 , . . . , tk ) =
0,
otherwise,
for some smooth function g : [0, ∞] → R, supported on [0, T ]. We see that with this choice F
is symmetric, and so Jk(m) (F) is independent of m. Thus we only need to consider Jk = Jk(1) (F).
Similarly we write Ik = Ik (F). R∞ R∞
The key observation is that if the center of mass 0 ug(u)2 du/ 0 g(u)2 du of g2 is strictly
less than 1, then for large k we expect that the constraints ki=1 ti ≤ 1 to be able to be
P
dropped at the cost of only a small error. This is because R ∞ (byR ∞concentration of measure)
Qk
the main contribution to the unrestricted integrals Ik′ = 0 · · · 0 i=1 g(kt i )2
dt 1 . . . dtk and
R ∞ R ∞ R ∞ Qk Pk
′ 2
Jk = 0 · · · 0 ( 0 i=1 g(kti )dt1 ) dt2 . . . dtk should come primarily from when ti is close
Pk i=1
to the center of mass. Therefore we would expect the contribution when i=1 ti > 1 to be
small if the center of mass is less than 1, and so Ik and Jk are well approximated by Ik′ and Jk′
in this case. R
To ease notation we let γ = u≥0 g(u)2 du, and restrict our attention to g such that γ > 0. We
have
( Z ∞ k
(7.4) Ik = 2
F(t1 , . . . , tk ) dt1 . . . dtk ≤ g(kt)2dt = k−k γk .
0
Rk
We now consider Jk . Since P squares are non-negative, we obtain a lower bound for Jk if we
restrict the outer integral to ki=2 ti < 1 − T/k. This has the advantage that, by the support of
SMALL GAPS BETWEEN PRIMES 19
Z ∞ 2 ( Y
k
−k−1
(7.7) =k g(u)du g(ui )2 du2 . . . duk .
0 i=2
u ,...,uk ≥0
Pk2
i=2 ui >k−T
First we wish to show the error integral E k is small. We do this by comparison with a second
moment. We expect the bound (7.13) for E k to be small if the center of mass of g2 is strictly
less than (k − T )/(k − 1). Therefore we introduce the restriction on g that
R∞
0
ug(u)2 du T
(7.8) µ= R∞ <1− .
g(u)2 du k
0
Pk Pk
To simplify notation, we put η = (k − T )/(k − 1) − µ > 0. If i=2 ui > k − T then i=2 ui >
(k − 1)(µ + η), and so we have
1 X k 2
−2
(7.9) 1≤η ui − µ .
k − 1 i=2
Z ∞ 2 Z ∞ Z ∞ Pk k
−2 −k−1 ui
i=2
2 Y
(7.10) Ek ≤ η k g(u)du ··· −µ g(ui )2 du2 . . . duk .
0 0 0 k−1 i=2
We expand out the inner square. All the terms which are not of the form u2j we can calculate
explicitly as an expression in µ and γ. We find
k
2 2≤i< j≤k ui u j 2µ ki=2 ui
Z ∞ Z ∞ P
−µ2 γk−1
P Y
2 2
(7.11) ··· − + µ g(u i ) du 2 . . . du k = .
0 0 (k − 1)2 k−1 i=2
k − 1
20 JAMES MAYNARD
For the u2j terms we see that u2j g(u j )2 ≤ T u j g(u j )2 from the support of g. Thus
Z ∞ Z ∞ Y k Z ∞
2 2
(7.12) ··· uj g(ui ) du2 . . . duk ≤ T γ k−2
u j g(u j )2 du j = µT γk−1 .
0 0 i=2 0
This gives
∞ ∞
Z µ2 γk−1 η−2 µT k−k−1 γk−1
2 µT γk−1 Z 2
−2 −k−1
(7.13) Ek ≤ η k g(u)du − ≤ g(u)du .
0 k−1 k−1 k−1 0
2 2
Since (k − 1)η ≥ k(1 − T/k − µ) and µ ≤ 1, we find that putting together (7.4), (7.5), (7.6)
and (7.13), we obtain
R∞
2
kJk ( 0 g(u)du) T
(7.14) ≥ R∞ 1− .
Ik g(u)2 du k(1 − T/k − µ)2
0
RT
To maximize our lower bound (7.14), we wish to maximize 0 g(u)du subject to the con-
RT RT
straints that 0 g(u)2 du = γ and 0 ug(u)2 du = µγ. Thus we wish to maximize the expression
Z T Z T Z T
2
(7.15) g(u)du − α g(u) du − γ − β ug(u)2 du − µγ
0 0 0
with respect to α, β and the function g. By the Euler-Lagrange equation, this occurs when
∂
∂g
(g(t) − αg(t)2 − βtg(t)2) = 0 for all t ∈ [0, T ]. Thus we see that
1
(7.16) g(t) = for 0 ≤ t ≤ T .
2α + 2βt
Since the ratio we wish to maximize is unaffected if we multiply g by a positive constant,
we restrict our attention to functions g is of the form 1/(1 + At) for t ∈ [0, T ] and for some
constant A > 0. With this choice of g we find that
Z T Z T
log(1 + AT ) 1 1
(7.17) g(u)du = , g(u)2 du = 1 − ,
0 A 0 A 1 + AT
Z T
1 1
(7.18) ug(u)2 du = 2 log(1 + AT ) − 1 + .
0 A 1 + AT
We choose T such that 1 + AT = eA (which is close to optimal). With this choice we find
that µ = 1/(1 − e−A ) − A−1 and T ≤ eA /A. Thus 1 − T/k − µ ≥ A−1 (1 − A/(eA − 1) − eA /k).
Substituting (7.17) into (7.14), and then using these expressions, we find that
kJk A T AeA
(7.19) ≥ 1 − ≥ A 1 − ,
Ik 1 − e−A k(1 − T/k − µ)2 k(1 − A/(eA − 1) − eA /k)2
provided the right hand side is positive. Finally, we choose A = log k − 2 log log k > 0. For k
sufficiently large we have
T (log k)3 1
(7.20) 1 − − µ ≥ A−1 1 − − > 0,
k k (log k)2
and so µ < 1 − T/k, as required by our constraint (7.8). This choice of A gives
kJk log k
(7.21) Mk ≥ ≥ (log k − 2 log log k) 1 − ≥ log k − 2 log log k − 2
Ik (log k)2 + O(1)
when k is sufficiently large.
SMALL GAPS BETWEEN PRIMES 21
for polynomials P. By the symmetry of km=1 Jk(m) (F) and Ik (F), we restrict our attention
P
to polynomials which are symmetric functions of t1 , . . . , tk . (If F satisfies Lk F = λF then
F σ = F(σ(t1 ), . . . , σ(tk )) also satisfies this for every permutation σ of t1 , . . . , tk . Thus the
symmetric function which is the average of F σ over all such permutations would satisfy this
eigenfunction equation, and so we expect there to be an optimal function which is symmetric.)
Any such polynomial can be written as a polynomial expression in the power sum polynomials
P j = ki=1 tij .
P
Lemma 8.1. Let P j = ki=1 tij denote the jth symmetric power sum polynomial. Then we have
P
(
a!
(1 − P1 )a Pbj dt1 . . . dtk = Gb, j (k),
(k + jb + a)!
Rk
where
b ! r
X x X Y ( jbi )!
Gb, j (x) = b!
r=1
r b ,...,b ≥1 i=1 bi !
P1 r r
i=1 bi =b
Z 1−Pki=2 ti Xk k
a Y Y k X k a+a1 +1 Z 1
ai ai
1− ti ti dt1 = ti 1 − ti (1 − v)a va1 dv
0 i=1 i=1 i=2 i=2 0
k k
a!a1 ! Y X a+a1 +1
(8.3) = tiai 1 − ti .
(a + a1 + 1)! i=2 i=2
R1
Here we used the beta function identity 0
ta(1 − t)b dt = a!b!/(a + b + 1)! in the last line. We
now see (8.2) follows by induction.
By the binomial theorem,
k
X b! Y
(8.4) Pbj = Qk tijbi .
b ,...,bk i=1 bi ! i=1
Pk1
i=1 bi =b
22 JAMES MAYNARD
Lemma 8.2. Let F be given in terms of a polynomial P by (8.1). Let Pk P be given in terms
Pk 2of a
polynomial expression in the symmetric power polynomials P1 = i=1 ti and P2 = i=1 ti by
P = di=1 ai (1 − P1 )bi Pc2i for constants ai ∈ R and non-negative integers bi , ci . Then for each
P
1 ≤ m ≤ k we have
X (bi + b j )!Gci +c j ,2 (k)
Ik (F) = ai a j ,
1≤i, j≤d
(k + bi + b j + 2ci + 2c j )!
ci Xcj
ci c j γbi ,b j ,ci ,c j ,c′1 ,c′2 Gc′1 +c′2 ,2 (k − 1)
X X ! !
Jk(m) (F) = ai a j ,
1≤i, j≤d c′1 =0 c′2 =0
c′1 c′2 (k + bi + b j + 2ci + 2c j + 1)!
where
bi !b j !(2ci − 2c′1 )!(2c j − 2c′2 )!(bi + b j + 2ci + 2c j − 2c′1 − 2c′2 + 2)!
γ bi ,b j ,ci ,c j ,c′1 ,c′2 = ,
(bi + 2ci − 2c′1 + 1)!(b j + 2c j − 2c′2 + 1)!
We now consider Jk(m) (F). Since F is symmetric in t1 , . . . , tk we see that Jk(m) (F) is independent
of m, and so it suffices to only consider Jk(1) (F). We have
Z 1−Pki=2 ti c ! k Z Pk k
b c
X c X 2 c′ 1− i=2 ti X b
2c−2c′
(1 − P1 ) P2 dt1 = ′
t i 1 − t i t1 dt1
0 ′
c =0
c i=2 0 i=1
c ! Z 1
X c ′ c′ ′ ′
= ′
(P2 ) (1 − P′1 )b+2c−2c +1 (1 − u)b u2c−2c du
c′ =0
c 0
c
b!(2c − 2c′ )!
!
X c ′ c′ ′ b+2c−2c′ +1
(8.8) = (P ) (1 − P1 ) ,
c′ =0
c′ 2 (b + 2c − 2c′ + 1)!
where P′1 = ki=2 ti and P′2 = ki=2 ti2 . Thus
P P
Z 1 2 X d Z 1−Pkj=2 t j 2
Fdt1 = ai (1 − P1 )bi Pc2i dt1
0 i=1 0
ci X cj ! !
X X ci c j ′ c′ +c′ ′ ′
2 (1 − P′ )bi +b j +2ci +2c j −2c1 −2c2 +2
= ai a j ′ (P2 )
1
′ 1
1≤i, j≤d c′ =0 c′ =0
c1 c2
1 2
9. Acknowledgements
The author would like to thank Andrew Granville, Roger Heath-Brown, Dimitris Kouk-
oulopoulos and Terence Tao for many useful conversations and suggestions.
The work leading to this paper was started whilst the author was a D.Phil student at Oxford
and funded by the EPSRC (Doctoral Training Grant EP/P505216/1), and was finished when
the author was a CRM-ISM postdoctoral fellow at the Université de Montréal.
References
[1] P. D. T. A. Elliott and H. Halberstam. A conjecture in prime number theory. In Symposia Mathematica, Vol.
IV (INDAM, Rome, 1968/69), pages 59–72. Academic Press, London, 1970.
[2] J. Friedlander and A. Granville. Limitations to the equi-distribution of primes. I. Ann. of Math. (2),
129(2):363–382, 1989.
[3] D. A. Goldston, S. W. Graham, J. Pintz, and C. Y. Yıldırım. Small gaps between products of two primes.
Proc. Lond. Math. Soc. (3), 98(3):741–774, 2009.
[4] D. A. Goldston, J. Pintz, and C. Y. Yıldırım. Primes in tuples. III. On the difference pn+ν − pn . Funct. Approx.
Comment. Math., 35:79–89, 2006.
[5] D. A. Goldston, J. Pintz, and C. Y. Yıldırım. Primes in tuples. I. Ann. of Math. (2), 170(2):819–862, 2009.
3
An ancillary Mathematica R file detailing these computations is available alongside this paper at
www.arxiv.org.
SMALL GAPS BETWEEN PRIMES 25
[6] D. A. Goldston and C. Y. Yıldırım. Higher correlations of divisor sums related to primes. III. Small gaps
between primes. Proc. Lond. Math. Soc. (3), 95(3):653–686, 2007.
[7] D. H. J. Polymath. A new bound for gaps between primes. Preprint.
[8] A. Selberg. Collected papers. Vol. II. Springer-Verlag, Berlin, 1991. With a foreword by K. Chandrasekha-
ran.
[9] Y. Zhang. Bounded gaps between primes. Ann. of Math.(2), to appear.