Regularization from L1 by convolution
1    Introduction
The point of this note is to provide background for one of my favorite problems.
I offer a prize of 1000 USD for the solution of this problem. I do not think that
the problem by itself has any importance. It is just pretty.
    It is well known that “convolution spreads regularity”. For example, if one
want to approximate a continuous function on R by a C ∞ function, one takes
convolution with a C ∞ function with (small) compact support. However, some
regularization is possible even when one takes convolution with a singular mea-
sure, and even when convolution is applied to L1 functions.
    For simplicity we consider only convolution on the group G = {−1, 1}N
provided with its Haar measure λ.
    Given a positive, finite measure µ on G we consider the operator Tµ on
L1 = L1 (T, dx) given by
                                      Z
                          Tµ (f)(x) = f(x + y)dµ(y) .
2    Simple facts
It is quite a requirement that Tµ has some regularization properties on L1 .
Proposition 1. Assume that for some Orlicz function ϕ such that limu→∞ ϕ(u) =
∞, for all f in L1 we have
                              kTµ (f)kϕ ≤ Ckfk1 .                              (1)
Then for some constant A we have µ ≤ Aλ.
    Suppose for contradiction that this is not the case, so that there is compact
set K with µ(K) > 0 and λ(K) = 0. Consider an open set U ⊂ G and a function
f ∈ L1 , such that x 6∈ U ⇒ f = 0, with f ≥ 0 and kfk1 = 1. Let us denote by
µK the restriction of µ to K. Then, using Fubini theorem in the first equality,
                        Z               Z
               µ(K) =      TµK (f)dλ = f(x + y)dλ(x)dµK (y) .
                        G
                                       1
Now, if f(x + y) 6= 0 and y ∈ K we have x ∈ U − K. Therefore we can restrict
the last integral to x ∈ U − K. That is, we have shown that
                          Z                   Z
                  µ(K) ≤       TµK (f)dλ(x) ≤       Tµ (f)dλ ,           (2)
                          U −K                       U −K
                                                                  R
   Meanwhile, consider a function g ≥ 0 with kgkϕ ≤ C, so that        ϕ(g/C)dλ ≤
1. For each measurable set V from Jensen’s inequality we have
                             Z
                           ϕ( gdλ/Cλ(V ) ≤ 1 .
                                  V
Since we assume that limu→∞ ϕ(u)du
                               R   = ∞, this implies that for some constant
C 0 depending only on C we have V fdλ ≤ C 0 λ(V ).. Using this for V = U − K
and g = Tµ (f) gives
                           µ(K) ≤ C 0 λ(U − K) .
Since U is arbitary, this gives µ(K) ≤ C 0 λ(K), and since K is arbitrary this
proves the result.                                                             
   The moral is that (1) is a far too stringent requirement, so we shall consider
weaker conditions. We define the function
                           n                                    o
              ψµ (u) = sup uλ({Tµ (f) ≥ u}) ; f ≥ 0, kfk1 = 1 .
Since
                             kTµ (f)k1 = µ(G)kfk1 ,
from Markov inequality we see that
                                  ψµ (u) ≤ µ(G) .
   Here is another simple fact.
Proposition 2 If µ is absolutely continuous with respect to λ then
                                  lim ψµ (u) = 0 .
                                  u→∞
   ThisPis also almost obvious. When µ = µ1 + µ2 , using that λ({f1 + f2 ≥
u}) ≤ j=1,2 λ({fj ≥ u/2}) we obtain that ψµ (u) ≤ 2ψµ1 (u/2) + 2ψµ2 (u/2).
Moreover the result is trivial if µ ≤ Cλ because then ψµ (u) = 0 for u ≥ C. 
    The interesting phenomenon is that as we shall see it can happen that µ is
singular with respect to λ but that limu→∞ ψµ (0) = 0. As the following simple
fact shows, this does not happen when µ has a finite support.
Theorem 3 If µ has a finite support then lim infu→∞ ψµ ≥ µ(G).
                                        2
    We may assume that µ is a probability. The support of µ is finite, so it
generates a finite subgroup H of G. Consider then a subgroup H 0 of G, so that
H + H 0 is a subgroup of G, which is invariant under translations by elements
of the support of µ. Thus if f is the indicator of H 0 + H it is invariant under
translations by elements of the support of µ, and thus Tµ (f) = f. Since the
measure of H + H 0 can be as small as we wish, the result should be obvious. 
   Here is another simple fact showing that when µ is singular the function ψµ
cannot decrease too fast.
Porposition 4 If µ is singular then
                            Z ∞
                                 ψµ (u)du = ∞ .                             (3)
                                   1
    For a function g ≥ 0 and a subset V of G we have, for any number A ≥ 0,
       Z          Z ∞                              Z ∞
           gdλ =      ¶({g ≥ u} ∩ V )du ≤ A¶(V ) +      ¶({g ≥ u})du .
         V         0                                             A
    Consequently, if f ≥ 0 and kfk1 = 1,
                     Z                      Z            ∞
                        Tµ (f)dλ ≤ A¶(V ) +                  ψµ (u)du .
                       V                                A
Using (2) and taking V = T − U , we get
                                                    Z   ∞
                       µ(K) ≤ Aλ(U − K) +                   ψµ (u)du .
                                                     A
Since we assume that µ is singular, we can find a > 0 such that we can find a
with µ(K) ≥Ra and λ(U − K) as small as we wish. We then see that for each A
             ∞
we have a ≤ A ψµ (u)du.                                                    
3     The problem
Given 0 ≤ a ≤ 1 we consider the “biased coin” probabilty µa on G. It is the
product measure on G that on each factor gives weight (1 + a)/2 to the point 1
(and weight (1 − a)/2 to the point -1) so that
                                  1 + a          1−a     ⊗N
                           µa =            δ1 +       δ−1     .
                                       2           2
    Here is a simple fact
Proposition 5 Given
                √    0 < a ≤ 1 there exists C(a) > 0 such that for u ≥ 2 we
have ψµa ≥ C(a)/ log u.
                                             3
   We simply look at the density of µa when we replace G by {−1, 1}n. On a
sequence with k terms equal to 1 this density g is (1 + a)k (1 − a)n−k , so that
                                                       
                                                       n
                   λ(g ≥ (1 + a)k (1 − a)n−k ) ≥ 2−n       .
                                                       k
We then choose k as close as possible to (1 + a)n/2. Letting c2 = (1 + a)1+a (1 −
a)1−a then (1 +√a)k (1 − a)n−k is about cn while, using Stirling’s formula, 2−n nk
is about 1/(cn n).                                                               
Conjecture 6 Given a > 0 there exist C(a) > 0 such that for u ≥ 2 we have
                                         C(a)
                               ψµa (u) ≤ √      .
                                          log u
    I do not know if it is true that limu→∞ ψµa (u) = 0. However the following
is proved in my paper “A conjecture on convolution operators, and operators
from L1 to a Banach lattice”, Israel J. of Math. 68, 1989, 82-88. In view of (3)
this is about as fast a decrease as can be expected.
                                               R1
Theorem 6 The probability measure µ =           0
                                                    µexp(−t)dt satisfies (for u ≥ 10)
                                         C log log u
                              ψµ (u) ≤               .
                                            log u
    Another noteworthy result is that in 2015 Ronen Eldan and James Lee have
solved the “Gaussian limiting case” of Conjecture 6 (with a slightly weaker
estimate).