0% found this document useful (0 votes)
15 views105 pages

Metric Norm

This document contains miscellaneous problems and results related to metrics, norms, inner products, and operator theory, which were not included in the main text. It includes definitions of various metrics on higher-dimensional Euclidean spaces, exercises to prove properties of these metrics, and discussions on completeness in metric spaces. Additionally, it provides extra problems for further exploration of concepts such as neighborhoods and closed sets in metric spaces.

Uploaded by

katerine.emv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views105 pages

Metric Norm

This document contains miscellaneous problems and results related to metrics, norms, inner products, and operator theory, which were not included in the main text. It includes definitions of various metrics on higher-dimensional Euclidean spaces, exercises to prove properties of these metrics, and discussions on completeness in metric spaces. Additionally, it provides extra problems for further exploration of concepts such as neighborhoods and closed sets in metric spaces.

Uploaded by

katerine.emv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

Christopher Heil

Metrics, Norms, Inner Products


and Operator Theory

Miscellaneous Extra Material and


Problems

January 25, 2018

c 2018 by Christopher Heil


Miscellaneous Extra Material and
Problems

We collect here miscellaneous problems and results that didn’t make it into
the text, but have some relevance and may be useful. Several notes and
caveats are in order.

• This is a somewhat random collection of results and problems that I hap-


pen to have in typed form that have some relation to the main text. This
extra material has not been organized to be in “book form,” nor has it
been thoughtfully developed to supplement the main text. It is simply a
collection of notes and problems that happen to be related to the text.
• There may be some duplication (sometimes considerable duplication) of
facts stated in the main text.
• Some of the ordering of the material may be off, e.g., Chapter 2 extra prob-
lems might contain material about normed spaces, which are not defined
until Chapter 3, and so forth.
• This extra material has not been proofread as well as the material in the
main text, and hence the probability of errors is higher.
• The numbering of the extra material begins with 50 in each section, so
any reference to a theorem, lemma, definition, or problem whose number
is 50 or greater must be from the extra material.

1
2 Extra Material and Problems c 2018 Christopher Heil

CHAPTER 2: METRIC SPACES

2.1 Metrics

On the higher dimensional Euclidean spaces Fd , the metric that we usually


think of first is the Euclidean metric (which in R, R2 , or R3 is simply the
ordinary physical distance between points). Here is the precise definition.
Definition 2.1.50 (The Euclidean Metric). The Euclidean metric on Fd
is defined by
1/2
d2 (x, y) = |x1 − y1 |2 + · · · + |xd − yd |2 , (2.14)

where x = (x1 , . . . , xd ) and y = (y1 , . . . , yd ) are vectors in Fd . ♦


Of course, so far we have only called the Euclidean metric a metric—we
have not yet proved that it actually is a metric on Fd ! It is easy to see that
d2 satisfies the nonnegativity, uniqueness, and symmetry requirements, and
the following exercise outlines one method for showing that d2 also satisfies
the Triangle Inequality. This exercise uses the dot product of vectors in Fd ,
which is defined by

x · y = x1 y1 + · · · + xd yd , x, y ∈ Fd . (2.15)

Note that if F = R, then the complex conjugate that appears in equation


(2.15) is superfluous. That is, for vectors in Rd we can write the dot product
simply as
x · y = x1 y1 + · · · + xd yd , x, y ∈ Rd . (2.16)
The following exercise is slightly easier to state for real scalars, so we outline
the argument for F = R first, then ask for an extension to F = C.
Exercise 2.1.51. Given vectors x, y, z ∈ Rd , prove the following statements.
(a) d2 (x, z)2 = d2 (x, y)2 + 2 (x − y) · (y − z) + d2 (y, z)2 .
1/2 1/2
(b) |x · y| ≤ |x1 |2 + · · · + |xd |2 |x1 |2 + · · · + |xd |2 .
(c) (x − y) · (y − z) ≤ d2 (x − y) d2 (y − z).
(d) d2 (x − z) ≤ d2 (x − y) + d2 (y − z).
Use this to prove that d2 is a metric on Rd . Then determine how to modify
the argument to prove that d2 is also a metric on Cd . ♦
The Euclidean metric is the metric on Fd that we encounter most often,
and therefore we sometimes refer to it as the “standard” or “default” metric
for Fd . However, it is not the only metric on Fd . The following exercise, which
is much easier to solve than Exercise 2.1.51, defines two additional metrics
on Fd .
2.1 Metrics 3

Exercise 2.1.52. Given x = (x1 , . . . , xd ) and y = (y1 , . . . , yd ) in Fd , set

d1 (x, y) = |x1 − y1 | + · · · + |xd − yd |

and 
d∞ (x, y) = max |x1 − y1 |, . . . , |xd − yd | .
Prove that d1 and d∞ are each metrics on Fd . ♦

More generally, we will see in Section 3.2 that if p is any number in the
range 1 ≤ p < ∞, then
1/p
dp (x, y) = |x1 − y1 |p + · · · + |xd − yd |p (2.17)

defines a metric on Fd . The usual Euclidean metric corresponds to the choice


p = 2, and the cases p = 1 and p = ∞ give the metrics defined in Exercise
2.1.52. If the dimension is d = 1 then d1 = d2 = d∞ , but when d ≥ 2 these
three metrics do not coincide (even so, we will see in Section 3.6 that these
metrics are equivalent in the sense that they all generate the same topologies
on Fd ).
What happens if we take 0 < p < 1 in equation (2.17)? In this case we
do not obtain a metric on Fd (except for the trivial case d = 1). To see why,
consider p = 21 , which corresponds to
2
d1/2 (x, y) = |x1 − y1 |1/2 + · · · + |xd − yd |1/2 . (2.18)

This function d1/2 does not satisfy the Triangle Inequality, because the choice
x = (1, 0, 0, . . . , 0), y = (1, 1, 0, . . . , 0), and z = (0, 1, 0, . . . , 0) yields

d1/2 (x, z) = 4 > 2 = d1/2 (x, y) + d1/2 (y, z).

Consequently, equation (2.18) does not define a metric. On the other hand,
the following exercise shows how to create a related but slightly different
function d1/2 that is a metric on Fd .

Exercise 2.1.53. Given x = (x1 , . . . , xd ) and y = (y1 , . . . , yd ) in Fd , set

d1/2 (x, y) = |x1 − y1 |1/2 + · · · + |xd − yd |1/2 .

Prove the following statements.


(a) (a + b)1/2 ≤ a1/2 + b1/2 for all a, b ≥ 0.
(b) If x, y, z ∈ Fd , then d1/2 (x, z) ≤ d1/2 (x, y) + d1/2 (y, z).
(c) d1/2 is a metric on Fd . ♦

In summary, we have constructed four metrics on Fd . These are the fa-


miliar Euclidean metric, which we have denoted by d2 , and the less-familiar
metrics d1 , d∞ , and d1/2 . We will see some additional metrics on Fd in Section
4 Extra Material and Problems c 2018 Christopher Heil

3.2. Problem 2.1.13 shows that is possible to define a metric on any set X.
However, in practice we usually want to do more than just define an arbitrary
metric on X. Typically, our set X has some kind of interesting properties,
and we seek a metric that somehow takes those properties into account. We
will see many examples of such metrics later. In particular, we will define
metrics on some spaces whose elements are infinite sequences in Section 3.2,
and metrics on some spaces whose elements are functions in Section 3.5.

Extra Problems

2.1.54. (a) Can you define a space of sequences ℓ2 and a corresponding metric
d2 that is analogous to the Euclidean metric on Fd ?
(b) Can you define a metric on C[0, 1] that is analogous in some way to
the Euclidean metric on Fd ?

2.1.55. Determine which of the following is a metric on R.


(a) d(x, y) = |x − 2y|.
|x − y|
(b) d(x, y) = .
1 + |x − y|

2.2 Convergence and Completeness

Definition 2.2.50 (Complete Sets and Complete Metric Spaces). Let


X be a metric space.
(a) If S is a subset of X and every Cauchy sequence in X converges to an
element of S, then we say that S is a complete subset of X.
(b) If X itself has the property that every Cauchy sequence in X converges
to an element of X, then we say that X is a complete metric space, or
simply that X is complete. ♦

Extra Problems

2.2.51. Let {xn }n∈N be a sequence of vectors in Fd . For each n, write



xn = xn (1), . . . , xn (d) .
d
That is, xn (k) denotes the  kth component of xn . Let y be a vector in F , and
write y = y(1), . . . , y(d) . Prove that
2.3 Topology in Metric Spaces 5

lim d2 (xn , y) = 0 ⇐⇒ lim xn (k) = y(k) for each k = 1, . . . , d.


n→∞ n→∞

Also prove that the same result holds if we replace the Euclidean metric d2
by d1 , d∞ , or d1/2 .

2.2.52. Prove that Fd is complete with respect to each of the metrics d1 , d2 ,


d∞ , and d1/2 .

2.2.53. Do you think that the space C[0, 1] is complete with respect to the
uniform metric du or the L1 -metric d1 ? Try to give a variety of different ex-
amples of Cauchy sequences and determine whether they converge in C[0, 1].
We will consider the completeness of C[0, 1] in more detail later, but try to
prove now that it is complete or incomplete with respect to each of these two
metrics.

2.3 Topology in Metric Spaces

Neighborhoods

A topology consists of open sets, but sometimes it is useful to consider sets


that are not open but which contain an open set. For example, the interval
[0, 1) is not open, but it contains the open interval (0, 1). We give the following
special name to a set that contains an open set that contains a point x.

Definition 2.3.50 (Neighborhood). Let X be a metric space.


(a) A neighborhood of a point x ∈ X is any set A such that there exists an
open set U that satisfies
x ∈ U ⊆ A.
(b) An open neighborhood of x is a neighborhood U of x that is an open
set. Equivalently, an open neighborhood of x is simply an open set that
contains x. ♦

A neighborhood need not be an open set. For example, if we consider


X = R, the interval [0, 3) is a neighborhood of the point x = 1, but it is not
an open set. On the other hand, (0, 3) is an open set that is a neighborhood
of x = 1, so (0, 3) is an open neighborhood of x = 1. In an arbitrary metric
space, the open ball Br (x) is an open neighborhood of x, but not every open
neighborhood need be an open ball.
6 Extra Material and Problems c 2018 Christopher Heil

Extra Problems

2.3.51. Let X be a metric space and fix x ∈ X. Prove that if y and z are any
two points in Br (x), then d(x, y) < 2r.
2.3.52. Let X be a metric space, and let x ∈ X and E ⊆ X be fixed. Prove
the following statements.
(a) Every open ball Br (x) is an open subset of X.
(b) A ⊆ X is a neighborhood of x if and only if there exists some r > 0
such that Br (x) ⊆ A.
(c) E is open if and only if E = E ◦ .
(d) E ◦ is the union of all open balls that are contained in E.
2.3.53. Let X be a metric space, and let xn , x ∈ X be given. Prove that the
following three statements are equivalent.
(a) xn → x.
(b) For every open neighborhood U of x, there is an N > 0 such that
xn ∈ U for all n > N.
(c) For every neighborhood A of x, there is an N > 0 such that xn ∈ A
for all n > N.

2.4 Closed Sets

Extra Problems

2.4.50. Exhibit a metric space X that contains a subset E such that E 6= ∅,


E 6= X, and E is both open and closed.
2.4.51. Consider C[0, 1] with respect to the uniform metric. Prove that

E = f ∈ C[0, 1] : |f (x)| < 1 for all x ∈ [0, 1]

is open but not closed, and therefore its complement is closed but not open.
Given an explicit direct description of E C , i.e., f ∈ E C if and only if what?
2.4.52. Let X be a metric set. Given A, B ⊆ X, define
 
D(A, B) = sup dist(x, B) : x ∈ A + sup dist(y, A) : y ∈ B .

Let B be the set of all closed, bounded, nonempty subsets of X. Determine,


with proof or a counterexample, which of the metric space requirements that
D satisfies on the set B. Why do we restrict B to closed, bounded, and
nonempty sets?
2.5 Accumulation Points and Boundary Points 7

2.5 Accumulation Points and Boundary Points

Accumulation Points

It is often helpful to reformulate a definition in equivalent ways, or to see


its implications. For example, suppose that x is an accumulation point of
a set E, and let us see what this tells us about the neighborhoods of x. If
A is a neighborhood of x, then, by the definition of a neighborhood, there
is an open set U such that x ∈ U ⊆ A. Consequently, by the definition of
an open set, there is some r > 0 such that Br (x) ⊆ U. Now, since x is an
accumulation point, we know that there exist points xn ∈ E, with every xn
different from x, such that xn → x. By definition of convergence, there exists
some integer N such that

n>N =⇒ d(xn , x) < r.

This tells us that for every n > N we have

xn ∈ Br (x) ⊆ U ⊆ A.

Thus the neighborhood A contains at least one point (and maybe more, if
the xn are not all equal) that belongs to E but is not equal to x. Another
way to say this is that if x is an accumulation point of E, then

A is a neighborhood of x =⇒ (E ∩ A) \ {x} 6= ∅. (2.19)

According to the next exercise, it is also possible to argue in the converse


direction. That is, we can show that if equation (2.19) holds, then x must
be an accumulation point of X. Thus, equation (2.19) gives us an equivalent
way to view accumulation points. Several other equivalent reformulations are
also given in the following exercise (which is exactly the same as Problem
2.5.8 except that it includes wording about neighborhoods instead of just
open sets).

Exercise 2.5.50. Let E be a subset of a metric space X. Given x ∈ X, prove


that the following five statements are equivalent.
(a) x is an accumulation point of E.
(b) If A is any neighborhood of x, then (E ∩ A) \ {x} 6= ∅.
(c) If U is an open set that contains x, then there exists a point y ∈ E ∩ U
such that y 6= x.
(d) If r > 0, then exists a point y ∈ E such that 0 < d(x, y) < r.
(e) Every neighborhood A of x contains infinitely many points of E. ♦

We give the set of accumulation points of E the following name.


8 Extra Material and Problems c 2018 Christopher Heil

Definition 2.5.51 (Derived Set). Let E be a subset of a metric space X.


The derived set of E is the set E ′ that consists of all of the accumulation
points of E, i.e.,

E ′ = x ∈ X : x is an accumulation point of E . ♦

Using this terminology, Theorem 2.4.2 tells us that

E is closed ⇐⇒ E ′ ⊆ E.

Here are some examples of derived sets for subsets of X = R.


Example 2.5.52. (a) The set

A = [0, 1] ∪ {2}

is a closed subset of R. We observed in Section 2.5 that x is an accumulation


point of A if and only if x ∈ [0, 1]. Hence the derived set of A is A′ = [0, 1].
(b) The set 1  1 1 1
S = n n∈N = 1, 2 , 3 , 4 , . . . .
is not a closed subset of the real line. In particular, 0 is an accumulation point
of S even though 0 does not belong to S. Since 0 is the only accumulation
point of S, its derived set is S ′ = {0}.
(c) The set 
T = S ∪ {0} = 0, 1, 12 , 31 , 14 , . . .
contains all of its accumulation points, and therefore is closed. However, 0 is
the only accumulation point of T, so its derived set is T ′ = {0}.
(d) It is possible for a nonempty closed set to have no accumulation points.
For example, E = {0} is a closed subset of R because its complement R \ {0}
is open. However, there is no way to choose points xn ∈ E = {0} with xn 6= 0
such that xn → 0, so 0 is not an accumulation point of E. In fact, E ′ = ∅.
Likewise, even though the set of integers is an infinite set, its derived set is
Z′ = ∅.
(e) The interval I = [0, 1] is a closed subset of R, and every point x ∈ [0, 1]
is an accumulation point of I. Hence I is a closed set that has the property
that all of its elements are accumulation points of I. Moreover, no point
outside of I is an accumulation point of I (why?), so I ′ = I. ♦
We saw in Theorem 2.5.3 that a closed set must contain all of its accumula-
tion points. However, not every point in a closed set need be an accumulation
point of the set. We give the following special name to closed sets that have
the property that every point in the set is an accumulation point.
Definition 2.5.53 (Perfect Set). Let E be a subset of a metric space X.
We say that E is perfect if E is nonempty and E = E ′ . ♦
2.5 Accumulation Points and Boundary Points 9

The interval [0, 1] is a perfect set. However, perfect sets can be quite un-
usual. For example, we will see in Problem 2.7.3 that the Cantor set C is
perfect and has no interior. The Cantor set is a closed, uncountable subset
of [0, 1] that contains no intervals, yet every point x ∈ C is an accumulation
point of C!
According to Problem 2.8.59, every perfect subset of Rd is uncountable.
Although we will not prove it, the Cantor–Bendixson Theorem implies that
every nonempty closed set F ⊆ Rd can be uniquely written as F = E ∪ Z
where E is perfect and Z is countable.

Boundary Points

Our next goal is to characterize closed sets in terms of their boundary points.
Here is the definition of an interior point, exterior point, and boundary point
of a set E.
Definition 2.5.54 (Interior, Exterior, and Boundary Points). Let E
be a subset of a metric space X, and let x be any point in X.
(a) We say that x is an interior point of E if there exists a neighborhood A
of x such that A ⊆ E.
(b) We say that x is an exterior point of E if there exists a neighborhood A
of x such that A ⊆ E C .
(c) We say that x is a boundary point of E if for every neighborhood A of x
we have both A ∩ E 6= ∅ and A ∩ E C 6= ∅. The set of all boundary points
of E is called the boundary of E, and it is denoted by

∂E = x ∈ X : x is a boundary point of E . ♦

Boundary points and accumulation points are similar in some respects, but
they are not identical. For example, 0 is a boundary point of the singleton {0},
but it is not an accumulation point. This shows that boundary points need
not be accumulation points. On the other hand, only 0 and 1 are boundary
points of the interval [0, 1], yet every point in [0, 1] is an accumulation point.
Therefore accumulation points need not be a boundary points.
According to Theorem 2.5.5, a subset of a metric space is closed if and
only if it contains all of its boundary points, i.e., ∂E ⊆ E.
Can we ever have ∂E = E? If L is the line segment

L = {(x, x) : 0 ≤ x ≤ 1}

in R2 then ∂L = L. However, if L = [a, b] is a line segment in R then ∂L only


contains the two endpoints a and b:

∂L = {a, b} =
6 [a, b] = L.
10 Extra Material and Problems c 2018 Christopher Heil

It is hard to imagine an infinite subset of the real line that equals its own
boundary! Yet Problem 2.7.3 shows that the Cantor set is an example of an
uncountable subset of the real line that has this unusual property.
Thus C is a rather strange set. It is uncountable, yet its interior is empty.
It equals its own boundary, and every point in C is both an accumulation
point of C and an accumulation point of its complement R \ C. Following
Definition 2.5.53, since every point in C is an accumulation point of C, we
say that C is a perfect set. It is also totally disconnected, which means that
it contains no nontrivial connected subsets (in one dimension, a nontrivial
connected set is simply an interval).

Closed Sets versus Complete Sets

Suppose that E is a subset of a metric space X. Theorem 2.4.2 showed that E


is closed if and only if E contains every limit of points from E. On the other
hand, Definition 2.2.50 states that E is a complete set if and only if every
Cauchy sequence in E converges to an element of E. These concepts are quite
similar, but the following example shows that they are not identical.

Example 2.5.55. Let X = Q, where the metric on X is the restriction of the


absolute value metric to Q. That is,

d(x, y) = |x − y|, x, y ∈ Q.

Let E be the set of all rational numbers that are less than π, i.e.,

E = (−∞, π) ∩ Q = {r ∈ Q : r < π}.

Let x1 = 3.1, x2 = 3.14, x3 = 3.141, x4 = 3.1415, and so forth. Then {xn }n∈N
is a Cauchy sequence in Q (why?), but it does not converge to an element
of Q. Therefore E is not a complete subset of Q.
On the other hand, we will show that E is a closed subset of Q. Suppose
that {yn }n∈N is any sequence of points from E that converges to some point
y ∈ Q. By definition of E, we have yn < π for every n. Since yn converges
to y, it follows that
y = lim yn ≤ π.
n→∞

But y is rational by hypothesis, so it follows that y < π and therefore y ∈ E.


We have shown that any limit of points from E belongs to E, so Theorem
2.4.2 tells us that E is a closed subset of Q. ♦

Thus, a closed set need not be a complete set in general. Problem 2.4.8
shows that if X itself is complete then the closed subsets of X coincide with
the complete subsets of X.
2.5 Accumulation Points and Boundary Points 11

Extra Problems

2.5.56. For this problem, use any of the metrics d1 , d2 , d∞ , or d1/2 on R2 .


The distance dist(x, A) from a point x to a set A is defined in Problem 2.4.7.
(a) Give an example of a point x ∈ R2 and a set A ⊆ R2 such that x ∈
/A
but dist(x, A) = 0.
(b) Exhibit disjoint sets A, B ⊆ R2 such that dist(A, B) = 0.
(c) Give an example of a set A ⊆ R2 such that d(x, y) < diam(A) for every
choice of x, y ∈ A.

2.5.57. Modify the Cantor middle-thirds set construction as follows. Fix a


parameter 0 < θ < 1, and at stage n remove intervals of relative length θ
from Fn to form Fn+1 . Show that the generalized Cantor set Cθ = Fn is
T

perfect, has no interior, and equals its own boundary.

2.5.58. Let X be a metric space, and suppose that x is an accumulation


point of a set E ⊆ X. Show that x is an accumulation point of every set F
that contains E.

2.5.59. Given a subset E of a metric space X, prove the following statements.


(a) E ◦ is the set of all interior points of E, and (E C )◦ is the set of all
exterior points of E.
(b) E ◦ , (E C )◦ , and ∂E are disjoint, and their union is X. Consequently,
every point x ∈ X is precisely one of an interior point, an exterior point, or
a boundary point of E.
(c) If x ∈ ∂E but x ∈/ E, then x is an accumulation point of E.
(d) ∂E is a closed set.

2.5.60. Let A, B be subsets of a metric space X.


(a) Prove that ∂(A ∪ B) ⊆ ∂A ∪ ∂B.
(b) Show by example that ∂(A ∪ B) need not equal ∂A ∪ ∂B.

2.5.61. This problem continues Problem 2.3.18. Let Y be a subset of a metric


space (X, d). We call the restriction of d to Y ithe inherited metric on Y.
(a) Given a set F ⊆ Y, prove that F is a closed subset of Y (with respect
to the inherited metric) if and only if there exists a closed set E ⊆ X such
that F = E ∩ Y.
(b) Let X = R with the usual metric, and set S = {r ∈ Q : 0 ≤ r ≤ π}.
Prove that S is a closed set in Q, but it is not a complete set.
12 Extra Material and Problems c 2018 Christopher Heil

2.6 Closure, Density, and Separability

Density

Observe that Theorem 2.6.5 implies that the Banach space ℓ1 contains proper,
dense subspaces. This is quite different from our experience with the Eu-
clidean space Rd , whose proper subspaces are closed sets and therefore are
not dense in Rd . Indeed, Problem 3.7.5 shows that no proper subspace of any
finite-dimensional normed space can be dense.

Separability

In Rd , we can find up to d + 1 vectors that are each a distance δ from each


other (for example, consider the four vertices of a regular tetrahedron in R3 ).
The following lemma implies that in a separable space there can be at most
countably many vectors that are each a distance δ from each other.
Lemma 2.6.50. Let X be a metric space. If there exists an uncountable set
S ⊆ X and a constant δ > 0 such that d(x, y) ≥ δ for every x 6= y ∈ S, then
X is not separable.
Proof. We will prove the contrapositive statement. Suppose that X is sepa-
rable, and suppose that S is a subset of X that satisfies d(x, y) ≥ δ for every
x 6= y ∈ S, where δ is a fixed positive constant. We must prove that S is
countable.
Since X is separable, there exists a countable set A that is dense in X.
Applying the equivalent formulation of density given in Corollary 2.6.4, for
each x ∈ S there must exist a point ax ∈ A such that
δ
d(x, ax ) < .
2
If x, y are two distinct elements of S, then
δ δ
δ ≤ d(x, y) ≤ d(x, ax ) + d(ax , ay ) + d(ay , y) < + d(ax , ay ) + .
2 2
Consequently d(ax , ay ) > 0, which implies that ax 6= ay . This shows that
each point x ∈ S determines a unique point ax ∈ A. In other words, the
function f (x) = ax is an injective mapping of S into the countable set A.
Consequently S must be countable. ⊓ ⊔
Since ℓp is separable when p is finite, a consequence of Lemma 2.6.50 is that
if p is finite, then there can exist at most countably many vectors in ℓp each
a distance δ apart. In contrast, Theorem 2.6.8 shows that ℓ∞ is nonseparable
2.6 Closure, Density, and Separability 13

and we can find uncountably many vectors in ℓ∞ that are each a distance 1
from each other!
Problem 2.6.20 shows that ℓ1 and c0 are separable, and Theorem 2.6.8
shows that ℓ∞ is not separable. his may seem to be related to the fact that

ℓ 1 ⊆ c0 ⊆ ℓ ∞ ,

but this is misleading, because these spaces do not all share the same norms.
Indeed, ℓ1 is separable with respect to the ℓ1 -norm, c0 is separable with
respect to the ℓ∞ -norm, and ℓ∞ is not separable with respect to the ℓ∞ -norm.
Problem 2.6.22 does show that if Y is a subspace of a separable space X and
we use the norm on Y inherited from X, then Y must be separable. However,
if Y is given a different norm, then it is entirely possible that Y could be
nonseparable even though X is separable. Although we will not prove it here,
an example from measure theory is given by the Lebesgue spaces L1 [0, 1] and
L∞ [0, 1]. The space L∞ [0, 1] is nonseparable, even though it is a proper subset
of the separable space L1 [0, 1] (for precise definitions and proof, see [Heil18]).
Again, the norms on these two spaces are different. If we replace the standard
norm on L∞ [0, 1] with the norm of L1 [0, 1], then L∞ [0, 1] is a separable space.

Extra Problems

2.6.51. Let E be a subset of a metric space X. Must E and E always have


the same interior?

2.6.52. Let 
A = x = (xk )k∈N ∈ ℓ1 : x1 = 0 .
(a) Prove that A is not a dense subset of ℓ1 by showing that there exists
some point z ∈ ℓ1 such that there do not exist any points xn ∈ A that
converge to z in ℓ1 -norm.
(b) Prove that A is a proper, closed subspace of ℓ1 , and use this to give
another proof that A is not dense in ℓ1 .

2.6.53. Let S be the set defined in equation (2.20).


(a) Is S a dense subset of ℓ∞ (with respect to the ℓ∞ -metric)? Is this by
itself enough to prove that ℓ∞ is not separable?
(b) Let c0 be the subspace of ℓ∞ defined in equation (2.21). Prove that
when we use the ℓ∞ -norm, S is a countable dense subset of c0 and therefore
c0 is separable with respect to the sup-norm.
14 Extra Material and Problems c 2018 Christopher Heil

2.7 The Cantor Set

Extra Problems

2.7.50. The sum of two sets A, B ⊆ R is A + B = {x + y : x ∈ A, y ∈ B}.


(a) Prove that [a, b] + [c, d] = [a + c, b + d].
(b) Let C be the Cantor set, and let
X
∞ 
cn
D = : cn = 0, 1 .
n=1
3n

Show that D + D = [0, 1], and use this to show that C + C = [0, 2].

2.8 Compact Sets in Metric Spaces

We know that a closed set contains all of its accumulation points. In fact,
Theorem 2.4.2 tells us that if {xn }n∈N is any convergent sequence that is
contained in a closed set F, then the limit of this sequence must belong
to F. Even so, it is not true that every infinite sequence {xn }n∈N in F must
converge, or even have an accumulation point or a convergent subsequence.
For example, the set of integers Z is a closed subset of the real line R, but if we
set xn = n then {xn }n∈N does not converge, and no subsequence of {xn }n∈N
converges. One of the equivalences proved in Theorem 2.8.9 shows that any
infinite sequence in a compact set must contain a convergent subsequence.
We will use Theorem 2.8.9 to give an example of a closed and bounded set
that is compact.

Lemma 2.8.50. The interval [0, 1] is a compact subset of R.

Proof. Implicitly, we are assuming that the metric on R is its standard metric
(the absolute value metric).
The interval [0, 1] is a closed subset of R. Since R is complete, Problem
2.4.8 implies that [0, 1] is complete.
Fix any r > 0, and let n be an integer larger than 1r . For each k = 0, . . . , n,
let 
Bk = nk − 2r , nk + 2r .
1
Then Bk is an open interval of radius r. Since r > n, it follows that

S
n
[0, 1] ⊆ Uk .
k=0
2.8 Compact Sets in Metric Spaces 15

Thus [0, 1] is covered by finitely many open balls of radius r. Since we can do
this for every r > 0, we conclude that [0, 1] is totally bounded.
Thus [0, 1] is both complete and totally bounded, so Theorem 2.8.9 implies
that [0, 1] is a compact subset of R. ⊓ ⊔

The Heine–Borel Theorem for Cd

The proof of Theorem 2.8.4 shows that a subset of Rd is compact if and only
if it is closed and bounded. We could use essentially the same proof to show
that a subset of Cd is compact if and only if it is closed and bounded. Instead,
we will indicate a different (and possibly more difficult) approach based on
the relationship between the topologies of R2d and Cd . Our intuition suggests
that C “looks just like” R2 , and indeed the following exercise will show that
there is a natural identification between the open subsets of Cd and the open
subsets of R2d . We assume in this exercise that the norms on Cd and R2d are
the Euclidean norms.

Exercise 2.8.51. Let U be an open subset of R2 , and let V be an open


subset of C. Show that

e = a + ib ∈ C : (a, b) ∈ U is an open subset of C,
U

and 
Ve = (a, b) ∈ R2 : a + ib ∈ V is an open subset of R2 .
Formulate and prove a corresponding result for Cd and R2d . ♦

Using the language that we will introduce in Definition 3.52.51, Exercise


2.8.51 says that Cd and R2d are homeomorphic spaces—in essence, they have
the “same” topologies. As a consequence, the Heine–Borel Theorem also holds
for As a consequence, the Heine–Borel Theorem also holds for subsets of Cd .
We assign the details of the argument as the following exercise.

Exercise 2.8.52. Let K be a compact subset of Cd . Prove that



e = (x1 , y1 , . . . , xd , yd ) : (x1 + iy1 , . . . , xd + iyd ) ∈ K
K

is a compact subset of R2d and is therefore closed and bounded in R2d . Use
this to prove that K is a closed and bounded set in Cd . ♦

Extra Problems

2.8.53. Prove directly that Rd is not compact.


16 Extra Material and Problems c 2018 Christopher Heil

2.8.54. Assume that {[an , bn ]}n∈N is a nested decreasing sequence of closed


finite intervals. Prove that
T

[an , bn ] = [a, b]
n=1

where a = supn an and b = inf n bn .

2.8.55. For each of the following statements, exhibit a metric space X that
has that property.
(a) X is compact.
(b) X is not compact.
(c) X is infinite, but no infinite subset of X is compact.
(d) X is infinite but not compact, yet there exists an infinite subset of X
that is compact.

2.8.56. Let X be a metric space.


(a) Prove that an arbitrary intersection of compact subsets of X is com-
pact, and a finite union of compact subsets of X is compact.
(b) Is it possible for an intersection of finitely many open subsets of X to
be a nonempty compact set?
(c) Is it possible for an intersection of infinitely many open subsets of Xto
be a nonempty compact set? What if X is a normed space?

2.8.57. Let X be a metric space. Show that if xn , x ∈ X and xn → x, then


K = {xn }n∈N ∪ {x} is compact.

2.8.58. Let K be a compact subset of a metric space X. Show that every


infinite set S ⊆ K has an accumulation point, and that accumulation point
belongs to K.

2.8.59. A set S ⊆ R is totally disconnected if it contains no intervals. It is


perfect if every point x ∈ S is an accumulation point of S.
(a) Show that the interval I = [0, 1] is perfect but not totally disconnected.
(b) Show that the Cantor set C is both perfect and totally disconnected.
(c) This part will show that any perfect subset of R must be uncountable.
Suppose that S = {x1 , x2 , . . . } is perfect. Let n1 = 1 and r1 = 1, and let
U1 = Br1 (xn1 ). Let n2 be the first integer greater than n1 such that xn2 ∈ U1 ,
and let U2 = Br2 (xn2 ) be such that U2 ⊆ U2 ⊆ U1 but xn1 ∈ / U2 . Continue
in this way, and then define K = (Un ∩ S). Show that the sets Un ∩ S are
T

compact and nested decreasing. The Cantor Intersection Theorem therefore


implies that K is nonempty. Show that no element of S can belong to K.
2.9 Continuity for Functions on Metric Spaces 17

2.8.60. Let S 1 = {z ∈ C : |z| = 1} denote the unit circle in the complex


plane. Fix z ∈ S 1 , and define T : S 1 → S 1 by T (x) = zx. Given x ∈ S 1 , the
set O(x) = {z n x}n≥0 is called the forward orbit of x under T, and the cluster
set of x under T is
T T n
A(x) = O(z k x) = {z x}n≥k .
k≥0 k≥0

Prove the following statements.


(a) O(x) = O(x) ∪ A(x).
(b) T maps A(x) into itself. That is, if y ∈ A(x), then T (y) ∈ A(x).
For the remainder of this problem, assume that O(x) is compact. Then from
part (a) we obtain A(x) ⊆ O(x), so there is a smallest nonnegative integer
n0 such that z n0 x ∈ A(x). Prove the following statements.
(c) O(z n0 x) = A(x) = A(z n0 x).
(d) If A(x) is infinite then it is perfect.
(e) O(x) is finite.

2.9 Continuity for Functions on Metric Spaces

Continuity at a Point

Definition 2.9.50 (Continuity at a Point). Let (X, dX ) and (Y, dY ) be


metric spaces, and let f : X → Y be a function that maps X into Y. We say
that f is continuous at a point x ∈ X if for every ε > 0 there exists a δ > 0
such that

∀ y ∈ X, dX (x, y) < δ =⇒ dY f (x), f (y) < ε. ♦ (2.20)

We give some equivalent formulations of continuity at a point in terms of


how f acts on sequences, and in terms of inverse images of open sets. Recall
that the inverse image of a set V ⊆ Y is

f −1 (V ) = x ∈ X : f (x) ∈ V .

Theorem 2.9.51. Let X, Y be metric spaces. Given a function f : X → Y


and a point x ∈ X, the following three statements are equivalent.
(a) f is continuous at the point x.
(b) For any sequence {xn }n∈N in X,

xn → x in X =⇒ f (xn ) → f (x) in Y. (2.21)


18 Extra Material and Problems c 2018 Christopher Heil

(c) For each open neighborhood V of f (x) in Y, there exists an open neigh-
borhood U of x in X such that U ⊆ f −1 (V ). ♦
Proof. (a) ⇒ (b). Assume that f is continuous at x, and suppose that xn → x.
Given ε > 0, there is a δ > 0 such that equation (2.20) holds. Also, by
definition of convergent sequences, since xn → x there is an N > 0 such that
dX (xn , x) < δ for all n > N. Therefore dY (f (xn ), f (x)) < ε for all n > N, so
we have f (xn ) → f (x) in Y.
(b) ⇒ (c). Suppose that statement (c) fails, i.e., there exists some open
neighborhood V of f (x) such that no open neighborhood U of x is contained
in f −1 (V ). Since U = B1/n (x) is an open neighborhood of x, there must be
a point xn ∈ B1/n (x) that is not in f −1 (V ). Hence xn → x, but f (xn ) ∈
/V
for any n so f (xn ) cannot converge to f (x). Therefore statement (b) fails.
(c) ⇒ (a). Suppose that statement (c) holds, and fix any ε > 0. Then
V = Bε (f (x)) is an open neighborhood of f (x) in Y, so statement (c) implies
that there exists an open neighborhood U of x in X such that U ⊆ f −1 (V ).
Since x ∈ U and U is open, there is a δ > 0 such that Bδ (x) ⊆ U. If y is any
point in X that satisfies dX (x, y) < δ, then we have

y ∈ Bδ (x) ⊆ U ⊆ f −1 (V ).

Consequently f (y) ∈ V = Bε (f (x), so dY (f (x), f (y)) < ε. Therefore f is


continuous at the point x. ⊓

Here is an analogous exercise that gives equivalent conditions for continuity
of a function on all of X.
Exercise 2.9.52. Let X, Y be metric spaces. Given a function f : X → Y,
prove that the following three statements are equivalent.
(a) f is continuous, i.e., if V is any open subset of Y, then its inverse image
f −1 (V ) is an open subset of X.
(b) f is continuous at each point x ∈ X.
(c) Given any point x ∈ X and any sequence {xn }n∈N in X,

xn → x in X =⇒ f (xn ) → f (x) in Y. ♦

The equivalent condition given in statement (b) of Exercise 2.9.52 tells us


that a continuous function preserves convergent sequences. This is often the
most convenient way to think of continuity of functions on metric spaces, but
the definition, given in statement (a) is just as important, and even more
fundamental in many respects. That statement tells us that a continuous
function respects topological structure in the sense that the inverse image of
every open set is open.
Although the inverse image of an open set under a continuous function is
open, the following example shows that it is not true that the direct image
of an open set under a continuous function must be open.
2.9 Continuity for Functions on Metric Spaces 19

Example 2.9.53. We know from undergraduate calculus class that the func-
tion f : R → R defined by f (x) = sin x is continuous. If V is any open subset
of the real line, then its inverse image f −1 (V ) is open. For example, the
inverse image of the interval V = (0, 2) is

f −1 (V ) = · · · ∪ (−2π, π) ∪ (0, π) ∪ (π, 2π) ∪ · · · ,

which is open. However, the direct image of the open interval U = (0, 2π) is

f (U ) = [−1, 1],

which is not open. ♦

Examples

These examples use some function spaces that are introduced in Chapter 3.
We will illustrate continuity by considering a function that maps the
normed space Cb1 (R) into the space Cb (R). The space Cb (R) consists of all
bounded, continuous functions, while Cb1 (R) is the space of all bounded, dif-
ferentiable functions f such that f ′ is bounded and continuous. As discussed
in Section 3.5, the standard norm for Cb (R) is the uniform norm, while the
standard norm for Cb1 (R) is kf kCb1 = kf ku + kf ′ ku . Unless we specify other-
wise, we always assume that these are the norms on these two spaces.
Example 2.9.54. Given g ∈ Cb1 (R), let Dg = g ′ be the derivative of g. By
definition of Cb1 (R), the derivative g ′ belongs to Cb (R). Therefore we can think
of D as a function whose domain is Cb1 (R) and whose codomain is Cb (R). We
input a function g from Cb1 (R), and we output the function Dg = g ′ ∈ Cb (R).
Normally we would write D(g) for the output of the function D, but in this
context it is traditional, and simpler, to just write Dg.
We claim that D : Cb1 (R) → Cb (R) is continuous. Note that the elements of
Cb (R) are functions. Hence a “point” in Cb1 (R) is a function g. The function
1

that we are trying to show is continuous is D, not g. In order to show that D


is continuous at a point g, we must interpret equation (2.21) correctly. In our
situation, “g” in that equation is our function D, while “x” is our point g in
the domain Cb1 (R). To prove that D is continuous at g, we must show that
if {gn }n∈N is a sequence of points in Cb1 (R) that converge to g, then Dgn
converges to Dg. That is, we must prove that

gn → g in Cb1 (R) =⇒ Dgn → Dg in Cb (R).

This is easy! To see why, suppose that gn → g in Cb1 (R). This means that gn
converges to g with respect to the norm of Cb1 (R), or in other words

lim kg − gn kCb1 = 0.
n→∞
20 Extra Material and Problems c 2018 Christopher Heil

But then we can compute that

kDg − Dgn ku = kg ′ − gn′ ku


≤ kg − gn ku + kg ′ − gn′ ku
= kg − gn kCb1
→ 0 as n → ∞.

Thus Dgn converges to Dg with respect to the uniform norm, which is the
norm of Cb (R). In summary, we have shown that if gn → g in Cb1 (R), then
Dgn → Dg in Cb (R). Therefore D is continuous at the point g. Since g is an
(
arbitrary function in Cb R), we conclude that D is continuous on the domain
Cb1 (R). ♦

We emphasize that continuity depends on the choice of metric or norm for


both the domain and codomain. For example, if in Example 2.9.54 we replace
the standard norm for Cb1 (R) with the uniform norm, then D is no longer
continuous (see Problem 3.52.59).

Example 2.9.55. Let us point out a few other properties of the derivative
function D considered in Example 2.9.54.
First, D is not injective, because we can find two different functions f, g in
the domain Cb1 (R) such that Df = Dg. In fact, if f is any function in Cb1 (R)
and c is a constant, then g = f + c has the same derivative as f.
Second, D is not surjective, because it is not true that the range of D
equals the codomain Cb (R). In particular, if g is the constant function 1,
then there is no function f ∈ Cb1 (R) that satisfies Df = 1. In fact, the only
functions whose derivative is the constant 1 are f (x) = x + c where c is
a constant, but none of these functions belong to Cb1 (R) because they are
unbounded.
Third, D is a linear function, because if f, g belong to Cb1 (R) and a, b are
scalars, then

D(af + bg) = (af + bg)′ = af ′ + bg ′ = aDf + b Dg.

A linear function “respects” the operations of addition and scalar multipli-


cation. ♦

Uniform Continuity

Example 2.9.56. Let D : Cb1 (R) → Cb (R) be the derivative operator discussed
in Example 2.9.55. We will prove that D is uniformly continuous. Given ε > 0,
let δ = ε. Our “points” x, y from Definition 2.9.5 are now elements of Cb1 (R),
2.9 Continuity for Functions on Metric Spaces 21

i.e., they are functions in Cb1 (R). So, suppose that f, g are any two elements
of Cb1 (R) such that kf − gku < δ. Then

kDf − Dgku = kf ′ − g ′ ku
≤ kf − gku + kf ′ − g ′ ku
= kf − gkCb1

< δ = ε.

Thus kDf − Dgku < ε simultaneously for all f and g in Cb1 (R). This shows
that D is uniformly continuous on Cb1 (R). ♦

In Example 2.9.56 it was easy to find δ given ε, because D is linear and


we have the relationship
kDgku ≤ kgkCb1 . (2.22)
Using the language that we will introduce in Section 6.2, equation (2.22) says
that D is a bounded operator. We will prove in Section 6.2 that all linear,
bounded operators are continuous.
The following exercise characterizes uniform continuity for scalar-valued
functions whose domain is the Euclidean space Fd .

Exercise 2.9.57. Prove that a function f : Fd → F is uniformly continuous


if and only if
lim kTa f − f ku = 0,
a→0

where Ta f (x) = f (x − a) denotes the translation of f by a ∈ Rd . ♦

According to Problem 2.9.17, a uniformly continuous function must map


Cauchy sequences to Cauchy sequences. This fact can be used to prove the
following theorem, which states that it is possible to extend a uniformly
continuous function from a dense subset of the domain to the entire domain
if the codomain is a complete metric space. We assign the proof of this result
as Problem 2.9.65.

Theorem 2.9.58. Let X and Z be metric spaces such that Z is complete.


Suppose that Y is a proper dense subset of X and f : Y → Z is uniformly
continuous. Then there exists a uniformly continuous function g : X → Z
such that g(x) = f (x) for every x ∈ Y.

Extra Problems

2.9.59. Let X, Y, and Z be metric spaces, and let f : X → Y and g : Y → Z


be functions.
22 Extra Material and Problems c 2018 Christopher Heil

(a) Show that if f is continuous at a point x ∈ X and g is continuous at


the point f (x), then their composition g ◦ f : X → Z is continuous at the
point x.
(b) Show that if f and g are each continuous, then their composition
g ◦ f : X → Z is continuous.
2.9.60. (a) Suppose that X is a normed space, and define g : X → R by
g(x) = kxk. Prove that g is continuous.
(b) Let X be a metric space, and suppose that f : X → F is continuous.
Let h = |f |, i.e., h(x) = |f (x)| for x ∈ X, and prove that h : X → R is
continuous.
2.9.61. Let f : X → Y be a function that maps one metric space X into an-
other metric space Y. Prove that the following two statements are equivalent.
(a) f is continuous.
(b) For each compact set K ⊆ X, the restriction of f to K is a continuous
mapping of K into Y.
2.9.62. Suppose that f : Rd → Rd satisfies the following two conditions.
(a) If K ⊆ Rd is compact, then f (K) is compact.
(b) If {Kn }n∈N is a decreasing sequence of compact subsets of Rd , then
∞ 
T T

f Kn = f (Kn ).
n=1 n=1

Prove that f is continuous.


2.9.63. Suppose that {fn }n∈N is a sequence of nonnegative continuous func-
tions on a compact metric space X. Show that if the sequence monotonically
decreases to the zero function, i.e., f1 (x) ≥ f2 (x) ≥ · · · and fn (x) → 0 for
every x, then fn → 0 uniformly.
2.9.64. Let X be a metric space with metric dX and let Y be a metric space
with metric dY .
(a) Prove that

d (x1 , y1 ), (x2 , y2 ) = dX (x1 , x2 ) + dY (y1 , y2 )

defines a metric on X × Y.
(b) Suppose that f : X → Y is continuous, and prove that

graph(f ) = (x, f (x)) : x ∈ X

is a closed subset of X × Y.
(c) Suppose that X is compact. Prove that f : X → Y is continuous if and
only if
2.9 Continuity for Functions on Metric Spaces 23

graph(f ) = (x, f (x)) : x ∈ X
is a compact subset of X × Y.

2.9.65. (a) Prove Theorem 2.9.58.


(b) Let S be the set of rational numbers in (0, 1), and note that S is dense
in the closed interval [0, 1]. Exhibit a function f : S → R that is continuous
on S, but which cannot be extended to a continuous function on [0, 1].
24 Extra Material and Problems c 2018 Christopher Heil

CHAPTER 3: NORMS AND BANACH SPACES

3.1 The Definition of a Norm

The simplest vector space, aside from the trivial space X = {0}, is the field
of scalars F. The “standard” norm on F is absolute value. However, this is
not the only norm, for if λ > 0 is a fixed positive real number then

kxk = λ |x|, x ∈ F,

defines another norm on F. Are there any other norms on F? Yes, but we
can identify all of them—according to Problem 3.1.8, a function k · k is a
seminorm on F if and only if there exists a scalar λ ≥ 0 such that kxk = λ |x|
for every x ∈ F; furthermore, this seminorm is a norm if and only if λ > 0.
We often summarize this by saying that up to multiplication by a positive
scalar, absolute value is the only norm on F, and the only seminorm on F
that is not a norm is the zero function.
The situation in higher dimensions is more complicated. One seminorm on
Fd is the zero function, defined by kxk = 0 for all x ∈ Fd . However, when
d ≥ 2 there are many nonzero functions on Fd that are seminorms, such as

kxk = |x2 | + · · · + |xd |, x = (x1 , x2 , . . . , xd ) ∈ Fd .

Recall that if d is a metric on a set X, then we can create a metric on any


subset S of X simply by restricting d to S. In contrast, if k · k is a norm on
a vector space X and S is a subset of X, then the restriction of k · k to S
need not define a norm on S. This is because Definition 3.1.1 requires that
the domain of a norm be a vector space. Therefore, only when S is a subspace
of X will it be true that the restriction of k · k to S is a norm on S.

Definition 3.1.50. Let (X, k·k) be a normed vector space. If S is a subspace


of X, then the norm on S obtained by restricting k · k to S is called the norm
on S inherited from X, or simply the inherited norm on S. We usually use
the same symbol k · k to denote the inherited norm on S. ♦

For example, if X = F3 with respect to the Euclidean norm k · k2 , then the


inherited norm on the subspace S = {x = (x1 , x2 , 0) : x1 , x2 ∈ F} is given by
1/2
kxk2 = |x1 |2 + |x2 |2 , x = (x1 , x2 , 0) ∈ S.
3.2 Examples: The ℓp Spaces 25

Extra Problems

3.1.51. (a) Fix a dimension d ≥ 1. Show that there exist positive constants
Ad , Bd such that

Ad kxk1 ≤ kxk∞ ≤ Bd kxk1 for all x ∈ Fd . (3.23)

The constants Ad , Bd can depend on the dimension d, but not on x. Find the
optimal constants Ad , Bd , i.e., find the largest constant Ad and the smallest
constant Bd such that equation (3.23) holds for every x ∈ Fd .
(b) Repeat part (a), but with k · k2 in place of k · k∞ .
(c) Repeat part (a), but with k · k2 in place of k · k1 .
Remark: Using the terminology of Definition 3.6.1, this problem says that
k · k1 , k · k2 , and k · k∞ are all equivalent norms on Fd .

3.2 Examples: The ℓp Spaces

We write the components of an infinite sequence in two ways. If x is a sequence


of scalars, then we usually write x as

x = (xk )k∈N = (x1 , x2 , . . . ).

Using this notation, the symbol xk denotes the kth component of x. However,
it is sometimes preferable to write
 
x = x(k) k∈N = x(1), x(2), . . . .

That is, it is sometimes more clear to let x(k) denote the kth component
of x. In any particular situation we will use whichever of these two notations
is more convenient.
Remark 3.2.50. By making appropriate changes in the definitions above, we
can consider spaces of sequences that are indexed by sets other than the
natural numbers N. For example, if I is a countable index set, then we say
that a sequence x = (xk )k∈I is p-summable if and only if
X
|xk |p < ∞.
k∈I

We let ℓp (I) be the space of all p-summable sequences indexed by I, and let
ℓ∞ (I) be the space of all bounded sequences indexed by I. If I = N, then
this reduces to the definition of ℓp that we gave before.
A common choice of index set is I = Z. A sequence indexed by Z is a
bi-infinite sequence of the form
26 Extra Material and Problems c 2018 Christopher Heil

x = (xk )k∈Z = (. . . , x−2 , x−1 , x0 , x1 , x2 , . . . ).

Hence ℓp (Z) denotes the set of all bi-infinite sequences that are p-summable
(if p is finite) or bounded (if p = ∞). For example, the bi-infinite sequence
 
x = 2−|k| k∈Z = . . . , 41 , 12 , 1, 21 , 14 , . . .

belongs to ℓp (Z) for every index 0 < p ≤ ∞.


Going in a different direction, we can let the index set be finite. For ex-
ample, if I = {1, . . . , d} then a sequence indexed by I is simply a vector
x = (x1 , . . . , xd ) ∈ Fd . Every such sequence is p-summable and bounded, so
for I = {1, . . . , d} we have ℓp (I) = Fd , no matter what index p that we choose.
In contrast to equation (3.5), when I is a finite set we have ℓp (I) = ℓq (I) for
every p and q.
It is possible to consider uncountable index sets I, but this requires more
care, see [Heil11, Exercise 1.15] for details. ♦

We will see that if 1 ≤ p ≤ ∞, then k · kp is an norm on ℓp (see Theo-


rem 3.4.3). Therefore we refer to k · kp as the “ℓp -norm” when p ≥ 1. For two
specific values of p we have additional names: We call k · k2 the Euclidean
norm, and we refer to k · k∞ as the sup-norm.
Before we can address the question of whether k · kp is a norm on ℓp ,
we need to prove that ℓp is a vector space. We know how to add sequences
(just add corresponding components), and how to multiply a sequence by a
scalar. It is clear that ℓp is closed under multiplication by scalars, and the
following lemma shows that ℓp is closed under addition when p is finite (even
for 0 < p < 1).

Lemma 3.2.51. Fix 0 < p < ∞. If x, y ∈ ℓp , then the sequence x + y =


(xk + yk )n∈N satisfies

kx + ykpp ≤ 2p kxkpp + kykpp , (3.24)

and therefore x + y ∈ ℓp .

Proof. Given any scalars a, b ∈ F, we have


p
|a + b|p ≤ |a| + |b|
 p
≤ max{|a|, |b|} + max{|a|, |b|}

= 2p max{|a|p , |b|p }

≤ 2p |a|p + |b|p .

Therefore, if x = (xk )k∈N and y = (yk )k∈N are sequences in ℓp , then


3.2 Examples: The ℓp Spaces 27


X ∞
X  
kx + ykpp = |xk + yk |p ≤ 2p |xk |p + |yk |p = 2p kxkpp + kykpp .
k=1 k=1

Since the right-hand side is finite, we conclude that x + y ∈ ℓp . ⊓



Equation (3.24) is not necessarily the most useful inequality that relates
kx + ykp to kxkp and kykp , but it is sufficient for the purpose of proving that
ℓp is closed under addition for every finite index p. The reader should verify
that ℓ∞ is also closed under addition and scalar multiplication and therefore
is a vector space.
The zero vector in ℓp is the zero sequence whose components are all zero.
We use the same symbol 0 to denote both the zero sequence and the number
zero. That is, we write the zero sequence as

0 = (0, 0, 0, . . . ).

As illustrated in the following example, it should usually be clear from context


whether the symbol 0 is meant to represent the number zero or the zero
sequence.
Example 3.2.52. Let x = (xk )k∈N be a sequence of scalars. If we write
“x = 0,” then, since x is a sequence, we infer that 0 in this context must
also represent a sequence and therefore stands for the zero sequence. In other
words, “x = 0” is equivalent “x is the zero sequence,” which is itself equivalent
to “xk = 0 for every k.”
Similarly “x 6= 0” means that x is not the zero sequence, and consequently
there exists at least one index k such that xk 6= 0. We say that x is a
nonzero sequence, or simply that x is nonzero, if x 6= 0. That is, a nonzero
sequence is a sequence that has at least one nonzero component. For example,
y = (1, 0, 0, . . . ) is a nonzero sequence. ♦

The Standard Basis Vectors

Let
δn = (0, . . . , 0, 1, 0, 0, . . . )
denote the sequence that has a 1 in the nth component and zeros elsewhere.
We call δn the nth standard basis vector, and refer to E = {δn }n∈N as the
sequence of standard basis vectors, or simply the standard basis.
Note that E = {δn }n∈N is a sequence, and each element δn of this sequence
is itself a sequence of scalars. Thus E is a “sequence of sequences,” but that
is not necessarily the best way to think about E. Instead, it is better to think
of E as being a sequence of vectors, where it just so happens that “vector”
here means a sequence of scalars. Each δn is a vector in ℓp , and E is just the
sequence of vectors E = {δ1 , δ2 , . . . } in ℓp .
28 Extra Material and Problems c 2018 Christopher Heil

Since E is a set of vectors, we would like to know its properties. Is it a


linearly independent set? What is its span? Before we address these ques-
tions, we consider an even more fundamental question: What does a linear
combination of elements of E look like? By definition, a linear combination is
a sum of finitely many scalar multiples of elements of E. If we choose finitely
many δn , say δn1 , . . . , δnk , then a linear combination of these vectors is

x = cn1 δn1 + · · · + cnk δnk ,

where cn1 , . . . , cnk ∈ F. If we let N be the largest of n1 , . . . , nk , then by


inserting terms with ck = 0, we can write
N
X
x = ck δk = c1 δ1 + · · · + cN δN = (c1 , . . . , cN , 0, 0, . . . ). (3.25)
k=1

Every linear combination of elements of E has the form given in equation


(3.25) for some integer N > 0 and scalars c1 , . . . , cN ∈ F. For example, if
n1 = 2, n2 = 3, n3 = 5 and cn1 = 6, cn2 = −π, cn3 = 4, then N = 5 and

x = cn1 δn1 + cn2 δn2 + cn3 δn3


= 6δ2 − πδ3 + 4δ5
= 0δ1 + 6δ2 − πδ3 + 0δ4 + 4δ5
= (0, 6, −π, 0, 5, 0, 0, . . . ).

Now we determine whether E is linearly independent. Suppose that some


finite linear combination of elements of E equals the zero vector in ℓp . That
is, suppose that
XN
ck δ k = 0 (3.26)
k=1

for some N and some scalars c1 , . . . , cN . Rewriting equation (3.26) in terms


of components, we obtain

(c1 , . . . , cN , 0, 0, . . . ) = (0, 0, 0, . . . ).

Hence c1 = · · · = cN = 0. Therefore the only linear combination that equals


the zero vector is the trivial combination, so E is a linearly independent set
in ℓp .
However, the span of E is not ℓp , and therefore E is not a vector space
basis for ℓp . To see why, recall that span(E) is the set of all finite linear
combinations of elements of E. Every linear combination of standard basis
vectors has the form given in equation (3.25). Consequently, if x is a vector
in span(E) then x has only finitely many nonzero components. Any vector
that has infinitely many nonzero components cannot belong to the span of
3.2 Examples: The ℓp Spaces 29

E. For example,  
z = 2−n n∈N
= 1 1 1
2, 4, 8, . . .

belongs to ℓp for every p, yet z ∈ / span(E), because it is not a finite linear


combination of δ1 , δ2 , . . . . Therefore

span(E) ( ℓp , 0 < p ≤ ∞.

Indeed, span(E) is precisely the set of all sequences that have finitely many
nonzero components. We will study this space, which we often denote by
c00 = span(E), in more detail in Example 2.2.13.
The fact that ℓp contains an infinite linearly independent set implies that
it must be infinite dimensional. This is because a basic result from linear
algebra tells us that if a vector space X has finite dimension d, then any set
of more than d vectors must be dependent. A finite-dimensional vector space
cannot contain an infinite linearly independent set.

Hölder’s Inequality

Fig. 3.50 The curved line is the graph of y = xp−1 . The area of the vertically hatched
1
region is 0a xp−1 dx, the area of the horizontally hatched region is 0b y p−1 dy, and the
R R

area of the rectangle [0, a] × [0, b] is ab.

The method of Problem 3.2.11 is not the only way to prove equation (3.8),
which stated that if 1 < p < ∞ then

ap bp
ab ≤ + ′, a, b ≥ 0.
p p
30 Extra Material and Problems c 2018 Christopher Heil

For another approach, note that xp−1 is continuous and strictly increasing on
1
the interval [0, a], and its inverse function is y p−1 . Figure 3.50 gives a “proof
by picture” that
Z a Z b
1
ab ≤ xp−1 dx + y p−1 dy. (3.27)
0 0

Evaluating the right-hand side of equation (3.27), we obtain equation (3.8).


Yet another proof of equation (3.8) can be given based on a theorem known
as Jensen’s inequality. 
Here is one application of Hölder’s Inequality. Let x = y = k1 k∈N . We
have x ∈ ℓp for 1 < p ≤ ∞. Taking p = 3/2, Hölder’s Inequality therefore
implies the interesting inequality

X∞ X∞ 2/3 X∞ 1/3


1 1 1
= kxyk1 ≤ kxk3/2 kyk3 = . ♦
k2 n=1
n3/2 n=1
n3
k=1

Euler’s formula (see Problem 5.10.11) states that



X 1 π2
= .
k2 6
k=1

Extra Problems

3.2.53. Given 0 < p < 1, prove that ℓp is complete with respect to the metric
defined in Problem 3.3.14.

3.3 The Induced Metric

Convexity

If a metric space contains an open ball that is not convex, then there is an
important further conclusion that we can draw. This is spelled out in the
following lemma.

Lemma 3.3.50. Let X be a vector space that has a metric d. If there exists an
open ball Br (x) in X that is not convex, then the metric d is not induced from
any norm on X. That is, there is no norm k·k on X such that d(y, z) = ky−zk
for all y, z ∈ X.

Proof. Suppose that the open ball Br (x) is convex. Suppose also that there
is a k · k on X that satisfies ky − zk = d(y, z) for all y, z ∈ X. By Problem
3.3 The Induced Metric 31

3.3.13, open balls defined with respect to a norm are convex. Yet the open
ball centered at x with radius r that is defined with respect to k ·k is precisely
Br (x):
 
Br (x) = y ∈ X : d(y, x) < 1 = y ∈ X : ky − xk < 1 .

Since Br (x) is not convex, this is a contradiction. Hence no such norm k · k


can exist. ⊓⊔

Bounded Sets

Before giving the next property of convergent and Cauchy sequences, we


recall from Definition 2.3.2 that a subset of a metric space is bounded if
it is contained in some open ball. The following (easy) exercise gives some
equivalent formulations for boundedness of subsets of a normed space. In
particular, statement (b) of this exercise says that a subset of a normed space
is bounded if it is contained in some open ball that is centered at the origin
(note that a generic metric space need not be a vector space, and hence there
need not be an “origin” in a metric space, as there is in a normed space).

Exercise 3.3.51. Let E be a subset of a normed space X. Prove that the


following three statements are equivalent.
(a) E is bounded, i.e., E ⊆ Br (x) for some x ∈ X and r > 0.
(b) E ⊆ Br (0) for some r > 0.
(c) supx∈E kxk < ∞. ♦

Separability

There is an interesting difference between ℓ∞ and the other ℓp spaces. Prob-


lem 2.6.20 showed that ℓ1 is separable, i.e., it contains a countable dense
subset. A very similar argument shows that ℓp is separable for any finite p,
but we saw in Theorem 2.6.8 that ℓ∞ is not separable. In this sense ℓ∞ is
“much larger” than the other ℓp spaces.

The Standard Topology for Fd

Many sets have a “standard” or “default” metric or norm associated with


them. The topology induced from this standard metric or norm is often re-
ferred to as the standard topology for that space. For example, the standard
32 Extra Material and Problems c 2018 Christopher Heil

norm for the space ℓp when 1 ≤ p ≤ ∞ is k · kp , so the topology induced from


this norm is the standard topology for ℓp . The standard norm for Cb (R) is
the uniform norm k · ku .
The standard topology for Fd (also called the usual topology for Fd ) is the
topology induced from the Euclidean norm on Fd . However, we will show
that exactly the same topology is induced from the norm k · k1 .
Lemma 3.3.52. If U ⊆ Fd , then U is open with respect to the Euclidean
norm if and only if U is open with respect to the norm k · k1 .
Proof. For this proof, let Br1 (x) denote an open ball with respect to k · k1 ,
i.e., 
Br1 (x) = y ∈ Fd : kx − yk1 < r .
Similarly, let Br2 (x) denote an open ball with respect to the Euclidean norm
k · k2 . These balls are illustrated in Figure 2.5 for dimension d = 2 and F = R.
As we see in that figure, Br2 (x) is a ball or disk in the usual sense of the word,
while Br1 (x) is a diamond. Figure 2.5 suggests that the diamond fits entirely
within the disk. Indeed, by direct calculation we can see that

Br1 (x) ⊆ Br2 (x). (3.28)

Suppose now that U is any set that is open with respect to the Euclidean
norm. By definition, this means that for each x ∈ U there exists some rx > 0
such that Br2x(x) ⊆ U. Hence,
S S
U = {x} ⊆ Br1x(x) (since x ∈ Br1x(x))
x∈U x∈U
S
⊆ Br2x(x) (by equation (3.28))
x∈U

⊆ U (since Br2 (x) ⊆ U ).

This implies that U is the union of the open balls Br1x(x) over x ∈ U, i.e.,
S
U = Br1x(x).
x∈U

Since each such ball Br1x(x) is open with respect to k · k1 , it follows that U is
open with respect to k · k1 .
If we attempt to use the same argument to prove the converse implication,
then we encounter a problem because Br2 (x) is not contained in Br1 (x). On
the other hand, by applying Hölder’s Inequality with the index p = 2 we see
that
d
X X
d 1/2 X
d 1/2
kxk1 = |1 · xk | ≤ 12 |xk |2 = d1/2 kxk2 ,
k=1 k=1 k=1

and it follows from this that


3.3 The Induced Metric 33

Br2 (x) ⊆ Bd11/2 r (x).

That is, every open disk is contained in a diamond whose radius has been
enlarged by a factor of d1/2 . We can then argue just as before to show that
if U is open with respect to the norm k · k1 , then it is also open with respect
to k · k2 . ⊓

In summary, the topology on Fd induced from the norm k · k1 is the same


as the topology induced by k·k2 . An entirely similar argument shows that the
norm k · k∞ also induces this same topology. We will prove in Theorem 3.7.2
that every norm on Fd induces the same topology. Thus, there is only one
topology on Fd that is induced by a norm, and that is the standard topology.
There are other topologies on Fd , such as the discrete topology, but these are
not induced from any norm on Fd .

Extra Problems

3.3.53. Given a set E in a normed space X and given t ≥ 0, define tE =


{tx : x ∈ E}.
(a) Prove that tBr (0) = Brt (0).
(b) Given an arbitrary point x ∈ X, must tBr (x) = Brt (x)?

3.3.54. Show that if X is a nontrivial normed space, then every open ball
Br (x) is an infinite, proper subset of X, and r>0 Br (x) = X.
S

In contrast, construct examples of metric spaces X1 , X2 , X3 that have the


following properties.
(a) Br (x) = X1 for some x ∈ X1 and some r > 0.
(b) X2 is infinite, yet Br (x) is a finite set for every x ∈ X2 and r > 0.
(c) For every integer n ∈ N there exists some x ∈ X3 and some r > 0 such
that Br (x) contains exactly n elements.

3.3.55. (a) Show that a set B ⊆ R is an open ball in R if and only if B is a


bounded open interval.
(b) Identify all of the convex subsets of the real line.

3.3.56. Suppose that X is a normed vector space. Show that if U is an open


subset of X, then the translated set U + x = {u + x : u ∈ U } is also open for
every x ∈ X.

3.3.57. Let X be a nontrivial normed space. Prove that every subspace of X


is convex, but no nontrivial subspace of X is a bounded set.
34 Extra Material and Problems c 2018 Christopher Heil

3.3.58. (a) Show that if E is a subset of a normed space X and



δ = inf kx − yk : x, y ∈ E, x 6= y > 0,

then ∂E = E. In particular, show that ∂E = E for every finite set E ⊆ Rd .


(b) Show by example that the conclusion of part (a) can fail if δ = 0.
(c) Show by example that  the conclusion of part (a) can fail if X is a
metric space, even if δ = inf d(x, y) : x, y ∈ E, x 6= y > 0.

3.4 Banach Spaces

Complete Subsets

Sometimes we need to refer to a subset of a normed space that has a com-


pleteness property.

Definition 3.4.50 (Complete Sets in Normed Spaces). If S is a subset


of a normed vector space X and every Cauchy sequence in X converges to
an element of S, then we say that S is a complete subset of X. ♦

If S is a complete subspace of a normed space X, then S is itself a normed


space, and so in this case we would call S a Banach space. However, if S
is a subset that is complete but is not a subspace, then we refer to S as a
complete subset of X. In Lemma 3.4.4 we will see that a subspace S of a
Banach space X is complete if and only if S is closed.

Remark 3.4.51. As we have noted before, the term “complete” has a num-
ber of distinct mathematical meanings. In this volume, the two main uses
are complete sets in the sense of Definition 2.2.9, and complete sequences in
the sense of Definition 4.4.4. The reader should pause to consider context
whenever the term “complete” is encountered. ♦

Here is an example of an incomplete subset of R.

Example 3.4.52. Let S be the open interval S = (0, 1) in the real line. Then
{ n1 }n∈N is a Cauchy sequence of elements of S, but this sequence does not
converge to an element of S. The sequence does converge, but it does not
converge to a point in S. Therefore S is not a complete subset of R. ♦

Another example is the set of rationals Q, which Example 2.2.7 shows is


an incomplete subset of R. A modification of the argument given in Example
2.2.7 can be used to show that

Qd = r = (r1 , . . . , rd ) : r1 , . . . , rd ∈ Q
3.4 Banach Spaces 35

is an incomplete subset of Rd with respect to each of the norms k · k1 , k · k2 ,


and k · k∞ , and that

Qd + iQd = z = r + is : r, s ∈ Qd

is an incomplete subset of Cd with respect to k · k1 , k · k2 , and k · k∞ .


Here is a direct proof that c00 is an incomplete subset of ℓ∞ .

Theorem 3.4.53. The sup-norm is a norm on c00 , but c00 is not complete
with respect to k · k∞ .

Proof. Since c00 is a subspace of ℓ∞ , it is a normed space with respect to the


sup-norm k · k∞ .
For each n ∈ N, let xn be the sequence

xn = 1, 12 , 13 , . . . , n1 , 0, 0, . . . ,

and consider the sequence of vectors {xn }n∈N in c00 . If m < n, then
1

xn − xm = 0, . . . , 0, m+1 , . . . , n1 , 0, 0, . . . ,

so
1
kxn − xm k∞ = .
m+1
Therefore, if we fix ε > 0 then we have kxn − xm k∞ < ε for all m, n > 1ε .
Hence {xn }n∈N is a Cauchy sequence in c00 .
Suppose that there was a sequence x ∈ c00 such that xn → x with respect
to the sup-norm. We know that ℓ∞ -norm convergence implies componentwise
convergence, so xn must converge componentwise to x as n → ∞. If we fix
any particular k, then
1
xn (k) = for all n > k.
k
Therefore we must have x(k) = k1 . That is, x is the sequence

x = 1, 21 , 13 , . . . . (3.29)

However, this x does not belong to c00 , so we have reached a contradiction.


Therefore there is no vector x ∈ c00 such that xn → x with respect to the
sup-norm.
In summary, {xn }n∈N is a Cauchy sequence in c00 , but this sequence does
not converge to an element of c00 . Therefore c00 is incomplete with respect
to the sup-norm. ⊓ ⊔

Theorem 3.4.6 shows that an infinite-dimensional Banach space (such as


c0 ) can contain an incomplete subspace (such as c00 ). This is quite dif-
ferent from what happens in finite-dimensions! Every subspace of a finite-
36 Extra Material and Problems c 2018 Christopher Heil

dimensional normed space is complete. In particular, every subspace of Fd is


complete.

Closed Subspaces of Banach Spaces

It is important to note that Lemma 3.4.4 is only applicable if the subspace Y


has the same norm as X. For example, although Cb1 (R) is a subspace of
Cb (R), the standard norms for these two spaces are different. For Cb (R) the
standard norm is the uniform norm, while for Cb1 (R) the standard norm
is kf kCb1 = kf ku + kf ′ ku . Therefore, even though we know that Cb (R) is
complete with respect to the uniform norm, we cannot use Lemma 3.4.4 to
prove that Cb1 (R) is complete with respect to its standard norm, because that
norm is not the uniform norm.

Extra Problems

3.4.54. Given an arbitrary (possibly uncountable) index set I, let ℓ∞ (I) be


the space of all bounded sequences x = (xi )i∈I , and set kxk∞ = supi∈I |xi |.
For 1 ≤ p < ∞ let ℓp (I) consist of all sequences x = (x Pi )i∈I p with at most
countably many nonzero components such that kxkpp = |xi | < ∞. Prove
that each of these spaces ℓp (I) is a Banach space with respect to k · kp .

3.5 Uniform Convergence of Functions

Remark 3.5.50. The uniform norm is the analogue for functions of the sup-
norm
kxk∞ = k(xk )k∈N k∞ = sup |xk | < ∞
k∈N

that is defined for sequences. It would therefore be natural to use the notation
kf k∞ instead of kf ku to denote the uniform norm of f. However, the symbols
kf k∞ traditionally denote the L∞ -norm of a function f. If f is a continuous
function on an interval I, then there is no difference between its L∞ -norm
and its uniform norm. However, for discontinuous functions there can be a
difference, because the L∞ -norm “ignores sets of measure zero” (in a sense
that is made precise by using measure theory; see [Heil18]). For this reason,
we will always write kf ku to denote the uniform norm of a function. ♦
3.5 Uniform Convergence of Functions 37

C0 (R) and Cc(R)

Now we look at continuous functions whose domain is the entire real line R.
We will define two important subspaces of Cb (R). To define the first of these,
we declare that a function f : R → F vanishes at infinity if we have both

lim f (x) = 0 and lim f (x) = 0. (3.30)


x→∞ x→−∞

For simplicity, we often write limx→±∞ f (x) = 0 or lim|x|→∞ f (x) = 0 to


mean that the two conditions in equation (3.30) both hold.
The space of all continuous functions that vanish at infinity is
n o
C0 (R) = f ∈ C(R) : lim f (x) = 0 .
x→±∞

Every function f ∈ C0 (R) is bounded (why?), so C0 (R) is a subset of Cb (R).


Further, C0 (R) is closed under addition of functions and multiplication of a
function by a scalar, so it is a subspace of Cb (R). Since the uniform norm is
a norm on Cb (R), it is therefore a norm on C0 (R) as well.
Here are some examples.
Example 3.5.51. (a) The Gaussian function
2
φ(x) = e−x , ∈ R,

belongs to C0 (R). Since φ(x) 6= 0 for any x, this shows that an element
of C0 (R) need not be zero at any point.
(b) The function f (x) = e−x does not belong to C0 (R) because f (x) does not
converge to zero as x → −∞.
(c) The two-sided exponential g(x) = e−|x| is an element of C0 (R), and it
satisfies g(x) 6= 0 for every x.
(d) The sinc function is

sin πx
sinc(x) = , x 6= 0.
πx
The sinc function has a removable discontinuity at the origin, so if we
define sinc(0) = 1 then it belongs to C0 (R). The sinc function has the
property that sinc(x) = 0 if and only if x is a nonzero integer (this plays
an important role in the Classical (or Shannon) Sampling Theorem; see
[Heil11]). ♦

Another example of a function in C0 (R) is any continuous function that


is identically zero outside of some finite interval. We have a special name for
functions that have this property.
38 Extra Material and Problems c 2018 Christopher Heil

Definition 3.5.52 (Compactly Supported Function). We say that a


continuous function f : R → F is compactly supported if there exists some
finite interval [a, b] such that f (x) = 0 for all x ∈
/ [a, b]. ♦
For example, f (x) = sin x is not compactly supported, but the continuous
function (
sin x, x ∈ [0, 2π],
g(x) = (sin x) χ[0,2π] (x) =
0, x∈/ [0, 2π],
is compactly supported because it is identically zero outside of [0, 2π].
We denote the space of compactly supported continuous functions by
n o
Cc (R) = f ∈ C(R) : f is compactly supported .

Each function in Cc (R) belongs to C0 (R), and Cc (R) is closed under addi-
tion and multiplication by scalars, so it is a subspace of C0 (R). Each of the
functions in parts (a), (c), and (d) of Example 3.5.51 belong to C0 (R) but
not Cc (R). Hence Cc (R) is a proper subspace of C0 (R). More generally, we
have the inclusions

Cc (R) ( C0 (R) ( Cb (R) ( C(R).

Since Cb (R) is a normed space and C0 (R) is a subspace of Cb (R), the


uniform norm defines a norm on C0 (R). We take the uniform norm to be the
standard, or “default,” norm for C0 (R), so unless we specify otherwise, we
always assume that the norm on C0 (R) is the uniform norm.
According to Problem 3.5.19, C0 (R) is complete with respect to the uni-
form norm. Problem 3.5.20 shows that Cc (R) is dense in C0 (R).
In contrast, although Cc (R) is a subspace of Cb (R), it is not dense in
Cb (R). To see why, let f be the constant function f = 1. If fn ∈ Cc (R), then
fn (x) = 0 for some x. Hence |f (x) − fn (x)| = 1 for that x, and therefore

kf − fn ku = sup |f (x) − fn (x)| ≥ 1.


x∈R

No matter what functions fn ∈ Cc (R) that we choose, kf − fn ku does not


converge to zero. Therefore there is no sequence of functions {fn }n∈N that
converges uniformly to the constant function f = 1. This implies that Cc (R)
is not dense in Cb (R).
Example 3.5.53. Let X = C0 (R) and let E = Cc (R). The “standard” or
“default” norm on C0 (R) is the uniform norm, and that is the norm that we
will use in this example. Let f be any function in C0 (R) that does not belong
to Cc (R). For example, we could take f (x) = e−|x| . By Problem 3.5.20, there
exist functions fn ∈ Cc (R) that converge uniformly to f. We have fn 6= f
since fn belongs to Cc (R) while f does not. Thus:
• fn ∈ Cc (R) for every n,
3.5 Uniform Convergence of Functions 39

• fn 6= f for every n, and


• fn → f (with respect to the norm on C0 (R), which is the uniform norm).
According to Definition 2.5.1, this says that f is an accumulation point of
Cc (R). Since f does not belong to Cc (R), this implies that Cc (R) is not a
closed subset of C0 (R). In fact, a modification of the argument given above
shows that Cc (R) is a dense but proper subset of C0 (R). ♦

Example 3.5.54. Suppose that instead of Cc (R), we consider the set S that
consists of those continuous, compactly supported functions that are identi-
cally zero outside of the interval [0, 1], i.e.,

S = g ∈ Cc (R) : supp(g) ⊆ [0, 1] .

We will show that this set S is a closed subset of C0 (R).


To do this, suppose that f is any function in C0 (R) that is a limit of
functions from S. By limit, we mean here a limit with respect to the norm of
C0 (R). In other words, suppose that functions fn ∈ S are such that fn → f
uniformly as n → ∞. Since fn belongs to S, it is supported within the interval
[0, 1]. Now, uniform convergence implies pointwise convergence, so for every
x ∈ R we have
f (x) = lim fn (x).
n→∞

In particular, if we choose x outside of [0, 1], then fn (x) = 0 for every n and
therefore f (x) = 0 as well. Hence f is compactly supported and is identically
zero outside of [0, 1], so f belongs to S. Applying Theorem 2.4.2, we conclude
that S is a closed subset of C0 (R). ♦

Expanded Discussion of the Cantor–Lebesgue Function

We expand a little on Problem 3.5.26, which constructs the Cantor–Lebesgue


Function.
To illustrate uniform convergence, we will construct a striking function
called the Cantor–Lebesgue Function. The construction of this function is
related to the middle-thirds Cantor set, which was constructed in Section
2.7.
The next exercise constructs the Cantor–Lebesgue function ϕ. As in the
construction of the Cantor set, we do not define ϕ directly, but instead obtain
it as a limit, in this case the limit of a uniformly convergent sequence of
continuous functions.

Exercise 3.5.55 (Cantor–Lebesgue Function). Consider the two func-


tions ϕ1 , ϕ2 pictured in Figure 3.51. The function ϕ1 takes the constant
value 12 on the interval ( 13 , 23 ) that is removed from [0, 1] in the first stage of
40 Extra Material and Problems c 2018 Christopher Heil

1 1

0.75 0.75

j1 j2
0.5 0.5

0.25 0.25

0 0
0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1

0.75

j3
0.5

0.25

0
0 0.25 0.5 0.75 1

Fig. 3.51 First stages in the construction of the Cantor–Lebesgue function.

the construction of the Cantor set, and it is linear on the remaining subin-
tervals of [0, 1]. The function ϕ2 takes the same constant 21 on the interval
( 13 , 32 ) but additionally is constant with values 41 and 43 on the two intervals
that are removed during the second stage of the construction of the Cantor
set. We continue this process and define ϕ3 , ϕ4 , . . . in a similar fashion. Each
function ϕn is continuous, and ϕn is constant on each of the open intervals
that are removed during the nth stage of the construction of the Cantor set.
 1 Looking
2
 at the graphs of ϕ1 and  1ϕ2 , observe
 2  that ϕ1 = ϕ2 on the interval
,
3 3 , and if x belongs to either 0, 3 or 3 , 1 then |ϕ1 (x)−ϕ2 (x)| is at most
1 1
2 . Hence kϕ 1 − ϕ k
2 u ≤ 2 . Continue this argument, and prove the following
statements.
(a) For each n ∈ N, ϕn is continuous and monotone increasing on [0, 1], i.e.,
if 0 ≤ x ≤ y ≤ 1, then ϕn (x) ≤ ϕn (y).
(b) For each n ∈ N,

kϕn − ϕn+1 ku = sup |ϕn (x) − ϕn+1 (x)| ≤ 2−n .


x∈[0,1]

(c) {ϕn }n∈N is a Cauchy sequence in C[0, 1] with respect to the uniform
norm.
3.5 Uniform Convergence of Functions 41

We know that C[0, 1] is a Banach space with respect to the uniform norm.
Therefore, since {ϕn }n∈N is a Cauchy sequence in C[0, 1], there must exist a
function ϕ ∈ C[0, 1] such that ϕn converge uniformly to ϕ as n → ∞! Since
uniform convergence implies pointwise convergence, we have

ϕ(x) = lim ϕn (x), x ∈ [0, 1].


n→∞

This continuous function ϕ is called the Cantor–Lebesgue function or, more


picturesquely, the Devil’s staircase on [0, 1].
If we like, we can extend ϕ to a continuous function that is defined on the
entire real line R by reflecting its graph about the point x = 1 and declaring ϕ
to be zero outside of [0, 2]. This reflected Devil’s staircase function is pictured
in Figure 3.52. ♦

1 2
Fig. 3.52 The reflected Devil’s staircase (Cantor–Lebesgue function).

The Cantor–Lebesgue function is differentiable at some points, but not


all. Suppose that x ∈ [0, 1] \ C, i.e., x is in the complement of the Cantor set
in [0, 1]. In this case, x belongs to one of the open intervals that is removed
during some stage of the construction of the Cantor set. Let us say it is
stage n, and let I denote that interval that is removed. Then ϕn is constant
on I, and ϕn+1 = ϕn on I. Taking the limit, the Cantor–Lebesgue function ϕ
must be constant on the open interval I, which contains the point x. Hence
ϕ is differentiable at x, and ϕ′ (x) = 0.
Thus ϕ is differentiable at every point x ∈ [0, 1] \ C, and ϕ′ (x) = 0 at each
such x. If it was the case that ϕ′ (x) = 0 for every x ∈ [0, 1], then ϕ would
be a constant function. However, while ϕ is constant on many subintervals of
[0, 1], those constants are different for each subinterval, and ϕ is not constant
on [0, 1].
In summary, ϕ is differentiable at all points in the interval [0, 1] except
for points x ∈ C. As discussed above, the measure of the Cantor set is zero.
42 Extra Material and Problems c 2018 Christopher Heil

Hence ϕ is differentiable except on a set whose measure is zero. A function


that has this property is called a singular function; see [Heil18, Chap. 5].

Expanded discussion of Lipschitz Continuity

Recall that given an interval I in the real line, we let Cb (I) be the set of all
functions f : I → F that are both bounded and continuous. The space Cb1 (I)
consists of all differentiable functions f : I → F such that both f and f ′ are
bounded (we focused on Cb1 (R) in Section 3.50, but the same ideas apply to
functions on an interval I). By definition, we have

Cb1 (I) ( Cb (I).

However, we often encounter functions that are “better than continuous”


but are “not quite differentiable.” The idea of Lipschitz continuity is one
way of quantifying a measure of smoothness that lies between continuity and
differentiability. We will define a study this notion in this section, and will
define a space of of functions called Lip(I) that satisfies

Cb1 (I) ( Lip(I) ( Cb (I).

To motivate the definition of Lipschitz continuity, let f be a real-valued


function in Cb1 (I). Then f is differentiable at every point of I and both f and
f ′ are bounded, so the quantity

K = kf ′ ku = sup |f ′ (t)|
t∈I

is finite. Choose any two points x < y in I. Because f is real-valued, the Mean
Value Theorem (Theorem 1.9.1) implies that there exists a point c between
x and y such that
f (y) − f (x)
f ′ (c) = .
y−x
Rearranging, we see that

|f (y) − f (x)| = |f ′ (c)| |y − x| ≤ K |y − x|. (3.31)

Hence the distance between f (x) and f (y) is never more than K times the
distance between x and y. Such a function does not expand the distance be-
tween points by more than the factor K (technically the word “compression”
may be more appropriate if K < 1, but the words “expand” or “dilate” are
often used in this context regardless of whether K is larger or smaller than 1).
Every real-valued differentiable function whose derivative is bounded has this
3.5 Uniform Convergence of Functions 43

property of limited expansion. On the other hand, the following exercise shows
that there exist non-differentiable functions that have the same property.

Exercise 3.5.56. The hat function W (x) = max 1 − |x|, 0 is piecewise
linear. Prove that W satisfies

|W (y) − W (x)| ≤ |y − x|, all x, y ∈ R,

even though W is not differentiable at the points −1, 0, and 1. ♦


We introduce the following terminology for functions that have this kind
of behavior.
Definition 3.5.57 (Lipschitz Continuous Function). Let I be an inter-
val in the real line. We say that a function f : I → F is Lipschitz continuous
on I if there exists a constant K ≥ 0 such that

∀ x, y ∈ I, |f (x) − f (y)| ≤ K |x − y|. (3.32)

Any number K such that equation (3.32) holds is called a Lipschitz constant
for f. ♦
The Lipschitz constant is not unique, for if K is a Lipschitz constant for f
then so is any number larger than K. The smallest, or “optimal,” Lipschitz
constant for f is
|f (y) − f (x)|
K = sup .
x6=y |x − y|
If f is Lipschitz and we choose any ε > 0, then |f (x) − f (y)| < ε whenever
|x − y| < δ = ε/K. Hence every Lipschitz constant is uniformly continuous.
We collect the bounded Lipschitz functions to form the following space.
Definition 3.5.58. Given an interval I, we set

Lip(I) = f ∈ Cb (I) : f is Lipschitz continuous on I . ♦

We try to avoid extra parentheses where possible, so if I is given explicitly


as I = [a, b] then we write Lip[a, b] instead of Lip([a, b]), and similarly for
other types of intervals.
We have already seen that if f is a real-valued function in Cb1 (I), then f
is Lipschitz. The proof required the Mean-Value Theorem, which does not
hold for complex-valued functions. However, by applying the Mean-Value
Theorem to the real and imaginary parts of a complex-valued function, we
can derive an analogous result for complex-valued functions in Cb1 (I).
Lemma 3.5.59. If f is differentiable everywhere on an interval I and both f
and f ′ are bounded on I, then f is Lipschitz on I. If f is real-valued, then
K = kf ′ ku is a Lipschitz constant, while if f is complex-valued, then K =
21/2 kf ′ ku is a Lipschitz constant. ♦
44 Extra Material and Problems c 2018 Christopher Heil

As a corollary, we see that Cb1 (I) ⊆ Lip(I). The hat function W is Lipschitz
but not differentiable on the interval [−1, 1]. If I is an arbitrary interval, then
by appropriately dilating and translating the function so that it is supported
within I, we obtain a function that is Lipschitz on I but does not belong to
Cb1 (I). Hence Cb1 (I) is a proper subspace of Lip(I). Another very interesting
example is constructed in part (c) of Problem 3.5.74, which shows that there
exists a function that is Lipschitz and differentiable everywhere on I but does
not belong to Cb1 (I) because its derivative is not continuous.
By definition, Lip(I) ⊆ Cb (I). Are there bounded continuous functions
that are not Lipschitz? Yes; here is an example.

Example 3.5.60. The Cantor–Lebesgue function ϕ takes the value 2−k at the
point x = 3−k . Therefore,

|ϕ(3−k ) − ϕ(0)| = |2−k − 0| = 2−k = |3−k − 0|log3 2 .

Set α = log3 2 ≈ 0.63093. Since 1 − α > 0,

|ϕ(3−k ) − ϕ(0)| |ϕ(3−k ) − ϕ(0)| −k


lim −k
= lim |3 − 0|1−α
k→∞ |3 − 0| k→∞ |3−k − 0|α
= lim |3−k − 0|1−α
k→∞

= ∞.

It follows that there is no finite constant K such that

|ϕ(3−k ) − ϕ(0)| ≤ K |3−k − 0|

for every k, so ϕ is not Lipschitz on the interval [0, 1]. ♦

If I is an arbitrary interval, then by appropriately dilating and translating


the Cantor–Lebesgue function ϕ so that it is supported within I, we obtain
a function that is continuous but not Lipschitz. Thus, Lip(I) is a proper sub-
space of Cb (I). Combining this with the fact that Cb1 (I) is a proper subspace
of Lip(I), we see that

Cb1 (I) ( Lip(I) ( Cb (I).

In this sense, the space of Lipschitz functions is intermediate between Cb1 (I)
and Cb (I).
Problem 3.5.76 shows that although the Cantor–Lebesgue function is not
Lipschitz, it does satisfy a weaker Hölder continuity condition. Problem 4.2.15
shows that the space of Lipschitz functions is a Banach space with respect
to an appropriate norm, and Problem 3.5.77 does the same for the space of
Hölder continuous functions.
3.5 Uniform Convergence of Functions 45

Extra Problems

3.5.61. Show that if I is an interval in the real line other than [a, b], then
Cb (I) ( C(I).
3.5.62. Give an example of a Banach space X that contains a dense sub-
space S and a closed subspace M such that S ∩ M = {0}.
3.5.63. (a) Let X, Y be metric spaces. Given a function f : X → Y, prove
that f is continuous if and only if the following implication holds:

F is closed subset of Y =⇒ f −1 (F ) is a closed subset of X.

(b) Show by example that the direct image of a closed set under a contin-
uous function need not be closed.
(c) Show by example that the inverse image of a compact set under a
continuous function need not be compact.
3.5.64. We say that a function f : R → F is periodic if there exists some
a ∈ R such that f (x − a) = f (x) for every x ∈ R. Prove that every periodic
function f ∈ C(R) is bounded and uniformly continuous on R.
3.5.65. Let X be a metric space, and suppose fn , f ∈ Cb (X). Show that if
fn → f uniformly and xn → x in X, then fn (xn ) → f (x).
3.5.66. (a) Suppose that fn , f ∈ Cb (R), and fn converges pointwise to f,
i.e., fn (x) → f (x) for each x ∈ R. Show that
def
kf ku ≤ lim inf kfn ku = sup inf kfk ku .
n→∞ n∈N k≥n

Give an example that shows that strict inequality is possible.


(b) Prove that if fn → f uniformly in Cb (R), then kf ku = limn→∞ kfn ku .
3.5.67. Let C(Rd ) be the set of all scalar-valued functions f : Rd → F. How
should the spaces Cb (Rd ), C0 (Rd ), and Cc (Rd ) be defined? How should the
uniform norm for functions on Rd be defined? Prove that Cb (Rd ) and C0 (Rd )
are complete with respect to the uniform norm, but Cc (Rd ) is not.
3.5.68. For this problem we take X = C0 (R), under the uniform norm. Let
f (x) = e−|x| and set
E = {cf : c ∈ R}.
Prove the following statements.
(a) If g = cf ∈ E and r > 0, then there exists a function h ∈ Cc (R) such
that kg − hku < r.
(b) E contains no open subsets of C0 (R), and therefore E ◦ = ∅.
(c) E is a closed subset of C0 (R).
(d) E ′ = E, and therefore E is a perfect subset of C0 (R).
46 Extra Material and Problems c 2018 Christopher Heil

3.5.69. Suppose that there was a norm k ·k on C[0, 1] such that kf − fn k → 0


if and only fn → f pointwise. Let {fn }n∈N be the sequence of Shrinking
Triangles from Example 3.5.2.
(a) Explain why kfn k 6= 0.
(b) Let gn = fn /kfn k. Show that there exists a function g such that gn → g
pointwise.
(c) What is kgn k? What is kgk? Why is this a contradiction?
Conclude that no such norm exists. (For this reason, we say that pointwise
convergence of functions on [0, 1] is not a normable convergence criterion.)

3.5.70. Recall that C(R) is the space of all continuous scalar-valued functions
on R. This space includes unbounded functions (such as f (x) = x2 ), so the
uniform norm k · ku is not a norm on C(R) (why not?).
For each N ∈ N, define

|||f |||N = sup |f (x)|, f ∈ C(R).


x∈[−N,N ]

Prove the following statements.


a b c
(a) If a, b, c ≥ 0 and a ≤ b + c, then 1+a ≤ 1+b + 1+c .
(b) The following is a metric on C(R):

X |||f − g|||N
d(f, g) = 2−N , f, g ∈ C(R).
1 + |||f − g|||N
N =1

3.5.71. Given a metric space X, let C(X) be the space of all continuous,
scalar-valued functions f : X → F, and let Cb (X) be the subspace of C(X)
that contains all bounded, continuous functions f : X → F.
(a) Prove that Cb (X) is a vector space, the uniform norm

kf ku = sup |f (x)|, f ∈ Cb (X),


x∈X

is a norm on Cb (X), and Cb (X) is a Banach space with respect to the uniform
norm.
(b) Prove that Cb (X) is closed with respect to products of functions, i.e.,
if f, g ∈ Cb (X) then f g ∈ Cb (X). Further, show that the following submulti-
plicative norm inequality is satisfied:

kf gku ≤ kf ku kgku, f, g ∈ Cb (X).

(c) Suppose that g ∈ Cb (X) and there exists an ε > 0 such that |g(x)| ≥ ε
for every x ∈ X. Prove that 1/g ∈ Cb (X).
(d) Show that if X is compact, then
3.5 Uniform Convergence of Functions 47

C(X) = Cb (X)

and C(X) is a Banach space with respect to the uniform norm.


(e) Let X = Cb (R), and define δ : X → F by

δ(f ) = f (0), f ∈ X = Cb (R).

For example, if f (x) = cos x, then δ(f ) = cos 0 = 1. Prove that δ is continu-
ous. Does δ belong to Cb (X)?

3.5.72. Let k · k1 be the norm on C[0, 1] defined in Theorem 3.2.6.


(a) Let {fn }n∈N be the sequence of “Shrinking Triangles” constructed in
Example 3.5.2. For each n ∈ N define gn (x) = nfn (x), and set g(x) = 0 (see
Figure 3.53). Show that gn (x) → g(x) for every x ∈ [0, 1], but kg − gn k1 does
not converge to zero. Thus, pointwise convergence does not imply convergence
with respect to the norm k · k1 in general.

n
gn

1
0 n
1

Fig. 3.53 Graph of the function gn from Problem 3.5.72(a).

(b) Given an interval [c, d] ⊆ [0, 1], let W[c,d] be the hat function of height 1
on the interval [c, d]. Explicitly,


 0, a ≤ x ≤ c,



 c+d
linear, c < x < 2 ,
W[c,d] = 1, c+d
x= 2 ,


 linear, c+d


 2 < x < d,
0, d ≤ x ≤ b.

Define

h1 = W[0,1] ,
h2 = W[0, 12 ] , h3 = W[ 12 ,1] ,
h4 = W[0, 31 ] , h5 = W[ 13 , 32 ] , h6 = W[ 32 ,1] ,
48 Extra Material and Problems c 2018 Christopher Heil

h7 = W[0, 14 ] , h8 = W[ 14 , 21 ] , h9 = W[ 21 , 34 ] , h10 = W[ 43 ,1] ,


h11 = W[0, 15 ] , h12 = W[ 51 , 52 ] , h13 = W[ 25 , 53 ] , h14 = W[ 53 , 45 ] , h15 = W[ 54 ,1] ,

and so forth. Picturing the graphs of these functions as triangles, the triangles
march from left to right across the interval [0, 1], then shrink in size and march
across the interval again, and do this over and over (see Figure 3.54).

1
h7 h8 h9 h10

0.5

0
0 0.25 0.5 0.75 1

Fig. 3.54 Graphs of the functions h7 , h8 , h9 , and h10 from Problem 3.5.72(b).

Let h = 0 (the zero function). Prove that limn→∞ kh − hn k1 = 0, but it


is not true that limn→∞ hn (x) = h(x) for every x ∈ [0, 1]. Thus, convergence
with respect to the norm k · k1 does not imply pointwise convergence in
general.
3.5.73. Define T : C[0, 1] → C[0, 1] by
Z x
T f (x) = x + tf (t) dt, f ∈ C[0, 1].
0

Prove that T satisfies the hypotheses of the Banach Fixed Point Theorem (see
Problem 2.9.20), and the fixed point of T satisfies the differential equation
f ′ (x) = xf (x) + 1.

Problems on Lipschitz Functions

3.5.74. Given a, b > 0, define


(
|x|a sin |x|−b , x 6= 0,
f (x) =
0, x = 0.

Prove the following statements.


(a) If a ≥ 1 + b, then f ∈ Lip[−1, 1].
(b) If a > 1 + b, then f ∈ C 1 [−1, 1], i.e., f is differentiable everywhere on
the interval [−1, 1] and both f and f ′ are bounded on [−1, 1].
3.5 Uniform Convergence of Functions 49

/ C 1 [−1, 1].
(c) If a = 1 + b, then f ∈
(d) Show that the function f defined in Example 3.50.50 is differentiable at
every point, f ′ is bounded but not continuous on [−1, 1], and f ∈ Lip[−1, 1].

3.5.75. Suppose that f : I → F is differentiable everywhere on an interval I.


Prove that f is Lipschitz if and only if f ′ is bounded.
3.5.76. Let I be an interval. We say that a function f : I → F is Hölder
continuous on I with exponent α > 0 if there exists a constant K > 0 such
that
∀ x, y ∈ I, |f (x) − f (y)| ≤ K |x − y|α .
Note that a Lipschitz function is a function that is Hölder continuous with
exponent α = 1.
(a) Show that if f is Hölder continuous for some exponent α > 0, then f
is uniformly continuous on I. Further, if α > 1, then f is constant.
(b) Show that the function f (x) = |x|1/2 is Hölder continuous on R for all
exponents 0 < α ≤ 12 but not for any exponent α > 21 , while


 0, x ≤ 0,

g(x) = − ln x , 0 < x < 12 ,
1


 1
ln 2 , x ≥ 21 ,

is uniformly continuous on R but is not Hölder continuous for any exponent


α > 0.
(c) Prove that if 0 < α ≤ log3 2 ≈ 0.6309 . . . , then the Cantor–Lebesgue
function ϕ is Hölder continuous on [0, 1] with exponent α, but ϕ is not Hölder
continuous for any exponent α > log3 2.

3.5.77. Given 0 < α < 1, let C α (I) be the space of all bounded functions
that are Hölder continuous with exponent α on I, i.e.,

C α (I) = f ∈ Cb (I) : f is Hölder continuous with exponent α .

Show that C α (I) is a Banach space with respect to the norm

|f (x) − f (y)|
kf kC α = kf ku + sup , f ∈ C α (I).
x6=y |x − y|α

Problems on Convex Functions

3.5.78. This problem introduces convex functions, establishes the Discrete


Jensen Inequality, and uses these to give another proof of equation (3.8).
50 Extra Material and Problems c 2018 Christopher Heil

ΦHyL

tΦHxL+H1-tLΦHyL

ΦHxL

a x tx+H1-tLy y b

Fig. 3.55 Graph of a convex function φ.

Fix −∞ ≤ a < b ≤ ∞. We say that a function φ : (a, b) → R is convex on


the open interval (a, b) if

∀ a < x < y < b, ∀ 0 < t < 1, φ tx + (1 − t)y ≤ t φ(x) + (1 − t) φ(y).

In other words, as illustrated in Figure 3.55, on any subinterval [x, y] of (a, b),
the graph of φ lies on or below the line segment that joints the points (x, φ(x))
and (y, φ(y)). An analogous definition is made for concave functions.
(a) Assume φ : (a, b) → R is a convex function. Prove the Discrete Jensen
Inequality: If N ≥ 2, then for any points x1 , . . . , xN ∈ (a, b) and positive
weights t1 , . . . , tN that satisfy t1 + · · · + tN = 1, we have
X
N  N
X
φ tj xj ≤ tj φ(xj ).
j=1 j=1

(b) Given a convex function φ on (a, b) and given x ∈ (a, b), prove that
the function
φ(y) − φ(x)
m(y) = , y ∈ (a, b), y 6= x,
y−x

is monotone increasing on (a, x) and monotone increasing on (x, b). Also show
that if z < x < y, then m(z) ≤ m(y).
(c) Let φ : (a, b) → R be given. Prove that φ is convex if and only if for all
a < x < y < z < b we have

φ(y) − φ(x) φ(z) − φ(x)


≤ .
y−x z−x

(d) Show that if b1 b2 > 0, then for any a1 , a2 ∈ R we have


3.5 Uniform Convergence of Functions 51
na a o a1 + a2 na a o
1 2 1 2
min , ≤ ≤ max , .
b1 b2 b1 + b2 b1 b2
(e) Suppose that a function φ : (a, b) → R is differentiable at every point
of (a, b) and φ′ is monotone increasing on (a, b). Prove that φ is convex.
(f) Prove that the following functions are convex on the given intervals:

xp on (0, ∞), where 1 ≤ p < ∞,


eax on (−∞, ∞), where a ∈ R,
− ln x on (0, ∞).

(g) Use the Discrete Jensen Inequality and the fact that ex is convex to
give another proof of the fact that if a, b ≥ 0 and 1 < p < ∞ then

ap bp
ab ≤ + ′.
p p
3.5.79. Fix −∞ ≤ a < b ≤ ∞, and assume that φ is a convex function on
(a, b). Prove that φ has the following properties.
(a) φ is right-differentiable on (a, b), i.e., for each point x ∈ (a, b) the limit

φ(y) − φ(x)
φ′+ (x) = lim
y→x+ y−x

exists and is finite. Similarly, φ is left-differentiable at each point of (a, b).


(b) If a < x < y < b, then

φ(y) − φ(x)
φ′+ (x) ≤ ≤ φ′− (y),
y−x

and φ′− (x) ≤ φ′+ (x).


(c) φ′+ and φ′− are monotone increasing on (a, b).
(d) φ is continuous on (a, b). (In fact, φ is Lipschitz on any closed interval
[c, d] ⊆ (a, b).)
(e) φ is differentiable at all but at most countably many points in (a, b).
That is, φ′+ (x) = φ′− (x) except for countably many values of x.
Hints: (a) Apply the Mean-Value Theorem to obtain points ξ1 < ξ2 such
that φ(y)−φ(x)
y−x = φ′ (ξ1 ) and φ(z)−φ(y)
z−y = φ′ (ξ2 ). Then apply the inequalities
from part (b) to estimate min{φ′ (ξ1 ), φ′ (ξ2 )}.
(b) Part (b) of Exercise 3.5.78 shows that if z < x < y, then m(z) ≤ m(y).
(d)
 Use part (b) to show that |φ(y) − φ(x)| ≤ K |y − x| where K =
max |φ′+ (c)|, |φ′− (d)| .
52 Extra Material and Problems c 2018 Christopher Heil

(e) Suppose that y is a point of continuity for φ′+ . Use part (b) to show
that
φ(y) − φ(x)
φ′− (y) = lim− ≥ lim− φ′+ (x) = φ′+ (y).
x→y y−x x→y

3.6 Equivalent Norms

As far as convergence in a normed space X is concerned, we can replace


the norm on X with any equivalent norm without changing the meaning of
convergence in the space, or the topology for the space. To illustrate this,
let X be a finite-dimensional vector space. The next exercise constructs a
family of norms on X, analogous to the ℓp norms on Fd defined in Theorem
3.2.5, and shows that all of the norms in this family are equivalent. Hence
each of these norms determines the same convergence criterion on X, i.e.,
a sequence converges with respect to one of these norms if and only if it
converges with respect to all of the others.

Extra Problems

3.6.50. Suppose that k · ka and k · kb are norms on a vector space X, and


there exists a constant C > 0 such that kxka ≤ C kxkb for every x ∈ X.
Prove the following statements.
(a) If xn , x ∈ X and xn → x with respect to k · kb , then xn → x with
respect to k · ka .
(b) If U ⊆ X and U is open with respect to k · ka , then U is open with
respect to k · kb .

3.6.51. Let X be a vector space. Prove that equivalence of norms is an equiv-


alence relation on the class of all norms on X. In other words, prove that the
following three statements hold for all norms k · ka , k · kb , and k · kc on X
(where we write k · k1 ≍ k · k2 to mean that the two norms are equivalent).
(a) Reflexivity: k · ka ≍ k · ka .
(b) Symmetry: If k · ka ≍ k · kb , then k · kb ≍ k · ka .
(c) Transitivity: If k · ka ≍ k · kb and k · kb ≍ k · kc , then k · ka ≍ k · kc .

3.6.52. (Expansion of Problem 3.6.9).


Suppose that k · ka and k · kb are two norms on a vector space X. Let
Bra (x) and Brb (x) denote the open balls of radius r centered at x ∈ X with
respect to the norms k · ka and k · kb , respectively. Prove that the following
three statements are equivalent.
3.7 Norms on Finite-Dimensional Spaces 53

(a) k · ka ≍ k · kb .
(b) There exists some C > 0 such that
1
kxka ≤ kxkb ≤ C kxka , x ∈ X.
C
(c) There exists some r > 0 such that
a
B1/r (0) ⊆ B1b (0) ⊆ Bra (0).

3.6.53. Given 1 ≤ p < q ≤ ∞, prove that k · kp and k · kq are not equivalent


norms on ℓp .
Remark: ℓp ⊆ ℓq , so k · kq is a norm on ℓp (when restricted from ℓq to ℓp ).

3.6.54. Given 1 ≤ p < q ≤ ∞, prove that k · kp and k · kq are not equivalent


norms on C[0, 1].

3.7 Norms on Finite-Dimensional Spaces

In summary, if X is a finite-dimensional vector space, then every norm on X


is equivalent and therefore induces the same topology on X (which we call
the norm topology on X). In particular, every norm on X is equivalent to
the norm k · k2 (see Problem 3.6.5). Consequently the norm topology on X
can be identified with the usual topology on Fd , in much the same manner
as Exercise 2.8.51 showed that the topology of Cd can be identified with that
of R2d . This idea can be used to solve the following exercise, which extends
the Heine–Borel Theorem to X.

Exercise 3.7.50. Let X be a finite-dimensional normed space X. Using the


notation of Problem 3.6.5, if B = {e1 , . . . , ed } is a basis for X, then
1/2
kxk2 = |c1 (x)|2 + · · · + |cd (x)|2 , x ∈ X,

defines a norm on X. Let K be a closed and bounded subset of X, and prove


that n  o
Ke = c1 (x), . . . , cd (x) : x ∈ K

e is compact, and use this


is a closed and bounded set in Fd . Conclude that K
to prove that K is compact in X. ♦

In contrast, he following two exercises will show that if X is an infinite-


dimensional normed space then there exists a subset of X that is closed and
bounded but not sequentially compact (and hence not compact by Theo-
rem 2.8.9).
54 Extra Material and Problems c 2018 Christopher Heil

Exercise 3.7.51. Let X be a normed space, let M be a proper, closed sub-


space of X, and fix ε > 0. This exercise will prove F. Riesz’s Lemma: There
is a unit vector y ∈ X whose distance from M is at least 1 − ε.
Choose u ∈ X \M and set a = dist(u, M ) > 0. Let δ > 0 be small enough
that a/(a + δ) > 1 − ε, and let v ∈ M satisfy a ≤ ku − vk < a + δ. Set

(u − v)
y = .
ku − vk

Show that kyk = 1 and

dist(y, M ) = inf ky − xk > 1 − ε. ♦


x∈M

The next exercise can be solved by applying Exercise 3.7.51 repeatedly.


Note that by Problem 3.7.4, if x1 , . . . , xn are finitely many elements of X,
then span{x1 , . . . , xn } is closed because it is a finite-dimensional subspace
of X.

Exercise 3.7.52. Let X be any infinite-dimensional normed space. Prove


that the closed unit disk D = {x ∈ X : kxk ≤ 1} in X is not sequentially
compact, and therefore is not compact. ♦

We will study inner product spaces and Hilbert spaces in Chapter 5. If X


happens to be a Hilbert space, then the machinery of orthogonal comple-
ments can be used to give a much more direct (and elegant) solution to
Exercises 3.7.51 and 3.7.52 (see Problem 5.9.7).
3.50 Extra Section: Spaces of Differentiable Functions 55

3.50 Extra Section: Spaces of Differentiable Functions

This extra section expands on the material in Section 3.5 where the space
Cb1 (I) is introduced and discussed.
Most of Section 3.5 deals with spaces whose elements are continuous func-
tions. We will generalize some of the results obtained there to spaces of m-
times differentiable functions. For simplicity of presentation, in this section
we we will take the domain of our functions to be I = R. However, only
small modifications need to be made if we want to consider functions whose
domain is a different type interval (essentially, one-sided instead of two-sided
limits need to be used at any boundary point of the interval).
We begin with a technical issue that complicates the issue of how best to
define spaces of differentiable functions. The problem is that it is not true
that a differentiable function must have a continuous derivative. Here is an
example of a function that is differentiable at every point, yet its derivative
is not continuous.

0.2

0.1

0.1 0.2 0.3 0.4


-0.1

-0.2

1.0
0.5

0.1 0.2 0.3 0.4


-0.5
-1.0

Fig. 3.56 The function f from Example 3.50.50 (top), and its derivative f ′ (bottom), for
0 < x < 0.4. Note the differing vertical scales on the two graphs.

Example 3.50.50. Define


(
x2 sin(1/x), x 6= 0,
f (x) =
0, x = 0.

This function is continuous at every point (why?). We will prove that f ′ (x)
exists at every point x, yet f ′ is discontinuous at x = 0.
To see that f is differentiable at points other than x = 0, note that the
function 1/x is differentiable on (0, ∞) and sin x is differentiable on all of R.
56 Extra Material and Problems c 2018 Christopher Heil

Therefore their composition sin(1/x) is differentiable on (0, ∞). Since x2 is


also differentiable everywhere, it follows that the product f (x) = x2 sin(1/x)
is differentiable on (0, ∞). Applying the product rule and the chain rule, the
derivative of f on this interval is
1  1  1  1 1
f ′ (x) = 2x sin + x2 cos − 2 = 2x sin − cos . (3.33)
x x x x x
A similar calculation shows that f is differentiable on (−∞, 0), and f ′ (x) is
given by equation (3.33) for x ∈ (−∞, 0).
What about the point x = 0? Here we use the definition of the derivative
to show that f ′ (0) exists:

f (0 + t) − f (0) t2 sin(1/t)
f ′ (0) = lim = lim = lim t sin(1/t) = 0.
t→0 t−0 t→0 t t→0

Thus f is differentiable at every point x ∈ R.


The graphs of f and f ′ are pictured in Figure 3.56. Considering the figure,
it appears that f ′ (x) cannot converge to any value as x → 0, because f ′ (x)
oscillates more and more as x approaches zero. Indeed, applying equation
(3.33), we see that
 1 1
lim f ′ (x) = lim 2x sin − cos does not exist.
x→0 x→0 x x
Therefore, even though f ′ (0) exists, the derivative function f ′ is not contin-
uous at the point x = 0. ♦

For this and other reasons, the space of all differentiable functions can be
difficult to work with. Usually we impose some restrictions to create a “nicer”
space. For example, we can consider the space of all differentiable functions
which have a continuous derivative. We call this space C 1 (R). Extending to
higher orders, we let C m (R) denote the set of all of all m-times differentiable
functions on R such that f and each derivative f ′ , . . . , f (m) is continuous,
i.e., 
C m (R) = f ∈ C(R) : f, f ′ , . . . , f (m) ∈ C(R) .
On the preceding line, when we write f, f ′ , . . . , f (m) ∈ C(R) we implicitly
mean that f, f ′ , . . . , f (m) all exist (i.e., f is m-times differentiable) and each
of these functions belongs to C(R).
The space C m (R) is still too “large” to define a convenient norm, so we
look at the following subspaces of C m (R):

Cbm (R) = f ∈ Cb (R) : f, f ′ , . . . , f (m) ∈ Cb (R) ,

C0m (R) = f ∈ C0 (R) : f, f ′ , . . . , f (m) ∈ C0 (R) ,

Ccm (R) = f ∈ Cc (R) : f, f ′ , . . . , f (m) ∈ Cc (R) .
3.50 Extra Section: Spaces of Differentiable Functions 57

That is, Cbm (R) is the space of m-times differentiable functions such that
f and each derivative f ′ , . . . , f (m) is continuous and bounded, and so forth.
According to Problem 3.50.53, these spaces are nested in the following way:

Ccm (R) ( C0m (R) ( Cbm (R) ( C m (R). (3.34)

The inclusions on the preceding line are proper. For example, the function
f (x) = sin x belongs to Cbm (R) because it and every derivative is bounded,
but f ∈/ C0m (R). Problem 3.50.53 asks for examples that show that the other
inclusions in equation (3.34) are also proper.
After giving some examples, we will define the “standard norm” for the
subspaces Cbm (R), C0m (R), and Ccm (R), and determine which of these are
complete with respect to that norm. The hardest case is actually m = 1;
once we understand that case we can extend to higher orders by induction.
Sometimes it is useful to consider analogous spaces of infinitely differen-
tiable functions. Although these are not normed spaces, we make the following
definitions:

C ∞ (R) = f ∈ C(R) : f, f ′ , . . . ∈ C(R) ,

Cb∞ (R) = f ∈ Cb (R) : f, f ′ , . . . ∈ Cb (R) ,

C0∞ (R) = f ∈ C0 (R) : f, f ′ , . . . ∈ C0 (R) ,

Cc∞ (R) = f ∈ Cc (R) : f, f ′ , . . . ∈ Cc (R) .

It is not obvious that Cc∞ (R) 6= {0}. Some examples of nontrivial elements
of Cc∞ (R) are constructed in Problem 2.10.6.

The Norm for Cb1 (R)

There are many ways to define a norm on Cb1 (R). For example, since Cb1 (R)
is a subspace of Cb (R), the uniform norm is a norm on Cb1 (R). However,
the following exercise shows that Cb1 (R) is not complete with respect to the
uniform norm.

Exercise 3.50.51. The hat function or tent function on the interval [−1, 1]
is the piecewise linear function W pictured in Figure 3.57. Explicitly,


1 − x, 0 ≤ x ≤ 1,

W (x) = max 1 − |x|, 0 = 1 + x, −1 ≤ x ≤ 0, (3.35)


0, |x| ≥ 1.

The hat function is continuous, but it is not differentiable at every point. We


will construct a differentiable “approximation” to W.
58 Extra Material and Problems c 2018 Christopher Heil

W
0.5

-1.5 -1.0 -0.5 0.5 1.0 1.5

Fig. 3.57 The hat function W from Example 3.50.51.

Omitting the points −1, 0, or 1 where W is not differentiable, the derivative


of W is 

1, −1 < x < 0,

W (x) = −1, 0 < x < 1,


0, |x| > 1.
Create a continuous approximation to W ′ by setting


0, x < 1 − k1 ,



linear, −1 − 1
≤ x < −1,

 k


1,
 −1 ≤ x < − k1 ,

fk (x) = linear, − k1 ≤ x < k1 ,



−1, 1
≤ x < 1,

 k




linear, 1 ≤ x < 1 + k1 ,


0, x ≥ 1 + k1 .

The function f5 is pictured in Figure 3.58.

1
f5
0.5

-1.5 -1.0 -0.5 0.5 1.0 1.5

-0.5

-1

Fig. 3.58 The function f5 from Example 3.50.51.

Now let gk be the following indefinite integral of fk :


3.50 Extra Section: Spaces of Differentiable Functions 59
Z x
gk (x) = fk (t) dt, x ∈ R.
−2

The function g5 is pictured in Figure 3.59. Looking at its graph, it appears


that g5 is a good approximation to W. Prove the following statements.
(a) gk ∈ Cc1 (R).
(b) gk converges uniformly to W as k → ∞.
(c) {gk }k∈N is a sequence in Cb1 (R) that is Cauchy with respect to the uniform
norm, but this sequence does not converge uniformly to an element of
Cb1 (R).
(d) Cb1 (R) is not complete with respect to the uniform norm. ♦

g5
0.5

-1.5 -1.0 -0.5 0.5 1.0 1.5

Fig. 3.59 The function g5 from Example 3.50.51, with the hat function W in dashed lines
for comparison.

In summary, k · ku is a norm on Cb1 (R), but Cb1 (R) is not complete with
respect to this norm. We need to construct a different norm that incorporates
information about the derivative. As a first attempt, let us define

|||f ||| = kf ′ ku , f ∈ Cb1 (R). (3.36)

Unfortunately, this does not even define a norm on Cb1 (R). For example, the
constant function f = 1 satisfies

|||f ||| = kf ′ ku = 0,

even though f is not the zero function. Hence ||| · ||| does not satisfy the
uniqueness requirement of a norm.
The “correct” definition of a norm on Cb1 (R) should incorporate informa-
tion about both f and f ′ . We can do this by setting

kf kCb1 = kf ku + kf ′ ku , f ∈ Cb1 (R). (3.37)

According to Problem 3.5.25, this is a norm on Cb1 (R), and Cb1 (R) is com-
plete with respect to this norm.
60 Extra Material and Problems c 2018 Christopher Heil

The case m = 1 (which is Problem 3.5.25) is actually the hard part. As-
suming that problem has been done, we can extend to higher-order derivatives
by induction. Given m ∈ N, the norm on Cbm (R) is defined by

kf kCbm = kf ku + kf ′ ku + · · · + kf (m) ku , f ∈ Cbm (R).

Exercise 3.50.52. Use induction to prove that Cbm (R) is a Banach space
with respect to the norm k · kCbm . ♦

Problems 3.50.56 and 3.50.57 consider whether the spaces C0m (R) and
Ccm (R) are complete with respect to the norm k · kCbm (yes for C0m (R), but
no for Ccm (R)).

Extra Problems

3.50.53. Let m ∈ N be fixed.


(a) Exhibit a function in C m (R) that does not belong to Cbm (R).
(b) Exhibit a function in Cbm (R) that does not belong to C0m (R).
2
(c) Let φ(x) = e−x be the Gaussian function. Show that for each n ∈ N
2
there exists a polynomial of degree n such that φ(n) (x) = pn (x) e−x . Prove
m m
/ Cc (R).
that φ ∈ C0 (R), but φ ∈

3.50.54. A function f ∈ C ∞ (R) is said to be real analytic at the origin if its


Taylor series converges to f (x) for all x in some neighborhood of 0. In other
words, there must exist a δ > 0 such that
X∞
f (n) (0) n
f (x) = x , |x| < δ.
n=0
n!

Prove that the function γ constructed in Problem 2.10.6 is not real analytic
at the origin.

3.50.55. Define
( 2
−1/x2 e−1/x , x > 0,
γ(x) = e χ(0,∞) (x) =
0, x ≤ 0.

Prove the following statements.


(a) For each n ∈ N, there exists a polynomial pn of degree 3n such that
−2
γ (n) (x) = pn (x−1 ) e−x χ(0,∞) (x).

(b) γ ∈ C ∞ (R) and γ (n) (0) = 0 for every n ≥ 0.


3.50 Extra Section: Spaces of Differentiable Functions 61

(c) If a < b then the function f (x) = γ(x − a) γ(b − x) belongs to Cc∞ (R)
and satisfies f (x) > 0 for x ∈ (a, b) and f (x) = 0 for x ∈
/ (a, b).

3.50.56. Given m ∈ N, prove that C0m (R) is a Banach space with respect to
the norm k · kCbm .

3.50.57. Let g be any function in C0m (R) such that g ∈ / Ccm (R). Let θ be
the function constructed in Problem 2.10.7 (so θ is infinitely differentiable,
compactly supported, and identically 1 on [−1, 1]). For each k ∈ N, set

gk (x) = g(x) θ(x/k).

Prove the following statements.


(a) gk ∈ Ccm (R).
(b) limk→∞ kg − gk kCbm = 0.
(c) {gk }k∈N is a sequence in Ccm (R) that is Cauchy with respect to the
norm k · kCbm , but this sequence does not converge to an element of Ccm (R).
(d) Ccm (R) is not complete with respect to k · kCbm .
62 Extra Material and Problems c 2018 Christopher Heil

3.51 Extra Section: A Metric for C(R)

This extra section expands on Problem 3.5.70.


Since functions in C(R) can be unbounded, the uniform norm k · ku is not
a norm on C(R). On the other hand, each function f ∈ C(R) is bounded on
each bounded closed interval [a, b]. Consequently, if we fix an integer N ∈ N,
then the quantity
|||f |||N = sup |f (x)|
|x|≤N

is finite for every f ∈ C(R). In fact, considering a single, fixed N, the function
||| · |||N is nonnegative, homogeneous, and satisfies the Triangle Inequality, so
it is a seminorm on C(R) in the sense of Definition 3.1.1. However, it is not
a norm, because there exist functions in C(R) other than the zero function
that satisfy |||f |||N = 0.

Remark 3.51.50. Let χ[−N,N ] denote the characteristic function of the inter-
val [−N, N ], i.e., (
1, |x| ≤ N,
χ[−N,N ] (x) =
0, |x| > N.
Then f · χ[−N,N ] is the function that equals f on [−N, N ] and is zero outside
of [−N, N ]. Therefore, an equivalent way to write ||| · |||N is

|||f |||N = kf · χ[−N,N ]ku . (3.38)

That is, the seminorm |||f |||N is simply the uniform norm of the restriction
of f to the interval [−N, N ]. If f is zero on [−N, N ] but nonzero somewhere
outside of that interval, then we will have |||f |||N = 0 even though f is not
the zero function. ♦

We would like to collect all of these seminorms ||| · |||N together to create a
norm or metric on C(R). We could try summing over N, but unfortunately
the series ∞ ∞
X X
|||f |||N or even 2−N |||f |||N
N =1 N =1

need not converge for every f ∈ C(R). On the other hand, no matter what
f ∈ C(R) we choose we always have

|||f |||N
≤ 1, all N ∈ N,
1 + |||f |||N

so we can define

X |||f |||N
kf k = 2−N , f ∈ C(R).
1 + |||f |||N
N =1
3.51 Extra Section: A Metric for C(R) 63

This series converges for every f ∈ C(R), but does it define a norm? Unfortu-
nately, the answer is no, because kcf k 6= |c| kf k in general. Even so, Problem
3.5.70 shows that

X |||f − g|||N
d(f, g) = kf − gk = 2−N , f, g ∈ C(R), (3.39)
1 + |||f − g|||N
N =1

is a metric on C(R).
What kind of convergence criterion does the metric d correspond to? That
is, what does it mean for fk to converge to f with respect to the metric d?
Instead of uniform convergence, we will see that convergence with respect to
d is the type of convergence that is given in the following definition.

Definition 3.51.51 (Uniform Convergence on Compact Sets). Let fk


and f be functions in C(R). If fk converges uniformly to f on the interval
[−N, N ] for each N ∈ N, then we say that fk converges uniformly to f on
compact sets. ♦

Example 3.51.52. Let f = 1, i.e., f is the constant function that is identi-


cally 1. Let fk be a continuous “cutoff” of f, similar to the cutoff illustrated
in Figure 8.41 for a generic continuous function g. e.g.,

1,
 |x| ≤ k,
fk (x) = linear, on [−k − 1, −k] and on [k, k + 1],


0, |x| ≥ k + 1.

If we consider a particular fixed N, then for all k ≥ N we have fk = f on the


interval [−N, N ]. Hence fk converges uniformly to f on the (fixed) interval
[−N, N ]. This is true for each individual N, so Definition 3.51.51 says that
fk converges uniformly to f on compact sets.

gn

-n n n+1

Fig. 3.60 A cutoff gn for a function g.

On the other hand, fk does not converge uniformly to f, because for every
k we have
64 Extra Material and Problems c 2018 Christopher Heil

kf − fk ku = sup |f (x) − fk (x)| ≥ |f (k + 2) − fk (k + 2)| = 1 − 0 = 1.


x∈R

Thus kf − fk ku does not converge to zero, so fk does not converge uniformly


to f, even though fk does converge uniformly on compact sets to f. ♦

The following lemma shows that convergence with respect to the metric d
is precisely the same as uniform convergence on compact sets.

Lemma 3.51.53. Given functions fk , f ∈ C(R), the following two state-


ments are equivalent.
(a) fk converges to f with respect to the metric d, i.e.,

lim d(fk , f ) = 0.
k→∞

(b) fk converges uniformly to f on compact sets, i.e.,

∀ N ∈ N, lim k(f − fk ) · χ[−N,N ]ku = 0.


k→∞

Proof. (a) ⇒ (b). Suppose that d(fk , f ) → 0 as k → ∞. Given N ∈ N, we


must show that |||f − fk |||N → 0 as k → ∞.
Choose ε > 0, and set
ε
η = N .
2 (1 + ε)
Since η > 0 and d(fk , f ) → 0, there must exist some K > 0 such that

k > K =⇒ d(fk , f ) < η.

Therefore, for k > K we have

|||f − fk |||N |||f − fk |||N


= 2N 2−N
1 + |||f − fk |||N 1 + |||f − fk |||N

X |||f − fk |||M
≤ 2N 2−M
1 + |||f − fk |||M
M=1

= 2N d(f − fk )
< 2N η
ε
= .
1+ε
x
Since the function 1+x is monotone increasing on (0, ∞), it follows that for
all k > K we have
|||f − fk |||N < ε.
This proves that |||f − fk |||N → 0 as k → ∞.
3.51 Extra Section: A Metric for C(R) 65

(b) ⇒ (a). Suppose that |||f − fk |||N → 0 as k → ∞ for every N ∈ N. Fix


ε > 0, and let M be large enough that

X
2−N < ε.
N =M+1

Then let K be large enough that

k>K =⇒ |||f − fk |||N < ε, N = 1, . . . , M.

Then for all k > K we have


M
X ∞
X
−N |||f − fk |||N |||f − fk |||N
d(fk , f ) = 2 + 2−N
1 + |||f − fk |||N 1 + |||f − fk |||N
N =1 N =M+1

M
X ∞
X
ε
≤ 2−N + 2−N · 1
1+0
N =1 N =M+1

X
≤ ε 2−N + ε
N =1

= 2ε.

Hence d(fk , f ) → 0 as k → ∞. ⊓

Example 3.51.52 shows that a sequence that converges uniformly on com-
pact sets need not converge uniformly. On the other hand, if fk converges
uniformly to f then we have kf − fk ku → 0 as n → ∞. Therefore, for any
particular N,

k(f − fk ) · χ[−N,N ]ku = sup |f (x) − fk (x)|


|x|≤N

≤ sup |f (x) − fk (x)|


x∈R

= kf − fk ku
→ 0 as k → ∞.

Consequently fk converges uniformly on compact sets. Thus uniform con-


vergence implies uniform convergence on compact sets, but Example 3.51.52
shows that the converse implication can fail. Here is another example that
illustrates the difference between these two types of convergence.
Example 3.51.54 (Triangles Marching to Infinity). Let W be the hat function
on the interval [−1, 1] that was introduced in equation (3.35). The hat func-
tion belongs to C(R); in fact, since it is compact supported, it is an element
of Cc (R).
66 Extra Material and Problems c 2018 Christopher Heil

1.0
W5 W6
0.8

0.6

0.4

0.2

0.0
0 1 2 3 4 5 6 7

Fig. 3.61 The functions Wk , k ≥ 5, from Example 3.51.54, vanish on the interval [−4, 4].

The function
Wk (x) = W (x − k)
is simply the hat function translated to the right by k units. If we fix an
integer N > 0, then for every k > N we have Wk = 0 on the interval
[−N, N ]. Consequently, as shown in Figure 3.61, kWk · χ[−N,N ] ku = 0 for
k > N. Therefore

∀ N ∈ N, lim kWk · χ[−N,N ]ku = 0.


k→∞

So, we conclude that Wk converges uniformly on compact sets to the zero


function. However,

kWk − 0ku = kWk ku = 1 for every k,

so Wk does not converge uniformly to the zero function. ♦

In summary, we have created a metric on C(R), and the convergence notion


associated with that metric is convergence on compact sets. According to
Problem 3.51.56, C(R) is complete with respect to this metric. However, we
do not call C(R) a Banach space, because it is not complete with respect to
a norm. Instead, using the terminology of topological vector spaces, C(R) is
a Fréchet space.
We saw in Problem 3.3.14 that if p < 1 then the metric on ℓp is not
induced from any norm on ℓp . It can be shown that the metric d on C(R) is
not induced from any norm on C(R). However, in contrast to ℓp when p < 1,
the issue for C(R) is not a lack of convexity.

Extra Problems

3.51.55. Given a function f ∈ C(R), prove that there exist functions fk ∈


Cc (R) such that fk converges uniformly to f on compact sets.
3.51 Extra Section: A Metric for C(R) 67

Remark: Using the terminology of Definition 2.6.3, this says that Cc (R) is
dense in C(R) with respect to the metric d.

3.51.56. Prove that C(R) is complete with respect to the metric d defined
in equation (3.39).

3.51.57. Given integers n ≥ 0 and N > 0, define

|||f |||N,n = kf (n) · χ[−N,N ]ku , f ∈ C ∞ (R).

Prove that
∞ X
X ∞
|||f − g|||N,n
d(f, g) = 2−n−N , f, g ∈ C ∞ (R),
n=0 N =1
1 + |||f − g|||N,n

defines a metric on C ∞ (R).


68 Extra Material and Problems c 2018 Christopher Heil

3.52 Extra Section: Homeomorphisms

We saw in Example 2.9.53 that a continuous function need not map an open
set to an open set. We are only guaranteed that the inverse image of an open
set under a continuous function is open. Now, if f : X → Y is a bijection, then
it has an inverse function f −1 : Y → X. If the inverse function is continuous,
then the inverse image of an open set under f −1 will be open. As we observe
in the following lemma, the inverse image of a set under f −1 is simply the
direct image of the set under f. So if f and f −1 are both continuous, then
both direct and inverse images of open sets will be open. Here is the precise
statement and proof of this fact.
Lemma 3.52.50. Let X and Y be metric spaces, and suppose that f : X → Y
is a bijection. If f and its inverse function f −1 are both continuous, then

U is open in X =⇒ f (U ) is open in Y (3.40)

and
V is open in Y =⇒ f −1 (V ) is open in X. (3.41)
Proof. Equation (3.41) is an immediate consequence of the fact that f is
continuous. To prove equation (3.40), let U be any open subset of X. Then,
since f −1 : Y → X is continuous, the inverse image of U under the function
f −1 must be open. Because f is a bijection, this inverse image is simply

(f −1 )−1 (U ) = U.

Therefore U is open in X. ⊓

Thus, if f and f −1 are both continuous, then f and f −1 both preserve
topological structure, as they each map open sets to open sets. We have a
special name for such functions.
Definition 3.52.51 (Homeomorphism). Let X, Y be metric spaces.
(a) We say that f : X → Y is a homeomorphism if f is a bijection and both
f and its inverse function f −1 are continuous.
(b) We say that X and Y are homeomorphic metric spaces if there exists a
homeomorphism f : X → Y. ♦
If f : X → Y is a homeomorphism, then f maps the open sets in X to the
open sets in Y, and f −1 does the same in the reverse direction. In essence,
homeomorphic spaces have the “same topological structure.” The following
exercise gives an example.
Exercise 3.52.52. Define f : R2d → Cd by

f (x1 , y1 , . . . , xd , yd ) = (x1 + iy1 , . . . , xd + iyd )

for x = (x1 , y1 , . . . , xd , yd ) ∈ R2d . Prove that f is a homeomorphism. ♦


3.52 Extra Section: Homeomorphisms 69

Here is a rather different example of a homeomorphism. The domain and


codomain for the function Ta in this example are Cb (R), which is the set of all
bounded, continuous functions f : R → F. The standard norm for this space
is the uniform norm k · ku .

Example 3.52.53. For each fixed real number a ∈ R we will define a function
Ta : Cb (R) → Cb (R). That is, Ta will maps elements of Cb (R) to elements
of Cb (R). In other words, Ta is a kind of “meta-function” that inputs a
function g ∈ Cb (R) and outputs a new function in Cb (R). Using normal
function notation we would denote this output by Ta (g), but in order to
avoid multiplicities of parentheses we will simply write Ta g instead of Ta (g).
We need to declare the rule for Ta , i.e., we must define how Ta will take
the input g and determine what the corresponding output Ta g is. What we
have in mind for Ta is that it will simply translate the graph of the input g
by a units to give us Ta g. That is, given g ∈ Cb (R), we declare that Ta g is
the function whose graph is the identical to the graph of g except that it has
been shifted to the right by a units. Explicitly,

Ta g(x) = g(x − a), x ∈ R.

For example, if g is the function g(x) = sin x, then Ta g is the function

Ta g(x) = sin(x − a) = sin x cos a − cos x sin a,


2
while if g(x) = e−x then
2 2
(x−a)2 x2 −2ax+a2 ea ex
Ta g(x) = e = e = .
e2ax
We refer to Ta as a translation operator (see the illustration in Figure 3.62).

1.0
0.8
g 0.6 T5 g
0.4
0.2

-4 -2 2 4 6 8 10

Fig. 3.62 A function g (solid) and its translation T5 g (dashed).

We will prove that Ta : Cb (R) → Cb (R) is a homeomorphism. To do this,


we must show that Ta is a bijection, and both Ta and its inverse (Ta )−1 are
continuous.
70 Extra Material and Problems c 2018 Christopher Heil

Proof that Ta is injective. To prove that Ta is injective, we must show


that if Ta f = Ta g, then f = g. So, suppose that functions f, g ∈ Cb (R) are
such that Ta f = Ta g. This means that Ta f and Ta g are the same function in
Cb (R). In other words, Ta f (t) = Ta g(t) for every t ∈ R. Choose any particular
point x ∈ R, and let t = x + a. Then

Ta f (t) = f (t − a) = f (x)

and similarly
Ta g(t) = g(t − a) = g(x).
But Ta f (t) = Ta g(t), so this implies that f (x) = g(x). Since x was an ar-
bitrary real number, we have shown that f and g take the same values at
every x, and therefore they are they same function. That is, f = g. Therefore
we have proved that Ta is injective.
Proof that Ta is surjective. We must prove that the range of Ta is all
of Cb (R). So, choose any function g ∈ Cb (R). We must show that f is in
the range of Ta . In other words, we must prove that g = Ta f for some
function f, i.e., we want g to be the function f shifted a units to the right.
The appropriate function f is therefore g shifted a units to the left. That is,
f is the function f (x) = g(x + a). This function f belongs to Cb (R), and for
every x we have
Ta f (x) = f (x − a) = g(x).
Note that this discussion also reveals what (Ta )−1 is. The inverse of shifting a
units to the right is simply shifting a units to the left, so

(Ta )−1 = T−a .

The inverse of the translation operator Ta is the translation operator T−a .


Proof that Ta is continuous. Before proving continuity, we observe that the
uniform norm of a function is not changed by translation, because

kTa gku = sup |Ta g(x)| = sup |g(x − a)| = sup |g(t)| = kgku. (3.42)
x∈R x∈R t∈R

We will use this translation-invariance property of the uniform norm to prove


that Ta is continuous.
Exercise 2.9.52 tells us that one way to prove that Ta is continuous is to
show that if gn → g in Cb (R), then Ta gn → Ta g in Cb (R). So, suppose that
gn → g in Cb (R). This means that gn converges to g uniformly, i.e.,

lim kg − gn ku = 0.
n→∞

But then, by the translation-invariance of the uniform norm and the fact that
Ta (g − gn ) = Ta g − Ta gn , we see that
3.52 Extra Section: Homeomorphisms 71

lim kTa g − Ta gn ku = lim kTa (g − gn )ku (3.43)


n→∞ n→∞

= lim kg − gn ku
n→∞

= 0.

Thus Ta gn → Ta g in Cb (R), and therefore Ta is continuous.


Proof that (Ta )−1 is continuous. Since (Ta )−1 = T−a , the same argument
as given for the continuity of Ta shows that (Ta )−1 is continuous. ♦

Thus, Ta is a homeomorphism that maps Cb (R) onto itself, i.e., it is a


continuous bijection whose inverse is also continuous. An important property
of Ta , which we made use of in equation (3.43), is that Ta is linear, which
means that

Ta (cf + dg) = c Ta f + d Ta g, all c, d ∈ F, f, g ∈ Cb (R).

In Section 6.1 we will see much simpler ways to prove that a linear function
is continuous.

Remark 3.52.54. We will show in Theorem 6.3.1 that if X and Y are normed
spaces and X is finite-dimensional, then every linear function F : X → Y is
continuous. However, if X is infinite-dimensional, then there exist linear func-
tions F : X → Y that are not continuous! An example of such a discontinuous
linear function appears in Example 6.3.4. ♦

Extra Problems

3.52.55. Prove that the property “X is homeomorphic to Y ” is an equiva-


lence relation on the class of all metric spaces. In other words, prove that the
following three statements hold for all metric spaces X, Y, and Z.
(a) Reflexivity: (X, dX ) is homeomorphic to (X, dX ).
(b) Symmetry: If (X, dX ) is homeomorphic to (Y, dY ), then (Y, dY ) is
homeomorphic to (X, dX ).
(c) Transitivity: If (X, dX ) is homeomorphic to (Y, dY ) and (Y, dY ) is
homeomorphic to (Z, dZ ), then (X, dX ) is homeomorphic to (Z, dZ ).

3.52.56. (a) Given f ∈ Cb (R), define Lf to be the function

(Lf )(x) = f (x) sin x, x ∈ R.

Prove that L : Cb (R) → Cb (R) is continuous and injective, but is not a home-
omorphism.
(b) Now define M : Cb (R) → Cb (R) by
72 Extra Material and Problems c 2018 Christopher Heil

(M f )(x) = f (x) (2 + sin x), x ∈ R.

Is M a homeomorphism? Is kM f ku = kf ku for every function f ? Compare


this to equation (3.42).

3.52.57. Let X be a normed space, and let z be a point in X.


(a) Define Fz : X → X by

Fz (x) = x + z, x ∈ X.

Prove that F is a homeomorphism.


(b) Suppose that X = Cb (R). In this case, what is z? Is Fz the same
homeomorphism as the function Ta defined in Example 3.52.53, i.e., is Fz =
Ta for some a? If not, exhibit a function f ∈ Cb (R) such that Fz (f ) 6= Ta (f ).

3.52.58. Given a > 0 and f ∈ Cb (R), define Da f to be the function

Da f (x) = f (ax), x ∈ R.

We call Da f the dilation of f by a (although if a > 1 then it is really a


“compression” rather than a “dilation”). Prove that Da : Cb (R) → Cb (R) is
a homeomorphism. Is Da uniformly continuous?

3.52.59. Given f ∈ Cb1 (R), let Df = f ′ be the derivative of f. By definition of


the space Cb1 (R), the derivative f ′ is continuous and bounded, and therefore
belongs to Cb (R). The standard norm for Cb1 (R) is kf kCb1 = kf ku + kf ′ ku ;
see Section 3.50.
(a) Prove that D : Cb1 (R) → Cb (R) is continuous, but is neither injective
nor surjective, and therefore is not a homeomorphism.
(b) Now we restrict the domain of D to the space C01 (R). Prove that
D : C01 (R) → C0 (R) is continuous and injective, but is not surjective and
therefore is not a homeomorphism.
(c) Finally, replace the standard norm for Cb1 (R) with the uniform norm.
Prove that D : (Cb1 (R), k · ku ) → Cb (R) is not continuous.
4.1 Infinite Series in Normed Spaces 73

CHAPTER 4: FURTHER RESULTS ON BANACH SPACES

4.1 Infinite Series in Normed Spaces

Example 4.1.50. Let x be set xn = 2−n x.


Pany vector in a normed space X, andP
We will show that x = xn . The partial sums of the series xn are
N
X N
X 2N − 1
sN = xn = 2−n x = x.
n=1 n=1
2N

Since
2N − 1
x − sN = x − x = 2−N x,
2N
we have

lim kx − sN k = lim k2−N xk = lim 2−N kxk = 0.


N →∞ N →∞ N →∞

Thus sN converges to x with respect to the norm of X, so Definition 4.1.1


tells us that
X∞ ∞
X
x = xn = 2−n x. ♦
n=1 n=1
P∞ n
Example 4.1.51. Consider the series n=1 (−1) 1
n δ1 in X = ℓ (note that we
use δn in each term of this series, not δn ). Is this a convergent series in
the space ℓ1 ? Before answering this, recall the alternating harmonic series
P∞ (−1)n
n=1 n . We know that this infinite series of scalars converges and equals
− ln 2 = ln(1/2). This means that its N th partial sum

XN
(−1)n
tN =
n=1
n

converges to the value − ln 2 as N increases.


P∞ n
The N th partial sum of the series n=1 (−1) n δ1 is

XN XN 
(−1) (−1)n
sN = δ1 = , 0, 0, . . . = (tN , 0, 0, . . . ).
n=1
n n=1
n

Therefore, if we let x = (− ln 2, 0, 0, . . . ), then

kx − sN k1 = (− ln 2 − tN , 0, 0, . . . ) 1
= | − ln 2 − tN | → 0 as N → ∞.
74 Extra Material and Problems c 2018 Christopher Heil
P∞ n
Hence sN does converge to x in ℓ1 -norm, so n=1 (−1)n δ1 converges in ℓ
1
p
and equals x. The reader should check that this series converges in ℓ for
every index 1 ≤ p ≤ ∞. ♦

Extra Problems

4.1.52. Let X be a nontrivial normed vector space (possibly incomplete).


(a)PShow that there exist vectors xn ∈ X, with xn 6= 0 for every n, such
that xn converges.
P
(b) Show that there exist vectors yn ∈ X such that the series yn does
not converge.

4.2 Absolute Convergence of Series

Let {xn }n∈N be a sequence of vectors in a normed space X. Then {kxn k}n∈N
is a sequence of real scalars, and we canPwonder whether there is a connection
between the convergencePof the series xn in X and the convergence of the
series of real numbers kxn k. To address this, let us give names to the
partial sums of these two series, say
N
X N
X
sN = xn and tN = kxn k.
n=1 n=1

It follows from the Triangle Inequality that ksN k ≤ |tN |, but this inequality
does not give us any useful information about convergence. On the other
hand, if M < N then we can use the Triangle Inequality to compute that
N
X N
X
ksN − sM k = xn ≤ kxn k = |tN − tM |.
n=M+1 n=M+1

This tells us that if {tN }N ∈N is a Cauchy sequence of real scalars, then


{sN }N ∈N is a Cauchy sequence of vectors in X. Now, all Cauchy sequences
in a complete space converge, so if X is complete then we have the following
implications:
4.2 Absolute Convergence of Series 75


X
kxn k converges =⇒ {tN }N ∈N is Cauchy in R (4.44)
n=1
=⇒ {sN }N ∈N is Cauchy in X (4.45)

X
=⇒ xn converges. (4.46)
if X is complete
n=1

However, if X is not complete then this chain of implication breaks down


at the last step. In an incomplete space, even if we P know that {sN }N ∈N is
a Cauchy sequence, we cannot conclude that P that xn converges! Even so,
we see that the issue
P of whether the series kxn k converges is important.
Suppose that xn is an absolutely
P convergent series in X. Definition 4.2.2
does not say that the series xn converges in X! However, the results that
we will obtain in Section 3.7 imply that counterexamples only exist in infinite-
dimensional normed spaces, so we cannot yet give a good example. On the
other hand, we can illustrate the idea by considering X = Q. Technically,
this will not give us a true counterexample because Q is not a vector space
over R or C (so Q is not a normed space in the sense of Definition 3.1.1).
Still, we will use Q to illustrate the difference between absolute convergence
and convergence of a series.

Example 4.2.50. Define a sequence of rational numbers (cn )n∈N by expanding


the decimal representation of the irrational number π = 3.14159265 . . . in the
following way:

c1 = 3,
c2 = 0.1,
c3 = 0.04,
c4 = 0.001,
c5 = 0.0005,
P P
and so forth. Since |cn | < ∞, the series cn converges absolutely. How-
P
ever, the series cn does not converge in the set Q. The partial sums
PN
sN = n=1 c n are rational numbers that converge to the real number π,
but the limit π does not belong to Q. ♦

Extra Problems

4.2.51. Give an explicit example of functions fn ∈ Cc (R) such that the series
P ∞
n=1 fn converges absolutely in Cc (R) with respect to the uniform norm,
but the series does not converge to an element of Cc (R).
76 Extra Material and Problems c 2018 Christopher Heil

4.2.52. Let M be a subspace of a normed space X. Prove that the closure


of M equals the set of all absolutely convergent series of elements of M :
X
∞ ∞
X 
M = xn : xn ∈ M, kxn k < ∞ .
n=1 n=1

Hint: If x ∈ M , then there exist vectors xn ∈ M such that xn → x. By


Problem 2.2.24, there is a subsequence {xnk }k∈N such that kxnk+1 − xnk k <
2−k for each k ∈ N.
Question: Why does this problem not imply that equality holds in equation
(4.16)?

4.2.53. Let I be an interval in R, and let k · kLip be the norm on Lip(I)


defined in Problem 4.2.15. Fix a point x0 ∈ I, and prove that

|f (x) − f (y)|
|||f ||| = |f (x0 )| + sup
x6=y |x − y|

is an equivalent norm for Lip(I).

4.3 Unconditional Convergence of Series

Unconditionally Convergent Series of Scalars

To illustrate conditional and unconditional convergence, we will first consider


series of scalars, i.e., infinite series in the Banach space X = F. Here is
an example of a convergent series of real numbers that does not converge
unconditionally—if we change the ordering of the series, then the new series
may converge to a different value, or may not converge at all!
P∞
Example 4.3.50. The harmonic series n=1 n1 does not converge, as its par-
tial sums are unbounded. On the other hand, the alternating harmonic series
X∞
(−1)n
n=1
n

does converge (in fact, its partial sums converge to ln 12 , the natural loga-
rithm of 12 ). Although the alternating harmonic series converges, it does not
converge absolutely in the sense of Definition 4.2.2.
We will show that the alternating harmonic series does not converge un-
1 1
conditionally either. To do this, set pn = 2n−1 and qn = 2n , i.e., the pn are
the positive terms from the alternating series and the qn are the absolute
P P
values of the negative terms. Each series pn and qn diverges to infinity.
4.3 Unconditional Convergence of Series 77

Hence there must exist an integer m1 > 0 such that

p1 + · · · + pm1 > 1.

Also, there must exist an integer m2 > m1 such that

p1 + · · · + pm1 − q1 + pm1 +1 + · · · + pm2 > 2.

Continuing in this way, we can create a rearrangement

p1 + · · · + pm1 − q1 + pm1 +1 + · · · + pm2 − q2 + · · ·


P (−1)n
of n that diverges to +∞. Likewise, we can construct a rearrange-
ment that diverges to −∞, converges to any given real number r, or simply
oscillates without ever converging (see Problem 4.3.6). Hence the series

X∞
(−1)σ(n)
n=1
σ(n)

converges for some bijections σ : N → N but not for others, which says that
the alternating harmonic series converges conditionally. ♦

Unconditional convergence of an infinite series is highly desirable because


it says that the particular choice of ordering of the index set is irrelevant—if
a series converges unconditionally then it will converge no matter how we
choose to order the terms, and each reordering of the series will converge to
the same limit. Series that converge conditionally (i.e., they converge, but
not unconditionally) are surprisingly difficult to deal with, as the following
example indicates.
P∞ n
Example 4.3.51. We know that the alternating harmonic series n=1 (−1) n
converges and equals − ln 2. For each n ∈ N let
1 1
pn = and qn = ,
2n 2n − 1
so the alternating series is
X∞
(−1)n
= −q1 + p1 − q2 + p2 − · · · = − ln 2 = −0.693147 . . . .
n=1
n

We will show that there is a rearrangement of this series that converges to


the real number
P∞ π = 3.14159P
....

The series n=1Ppn and n=1 qn both diverge to infinity. In particular,

the partial sums of n=1 pn increase without bound, so there is a first integer
M1 such that
p1 = p1 + · · · + pM1 ≥ π.
78 Extra Material and Problems c 2018 Christopher Heil
P∞
As the partial sums of n=1 qn also increase without bound, there is a first
integer N1 such that

p1 + · · · + pM1 − q1 − · · · − qN1 < π.


P
But n>M1 pn also diverges, so there is some first integer M2 such that

p1 + · · · + pM1 − q1 − · · · − qN1 + pM1 +1 + · · · + pM2 ≥ π.

Continuing in this way we obtain a rearrangement of the alternating harmonic


series that converges to π:

p1 + · · · + pM1 − q1 − · · · − qN1 + pM1 +1 + · · · + pM2


− qN1 +1 − · · · − qN2 + pM2 +1 + · · · + pM3 − · · · = π.

There is nothing special about π in this argument—if we replace π by e then


we can find a rearrangement of the alternating harmonic series that converges
to e, or to any real number that we like! The same is true of any infinite series
of real scalars that converges but does not converge unconditionally—given
any real number x we can find a rearrangement of the series that converges
to x. ♦
P∞ (−1)n+1
Example 4.3.52. Recall that the alternating harmonic series n=1 n
converges to ln 2. We will show that there is a rearrangement of the alternat-
ing harmonic series that converges to a sum different from ln 2.
We do not even need to know exactly what the alternating harmonic series
converges to, just let
X∞
(−1)n+1
L = .
n=1
n
Since this is an alternating series, we can use the first terms to give bounds
for L. We see that 12 ≤ L ≤ 1, so we know that L 6= 0.
Multiplying the series by 2, we obtain
X∞  
(−1)n+1 1 1 1 1 1 1 1
2L = 2 = 2 1 − + − + − + − + ···
n=1
n 2 3 4 5 6 7 8
2 1 2 1 2 1
= 2−1+ − + − + + − ···+
3 2 5 3 7 4
   
1 2 1 1 2 1 1
= (2 − 1) − + − − + − − + ···
2 3 3 4 5 5 6
1 1 1 1 1
= 1 − + − + − + ···
2 3 4 5 6
= L.

This is a contradiction since L 6= 0. ♦


4.3 Unconditional Convergence of Series 79

Problem 4.3.58 asks for a proof that if X is a finite-dimensional normed


space, then a series in X converges absolutely if and only if it converges
unconditionally.

Unconditional Convergence in Banach Spaces

Unfortunately, it is not true that unconditional convergence always implies


absolute convergence. For example, we will see later that if {en }n∈N is an
P1
infinite orthonormal sequence in a Hilbert space H, then the series n en
converges unconditionally but not converge absolutely (see Problem 5.7.3).
Although we will not prove it, the following interesting, nontrivial result
implies that if X is any infinite-dimensional Banach space, then there exists
a series in X that converges unconditionally but not absolutely. For a proof
of Theorem 4.3.53, see [Heil11, Thm. 3.33].

Theorem 4.3.53 (Dvoretzky–Rogers Theorem). Let X be an infinite-


2
dimensional Banach space. Given any sequence P (cn )n∈N ∈ ℓ , there exist unit
vectors xn ∈ X such that the infinite series cn xn converges unconditionally
in X. ♦

Corollary 4.3.54. If X is an infinite-dimensional Banach space, then there


exists an infinite series that converges unconditionally but not absolutely in X.

Proof. If we set cn = n1 , then the sequence (cn )n∈N is square-summable. The-


orem 4.3.53Ptherefore implies that there exist unit vectors xn ∈ X such that
the series cn xn converges unconditionally. This series does not converge
absolutely, because

X X∞
1
kcn xn k = = ∞. ♦
n=1 n=1
n

Equivalent Characterizations of Unconditional


Convergence

We will state a theorem that gives several reformulations of unconditional


convergence (see [Heil11, Thm. 3.10] for a proof).

Theorem 4.3.55. If {xn }n∈N is a sequence in a Banach space X, then the


following four statements are equivalent.
P∞
(a) n=1 xn converges unconditionally.
80 Extra Material and Problems c 2018 Christopher Heil
P
(b) The net n∈F xn : F ⊆ N, F finite converges in X. Explicitly, this
means that there exists a vector x ∈ X such that for every ε > 0, there is
a finite set F0 ⊆ N such that
X
F0 ⊆ F ⊆ N, F finite =⇒ x− xn < ε.
n∈F
P∞
(c) If (cn )n∈N is any bounded sequence of scalars, then the series n=1 cn xn
converges in X.
(d) If (εn )n∈N P
is any sequence of scalars such that εn = ±1 for every n, then

the series n=1 εn xn converges in X. ♦

The equivalence of statements


P (a) and (c) in Theorem 4.3.55 is quite in-
teresting. It implies that if xn is a series in X that does not converge
unconditionally,
P then we can find scalars cn with |cn | < 1 for every n such
that cn xn does not converge. That is, even though
P the norm of cn xn is
strictly smaller than the norm of xn , the series cn xn fails to converge
compare Problem 4.3.56).
Statement (b) of Theorem 4.3.55 mentions nets. Nets are generalizations
of ordinary sequences, and are used to characterize convergence in abstract
topological spaces.

Extra Problems
P
4.3.56. Let xn = (−1)n/n, so xn is the alternating harmonic series. Al-
though
P this series converges, show that there exist scalars cn such that cn → 0
yet cn xn does not converge.
P
4.3.57. Assume that cn is a conditionally convergent series of real scalars,
i.e., the series converges, but it does not converge unconditionally.
(a) Let (pn ) be the sequence of nonnegative terms of (cn ) in order,P and let
(q
P n ) be the sequence of negative terms of
P (c n ) in order. Show that pn and
qn must both diverge. Conclude that cn is not absolutely convergent.
P
(b) Given x ∈ R, show there exists a permutation σ of N such that cσ(n)
converges and equals x.
P
(c) Show that there exists a permutation σ of N such that cσ(n) diverges
PN
to ∞ (that is, limN →∞ n=1 cσ(n) = ∞), and another permutation τ such
P
that cτ (n) diverges to −∞.
P
(d) Show that there exists a permutation σ of N such that cσ(n) does
not converge and does not diverge to ∞ or −∞.
4.5 Independence, Hamel Bases, and Schauder Bases 81
P
4.3.58. Let X be a finite-dimensional normed space. Show that a series xn
in X is unconditionally convergent if and only if it is absolutely convergent.
P
4.3.59. Use Theorem 4.3.55 to show that if a series ∞ n=1 xn converges un-
conditionally in a Banach space X, then there P exists a single vector x ∈ X

such that for each bijection σ : N → N we have n=1 xσ(n) = x.

4.3.60. Let XPbe P a Banach space, and assume that vectors xmn ∈ X, m,
n ∈ N, satisfy m n kxmn k < ∞. Show that if σ : N → N×N is a bijection,
then the following series all converge and are equal as indicated:
∞ X
X ∞  ∞ X
X ∞  ∞
X
xmn = xmn = xσ(k) .
m=1 n=1 n=1 m=1 k=1

4.3.61. Prove that the conclusion of Problem 4.3.60 remains valid if we re-
place the hypothesis of absolute
Pconvergence with unconditional convergence,
i.e., we assume that the series (m,n)∈N2 cmn converges unconditionally in X.

4.5 Independence, Hamel Bases, and Schauder Bases

When we are given a finite-dimensional vector space, one of the first things
that we are taught to do in undergraduate linear algebra class is to look for
a basis for that space, i.e., a set that both spans and is linearly independent.
Unfortunately, bases in this ordinary vector space sense are usually not very
useful in the infinite-dimensional setting. For one thing, in most infinite-
dimensional vector spaces the only way that we can know that a basis exists
is by appealing to the Axiom of Choice. For another, any basis for a complete
infinite-dimensional normed space must be an uncountable set (see Theorem
4.5.3). This makes them too unwieldy for practical use.
When we work with generic vector spaces, we have no choice but to restrict
ourselves to finite linear combinations of elements. In particular, spanning
and linear independence are each defined in terms of finite linear combinations
of vectors. We cannot form an infinite series in a vector space without having
a notion of what it means to converge in that space, because an infinite series
is, by definition, the limit of the partial sums of the series. If our vector space
is lucky enough to have a norm, then we do have a convergence criterion and
therefore can form infinite series. Consequently, when we work with a normed
vector space we can seek “bases” that are defined in terms of “infinite linear
combinations” instead of just finite linear combinations of vectors. This is not
an issue in finite-dimensional normed spaces, but it can be very important
when dealing with infinite-dimensional spaces. As it turns out, defining the
most useful “basis” concept in infinite dimensions is not as straightfoward as
we might hope, as there are more subtleties in the ways that we can generalize
82 Extra Material and Problems c 2018 Christopher Heil

the notions of spanning and independence than might be apparent at first


glance.

The Standard Basis Vectors

The standard basis vectors, which were introduced in Notation 2.1.8, will
be used to illustrate many of the concepts discussed in this chapter, so we
recall their definition here. Given n ∈ N, the nth standard basis vector is the
sequence δn whose nth component is 1, and all other components are zero.
That is,
δn = (0, . . . , 0, 1, 0, 0, . . . ),
where n − 1 zeros precede the 1. The sequence of standard basis vectors, or
simply the standard basis, is the collection E of all of the standard basis
vectors. In other words,
E = {δn }n∈N .
Each standard basis vector δn has only finite many nonzero components, so
δn belongs to the space c00 . Hence the standard basis E is a subset of c00 ,
and since c00 is contained in the spaces ℓp and c0 , the standard basis E is also
a subset of ℓp and c0 .
Although we have just called E = {δn }n∈N a “basis,” this should only
be viewed as a name for now. As we progress through the chapter, we will
discuss in what sense E is a basis for spaces such as ℓp , c0 , or c00 .
According to the next exercise, if p is finite then the standard basis is
a Schauder basis for ℓp . Note that in order to prove that a sequence is a
Schauder basis for a given space, we must show that the series in equation
(4.17) converge in the norm of that space (not just in some other sense like
componentwise convergence!).

Exercise 4.5.50. Let E = {δn }n∈N be the standard basis.


(a) Prove that E is a Schauder basis for ℓp for each index 1 ≤ p < ∞ (be sure
to show that the partial sums converge in norm, not just componentwise).
(b) Prove that E is not a Schauder basis for ℓ∞ . In particular, exhibit
P an
sequence x ∈ ℓ∞ such that there are no scalars cn that satisfy x = cn δn
with convergence of the series in ℓ∞ -norm. ♦

Closed Span

We emphasize that the definition of the closed span does not say that
4.5 Independence, Hamel Bases, and Schauder Bases 83
X
∞ 
span(A) = cn xn : xn ∈ A, cn ∈ F ← This need not hold!
n=1

not true that an arbitrary element of span(A) can always be written as


It is P

x = n=1 cn xn for some xn ∈ A, cn ∈ F. Instead, to illustrate the meaning
of the closed span, consider a countable set A = {xn }n∈N . For such a set we
can write the closed span explicitly as

n N
X o
span{xn }n∈N = x ∈ X : ∃ cn,N ∈ F such that cn,N xn → x as N → ∞ .
n=1

That is, x belongs to the closed span of {xn }n∈N if and only if there exist
scalars cn,N ∈ F such that
N
X
cn,N xn → x as N → ∞. (4.47)
n=1

The scalars cn,N in equation (4.47) can depend on N. In contrast, to say that
P∞
x = n=1 cn xn means that
N
X
cn xn → x as N → ∞. (4.48)
n=1

In order for equation (4.48) to hold, the scalars cn must be independent of N.

Schauder Bases

We will prove that every Schauder basis is a complete sequence and is finitely
linearly independent. However, Example 4.5.52 will show that the converse
statement need not hold.

Lemma 4.5.51. If B = {xn }n∈N is a Schauder basis for a Banach space X,


then B is both complete and finitely linearly independent in X.

Proof. By the definition of a Schauder basis, every vector x ∈ X can be writ-


P∞ PN
ten uniquely as x = n=1 αn (x) xn . The partial sums sN = n=1 αn (x) xn
of this infinite series belong to the finite linear span of B. Further, sN → x,
so we conclude that span(B) is dense in X. Applying equation (4.15), this
tells us that the sequence B is complete.
To show that B is independent, suppose that some finite linear combination
of vectors from B equals the zero vector. That is, suppose that
84 Extra Material and Problems c 2018 Christopher Heil

N
X
cn xn = 0
n=1

for some integerP∞ N > 0 and scalars c1 , . . . , cN ∈ F. If we define cn = 0 for


n > N, then n=1 cn xn = 0, where the series converges with respect to
P
the norm of X. However, we also have ∞ n=1 0xn = 0, so the uniqueness
requirement of a Schauder basis implies that cn = 0 for every n. Hence B is
finitely linearly independent. ⊓ ⊔

Unfortunately, the converse of Lemma 4.5.51 fails in general, i.e., there


are sequences that are both complete and finitely independent but are not
Schauder bases. Here is an example in the space C[a, b]; also see Problem
5.8.14, which presents some further examples and extensions.

Example 4.5.52. The set of monomials M = {xk }∞ k=0 is both complete and
finitely independent in C[a, b]. However, as shown in Problem 4.6.6,Pthere ex-k
ist functions f ∈ C[a, b] that cannot be written in the form f (x) = ∞k=0 ck x
with convergence of the series in the uniform norm (or even just pointwise).
Consequently the set of monomials M is not a Schauder basis for C[a, b]. ♦

A complete examination of Schauder bases requires the use of the Hahn–


Banach and Uniform Boundedness Theorems, which are part of the field of
Functional Analysis. For more information on Schauder bases and related
topics, we refer to the following texts:
C. Heil, A Basis Theory Primer, Expanded Edition, Birkhäuser, Boston,
2011.
I. Singer, Bases in Banach Spaces I, Springer-Verlag, New York, 1970.
J. Lindenstrauss and L. Tzafriri, Classical Banach Spaces, I, Springer-
Verlag, New York, 1977.

Separable Banach Spaces

According to Definition 2.6.7, a Banach space X is separable if it contains a


countable dense subset. Now, if A = {xn }n∈N is a complete sequence in X then
span(A) is dense in X, but span(A) is not countable since it contains every
scalar multiple of each vector xn (as well as every finite linear combination of
these vectors). On the other hand, if we restrict to finite linear combinations
that only use rational scalars then we obtain a countable subset of span(A),
and the next exercise shows that this subset is still dense in X. Recall that a
rational scalar is a complex number whose real and imaginary parts are both
rational.
4.5 Independence, Hamel Bases, and Schauder Bases 85

Exercise 4.5.53. Let X be a Banach space. Show that if there exists a count-
able sequence {xn }n∈N that is complete in X, then
X
N 
S = rn xn : N > 0, rn is rational
n=1

is a countable, dense subset of X, and therefore X is separable. ♦

For example, we can use this exercise to prove that C[a, b] is separable.

Example 4.5.54. The Weierstrauss Approximation Theorem implies that the


set of monomials {1, x, x2 , . . . } is a countable complete sequence in C[a, b].
Applying Exercise 4.5.53, it follows that C[a, b] must be separable. ♦

The following corollary is a special case of Exercise 4.5.53.

Corollary 4.5.55. If a Banach space X has a Schauder basis B = {xn }n∈N ,


then X is separable.

Proof. By Lemma 4.5.51, if B = {xn }n∈N is a Schauder basis for a Banach


space X, then B is both countable and complete. It therefore follows from
Exercise 4.5.53 that X must be separable. ⊓

Thus, a Banach space that has a Schauder basis must be separable. The
question of whether every separable Banach space has a Schauder basis was
a longstanding open problem known as the Basis Problem. It was finally
shown by Enflo [Enf73] that there exist separable Banach spaces that have no
Schauder bases! For constructions of Schauder bases in some specific Banach
spaces such as C[0, 1], we refer to the text [Heil11].

The HRT Conjecture

The problem given in Problem 4.5.12 is one particular open problem that is
part of the Linear Independence of Time-Frequency Translates Conjecture,
also known as the HRT Conjecture. Many special cases are known. For ex-
ample, if g ∈ Cc (R) then it is easy to prove that the set of four functions
given in equation (4.21) is independent. More special cases can be found in
the survey paper [Heil06]. The original conjecture is in the paper:
C. Heil, J. Ramanathan, and P. Topiwala, Linear independence of
time-frequency translates, Proc. Amer. Math. Soc., 124 (1996), pp. 2787–
2795.
The actual statement of the conjecture is as follows.
86 Extra Material and Problems c 2018 Christopher Heil

2
Conjecture 4.5.56 (HRT Conjecture). If g in L (R) is not the zero function
and Λ = (ak , bk ) : k = 1, . . . , N is a set of finitely many distinct points in
R2 , then 
G(g, Λ) = e2πibk x g(x − ak ) : k = 1, . . . , N
is linearly independent. ♦

Despite the simplicity of the statement of the conjecture, it appears to be


a very difficult problem, still open as of 2018. The conjecture is open if L2 (R)
is replaced by C0 (R), and the real-valued version obtained by replacing e2πibx
with cos bx is likewise open. If we replace L2 (R) or C0 (R) with Cb (R) then it
is easy to find counterexamples (e.g., any periodic function).
Please contact me with any news, progress, or questions regarding this
conjecture.

Extra Problems

4.5.57. Let c00 and c0 be the spaces of sequences introduced in Chapter 2,


and let c denote the set of all convergent sequences of scalars, i.e.,
n o
c = x = (xk )k∈N : lim xk exists .
k→∞

Since
c00 ( c0 ( c ( ℓ∞ ,
the sup-norm is a norm of each of these four spaces. Further, by Problem
2.2.19, ℓ∞ is complete with respect to this norm. Prove the following state-
ments.
(a) Prove that c is a closed subspace of ℓ∞ , and therefore is a Banach
space with respect to the sup-norm.
(b) c0 is a closed subspace of c, and therefore is a Banach space with
respect to the sup-norm.
(c) c00 is not a closed subspace of c0 , and therefore is not complete with
respect to the sup-norm.

4.5.58. This problem adds to Problem 4.6.6.


Let (ck )k≥0 be a fixed sequence of scalars, and suppose that the series
P∞ k
k=0 ck x converges for some number x ∈ R. Set r = |x|, and prove the
following statements.
P
(a) The series f (t) = ∞ k
k=0 ck t converges absolutely for all t ∈ (−r, r),
i.e.,
X∞
|ck ||t|k < ∞, all |t| < r.
k=0
4.6 The Stone–Weierstrass Theorem 87

(b) f is infinitely differentiable on the interval (−r, r).


(c) f is real analytic at the origin in the sense defined in Problem 3.50.54.
Specifically,
X∞
f (k) (0) k
f (t) = t , |t| < r.
k!
k=0

4.5.59. Prove the following statements.


(a) c0 is separable.
(b) ℓp is separable for 1 ≤ p < ∞.
(c) C0 (R) is separable.
Hint: Let θM be 1 on [−M, M ], zero outside [−M −1, M +1], and linear on
each of the two intervals [−M − 1, −M ] and [M, M + 1]. Use the Weierstrass
Approximation Theorem to show that the countable set
X
N 
S = rk xk θM (x) : M > 0, N ≥ 0, rk rational
k=0

is dense in C0 (R).

4.6 The Stone–Weierstrass Theorem

Extra Problems

4.6.50. For this problem we take F = R. Assume that f ∈ C[a, b] is a strictly


positive function that is also injective. Prove that A = span{f (x)n }n≥0 is
dense in C[0, 1].
For example, we can take f to be xα (if we choose a and b correctly) or
ex .
88 Extra Material and Problems c 2018 Christopher Heil

4.50 Extra Section: Hamel Bases and the Axiom of


Choice

If X is a finite-dimensional vector space, then it is usually not too difficult


to construct some Hamel basis for X, and we can use basic linear algebra
techniques to prove that any two Hamel bases for X contain exactly the
same number of vectors. That is, any two Hamel bases for X have the same
cardinality, and we call this number the dimension of X.
Bases do exist for infinite-dimensional spaces, but we cannot always ex-
plicitly display them. The Axiom of Choice is one of the axioms of the stan-
dard form of set theory most commonly accepted in mathematics (Zermelo–
Fraenkel set theory with the Axiom of Choice, or ZFC). One consequence of
the Axiom of Choice is that
every vector space has a Hamel basis.
We will show belows how to use one of the equivalent forms of the Axiom of
Choice known as Zorn’s Lemma to prove the existence of Hamel bases. Even
so, while the Axiom of Choice implies that X has a Hamel basis, if X is infinite
dimensional then there often is no way to construct such a basis. Although we
know Hamel bases exist, we may not be able to exhibit any specific examples
(just try to construct a Hamel basis for ℓ1 ; you won’t succeed!). This is one
reason why Hamel bases are not very useful in infinite-dimensional spaces.
The Axiom of Choice is one of the axioms of the standard form of set theory
most commonly accepted in mathematics (Zermelo–Fraenkel set theory with
the Axiom of Choice, or ZFC). Here is the formal statement of this axiom.

Axiom 4.50.50 (Axiom of Choice). Let S be a nonempty set S, and let


P be the family of all nonempty subsets of S. Then there exists a function
f : P → S such that f (A) ∈ A for each set A ∈ P. ♦

There are many statements that are equivalent to the Axiom of Choice.
In the remainder of this appendix, we will discuss another equivalent form
of the Axiom of Choice, known as Zorn’s Lemma or the Kuratowski–Zorn
Lemma. First we need to introduce the concept of a partial order on a set.

Definition 4.50.51 (Partial Order). Let S be a set, and let ≤ be a relation


on S × S.
(a) The relation ≤ is a partial order on S if the following conditions are
satisfied for all elements A, B, C ∈ S:
i. Reflexivity: A ≤ A,
ii. Symmetry: A ≤ B and B ≤ A implies A = B, and
iii. Transitivity: A ≤ B and B ≤ C implies A ≤ C.

(b) Two elements A, B ∈ S are comparable if either A ≤ B or B ≤ A.


4.50 Extra Section: Hamel Bases and the Axiom of Choice 89

(c) A partial ordering ≤ on S such that every pair A, B of elements of S are


comparable is called a linear ordering or a total ordering of S.
(d) A nonempty subset of S that is linearly ordered by ≤ is called a chain
in S.
(e) An element A ∈ S is maximal in S if for every B ∈ S we have that

B is comparable to A =⇒ B ≤ A.

(f) An element A ∈ S is an upper bound for U ⊆ S if B ≤ A for every


B ∈ U. ♦

Note that a maximal element need not be comparable to all elements of S,


and it need not be unique.
Now we state Zorn’s Lemma (although it is more precise to refer to this re-
sult as the Kuratowski–Zorn Lemma, as it is due independently to Kazimierz
Kuratowski and Max Zorn). A proof of that Zorn’s Lemma is equivalent to
the Axiom of Choice can be found in Theorem A.4 of:
J. J. Rotman, Advanced Modern Algebra, Prentice Hall, Upper Saddle
River, NJ, 2002.
An exposition of the history of the Kuratowski–Zorn Lemma appears in:
P. J. Campbell, The origin of “Zorn’s Lemma,” Historia Math., 5 (1978),
pp. 77–89.

Axiom 4.50.52 (Kuratowski–Zorn Lemma). Let ≤ be a partial order


on a set S. If every chain in S has an upper bound in S, then S contains a
maximal element. ♦

To illustrate the use of the Kuratowski–Zorn Lemma, we will prove that


every vector space X (other than the trivial space {0}) possesses a Hamel
basis, i.e., a subset that is both finitely linearly independent and whose finite
linear span is X (see Definition 4.5.2).

Theorem 4.50.53. If X is a nontrivial vector space, then there exists a sub-


set B that is a Hamel basis for X.

Proof. Let

S = A ⊆ X : A is finitely linearly independent .

Note that S is nonempty, since if x 6= 0 then the singleton {x} is a linearly


independent set. The inclusion relation ⊆ is a partial order on S.
Suppose that C is a chain in S, say C = {Ai }i∈I where I is some index set.
By definition, each set Ai is finitely independent, and we claim that A = Ai
S

is also finitely independent. To see this, choose finitely many distinct vectors
90 Extra Material and Problems c 2018 Christopher Heil

x1 , . . . , xn ∈ A. Then for each k = 1, . . . , n we have xk ∈ Aik for some index


ik ∈ I. Since C is a chain, it is linearly ordered by inclusion. Therefore there
is a largest set Aik , i.e., there is a j such that Aik ⊆ Aij for k = 1, . . . , n.
Hence x1 , . . . , xn all belong to Aij and therefore {x1 , . . . , xn } is a linearly
independent set. Thus every finite subset of S is independent, so A ∈ S.
Since we have Ai ⊆ A for each i ∈ I, it follows that A is an upper bound for
the chain C.
Applying the Kuratowski–Zorn Lemma, we conclude that S contains a
maximal element B. By definition, B is finitely independent, so if its finite
span is X then it is a Hamel basis for X. Suppose that there exists a vector
x ∈ X that does not belong to span(B). Then B ′ = B ∪ {x} is finitely
independent and hence belongs to S. However, B ( B ′ , which implies that B
is not a maximal element in S. This is a contradiction, so we must have
X = span(B). ⊓ ⊔

In summary, we have shown that Theorem 4.50.53 is a consequence of the


Axiom of Choice. The converse implication is also true, i.e., the statement
“every vector space has a Hamel basis” is another equivalent form of the Ax-
iom of Choice and the Kuratowski–Zorn Lemma; see the following reference:
A. Blass, Existence of bases implies the axiom of choice, in: Axiomatic
Set Theory (Boulder, Colo., 1983), J. E. Baumgartner, D. A. Martin, and
S. Shelah, Eds., Contemp. Math., Vol. 31, Amer. Math. Soc., Providence, RI,
1984, pp. 31–33.
5.8 Orthonormal Bases 91

CHAPTER 5: INNER PRODUCTS AND HILBERT SPACES

5.7 Orthogonal and Orthonormal Sequences

Extra Problems

5.7.50. This problem will give a constructive solution to Exercises 3.7.51 and
3.7.52 in the setting of Hilbert spaces.
(a) Let M be a proper, closed subspace of a Hilbert space H. Given a vector
f∈/ M, let p be the orthogonal projection of f onto M and set e = f − p.
Show that the vector g = e/kek satisfies

kgk = 1, g ∈ M ⊥, dist(g, M ) = inf kg − mk = 1.


m∈M

(b) Let H be an infinite-dimensional Hilbert space. Prove directly that H


must contain an infinite orthonormal sequence {en }n∈N . Show that no sub-
sequence of {en }n∈N is Cauchy, and conclude that the closed unit disk

D = f ∈ H : kf k ≤ 1 is not a compact subset of H.

5.8 Orthonormal Bases

Remark 5.8.50. If {fn }n∈N is an orthonormal sequence, then the scalars


hf, fn i are sometimes called the generalized Fourier coefficients of f with
respect to {fn }n∈N . ♦

Remark 5.8.51. In summary, if {en } is a complete orthonormal sequence in a


Hilbert space H, then every vector f ∈ H has an explicit, unique represen-
tation as X
f = hf, en i en , f ∈ H, (5.49)
n

and therefore {en } is a Schauder basis for H. This fact is very useful. Not
every complete sequence {fn } is a Schauder basis, and even if it is, it can
be a highly nontrivial
P task to find an explicit formula for the scalars cn (f )
that satisfy f = cn fn . Remarkably, if {en } is both complete and ortho-
normal, then it is a automatically a Schauder basis with explicitly known
representations. Furthermore, such an orthonormal basis has a number of
important “stability”
P properties. First, it follows from Theorem 5.7.1 that
the series f = hf, en i en converges unconditionally. That is, the series con-
verges no matter how we order the index set (and other important “stability”
features of unconditional convergence appear in Theorem 4.3.55). Second, the
92 Extra Material and Problems c 2018 Christopher Heil

Plancherel Formula tells us that the norm of a vector f is precisely equal to


the ℓ2 -norm of the sequence of coefficients (hf, en i). Hence a small change
in f yields a small change in these coefficients, and conversely a small change
in the coefficients results in a small change to f. ♦
The results of Theorems 5.7.1 and 5.8.1 can be generalized to uncount-
able orthonormal sets (see Problems 5.9.10 and 5.9.11). Specifically, Problem
5.9.11 shows that if a Hilbert space H contains a complete, uncountable
orthonormal set, then we obtain basis-like representations analogous to those
given in equation (5.11). Sometimes such a set is called an “orthonormal ba-
sis,” but in keeping with Definition 4.5.4 we will restrict the use of the word
“basis” to countable sequences.
Still, this raises a question: Does every Hilbert space contain a complete
orthonormal set? We could try to create such a set by an iterative process. For
example, suppose that S is any orthonormal set in H, and let M = span(S)
be the closed span of S. By definition, S is complete if and only if M = H. If S
is not complete then M is a proper closed subspace of H, so its orthogonal
complement M ⊥ is not trivial. If we choose any nonzero vector g ∈ M ⊥ and
divide g by its length, we obtain a unit vector that is orthogonal to S. Hence
S ∪ {g} is a larger orthonormal set, and if this new orthonormal set is not
complete then we can keep going and add another orthonormal vector to
it. Unfortunately, if H is infinite-dimensional then this process might never
end. However, the same type of Zorn’s Lemma argument that is used in
Theorem 4.50.53 to show that every vector space has a Hamel basis can be
adapted to show that H must contain a complete orthonormal set. Of course,
since this proof depends on the Axiom of Choice it is nonconstructive. On
the other hand, Problem 5.8.54 shows how to give a constructive proof for
separable Hilbert spaces.
Exercise 5.8.52. Use the Axiom of Choice in the form of Zorn’s Lemma to
show that every Hilbert space contains a complete, orthonormal set. ♦
Thus every Hilbert space contains subsets that are both complete and
orthonormal. Given this, the following lemma shows that H contains a count-
able complete orthonormal set (i.e., an orthonormal basis) if and only if H
is separable.
Lemma 5.8.53. A Hilbert space H is separable if and only if there exists a
countable sequence {en } that is a orthonormal basis for H.
Proof. ⇐. Exercise 4.5.53 tells us that if a Banach space X contains a count-
able complete sequence, then X is separable.
⇒. Let H be a separable Hilbert space. By Exercise 5.8.52, there is a sub-
set S that is both complete and orthonormal.√Because of this orthonormality,
any two distinct vectors in S are a distance 2 apart:

f 6= g ∈ S =⇒ kf − gk = 2. (5.50)
5.11 The Complex Trigonometric System 93

If S was uncountable, then by applying Problem 5.4.6 we would find that X


is not separable. Since we know that H is separable, S must be countable,
and therefore S is an orthonormal basis for H.
For an alternative (and constructive) proof that every separable Hilbert
space contains an orthonormal basis, see Problem 5.8.54. ⊓ ⊔

In summary, if H is a finite-dimensional Hilbert space, then it contains a


finite orthonormal basis {e1 , . . . , ed } where d is the dimension of H, while if H
is separable, infinite-dimensional Hilbert space, then it contains a countably
infinite orthonormal basis {en }n∈N . If H is nonseparable, then it does contain
a complete orthonormal set S, but any such set must be uncountable.

Extra Problems

5.8.54. (a) Let M be a closed subspace of a Hilbert space H, and suppose


that f ∈
/ M. Let p be the orthogonal projection of f onto M and set e = f −p.
Prove that
M ⊕ span{e} = {m + ce : m ∈ M, c ∈ F}
is a closed subspace of H, and

M ⊕ span{e} = M + span{f } = {m + cf : c ∈ F}.

(b) Let {fn }n∈N be a linearly independent sequence in a Hilbert space H.


Show that there exists an orthogonal sequence {en }n∈N in H that satisfies

span{e1 , . . . , eN } = span{f1 , . . . , fN } for each N ∈ N.

This is called the Gram–Schmidt orthogonalization procedure.


(c) How should the Gram–Schmidt orthogonalization procedure be modi-
fied if {fn }n∈N is not linearly independent?
(d) If H is a separable Hilbert space then, by definition, H contains a
countable dense subset {fn }n∈N . By applying the Gram–Schmidt orthogo-
nalization procedure to this set, prove that H contains a countable complete
orthonormal subset {en }n∈N (which, by Theorem 5.8.1, is therefore an ortho-
normal basis for H).

5.11 The Complex Trigonometric System

The trigonometric system is a countable sequence indexed by the set of inte-


gers Z. In contrast, the majority of the sequences we have encountered before
have been indexed by the natural numbers N. Even so, all of the results of
94 Extra Material and Problems c 2018 Christopher Heil

this chapter on orthonormal sequences carry over with only minor notational
changes to sequences indexed by Z or any other countably infinite index set.
In particular, Theorems 5.7.1 and 5.8.1 both carry over to orthonormal se-
quences that are indexed by any countable index set. Essentially, this is a
consequence of the unconditional convergence of the series that appear in
those theorems. An unconditionally convergent series converges regardless
of what ordering we impose on the index set, and hence we can use any
countably infinite index set that we like without changing the meaning of the
series.
For the index set Z, we typically use the “standard ordering”

0, 1, −1, 2, −2, 3 − 3 . . . .

Unless we specify otherwise, this will be the ordering


P that we have in mind.
This means that we consider a bi-infinite series n∈Z xn to be ordered as
X
xn = x0 + x1 + x−1 + x2 + x−2 + · · · .
n∈Z

The partial sums of this series are

s1 = x0 ,
s2 = x0 + x1 ,
s3 = x0 + x1 + x−1 ,
s4 = x0 + x1 + x−1 + x−2 ,
P
and so forth. Therefore, using this ordering, an infinite series n∈Z xn con-
verges if and only if these partial sums sN converge as N → ∞. If the series
converges unconditionally (the typicaly situation when working with ortho-
normal vector), then it will converge no matter what ordering of the series
that we choose. When dealing with bi-infinite series it is often convenient
to consider the partial sum s2N −1 = x−N + · · · + xN . We call this the N th
symmetric partial sum and often denote it by
N
X
SN = s2N −1 = xn = x0 + x1 + x−1 + · · · + xN + x−N .
n=−N

Extra Problems

5.11.50. Let f (x) = x for x ∈ [0, 1]. Compute the Fourier coefficients of f,
and use this to give another proof of Euler’s formula.
6.4 Equivalence of Bounded and Continuous Linear Operators 95

CHAPTER 6: OPERATOR THEORY

6.1 Linear Operators on Normed Spaces

Functionals

If the codomain of an operator is the field of scalars F, then we say that the
operator is a functional. In other words, a functional is simply a function
of the form f : X → F. We often denote functionals by Greek letters such
as λ, µ, or ρ, or sometimes by lowercase Roman letters such as d or q. The
following exercise gives two examples of nonlinear functionals. We will see
some linear functionals in Example 6.3.2.
Exercise 6.1.50. (a) Let X be a nontrivial normed space. Define a func-
tional ρ : X → F by

ρ(x) = kxk, x ∈ X.

Prove that ρ is not linear, not surjective, and not injective, even though
ker(ρ) = {0}.
(b) Fix n ≥ 2 and let X be the set of all n × n matrices with scalar entries.
Let det(A) denote the determinant of a matrix A, and define d : X → F
by d(A) = det(A). Show that the functional d is surjective, but it is not
linear and not injective. ♦

By definition, an n × n matrix A is singular if its determinant is zero.


Hence the kernel of the functional d given in part (b) of Exercise 6.1.50 is
precisely the set of all n × n singular matrices.

6.4 Equivalence of Bounded and Continuous Linear


Operators

Recall from Exercise 2.9.52 that if X and Y are normed spaces, then a func-
tion A : X → Y is continuous at a point x ∈ X if xn → x in X implies
Axn → Ax in Y, and A is continuous if it is continuous at every point.
Here is an expanded version of Theorem 6.4.1.

Theorem 6.4.50 (Boundedness Equals Continuity). Let X and Y be


normed vector spaces. If A : X → Y is a linear operator, then the following
four statements are equivalent.
(a) A is continuous at some point x ∈ X.
96 Extra Material and Problems c 2018 Christopher Heil

(b) A is continuous at x = 0.
(c) A is continuous.
(d) A is bounded.

Proof. (a) ⇒ (b), (c). Suppose that A is continuous at x, and fix any vector
y ∈ X. If yn → y then y − yn + x → x, so by linearity and the definition of
continuity at f we have

Ay − Ayn + Ax = A(y − yn + x) → Ax as n → ∞.

Rearranging, we see that Ayn → Ay, so A is continuous at the point y. This


is true for every vector y (including y = 0 in particular), so A is continuous.
(c) ⇒ (d). Suppose that A is continuous but unbounded. Then kAk = ∞,
so for each n ∈ N there must exist a vector xn ∈ X such that kxn k = 1 but
kAxn k ≥ n. Setting yn = xn /n, we compute that

kxn k
kyn − 0k = kyn k = → 0 as n → ∞.
n

Thus yn → 0. Since A is continuous and linear, it follows that Ayn → A0 = 0.


By the continuity of the norm (Lemma 3.3.3), this implies that

lim kAyn k = k0k = 0.


n→∞

However, for each n ∈ N we have


1 1
kAyn k = kAxn k ≥ · n = 1,
n n
which is a contradiction. Therefore A must be bounded.
(d) ⇒ (a). Suppose that A is bounded, and choose any vector x ∈ X.
Suppose that xn → x in X. Applying linearity and Lemma 6.2.4, we see that

kAxn − Axk = kA(xn − x)k ≤ kAk kxn − xk → 0.

Thus Axn → Ax, so A is continuous at f. ⊓


6.5 The Space B(X, Y )

Using abstract language, Lemma 6.5.5 shows that B(X) is a Banach algebra.
This Banach algebra is noncommutative since AB 6= BA in general, but on
the other hand it does have an identity element (the identity operator I).
6.5 The Space B(X, Y ) 97

Extra Problems

6.5.50. Let X, Y be normed spaces. Fix A ∈ B(X, Y ) and xn ∈ X. Prove


the following statements.
P∞ P∞
(a) If a series n=1 xn converges in X, then n=1 Axn converges in Y.
P P∞
(b) If ∞ n=1 xn converges unconditionally in X, then n=1 Axn converges
unconditionally in Y.
P P∞
(c) If ∞ n=1 xn converges absolutely in X, then n=1 Axn converges ab-
solutely in Y.

6.5.51. (a) What is the difference between Cb (R) and B(R, F)? Is one a subset
of the other?
(b) Give a complete characterization of the elements of B(R, F).

6.5.52. This is an extension of Problem 6.3.6.


Let A = [aij ] be an m × n matrix, which we identify with the linear
operator A : Fn → Fm that maps x to Ax. Since Fn is finite dimensional,
Theorem 6.3.1 implies that A is bounded with respect to any norm on Fn .
However, the exact value of the operator norm of A does depend on which
norms we choose to place on Fn and Fm .
(a) Prove that if the norm on both Fn and Fm is k · k1 then the operator
norm of A is X 
m
kAkℓ1 →ℓ1 = max |aij | .
j=1,...,n
i=1

Show further that if m = n then this operator norm is submultiplicative, i.e,.


for all n × n matrices A and B we have

kABkℓ1 →ℓ1 ≤ kAkℓ1 →ℓ1 kBkℓ1 →ℓ1 .

(b) Prove that if the norm on Fn and Fm is k · k∞ then the operator norm
of A is X 
n
kAkℓ →ℓ = max
∞ ∞ |aij | .
i=1,...,m
j=1

Show that this operator norm is also submultiplicative when m = n.


(c) Let kAk∞ be the sup-norm of the entries of A, i.e.,

kAk∞ = max |aij |.


i,j

Prove that this is a norm on the set of all m × n matrices, but if m = n ≥ 2


then the sup-norm is not submultiplicative.
98 Extra Material and Problems c 2018 Christopher Heil

6.6 Isometries and Isometric Isomorphisms

The translation operator Ta : Cb (R) → Cb (R) defined in Example 3.52.53 is


an isometric isomorphism because it is a bijection and equation (3.42) tells
us that kTa gku = kgku for every vector f in the domain Cb (R).

Extra Problems

6.6.50. (a) Prove that the dilation operator Da : Cb (R) → Cb (R) defined
in Problem 3.52.58 is a bounded linear bijection, and its operator norm is
kDa k = 1. Is it an isometry?
(b) Prove that the operators L and M defined in Problem 3.52.56 are
bounded and linear, and find their operator norms. Is either one an isometry?
6.6.51. (a) We say that a function f : R → F is 1-periodic if f (x + 1) = f (x)
for all x ∈ R. Prove that

Cbper (R) = f ∈ Cb (R) : f is 1-periodic

is a closed subspace of Cb (R) with respect to the uniform norm on Cb (R).


(b) Prove that

C per [0, 1] = f ∈ C[0, 1] : f (0) = f (1)

is a closed subspace of C[0, 1] with respect to the uniform norm on C[0, 1].
(c) Prove that
Cbper (R) ∼
= C per [0, 1],
i.e., there exists an isometric isomorphism A : Cbper (R) → C per [0, 1].
6.6.52. Suppose that X and Y are normed spaces, and {xn }n∈N is a Schauder
basis for X. Show that if A : X → Y is a topological isomorphism, then
{Axn }n∈N is a Schauder basis for Y.

6.7 Infinite Matrices

Expanded Discussion of Convolution

In this section we will consider bi-infinite sequences of scalars, i.e., sequences


of the form

x = (xk )k∈Z = . . . , x−2 , x−1 , x0 , x1 , x2 , . . . ,
6.7 Infinite Matrices 99

where each component xk is a scalar. As discussed P in Remark 3.2.50, if p is


finite then such a sequence x is p-summable if k∈Z |xk |p < ∞. We let ℓp (Z)
be the space of all p-summable bi-infinite sequences, and ℓ∞ (Z) be the space
of all bounded bi-infinite sequences.
All of the results of Section 3.2 carry over from ℓp to ℓp (Z). Thus, if
1 ≤ p ≤ ∞ then ℓp (Z) is a Banach space with respect to the norm
 !1/p

 X∞

 p
 |xk | , 1 ≤ p < ∞,
kxkp = k=−∞




sup |xk |, p = ∞.
k∈Z

In particular, ℓp (Z) is closed under the operations of addition of sequences and


multiplication of a sequence by a scalar. We will introduce another operation
on sequences, called convolution, and we will prove that the particular space
ℓ1 (Z) is closed with respect to this new operation. This gives us an example of
a Banach space with an extra operation that makes it into a Banach algebra.
We motivate convolution by looking at multiplication of polynomials. If
p(x) = a0 +a1 x+· · · am xm and q(x) = b0 +b1 x+· · · bn xn are two polynomials,
then their product is the polynomial

p(x)q(x) = c0 + c1 x + · · · + cm+n xm+n

whose coefficients ck are given by the formula


k
X
ck = aj bk−j , k = 0, . . . , m + n, (6.51)
j=0

where we take ak = 0 for k > m and bk = 0 for k > n. We can think of


equation (6.51) as a rule that tells us how to combine two finite sequences a =
(a0 , . . . , am ) and b = (b0 , . . . , bn ) to obtain a new sequence c = (c0 , . . . , cm+n ).
Extending this idea to infinite sequences gives us the definition of convolution.

Definition 6.7.50 (Convolution of Sequences). Let x = (xk )k∈Z and


y = (yk )k∈Z be bi-infinite sequences of scalars. If the series

X
ck = xj yk−j , (6.52)
j=−∞

converges for each k ∈ Z, then the convolution of x with y is the bi-infinite


sequence x ∗ y whose components are ck , i.e.,
X 
x ∗ y = (ck )k∈Z = xj yk−j .
j∈Z k∈Z
100 Extra Material and Problems c 2018 Christopher Heil

If the series in equation (6.52) does not converge for some k, then the convo-
lution of x with y is undefined. ♦

The convolution of two arbitrary sequences need not be defined. For ex-
ample, if  
x = 2k k∈Z and y = 2−k k∈Z ,
then the series given in equation (6.52) does not converge for any k, so the
convolution of these two particular sequences does not exist. If the convolu-
tion x ∗ y does exist, then we usually denote the kth component of x ∗ y by
(x ∗ y)k instead of ck . That is, if x ∗ y exists then its components are

X
(x ∗ y)k = xj yk−j , k ∈ Z.
j=−∞

The following theorem shows that the convolution of two summable se-
quences always exists, and it also gives us a bound for the ℓ1 -norm of a
convolution. In the proof, we use the fact that we can reindex convergent
series. In particular, if k is fixed then by setting i = k − j we see that

X ∞
X
yk−j = yi .
j=−∞ i=−∞

Note that it is important here that the summation is over all of Z, otherwise
the range of summation would change when we reindex. This is one reason
why we use bi-infinite sequences when we define convolution. The proof of
the following result is Problem 6.7.9.

Theorem 6.7.51. If x, y ∈ ℓ1 (Z), then x ∗ y ∈ ℓ1 (Z). Further, the following


submultiplicative norm inequality holds:

kx ∗ yk1 ≤ kxk1 kyk1 , x, y ∈ ℓ1 (Z). ♦ (6.53)

Thus ℓ1 (Z) is closed under convolution. Problem 6.7.10 gives some further
properties of convolution.

Extra Material on Banach Algebras

In summary, not only is ℓ1 (Z) a Banach space, but it is also closed under
the operation of convolution. Indeed, convolution is a “multiplication-like”
operation on ℓ1 (Z) in the sense that it has properties similar to ordinary
multiplication of scalars. We have a special name for normed spaces that
have such a “multiplicative” operation.
6.7 Infinite Matrices 101

Definition 6.7.52 (Normed Algebras and Banach Algebras). Let A


be a normed vector space. We say that A is a normed algebra if for each
choice of x, y ∈ A there exists a unique product xy ∈ A that satisfies the
following conditions for all x, y, z ∈ A and all scalars c ∈ F.
(a) Associativity: (xy)z = x(yz).
(b) Distributive Law: x(y + z) = xy + xz and (x + y)z = xz + yz.
(c) Interaction with scalar multiplication: c(xy) = (cx)y = x(cy).
(d) Submulticativity: kxyk ≤ kxk kyk for all x, y ∈ A.
If xg = gx for all x, g ∈ A then we say that A is commutative.
If there exists an element e ∈ A such that ex = xe = x for every x ∈ A
then we say that A is a normed algebra with identity.
If A is compete, then we say that A is a Banach algebra. In other words,
a Banach algebra is a Banach space that is also a normed algebra. ♦

Thus ℓ1 (Z) is a commutative Banach algebra with identity with respect


to the operation of convolution.

Example 6.7.53. (a) Cb (R) is a commutative Banach algebra with identity


under the operation of pointwise products of functions, (f g)(x) = f (x)g(x).
(b) C0 (R) is a commutative Banach algebra without identity with respect
to pointwise products.
(c) Since the operator norm is submultiplicative, if X is a Banach space
then the space B(X) of all bounded linear operators mapping X into itself is
a noncommutative Banach algebra with identity with respect to composition
of operators. ♦

Some normed algebras also have an additional operation that has proper-
ties similar to that of conjugation of complex numbers.

Definition 6.7.54 (Involution). Let A be a normed algebra. We say that


∼ is an involution on A (with respect to the product on A) if for each x ∈ A
there is an element xe ∈ A such that the following conditions are satisfied for
all x, y ∈ A and all scalars c ∈ F:
e = x,
(a) x
e
(b) x
fy = ye x
e,

(c) (x + y) = xe + ye, and
cx = c x
(d) f e. ♦

If F = R, then the complex conjugate that appears on the scalar c in


part (d) of the preceding definition is superfluous.
The following exercise shows that ℓ1 (Z) has an involution.
102 Extra Material and Problems c 2018 Christopher Heil

Exercise 6.7.55. Given x = (xk )k∈Z in ℓ1 (Z), define


 
e = x−k k∈Z = . . . , x2 , x1 , x0 , x−1 , x−2 , . . . .
x

Prove that ∼ is an involution on ℓ1 (Z) with respect to convolution. ♦

Extra Problems

6.7.56. (a) Let c00 (Z) be the set of all bi-infinite sequences x = (xk )k∈Z such
that xk 6= 0 for at most finitely many k. Prove that c00 (Z) is closed under
convolution, i.e., if x, y ∈ c00 (Z) then x ∗ y exists and belongs to c00 .
(b) Let c0 (Z) be the set of all bi-infinite sequences x = (xk )k∈Z such that
xk → 0 as k → ±∞. Exhibit sequences x, y ∈ c0 (Z) such that the convolution
x ∗ y does not exist.

6.7.57. Prove the following statements.



(a) If 1 ≤ p ≤ ∞, x ∈ ℓp (Z), and y ∈ ℓp (Z), then x ∗ y ∈ ℓ∞ (Z) and

kx ∗ yk∞ ≤ kxkp kykp′ .



(b) If 1 < p < ∞, x ∈ ℓp (Z), and y ∈ ℓp (Z), then x ∗ y ∈ c0 (Z).
(c) There exist sequences x ∈ ℓ1 (Z) and y ∈ ℓ∞ (Z) such that x ∗ y ∈
/ c0 (Z).
1
(d) If x ∈ ℓ (Z), and y ∈ c0 (Z), then x ∗ y ∈ c0 (Z).

6.7.58. Fix 1 < p < ∞, and suppose that x ∈ ℓp (Z) and y ∈ ℓ1 (Z).
(a) Show that
∞ 
X 
1/p 1/p′
|(x ∗ y)k | ≤ |xj | yk−j yk−j dy. (6.54)
j=−∞

Apply Hölder’s Inequality with exponents p and p′ to the two factors that
appear on the right-hand side of equation (6.54) to show that
 X
∞ 1/p
1/p′ p
|(x ∗ y)k | ≤ kyk1 |xj | |yk−j | .
j=−∞

(b) Use part (a) and Tonelli’s Theorem to directly prove Young’s Inequal-
ity:
kx ∗ ykp ≤ kxkp kgk1 . (6.55)
Remark: Equation (6.55) also holds for p = 1 (see Problem 6.7.9) and for
p = ∞ (see Problem 6.7.11).
7.8 The Spectral Theorem for Compact Self-Adjoint Operators 103

CHAPTER 7: OPERATORS ON HILBERT SPACES

7.8 The Spectral Theorem for Compact Self-Adjoint


Operators

Extra Problems

7.8.50. Let F = C for this problem. Let λ1 , . . . , λn be the eigenvalues of an


n × n matrix A, ordered so that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Let kxk be the
Euclidean norm on Cn , and let kAk be the operator norm of A, defined by
kAk = max{kAxk : kxk = 1}.
(a) Show that kAk k1/k ≥ |λ1 | for any positive integer k.
(b) Suppose that A is diagonalizable, i.e., there exists an invertible matrix
S and a diagonal matrix Λ such that A = SΛS −1 Show that kΛk = |λ1 |. Use
this to show that
lim kAk k1/k ≤ |λ1 |.
k→∞

(c) Combine parts (a) and (b) to deduce that

lim kAk k1/k = |λ1 |.


k→∞

This limit is called the spectral radius of A.

You might also like