Linear Spaces
Linear Spaces
Linear Spaces
Contents
Field of Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2
Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3
Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5
Sum of subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5
Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6
Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7
Basis and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7
Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8
Normed linear spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9
The `p and Lp spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10
Topological concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.12
Open sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.13
Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.14
Bounded sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.15
Convergence of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16
Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.17
Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.18
Banach spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.19
Complete subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.19
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.21
Linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.21
Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.22
Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.23
Upper semicontinuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.26
Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.27
Denseness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.27
Separability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.27
Schauder basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.27
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.28
2.1
2.2 c J. Fessler, October 4, 2004, 12:44 (student version)
Formal definitions of a vector space use the concept of a field of scalars, so we first review that.
Field of Scalars (from Applied Linear Algebra, Noble and Daniel, 2nd ed.)
A field of scalars F is a collection of elements α, β, γ, . . . along with an “addition” and a “multiplication” operator.
To every pair of scalars α, β in F, there must correspond a scalar α + β in F, called the sum of α and β, such that
• Addition is commutative: α + β = β + α
• Addition is associative: α + (β + γ) = (α + β) + γ
• There exists a unique element 0 ∈ F, called zero, for which α + 0 = α, ∀α ∈ F
• For every α ∈ F, there corresponds a unique scalar (−α) ∈ F for which α + (−α) = 0.
To every pair of scalars α, β in F, there must correspond a scalar αβ in F, called the product of α and β, such that
• Multiplication is commutative: αβ = βα
• Multiplication is associative: α(βγ) = (αβ)γ
• Multiplication distributes over addition: α(β + γ) = αβ + αγ
• There exists a unique element 1 ∈ F, called one, or unity, or the identity element, for which 1α = α, ∀α ∈ F
• For every nonzero α ∈ F, there corresponds a unique scalar α−1 ∈ F, called the inverse of α for which αα−1 = 1.
Simple facts for fields:
• 0 + 0 = 0 (use α = 0 in the definition of 0)
• −0 = 0 Proof. For any α, by the associative property (α + 0) + (−0) = α + (0 + (−0)) hence α + (−0) = α. Hence, since
the zero element is unique, −0 = 0.
Example. The set of rational numbers Q (with the usual definition of addition and multiplication) is a field.
The only fields that we will need are the field of real numbers R and the field of complex numbers C.
Therefore, hereafter we will use F when describing results that hold for either R or C.
c J. Fessler, October 4, 2004, 12:44 (student version) 2.3
Vector Spaces
In simple words, a vector space is a space that is closed under vector addition and under scalar multiplication.
Definition. A vector space or linear space consists of the following four entities.
1. A field F of scalars.
3. An operation called vector addition that associates a sum x + y ∈ X with each pair of vectors x, y ∈ X such that
• Addition is commutative: x + y = y + x
• Addition is associative: x + (y + z) = (x + y) + z
• There exists an element 0 ∈ X , called the zero vector, for which x + 0 = x, ∀x ∈ X
• For every x ∈ X , there corresponds a unique vector (−x) ∈ X for which x + (−x) = 0.
4. An operation called multiplication by a scalar that associates with each scalar α ∈ F and vector x ∈ X a vector αx ∈ X ,
called the product of α and x, such that:
• Associative: α(βx) = (αβ)x
• Distributive α(x + y) = αx + αy
• Distributive (α + β)x = αx + βx
• If 1 is the identify element of F, then 1x = x. ∀x ∈ X .
• 0x = 0 for any x ∈ X .
What are some examples? (Linear algebra classes focus on finite-dimensional examples.)
Important Vector Spaces
• Euclidean space or n-tuple space: X = Rn . If x ∈ X , then x = (a1 , a2 , . . . , an ) where ai ∈ R and we use ordinary addition
and multiplication: x + y = (a1 + b1 , a2 + b2 , . . . , an + bn ) and αx = (αa1 , . . . , αan ).
case, the set of real numbers R (with ordinary addition and multiplication) is a trivial
(As a special R ∞ vector
R ∞ space.)
• X = L1 R2 . The set of functions f : R2 → R that are absolutely (Lebesgue) integrable: −∞ −∞ |f (x, y)| dx dy < ∞,
with the usual pointwise definition
of addition and scalar multiplication.
To show show that f, g ∈ L1 R2 implies f + g ∈ L1 R2 , one can apply the triangle inequality: |f + g| ≤ |f | + |g|.
• The set of functions on the plane R2 that are zero outside of the unit square.
• The set of solutions to a homogeneous linear system of equations Ax = 0.
• C[a, b]: the space of real-valued, continuous functions defined on the interval [a, b].
• The space of band-limited signals.
• Many more in Luenberger...
Example. For 1 ≤ p < ∞, define the following infinite-dimensional1 space:
Z 1
p
X = Rp [a, b] = f : [a, b] → R : f is Riemann integrable and |f (t)| dt < ∞ ,
0
(with the usual pointwise definitions of addition of functions and multiplication of functions by a scalar). To show that this space
is a vector space, the only nontrivial work is verifying closure.
R1 R1
Clearly if f ∈ X then αf ∈ X since 0 |αf (t)|p dt = |α|p 0 |f (t)|p dt < ∞, so X is closed under scalar multiplication.
To show that f +g ∈ X if f, g ∈ X , i.e., closure under addition, requires a bit more work. Note that since |a+b| ≤ 2 max{|a|, |b|},
it follows for p ≥ 1 that
|a + b|p ≤ 2p max{|a|p , |b|p } ≤ 2p [|a|p + |b|p ].
Hence if f, g ∈ X :
Z 1 Z 1 Z 1 Z 1
|f (t) + g(t)|p dt ≤ [2p |f (t)|p + 2p |f (t)|p ] dt ≤ 2p |f (t)|p dt +2p |f (t)|p dt < ∞ + ∞ = ∞.
0 0 0 0
X × Y = {(x, y) : x ∈ X and y ∈ Y} .
To be a vector space we must define vector addition and scalar multiplication operations, which we define component-wise:
• (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 )
• α(x, y) = (αx, αy), ∀α ∈ F.
Fact. With these definitions the Cartesian product of two vector spaces is indeed a vector space.
The above definition generalizes easily to higher-order combinations.
Example. R3 = R × R × R.
2.3
Subspaces
A (nonempty) subset S of a vector space X is called a subspace of X if S, when endowed with the addition and scalar multiplication
operations defined for X , is a vector space, i.e., αx + βy ∈ S whenever x, y ∈ S and α, β ∈ F.
Example. The subset of X = R2 [−1, 1] consisting of symmetric functions (f (−t) = f (t)) is a subspace of X . It is clearly closed
under addition and scalar multiplication.
Sum of subsets
Definition. If S and T are two subsets of a vector space, then the sum of those subsets, denoted S + T is defined by
S + T = {x = s + t : s ∈ S, t ∈ T } .
Example. What is the sum of a plane and a line (both through origin) in R3 ? ??
Example. Consider X = R2 , with S = {(x, 0) : x ∈ [0, 1]} and T = {(0, y) : y ∈ [0, 1]}. Then S + T is the unit square.
Proposition. If M and N are subspaces of a vector space X , then the sum M + N is a subspace of X .
What is M + N ? ??
2 It is time to stop saying this. From now on we leave it implicit whenever this is clear, which it usually is.
2.6 c J. Fessler, October 4, 2004, 12:44 (student version)
Linear combinations
Pn
Definition. A finite sum i=1 αi xi for xi ∈ X and αi ∈ F is called a linear combination.
(The associative property of vector addition allows us to write such a sum without parentheses.)
Depending on where the xi ’s originated Pnwe get various properties of linear combinations.
• X : If xi ∈ X , i = 1, . . . , n, then i=1 αi xi ∈ X . This is shown
Pn easily by induction from the definition of a vector space.
• M : If xi ∈ M, i = 1, . . . , n, where M is a subspace, then i=1 αi xi ∈ M by induction from the definition of a subspace.
Any linear combination of vectors from a subspace is also in the subspace.
• S: What if we take linear combinations from a subset rather than a subspace?
Definition. If S is a subset of a vector space X , then the subspace generated by S is the subspace of linear combinations drawn
from S, defined by ( )
X n
[S] = x ∈ X : x = αi xi , for xi ∈ S, αi ∈ F, and n ∈ N .
i=1
2.5
Linear independence
Often we need to quantify how “big” a subspace is.
Definition. A vector x is called linearly dependent on a set S of vectors iff x ∈ [S], i.e., x is in the span of S.
Otherwise, if x ∈
/ [S], then x is called linearly independent of S.
Definition. A set S of vectors is called a linearly independent set if each vector in the set is linearly independent of the remaining
vectors in the set, i.e.,
∀x ∈ S, x ∈ / [S − {x}] .
Remark. S may be uncountable, but testing whether x ∈ [S − {x}] requires consideration only of finite sums, by the definition of
linear combinations.
Example. This illustrates that S can be uncountable!
1, t = s
Let X = {f : [0, 1] → R} and define gs (t) =
0, otherwise.
Then S = {gs : s ∈ [0, 1]} is a linearly independent subset of X .
Pn
Theorem. A finite set of vectors {x1 , . . . , xn } is linearly independent iff i=1 αi xi = 0 implies that αi = 0, ∀i.
(We are skipping proofs that are found in basic linear algebra books.)
Pn
Corollary. If a finite set S = {x1 , . . . , xn } is linearly independent and y ∈ [S], then y has a unique expansion y = i=1 αi xi
for some α1 , . . . , αn ∈ F.
Theorem. Any two bases for a finite-dimensional vector space contain the same number of elements.
Most, but not all, properties of (more easily understood) finite-dimensional spaces generalize to infinite-dimensional spaces.
Example. Pn = {polynomials of degree ≤ n}
A basis is 1, t, t2 , . . . , tn , which has dimension n. Why linearly independent? ??
Another basis is the Legendre polynomials: dk /dtk (t2 − 1)k , k = 1, . . . , n.
Exercise. C[0, 1] is infinite dimensional. Hint: prove by counter example considering Pn+1 .
A basis is a generalization of the usual concept of a coordinate system.
Fact. [3, p 184] If S is linearly independent set in a vector space X , then there exists a basis B for X such that S ⊆ B.
Thus, every vector space has a Hamel basis, (since the empty set is linearly independent).
However, “Hamel basis is not the only concept of basis that arises in analysis. There are concepts of basis that involve topological
as well as linear structure. ... In applications involving infinite-dimensional spaces, a useful basis, if one even exists, is usually
2.8 c J. Fessler, October 4, 2004, 12:44 (student version)
something other than a Hamel basis. For example, a complete orthonormal set is far more useful in an infinite-dimensional Hilbert
space than a Hamel basis” [3, p. 183]. (More on this later!)
We often need sets with less rigid structure than subspaces but that nevertheless still have some structure, so we digress a bit here.
2.4
Convexity
Luenberger describes convexity as “the fundamental algebraic concept of vector space.” (p. 25)
According to Oxford English dictionary, an algebra is “a calculus of symbols combining according to certain defined laws.”
Definition. A set K in a vector space is called convex iff for any x, y ∈ K, αx + (1 − α)y ∈ K for all α ∈ [0, 1].
Geometrically: for any two points in a convex set, the “line segment” between them is also in the set.
Properties of convex sets
• Subspaces and linear varieties are convex.
• {0} is “vacuously” convex.
• For α ∈ F, αK , {x = αk : k ∈ K} is convex if K is convex (magnification or minification of a set)
• K1 + K2 is convex if K1 and K2 are convex sets in a common vector Tspace.
• If C is a collection of convex sets (in a common vector space), then K∈C K is a convex set. (Important for POCS methods.)
Convex hull
Definition. The convex cover or convex hull of a set S in a vector space is the smallest convex set containing S, denoted co(S).
Equivalently, the convex hull of S is the intersection of all convex sets containing S:
\
co(S) = K. (Picture in 2D of blob and its convex hull)
{cK convex : S⊆K}
Problem 2.4 gives a more constructive form for co(S).
Cone
Definition. A set C in a vector space is called a cone with vertex at the origin if x ∈ C implies that αx ∈ C for all α ∈ [0, ∞).
Example. The space of nonnegative continuous (real) functions is a convex cone in the vector space of continuous functions.
Relationships between cones, convex sets, and subspaces
Subsets
Convex Cones
Cones
Subspaces Convex Sets
{X }
2.6
Normed linear spaces
We’ve gone about as far as we can with just the basic axioms of a vector space. Fortunately, most of the vector spaces of interest
have additional structure: a norm.
Definition. A norm on a vector space X is a function k·k : X → R that satisfies the following for every x ∈ X .
• kxk ≥ 0 (nonnegativity)
• kxk = 0 iff x = 0 (positive definiteness)
• kαxk = |α| kxk , ∀α ∈ F (scaling property) (homogeneity)
• kx + yk ≤ kxk + kyk , ∀y ∈ X (triangle inequality)
Notice that the absolute value function |α| appears above. It is here where our “restriction” to the fields R and C enters3 .
Definition. (X , k·k) is a normed vector space or normed linear space or normed linear vector space or just normed space4 .
Clearly a norm generalizes the usual notion of length.
The following lemma arises remarkably frequently in proofs.
Lemma. In a normed space, kxk − kyk ≤ kx − yk for all x, y ∈ X , i.e., | kxk − kyk | ≤ kx − yk.
Proof. kxk = kx − y + yk ≤ kx − yk + kyk .
pPn
Example. Euclidean n-space: En , Rn , kxk = i=1 ix2
pPn
Example. Euclidean n-space with a weighted norm: Rn , kxk = w
i=1 i i x2 for w > 0.
i
3 Probably it is possible to define normed spaces over other fields for which one can define a suitable |·| function that satisfies:
(i) |α| = 0 ⇐⇒ α = 0, (ii) |αβ| = |α| |β|, (iii) |α + β| ≤ |α| + |β|. But would such spaces be useful?
4 The term normed space should suffice because the presence of a vector space is implied in the axioms of a norm, since vector addition, scalar multiplication,
2.10
The `p and Lp spaces
These normed spaces are ubiquitous in the engineering literature, and represent perhaps the most important examples of (infinite
dimensional) normed vector spaces.
P∞
Definition. Let p ∈ [1, ∞). The space `p consists of all sequences of scalars a1 , a2 , . . . for which i=1 |ai |p < ∞. The norm of
a vector x ∈ `p is defined by
∞
!1/p
X
p
kxkp = |ai | < ∞.
i=1
Definition. The space `∞ consists of bounded sequences. The norm of a vector x = {ai } in `∞ is defined by
Before referring to these spaces as normed spaces, we must confirm that each of the functionals defined above is indeed a norm.
• kαxkp = |α| kxkp is trivial to verify
• kxkp > 0 unless x = 0.
• What about the triangle inequality? The holds due to the Minkowski inequality, which in turn follows from the Hölder
inequality.
Moreover, equality holds iff either of the following two conditions hold:
• either x or y equal 0, or
!1/q !1/p
|ai | |bi |
• both x and y are nonzero and = , ∀i.
kxkp kykq
See errata: Luenberger’s equality condition omits the cases where x or y equal 0.
X sX sX X sX sX
|ai bi | ≤ |ai |2 |bi |2 and hence ai bi ≤ |ai |2 |bi |2 .
i i i i i i
Z !1/p
b
p
kxkp = |x(t)| dt .
a
However, there is a subtle caveat with this space. There are functions that are nonzero on a set of measure zero for which kxkp = 0.
So to consider this space to be a normed space, we must treat functions that are equal almost everywhere (a.e.) as being equivalent.
In other words, a vector in Lp is really an equivalence class of measurable functions that are all equal almost everywhere.
This makes the definition of L∞ [a, b] a bit more subtle than `∞ . In particular, we cannot define kxk∞ to be simply the obvious
choice “supa≤t≤b |x(t)| ,” because that value will be different for different functions in the equivalence class. Instead, we define
kxk∞ = essential supremum of |x(t)| = infimum sup |y(t)| = ess sup |x(t)| .
a.e.
y(t)=x(t)
Topological concepts
According to the Oxford English dictionary, topology is the “branch of mathematics concerned with those properties of figures and
surfaces which are independent of size and shape and are unchanged by any deformation that is continuous, neither creating new
points nor fusing existing ones; hence, with those of abstract spaces that are invariant under homeomorphic transformations.”
Anyway, we have been systematically generalizing notions from geometry to more general settings, but one important concept we
have yet to generalize is that of distance.
The concept of distance is central to any discussion of optimization, since many optimization problems involve finding the element
(within a set) that is the closest to some given point outside that set.
Definition. If x and y are two points in a normed space (X , k·k), then the distance5 between x and y is defined by
d(x, y) = kx − yk .
More generally, if S is a nonempty subset of X , then the distance between a point x ∈ X and the set S is defined by
d(x, S) = inf kx − yk .
y∈S
Example. In R, d(1, (3, 4]) = 2, for the usual norm kxk = |x|. (Picture)
Note that there is no “closest point,” a complication that will require careful attention later.
Example. In R2 , what is the distance between (0, 0) and {(a, b) : 2 ≤ a ≤ 3, 1 ≤ b ≤ 2}? (Trick question) ??
How would you define the distance between two sets S and T ? ??
Properties of distance d. (We will use these later.)
Lemma 2.1 |d(x, S) − d(y, S)| ≤ kx − yk . (Picture)
Proof. d(x, S) = inf z∈S kx − zk = inf z∈S k(x − y) − (z − y)k ≥ inf z∈S kz − yk − kx − yk = d(y, S) − kx − yk . Now
rearrange. 2
Lemma 2.2 For any two subsets U and V of a normed space X , d(U, V ) ≤ d(x, U ) + d(x, V ), ∀x ∈ X . (Picture)
Proof. d(U, V ) = inf u∈U d(u, V ) ≤ inf u∈U d(x, V ) + kx − uk = d(x, V ) + d(x, U ). 2
Lemma. Let (X , k·k) be a normed space, M ⊆ X a subspace, and x ∈ X a point. Then ∀α ∈ F: d(αx, M ) = |α| d(x, M ) .
Proof. Trivial for α = 0.
For α 6= 0: d(αx, M ) = inf y∈M kαx − yk = inf z∈M kαx − αzk = |α| inf z∈M kx − zk = |α| d(x, M ) (Picture) . 2
Preview of optimization
Many problems in optimization can be expressed as follows.
Given x in a normed space (X , k·k), and a subset S in X , find “the” vector s ∈ S that minimizes kx − sk.
What questions should we ask about such problems?
• Is there any best s? I.e., does there exist s? ∈ S s.t. kx − s? k = d(x, S)?
• If so, is s? unique?
• How is s? characterized? (Better yet would be an explicit formula for s? .)
One purpose of some of the material that follows is to answer these questions. (But be prepared for a challenge since some aspects
of these questions remain open problems!) We will return to these questions after introducing Hilbert spaces in Ch. 3.
5 This type of distance is a special case of the more general concept of a distance function that is called a metric if it satisfies:
• d(x, x) = 0
• d(x, y) > 0 if x 6= y
• d(x, y) = d(y, x)
• d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality)
Exercise. Verify that a metric is a more general concept than a norm, i.e., find a metric d(x, y) that is not of the form d(x, y) = kx − yk for any norm k·k.
Hint: a metric need not satisfy the scaling property. ??
c J. Fessler, October 4, 2004, 12:44 (student version) 2.13
2.7
Open sets
Looking ahead: in optimization we often use iterative algorithms that generate sequences, and we need to know when a sequence
converges to a limit that belongs to the same set as the sequence itself. Such questions are related to whether a set is closed or not,
so we need concepts of open and closed sets.
Proof. x ∈ Int(P ) =⇒ ∃ε > 0 s.t. S(x, ε) ⊆ P, so d(x, Pe) = inf y∈Pe kx − yk = inf y∈P
/ kx − yk ≥ ε,
since y ∈
/ P =⇒ y ∈ / S(x, ε) =⇒ ky − xk ≥ ε.
We can prove the converse by proving its contrapositive6 .
/ Int(P ). Then ∀ε > 0, S(x, ε) ∩ Pe 6= ∅, so ∃y ∈ Pe s.t. kx − yk < ε. Thus d(x, Pe) = inf y∈Pe kx − yk = 0.
Suppose x ∈
Alternative direct proof. δ = d(x, Pe) > 0 =⇒ kz − xk ≥ δ > 0, ∀z ∈ Pe. But y ∈ S(x, δ) =⇒ ky − xk < δ =⇒ y ∈
/ Pe =⇒
y ∈ P. So S(x, δ) ⊆ P and hence x ∈ Int(P ) . 2
e
Fact. Similarly: x ∈ Int Pe ⇐⇒ d(x, P ) > 0, since P e = P.
6 We have shown that A =⇒ B and we want to show the converse B =⇒ A, the contrapositive of which is: not A =⇒ not B.
2.14 c J. Fessler, October 4, 2004, 12:44 (student version)
Closed sets
Definition. A point x ∈ X is called a closure point (or a cluster point or an adherent point) of a set P iff
Proposition.
• P is open =⇒ Pe is closed
• P is closed =⇒ Pe is open
Proposition.
• The intersection of a finite number of open sets is open.
• The union of a finite number of closed sets is closed.
• The intersection of an arbitrary number of closed sets is closed.
• The union of an arbitrary number of open sets is open.
Proof. Exercise.
Example. Consider the open intervals Sn = (−1/n, 1/n) ⊂ R, n = 1, 2, . . ..
Then ∪∞ N ∞
n=1 Sn = (−1, 1), which is open. And ∩n=1 Sn = SN , which is open. What is ∩n=1 Sn ? ??
c J. Fessler, October 4, 2004, 12:44 (student version) 2.15
Now we consider open and closedness in the context of the special types of sets we considered previously: convex sets and
subspaces.
Proposition. If C is a convex set in a normed space, then C and Int(C) are convex.
??
Bounded sets
Definition. A set S in a normed space (X , k·k) is called bounded iff ∃M < ∞ such that kxk ≤ M, ∀x ∈ S.
Sequences
Definition. A sequence is a set of vectors indexed by the natural numbers N, e.g., {xn } = {xn : n ∈ N}.
Formally, a sequence is a mapping from N to some vector space X .
Definition. If {xn } is a sequence, and n1 < n2 < · · · , then {xni } is called a subsequence of {xn }.
Notation. {xn } ∈ X iff xn ∈ X , ∀n
2.8
Convergence of sequences
Definition. In a normed space, we say a sequence of vectors {xn } converges to a vector x iff the sequence of real numbers
kxn − xk converges to zero, in which case we write xn → x or limn→∞ xn = x.
In other words, considering the definition of convergence of real numbers:
xn → x ⇐⇒ kxn − xk → 0
⇐⇒ ∀ε > 0, ∃Nε < ∞ s.t. n ≥ Nε =⇒ kxn − xk < ε
⇐⇒ ∀ε > 0, ∃Nε < ∞ s.t. n ≥ Nε =⇒ xn ∈ S(x, ε).
Example. (Infinite dimensional, of course.) Consider L2 [0, 1] with x(t) = 1 and xn (t) = 1 + t/n.
qR qR
1 1
Then xn → x because kx − xn k2 = 0
|x(t) − xn (t)|2 dt = 0
|t/n|2 dt = √13 n → 0 as n → ∞.
Example. (Infinite dimensional, of course.) Consider `p with x = (1, 1/2, 1/3, . . .) and xn = (1, 1/2, . . . , 1/n, 0, 0, . . .).
P∞
p 1/p
Then kxn − xkp = k=n+1 1/k . That power series is convergent (and hence goes to zero as n → ∞) for p > 1, but
diverges for p = 1. So we can say “xn →x” in `2 , for example, but we cannot say that in `1 !
So all norms are not equivalent in general, unlike in finite-dimensional vector spaces.
kx − yk = kx − xn + xn − yk ≤ kx − xn k + kxn − yk
which → 0 as n → ∞. Thus kx − yk = 0 so x − y = 0 so x = y. 2
Proof. As shown earlier, | kxk − kyk | ≤ kx − yk . Thus, |kxn k − kxk| ≤ kxn − xk → 0 if xn → x, so kxn k → kxk. 2
Proposition. If xn → x, then supn kxn k < ∞, i.e., convergent sequences are bounded.
Proof. ∃N1 s.t. kxn − xk < 1, ∀n > N1 . Thus kxn k = kxn − x + xk ≤ kxn − xk + kxk < 1 + kxk ,
so supn kxn k ≤ max {1 + kxk , kx1 k , . . . , kxN1 k} < ∞. (Picture) 2
c J. Fessler, October 4, 2004, 12:44 (student version) 2.17
Proposition. x is a limit point of S iff x ∈ S. Thus limit points and cluster points are equivalent!
Summary
• distance
• open sphere
• interior point & interior
• open set = its interior
• closure point: d(x, P ) = 0, & closure
• closed set = its closure
• bounded set
• convergence sequences
• limit point of a set (exists convergent sequence to it) = cluster point
• closed sets contain their limit points
2.11
Cauchy sequences
Often in convergence analysis it is easier to examine kxn − xm k than kxn − x? k, especially if we have not yet shown that a limit
x? even exists.
Definition. A sequence {xn } in a normed space is called a Cauchy sequence iff
kxn − xm k → 0 as n, m → ∞.
In other words,
∀ε > 0, ∃N > 0 s.t. n, m > N =⇒ kxn − xm k < ε.
Fact. In a normed space, every convergent sequence is a Cauchy sequence, since if xn → x, then
Note the repeated use of the triangle inequality in proofs. In a normed space, it is about all we have to work with!
The converse of the above fact is not true in general: Cauchy sequences need not converge in general normed spaces.
This is a bit unfortunate since the converse is what we would really like to use usually!
Example. Consider the sequence of functions fn (t) = 1 − e−nt in the normed space
Z 1
X = {f : [0, 1] → R : f continuous} , kf k = kf k1 = |f (t)| dt , (Picture) .
0
R1
kfn − fm k = 0 |e−nt − e−mt | dt = |(1 − e−n )/n − (1 − e−m )/m| → 0 as n, m → ∞. So {fn } is Cauchy.
But the “apparent limit” of fn is a step function, which is not continuous, and hence not an element of X .
How can we “fix” this problem?
• Broaden the vector space X to L1 [0, 1] (i.e., drop the continuity requirement).
• Replace the norm kf k1 with kf k∞ .
One can show for this example that kfn − fm k∞ does not approach zero as “n, m → ∞.”
(For any fixed m, kfn − fm k∞ → 1 as n → ∞.)
This “incompleteness” can also arise even in subspaces of `2 .
Example. X = set of infinite sequence of reals with only finitely many nonzero terms:
X = {(a1 , . . . , ak , 0, 0, . . .) : k ≥ 1, ai ∈ R}
with the `p norm kxkp . Now consider the sequence with elements xn = (1, 1/2, 1/4, . . . , 1/2n , 0, 0, . . .).
P 1/p 1/p
max{n,m} 1 p 1 1
Since kxn − xm kp = k=min{n+1,m+1} 2 k ≤ 2 p min{m,n} 1−1/2 p → 0 as n, m → ∞, {xn } is Cauchy. But there
is no x ∈ X to which {xn } converges.
In some sense, the problem is that X has “holes” in it. Broadening the space is the natural solution.
Is there an alternate norm that would make this vector space X complete? (I doubt it, can you show otherwise?)
c J. Fessler, October 4, 2004, 12:44 (student version) 2.19
2.11
Banach spaces
Often we prefer to use normed spaces that are free of the pathologies described in the preceding examples, so we name them.
Definition. A normed space (X , k·k) is called complete iff every Cauchy sequence in X has a limit in X (and hence converges).
Definition. A complete normed space is called a Banach space.
Examples of Banach spaces
• Rn , k·kp , for p ∈ [1, ∞]
• C[a, b] with kf k∞ = supt∈[a,b] |f (t)|
• `p for p ∈ [1, ∞] with usual kxkp
• Lp [a, b] for p ∈ [1, ∞] with usual kf kp
Is finding a suitable Banach space usually difficult? Fortunately not, since every normed space has a completion.
Theorem. If (X , k·kX ) is a normed space, then ∃ Y, k·kY that is a Banach space (called a completion of X ) with
• X is a subspace of Y
•X =Y
• x ∈ X =⇒ kxkX = kxkY .
Moreover, Y is essentially unique (i.e., all the completions of X are isometric with one another [3, p. 121]).
R1
Example. L1 [0, 1] is the completion of X = {f : [0, 1] → R : f continuous} , kf k = kf k1 = 0 |f (t)| dt .
There exist bounded functions that are Lebesgue integrable but not Riemann integrable, e.g. the indicator function on the rationals
[3, p. 561]. And in fact R1 [a, b] is not complete [3, p. 564].
2.12
Complete subsets
Definition. A subset S of a normed space is complete iff every Cauchy sequence from the subset converges to a limit within S.
Example. Any finite set is complete. (The next theorems give more interesting cases.)
Proof. The “only if” direction follows from the preceding theorem.
Suppose P is closed and {xn } is a Cauchy sequence in P (and hence in X ). Since X is Banach, ∃x ∈ X such that xn → x.
Hence x is a limit point of P . Since P is closed, x ∈ P . Thus P is complete. 2
Exercise. In a normed space, intersections of arbitrarily many complete subsets are complete.
There is an asymmetry between the two preceding theorems. We assumed a normed space to show complete set =⇒ closed set,
but we assumed a Banach space to show closed set =⇒ complete set.
Is the stronger assumption (of a Banach space) truly necessary to show closed set =⇒ complete set? ??
Ok, but in some (incomplete) X is there a closed proper subset that is incomplete? ??
In the preceding theorem we assumed we are already working in a Banach space. What if we only have an ordinary normed space?
Are there complete subsets of it? The next theorem shows that the answer can be yes, at least for finite-dimensional subspaces.
Recall that if M is a subspace of a normed space X , then d(αx, M ) = |α| d(x, M ) for all x ∈ X and α ∈ F.
Theorem. Any finite-dimensional subspace of a normed space is complete (and hence closed).
Since kxn − xm k → 0 for a Cauchy sequence, we see that |λnk − λm | → 0 since δk > 0. Thus {λnk } is Cauchy and hence (by the
PN k
completeness of R) converges to some limit λk . Defining x = k=1 λk ek ∈ M we find that xn → x since
N
X N
X
kxn − xk = (λnk − λk )ek ≤ |λnk − λk | kek k → 0.
k=1 k=1
2.9
Transformations
In systems theory, we analyze many systems and mathematical operations that transform one signal into another.
Definition. Let X and Y be two vector spaces (over a common field F), and let D be a subset of X .
A rule “T ” that assigns a single element y ∈ Y to each x ∈ D is called a transformation from X to Y with domain D.
We write y = T (x) and T : X → Y. or T : D → Y.
See p. 27 or c1 review notes for one-to-one and onto.
Definition. A transformation from a vector space X (over a scalar field F) into F is called a functional on X .
Example. A norm is a functional.
Linear transformations
Definition. A transformation T : X → Y (where X and Y are vector spaces over a common field) is called linear iff
Definition. If T is a linear transformation from X into X itself, then we say T is a linear operator.
However, the terminology distinguishing linear transformations from linear operators is not universal, and the two terms are often
used interchangeably.
Example. Let F = R and let X = Y be the space of continuous functions on [0, 1].
Rt
Define the linear transformation T by: F = T (f ) iff F (t) = 0 f (τ ) dτ .
Integration (with suitable limits) is a linear transformation.
Pn Pn
Caution! (From Linear Systems by T. Kailath.) By induction it follows that T ( i=1 αi xi ) = i=1 αi T (xi ) for any finite n, but
the above does not imply in general that linearity holds for infinite summations or integrals.
Further assumptions about “smoothness” or “regularity” or “continuity” of T are needed for that.
This point is always glossed over in introductory signals and systems courses, where infinite sums (and integrals) are routinely
“passed through” linear systems according to the superposition property, with no attempt to verify the validity of such exchanges.
2.22 c J. Fessler, October 4, 2004, 12:44 (student version)
Continuity
To define such continuity, we restrict attention to normed spaces.
Note that a transformation can be continuous for one pair of normed spaces but discontinuous for another pair, e.g. [3, p. 63,65].
Definition. A transformation T from a normed space (X , k·kX ) into a normed space Y, k·kY is called continuous at x0 ∈ X
iff
∀ε > 0, ∃δ = δ(x0 , ε) > 0 s.t. kx − x0 kX < δ =⇒ kT (x) − T (x0 )kY < ε.
Proposition. A transformation T from a normed space (X , k·kX ) into a normed space Y, k·kY is continuous at x ∈ X iff
xn → x =⇒ T (xn ) → T (x) (for any such sequence {xn }), i.e., lim T (xn ) = T lim xn .
n→∞ n→∞
2.13
Compactness
Optimization is about maximizing a functional over some set, or more precisely, usually about finding the maximizer (within some
set) of a functional. When does a functional achieve its maximum? We wish that the answer were “if the set is closed and bounded,”
but unfortunately that is incorrect in general. The concept of a compact set helps answer this question in general.
Definition. A subset K of a normed space (X , k·k) is called compact or sequentially compact iff every sequence {xn } in K has
some subsequence {xni } that converges to a limit x ∈ K.
(This is not quite the definition of compactness typically used first in a real analysis course, but it is more convenient for our
purposes. One can first define a compact set in terms of set coverings, and then prove that a metric space is compact if and only if
it is sequentially compact, e.g., [2, p. 63].)
As usual, having defined a new concept, we now attempt to relate it to previously defined concepts.
Proposition. A compact subset K of a normed space (X , k·k) is complete, closed, (and bounded).
Proof.
• Claim: K is complete, because Cauchy sequences will have limits in a compact set K.
Let {xn } be any Cauchy in K, then ∀ε > 0, ∃M s.t. n, m > M =⇒ kxn − xm k < ε/2.
Since K is compact, ∃ {ni } s.t. xni → x ∈ K, i.e., ∀ε > 0, ∃I s.t. i > I =⇒ kxni − xk < ε/2.
For any ε > 0, let N = max{M, nI }.
For n > N and i > I we have kxn − xk ≤ kxn − xni k + kxni − xk < ε/2 + ε/2 = ε. Thus xn → x.
• To show K is closed, we show that xn → x =⇒ x ∈ K for {xn } ∈ K.
{xn } ∈ K =⇒ ∃xni → y ∈ K. But xn → x =⇒ xni → x, so x = y ∈ K. Thus K is closed.
• Before we show boundedness, we need to prove the Weierstrass theorem! 2
So is the reverse true? Are closed and bounded sets compact?
The following theorem shows that in general the answer is “yes” in finite-dimensional normed spaces. (But not in general.)
Proof.
(a) =⇒ (b) is the Heine-Borel theorem (see Math 451...)
(b) =⇒ (c) is obvious
(c) =⇒ (a) will be a homework problem
This theorem illustrates why infinite-dimensional spaces are more challenging than finite-dimensional spaces. The fact that every
closed and bounded subset is compact can be very useful in finite dimensional problems, and we do not have this tool in general
infinite-dimensional cases.
By having a notion of compactness, we can make statements (such as the Weierstrass theorem below) that apply to both infinite and
finite-dimensional spaces, whereas those statements would not hold if we merely assumed that the sets were closed and bounded.
Analogy: we write sup when we think max. Here, we write compact when we think “closed and bounded.”
Summary
2.24 c J. Fessler, October 4, 2004, 12:44 (student version)
Examples
As has been / will be shown, any compact set is closed, bounded, and complete, and that the reverse is true in finite-dimensional
spaces. But what about infinite-dimensional spaces?
Example.
Consider any of the Banach spaces `p , for p ∈ [1, ∞] and the subset B = {x ∈ X : kxk ≤ 1} . This is a closed set, and hence it
is complete (since X is complete). Furthermore, B is clearly bounded.
We now proceed to exhibit a sequence {xn } in B that has no convergent subsequences.
Let xn denote the sequence in `p whose nth element is unity and all other elements are zero. Clearly {xn } ∈ B.
Yet kxn − xm kp = 1, so it is impossible for this {xn } to have any convergent subsequences since any such convergent subse-
quence would need to be Cauchy which cannot occur when kxn − xm kp = 1, ∀n, m ∈ N.
So the set B in `p is closed, bounded, and complete, but not compact.
One might begin to wonder then: are there any compact sets in infinite-dimensional spaces?
Example. Any finite set in an infinite-dimensional space is compact.
But that is not very interesting since the set {x1 , . . . , xn } is just a closed and bounded subset of the finite-dimensional subspace
[x1 , . . . , xn ], so of course it is compact.
Example. In any normed space X , suppose {xn } converges to x ∈ X . Then the set {x} ∪ ∪n {xn } is compact.
This still seems like a rather contrived and limited compact set.
Are there any interesting (e.g., nonempty) convex compact sets in infinite-dimensional spaces?
Conjecture: the following (convex) set is (?) compact:
( ∞
)
X
i
x ∈ `1 : 2 |xi | ≤ 1 . (2-1)
i=1
as shown above, so the question is whether adding the “2i ” part is enough of a change to make a compact set.
Extra credit (=30 homework points) to anyone who can show that (2-1) is or is not compact, or who can give a different example
of a nontrivial convex, compact set in an infinite dimensional normed space.
c J. Fessler, October 4, 2004, 12:44 (student version) 2.25
??
Lemma 2.5 In a normed space, any compact set contains the closures of all of its subsets.
??
Lemma 2.6 If U and V are compact, disjoint subsets of a normed space, then d(U, V ) > 0. (Picture)
Proof. Suppose d(U, V ) = 0. Then there exists {xn } ∈ U and {yn } ∈ V such that kxn − yn k → 0.
Since U is compact, there is a subsequence {xni } that converges to some x ∈ U .
Now kyni − xk ≤ kyni − xni k + kxni − xk → 0 as i → ∞, so yni → x ∈ U .
But since V is compact, it is also closed, so the limit of {yni } must lie in V , contradicting the disjointedness of U and V . 2
......................................................................................................................
Would it suffice for U and V to be closed?
No. Consider the sets U = (x, y) ∈ R2 : y > 0, x ≥ 1/y and V = (x, y) ∈ R2 : y < 0, x ≤ −1/y . (Picture)
These sets are closed and disjoint, yet d(U, V ) = 0.
10
2 4 6 x
What is lim supy→2 f (y)? We get the biggest limit approaching from the left, so lim supy→2 f (y) = 20. So f is u.s.c. at x = 2.
Similarly, lim supy→4 f (y) = 20, so f is not u.s.c. at x = 4. But lim supy→6 f (y) = 20, so f is u.s.c. at x = 6.
This function is u.s.c. everywhere except ?? and lower semicontinuous everywhere except ??
Does this function achieve a maximum on [3,5]? ?? What about on [1,3]? ?? How about on [0,1)? ??
Theorem. (Weierstrass)
An upper semicontinuous (real) functional f on a compact subset K of a normed space (X , k·k)
(i) is bounded on K, and (ii) achieves a maximum on K.
inf {g ∈ R : g ≥ f (x), ∀x ∈ K} , if f is bounded above on K
Proof. Let M = supx∈K f (x) =
∞, otherwise.
(At this point we do not know if M is finite or not.)
By definition of supremum, ∃ a sequence {xn } ∈ K such f (xn ) → M . (This is true even if M = ∞.)
However, the definition of supremum alone does not ensure that xn converges.
Since K is compact, ∃ a convergent subsequence xni → x? ∈ K, for some x? .
Since subsequences of convergent sequences have the same limit, limi→∞ f (xni ) = M , considering {f (xn )} as a sequence in R.
Since f is upper semicontinuous, f (x? ) ≥ lim supi→∞ f (xni ) = M .
Since f (x) is real and hence finite, f (x? ) ≥ M implies M must be finite. ... f is bounded on K.
On the other hand, by definition of supremum, since x? ∈ K, M ≥ f (x? ).
So we conclude f (x? ) = M , meaning that x? achieves the maximum of f on K. 2
Corollary. A real-valued, continuous functional f on a compact subset K of a normed space (X , k·k) achieves its maximum
and minimum on K.
Example. On a normed space (X , k·k), the function f : X → R defined by f (x) = kxk is continuous.
Proof. Recall | kxk − kyk | ≤ kx − yk, so x → y =⇒ kx − yk → 0 =⇒ |f (x) − f (y)| → 0.
An inf. dim. example like splines would be nice here...
Corollary (to Weierstrass theorem)
A compact subset K of a normed space is (complete, closed, and) bounded.
Proof. Since k·k is continuous, by the Weierstrass theorem ∃y ∈ K s.t. M , kyk = supx∈K kxk.
Hence kxk ≤ M < ∞, ∀x ∈ K, i.e., K is bounded. 2
c J. Fessler, October 4, 2004, 12:44 (student version) 2.27
2.14
Quotient Spaces
skip
2.15
Denseness
One last topological concept.
Definition. A subset D of a normed space (X , k·k) is called dense iff any of the following equivalent conditions hold.
• ∀x ∈ X and ∀ε > 0, ∃y ∈ D s.t. kx − yk < ε, i.e., d(x, D) = 0.
• D = X.
• ∀x ∈ X , ∃ {yn } ∈ D s.t. yn → x
Example. The set of rational numbers is dense in the real line.
Separability
S∞
Definition. A normed space X is called separable iff it contains a countable dense set, i.e., D = n=1 {yn }.
Example. Euclidean space En is separable. The collection of vectors x = (a1 , . . . , an ) with rational components is countable and
dense in En .
Example. `p is separable for p ∈ [1, ∞)
If Dn = {(r1 , . . . , rn , 0, 0, . . .) : rk ∈ Q}, then D = ∪∞
n=1 Dn is dense in `p .
See text for proof.
Example. Lp is separable for p ∈ [1, ∞)
Pn
If Dn = { k=1 rk 1Ik (t) : rk ∈ Q, Ik = (ak , bk ), ak , bk ∈ Q} for n ∈ N, then D = ∪∞ n=1 Dn is dense in Lp .
Alternatively, if Dn = {f : [a, b] → R : f is piecewise linear with n ∈ N breakpoints at rk ∈ Q and f (rk ) ∈ Q},
then D = ∪∞ n=1 Dn is dense in Lp .
Example. C[a, b] is separable. The set of all polynomials with rational coefficients is countable and dense in C[a, b].
If Dn = {r0 + r1 t + · · · + rn tn : rk ∈ Q}, then D = ∪∞n=1 Dn is dense in C[a, b].
Schauder basis
Definition. InPa normed space, {xn } ∈ X is a Schauder basis for X iff for each x ∈ X , there exists a unique sequence {λn }
∞
such that x = n=1 λn xn [5] [2, p. 98].
Theorem. If a normed space has a Schauder basis, then it is separable [2, p. 100].
Summary
Geometrical concepts and their generalizations.
point x∈X
line P : α ∈ R}
{αx
plane { i αi xi : αi ∈ R}
cone x ∈ C =⇒ αx ∈ C for α ≥ 0
length kxk
sphere {x ∈ X : kxk < ε}
distance kx − yk
coordinate system basis
Hierarchy of spaces: vector space ⊃ normed space ⊃ Banach space
Some principal results
• A subset of a Banach space is complete iff it is closed.
• Any finite-dimensional subspace of a normed space is complete (and hence closed).
• A real-valued, continuous functional on a compact subset of a normed space achieves its maximum and minimum on that subset.
• Any closed and bounded subset of a finite dimensional normed space is compact.
c J. Fessler, October 4, 2004, 12:44 (student version)
Bibliography
[1] P. Enflo. A counterexample to the approximation problem in Banach spaces. Acta Math, 130:309–17, 1973.
[3] A. W. Naylor and G. R. Sell. Linear operator theory in engineering and science. Springer-Verlag, New York, 2 edition, 1982.
[4] D. G. Luenberger. Optimization by vector space methods. Wiley, New York, 1969.
[5] J. Schauder. Zur theorie stetiger abbildungen in funktionenrumen. Math. Zeitsch., 26:47–65, 1927.
[6] P. P. Vaidyanathan. Generalizations of the sampling theorem: Seven decades after Nyquist. IEEE Tr. Circ. Sys. I, Fundamental
theory and applications, 48(9):1094–109, September 2001.
[7] F. Deutsch. The convexity of Chebyshev sets in Hilbert space. In A. Yanushauskas Th. M. Rassias, H. M. Srivastava, editor,
Topics in polynomials of one and several variables and their applications, pages 143–50. World Sci. Publishing, River Edge,
NJ, 1993.
[8] M. Jiang. On Johnson’s example of a nonconvex Chebyshev set. J. Approx. Theory, 74(2):152–8, August 1993.
[9] P. L. Combettes and H. J. Trussell. Method of successive projections for finding a common point of sets in metric spaces. J.
Optim. Theory Appl., 67(3):487–507, December 1990.