Limits Notes
Limits Notes
• supt∈[5,∞) cos(t) = 1
EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 2
• supt∈[7,∞) t cos(t) = ∞
lim f (tk ) = f ∗
k→∞
EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 3
(ii) For any sequence of times {τk }∞ k=1 such that limk→∞ τk = ∞ and such that the regular limit limk→∞ f (τk )
exists, then:
lim f (τk ) ≤ f ∗ .
k→∞
This lemma is the reason that the lim sup of a function f (t) can be viewed as the largest limiting value of the
function. Similarly, the lim inf of a function f (t) can be viewed as its smallest limiting value.
The next lemma demonstrates that the lim sup, lim inf , and the regular limit are all equivalent whenever the
regular limit exists.
Lemma 3: Consider any function f (t). Then limt→∞ f (t) = f ∗ (where f ∗ is possibly infinite) if and only if:
lim sup f (t) = lim inf f (t) = f ∗
t→∞ t→∞
That is, if the regular limit exists and is equal to a value f ∗ , then
both the lim sup and lim inf of the function are
∗
equal to f . Conversely, if lim inf t→∞ f (t) = lim supt→∞ f (t), then the regular limit also exists and is equal to
the same value as the lim inf and the lim sup.
The above lemma is very important for practical understanding of the lim inf and lim sup. It says that these
limits are identical to the regular limit whenever the regular limit exists. Thus, whenever a lim sup or lim inf
appears in an equation, the reader can view it exactly as a regular limit under the additional assumption that the
regular limit exists. However, using lim sup and lim inf notation often makes things much easier, because there is
no need to prove these limits exist (since they always exist).
• lim supt→∞ [f (t) + g(t)] ≤ lim supt→∞ f (t) + lim supt→∞ g(t) (whenever the right hand side does not yield
“∞ + −∞” or “−∞ + ∞”).
The final property above is the only one that is different from regular limits. A simple example of this is as
M M
follows; Define f (t)= cos(t) and g(t)= − cos(t). Then:
lim sup[f (t) + g(t)] = 0
t→∞
lim sup f (t) + lim sup g(t) = 2
t→∞ t→∞
Two other useful properties are given below:
Lemma 5: Let f (t) and g(t) be any functions. Then:
lim sup[f (t) + g(t)] ≥ lim sup f (t) + lim inf g(t) ≥ lim inf [f (t) + g(t)]
t→∞ t→∞ t→∞ t→∞
whenever “lim supt→∞ f (t) + lim inf t→∞ g(t)” does not yield “∞ + −∞” or “−∞ + ∞.”
Lemma 6: Let f (t) be any function, and let h(t) be a function with a well defined and limit, so that limt→∞ h(t) =
h∗ for some (possibly infinite) value h∗ . Then:
lim sup[h(t) + f (t)] = h∗ + lim sup f (t)
t→∞ t→∞
whenever the right hand side does not yield “∞ + −∞” or “−∞ + ∞.”
EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 4
E. Ungraded Exercises
Use only the basic properties described in Section I-D to prove the results in the following lemmas. (Hints:
For Lemma 7, use the fact that lim supt→∞ f (t) = − lim inf t→∞ [−f (t)]. For Lemma 8 use the fact that g(t) =
(g(t) − f (t)) + f (t)):
Lemma 7: For any functions f (t), g(t) and any constant C > 0, we have:
• lim inf t→∞ Cf (t) = C lim inf t→∞ f (t)
• lim inf t→∞ [f (t) + g(t)] ≥ lim inf t→∞ f (t) + lim inf t→∞ g(t) (whenever the right hand side does not yield
“∞ + −∞” or “−∞ + ∞”).
Lemma 8: Consider any two functions f (t), g(t), and suppose that:
lim inf [g(t) − f (t)] ≥ 0
t→∞
A. Closed Sets
Let A represent a subset of N -dimensional Euclidean space (that is, A ⊂ RN ). Thus, A contains N -dimensional
vectors (possibly infinitely many), where each vector has the form x = (x1 , . . . , xN ). Such vectors are also called
points, as they represent a single point in Euclidean space.
Definition 5: A limit point of a set A is a point x∗ ∈ RN such that there exists an infinite sequence of points
{x(1), x(2), x(3), . . .} where x(k) ∈ A for all k ∈ {1, 2, . . .} (and where the x(k) values can possibly repeat the
same points in A), and such that:
lim x(k) = x∗
k→∞
The limit in the above equation represents a component-wise limit, so that each of the N components of the
{x(k)} sequence converges to the corresponding component of x∗ . The fact that we allow points of A to be
repeated when constructing the infinite sequence ensures that every point x that is already contained in A is also
a limit point of A. This can be seen by forming the trivial sequence {x(k)}∞ k=1 = {x, x, x, . . .}.
Definition 6: A set A is closed if it contains all its limit points.
Any set A that is not closed can be transformed into a closed set simply by adding all of its limits points:
Definition 7: For any set A, define the closure of A (denoted Cl{A}), to be the set of all limit points of A
(which also includes all original points of A).
It is easy to see that for any set A, the set Cl{A} is closed. For simple examples in one dimension, define:
M
• A1 =(0, 1] = {x ∈ R | 0 < x ≤ 1}
M
• A2 = [0, 1] = {x ∈ R | 0 ≤ x ≤ 1}
M
• A3 = [0, 5.5) ∪ (6.7, 9]
M
• A4 = {1, 9.7, 10}
EE599 STOCHASTIC NETWORK OPTIMIZATION, MICHAEL J. NEELY, FALL 2008 5
M
• A5 = {1, 1/2, 1/3, 1/4, . . .} = {1/n | n ∈ {1, 2, . . .}}
Then the only limit point of A1 that is not itself contained in A1 is the point 0. It follows that Cl{A1 } = [0, 1] = A2 .
Thus, A2 is closed. The only limit points of A3 that are not contained in A3 are 5.5 and 6.7. Thus, the set A3 is
not closed, but its closure is equal to [0, 5.5] ∪ [6.7, 9]. The set A4 is a finite set and hence is always closed (because
the only possible limit points are members of the finite set). The set A5 is an infinite set of discrete points. Note
that the sequence {1, 1/2, 1/3, . . .} converges to the point 0, but 0 ∈ / A5 , and hence A5 is not closed. This same
sequence {1, 1/2, 1/3, . . .} is contained in A1 , which formally shows that 0 is a limit point of A1 .
B. Bounded Sets
Consider a subset A of RN . Recall that each point of A is a vector with the form x = (x1 , . . . , xN ). Let |x − y|
denote the traditional Euclidean distance between two vectors x and y . Specifically:
qP
M N 2
|x − y|= i=1 (xi − yi )
Definition 8: A subset A is bounded if there exists a finite constant M such that |x − y| ≤ M for all vectors
x and y contained in A.
Definition 9: An N -dimensional closed hypercube centered at the origin is a subset of RN given by [−Z/2, Z/2]N
for some positive constant Z , where:
M
[−Z/2, Z/2]N = {(x1 , . . . , xN ) | − Z/2 ≤ xi ≤ Z/2 for all i ∈ {1, . . . , N }}
Such a hypercube has edge size Z and volume Z N . It is not difficult to show from the above definitions that if
a set A is bounded, then it can be contained in some hypercube.
Consider a bounded set A ⊂ RN and suppose we have an infinite sequence of points {x(k)}∞ k=1 , where x(k) ∈ A
for all k . This sequence possibly repeats some points of A many times, and might bounce around the set A, so that
limk→∞ x(k) may not exist. However, The next theorem proves that x(k) must have a convergent subsequence.
Definition 10: Let {x(k)}∞ ∞
k=1 be an infinite sequence of points. A subsequence of {x(k)}k=1 is an infinite
sequence {yn }∞ n=1 , where this sequence selects points from the original sequence (possibly skipping some points
of the original sequence, but preserving the same order). Formally, the subsequence {yn }∞ n=1 is defined by:
M
yn = x(kn )
where kn is a strictly increasing function that maps the positive numbers {1, 2, 3, . . .} into the positive numbers
{1, 2, 3, . . .} (so that kn < kn+1 for all n ∈ {1, 2, . . .}).
As an example of a subsequence, let the original sequence consist of all positive integers {1, 2, 3, 4, . . .}, so
that x(k) = k for k ∈ {1, 2, 3, . . .}. An example subsequence of this is the sequence of all even positive integers
{2, 4, 6, 8, . . .}, so that kn = 2n and yn = x(kn ) = 2n for n ∈ {1, 2, 3, . . .}.
Theorem 1: (Multi-Dimensional Bolzano-Weirstrass) Let A be a bounded subset of RN , and let {x(k)}∞ k=1
represent an infinite sequence of points in A (so that x(k) ∈ A for all k ). Then {x(k)} has a convergent subsequence,
i.e., a subsequence {x(kn )}∞ n=1 such that:
lim x(kn ) = x∗
n→∞
[α1 , β1 ]). Note that this new interval [α1 , β1 ] has size (b−a)/2. Again divide this new interval into two sub-intervals
[α1 , (α1 + β1 )/2] and [(α1 + β1 )/2, β1 ]. Because there are an infinite number of points x(k) in [α1 , β1 ], again we
see there must be an infinite number of points in one of the two new sub-intervals. Label this sub-interval [α2 , β2 ].
Proceeding this way, we see that we can find successive points of the {x(k)} sequence in smaller and smaller
sub-intervals, where the sub-interval size is halving every step. This is a nested set of closed intervals that shrink to
size 0 (where each interval has an infinite number of the points in the {x(k)} sequence) and hence these intervals
must converge about a single limit point x∗ .
The multi-dimensional case can be proven by iteratively applying the single dimensional result to each dimension,
or by repeating the above argument but using “hypercubes” and “sub-hypercubes” in replacement of “intervals” and
“sub-intervals.” Note that when a N -dimensional hypercube has each of its dimensions divided into two halves,
we are left with 2N hypercubes, each with edge size that is halved. For example, dividing a 2-dimensional square
by cutting each edge in half creates 22 = 4 sub-squares.
III. C ONVEXITY
Definition 11: A set A ⊂ RN is said to be convex if for any two points x and y contained in A, we have:
px + (1 − p)y ∈ A
for all values p such that 0 ≤ p ≤ 1.
Note that if p = 0, then px + (1 − p)y = y , while if p = 1 then px + (1 − p)y = x. Choosing intermediate
values of p yields points that are on the line segment with endpoints x and y . Thus, a set is convex if it contains
all line segments formed by any two points of the set. An example of a convex set is a sphere (with all the inside
points of the sphere included), because the line segment formed by any two points of the sphere is also inside the
sphere. An example of a set that is not convex is a donut (more mathematically called a torus), where the donut
hole is not included.
Exercise 1: (ungraded) Show that the set of all points x = (x1 , . . . , xN ) ∈ RN that satisfy the following
collection of K linear inequalities is a convex set:
PN (1)
i=1 αi xi ≤ b1
PN (2)
i=1 αi xi ≤ b2
...
PN (K)
i=1 αi xi ≤ bK
(j)
where {b1 , . . . , bK } and {αi } are arbitary constants in R.
The linear inequalities of the above exercise define a set that is called a polyhedral convex set.
Note that the convex hull of a set A includes all of the original points of the set, because one can always form
the trivial convex combination that weights a given point with weight p = 1. Thus, A ⊂ Conv{A}. Further, the
convex hull itself is always convex. The convex hull of a set that is already convex is the same as the original set.
Fact 4: The convex hull of a closed set is closed.
An example of the last fact is as follows: Let A = {x1 , x2 , x3 } be a finite set in 2-dimensional space containing
three elements, where x1 = (0, 0), x2 = (2, 0), x3 = (1, 1). Then A is closed (because it is finite), and so Conv{A}
is closed. The set Conv{A} is given by the triangle (including the interior) with vertices at points x1 , x2 , and x3 .
Any point on the triangle can be written as a convex combination of the 3 points x1 , x2 , and x3 .
Theorem 2: (Caratheodory’s Theorem) Let A be any subset of RN . Let x be any convex combination of points
in A, so that x can be written as a convex combination using some finite number k of points in A (where k can
be arbitrarily large):
x = p1 x1 + p2 x2 + . . . + pk xk
where xk ∈ A for all i ∈ {1, . . . , k}. Then x can also be written as a convex combination that uses only N + 1
points of A:
x = p̃1 x̃1 + p̃2 x̃2 + . . . + p̃N +1 x̃N +1
for some probabilities {p̃1 , . . . , p̃N +1 } and some elements {x̃1 , . . . , x̃N +1 } of A.
Thus, if we have a set A ∈ R2 that contains 10 elements: A = {x1 , . . . , x10 }, then any element of the convex
hull of A can be written as a convex combination that uses at most 3 of these elements.
The following is a useful result that relates to random vectors X .
Fact 5: (from [4]) Let X be a random vector that takes values in some set A ⊂ RN . Suppose that E {X} is
defined. Then:
E {X} ∈ Conv{A}
Further, if the set A is itself convex, then Conv{A} = A, and so E {X} ∈ A.
If A is a finite set {x1 , . . . , xk }, then the above fact is easy to establish, as the expectation can then be written:
E {X} = p1 x1 + p2 x2 + . . . + pk xk
where pi = P r[X = xi ]. As this is a convex combination, it follows that X is contained in Conv{A}. It is tempting
to think that the general result of Fact 5 for arbitrary infinite sets A can be derived by writing the integral associated
with the expectation as a limit of a finite sum. However, this approach does not work because the set A is not neces-
sarily closed (and so the set Conv{A} is not necessarily closed). Hence, the resulting limit is not a-priori guaranteed
to be contained in Conv{A}. However, a simple proof of Fact 5 can be obtained using the hyperplane separation
theorem (see, for example, [1] for a description of hyperplane separation, and the notes “Multi-Dimensional
Integration Theorem” available on the link: http://www-rcf.usc.edu/∼mjneely/pdf papers/convex integration.pdf for
a proof of a statement similar to Fact 5).
Definition 15: A function f (x) that maps x ∈ RN to R is concave if −f (x) is convex. A function f (x) is
concave over X (where X is a convex subset of RN ) if −f (x) is convex over X .
Theorem 3: (Jensen’s Inequality) Let X be a random vector that takes values in some convex set X ⊂ RN .
(a) If f (x) is a convex function over x ∈ X , then:
f (E {X}) ≤ E {f (X)}
(b) If f (x) is a concave function over x ∈ X , then:
f (E {X}) ≥ E {f (X)}
Note from Fact 5 that E {X} ∈ Conv(X ) = X , and so the value f (E {X}) is well defined. Note also that part
(b) follows immediately from part (a) using Definition 15. Part (a) is proven as a simple consequence of Fact 5 in
the next subsection.
Corollary 1: Let X(t) be a random vector process indexed by time t ∈ {0, 1, 2, . . .} such that X(t) ∈ X for
all t (where X is a convex set). Let f (x) be a convex function over X . Then for all t we have:
t−1 t−1
!
1X 1X
f E {X(τ )} ≤ E {f (X(τ ))}
t t
τ =0 τ =0
Proof: Fix a time t, and let T be an integer random variable that is uniform over {0, 1, 2, . . . , t − 1} and that is
independent of the process X(τ ) (for τ ∈ {0, . . . , t − 1}). Define the random variable Y = X(T ). Apply Jensen’s
inequality to f (Y ).
We note that a linear function satisfies θ1 f (x1 ) + θ2 f (x2 ) = f (θ1 x1 + θ2 x2 ) for all x1 , x2 , θ1 , θ2 . Thus, a linear
function is both convex and concave. It follows by Jensen’s inequality that if f (x) is linear, then f (E {X}) =
E {f (X)}, and so the expectation passes through the linear function.
R EFERENCES
[1] D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar. Convex Analysis and Optimization. Boston: Athena Scientific, 2003.
[2] J. R. Munkres. Topology. NJ: Prentice Hall, Inc., 2000.
[3] D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995.
[4] L. Georgiadis, M. J. Neely, and L. Tassiulas. Resource allocation and cross-layer control in wireless networks. Foundations and Trends
in Networking, vol. 1, no. 1, pp. 1-149, 2006.
1
Strictly speaking, the function on the right hand side of the inequality in Lemma 9 is called an affine function of x (not a linear function
of x), as it is linear plus a constant.