0% found this document useful (0 votes)
12 views6 pages

Ee364 HW5

The document contains exercises from EE364a Homework 5, covering topics such as exponential distributions, optimization problems involving norms, and convex functions. It includes detailed mathematical derivations and Python code for solving linear programming problems. The exercises address various aspects of statistics and optimization theory, providing insights into the behavior of different functions and their constraints.

Uploaded by

Aby Coathin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views6 pages

Ee364 HW5

The document contains exercises from EE364a Homework 5, covering topics such as exponential distributions, optimization problems involving norms, and convex functions. It includes detailed mathematical derivations and Python code for solving linear programming problems. The exercises address various aspects of statistics and optimization theory, providing insights into the behavior of different functions and their constraints.

Uploaded by

Aby Coathin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

EE364a Homework 5

Summer 2022

Exercise A7.3
(a) When c(x) = x, the associated family densities is the family of exponential distribu-
tions. The density function is valid iff θ < 0, since
Z ∞ (
− 1θ if θ < 0;
eθx dx =
0 ∞ if θ ≥ 0.

(b) For θ ∈ R, the associated distribution is

1 eθ
pθ (0) = , and p θ (1) = .
1 + eθ 1 + eθ
When θ runs over R,

1 + eθ
runs over (0, 1), and so the distribution runs over the family of Bernoulli(p) distribu-
tions, with p ∈ (0, 1). All θ ∈ R are valid.

(c) Let D = Rn and c1 (x) = x, C2 (x) = − 12 xxT . For parameter θ = (Σ−1 µ, Σ−1 ), we have
 
T 1 −1 T
 −1
T
exp(θ c(x)) = exp − tr Σ · xx + Σ µ x
2
 
1 T −1
 T −1 1 T −1
∝ exp − tr x Σ x + x Σ µ − µ Σ µ
2 2
 
1
∝ exp − (x − µ)T Σ−1 (x − µ) .
2

(d) In the finite case, we have


!
X
T T
log pθ (x) = θ c(x) − log exp(θ c(x)) .
x∈D

Subtracting a log-sum-exp function from an affine function makes a concave function.


And the continuous case can be handled by taking limits of finite sums.

1
(e) Using the formula from the previous part and assuming the interchangeablity of inte-
gration and differentiation,
K
1 1 X ∂
· ∇θ lθ (x1 , · · · , xK ) = · pθ (xk )
K K k=1 ∂θ
K Z !−1
1 X X
= · c(xk ) − c(x) exp(θT c(x)) dx · exp(θT c(x))
K k=1 D x∈D
K
1 X
= · c(xk ) − E c(x).
K k=1 x∼D

Exercise 6.2
(a) For l1 -norm, if we order the entries of b into b(1) ≤ b(2) ≤ · · · ≤ b(n) , we have for any
x∈R
n n n
X 1X  1X
∥x1 − b∥1 = |b(k) − x| = |b(k) − x| + |b(n+1−k) − x| ≥ b(k) − b(n+1−k) .
k=1
2 k=1
2 k=1

On the other hand, if we choose x⋆ = median(b), then for any k ≤ n+1 2


holds b(k) ≤
x⋆ ≤ b(n+1−k) , and hence for any k ∈ [n] we have (b(k) − x⋆ )(b(n−k) − x⋆ ) ≤ 0. Therefore
n n n
X 1X  1X
∥x⋆ 1 − b∥1 = |b(k) −x⋆ | = |b(k) − x| + |b(n+1−k) − x⋆ | = b(k) − b(n+1−k) .
k=1
2 k=1 2 k=1

In conclusion, the problem attains its minimum at x⋆ = median(b), with an optimal


value of n
1X
b(k) − b(n+1−k) .
2 k=1

(b) For l2 -norm, the problem is in fact a special case of least-squares approximation. Since
rank(1) = 1, we have the optimal x⋆ = P (1T 1)−1 1T b = n1 (b1 + · · · + bn ) := b̄, under
which the objective attains its minimum nk=1 (bk − b̄)2 .

(c) For l∞ -norm, we have simply

∥x1 − b∥∞ = max |x − bk |


1≤k≤n
 
= max max (x − bk ), max (bk − x)
1≤k≤n 1≤k≤n

= max x − b(1) , b(n) − x ,

which is clear to be minimized at 12 (b(1) + b(n) ) with minimal value 12 (b(n) − b(1) ).

2
Exercise 6.5
It’s obvious that ∥Ax − b∥ − ϵ is a convex function. It suffices to prove that length(·)
is a quasiconvex function. For any fixed α, if x, y ∈ Rn such that length(x) ≤ α and
length(y) ≤ α, then since xk = yk = 0 holds for all k > α, the same obviously holds also for
any convex combination θx + (1 − θ)y of x and y, and hence

length(θx + (1 − θ)y) ≤ α.

Exercise A17.16
Python code:
import numpy as np
import cvxpy as cp

S = np . l i n s p a c e ( 0 . 5 , 2 , 200)

y = cp . V a r i a b l e ( 2 0 0 )
pn = cp . V a r i a b l e ( )
p = np . a r r a y ( [ 1 , 1 , 0 . 0 6 , 0 . 0 3 , 0 . 0 2 , 0 . 0 1 ] )

V = np . a r r a y ( [ [ 1 . 0 5 ] ∗ 2 0 0 ,
S,
np . maximum( S − 1.1 , 0) ,
np . maximum( S − 1.2 , 0) ,
np . maximum ( 0 . 8 − S, 0) ,
np . maximum ( 0 . 7 − S, 0)])

Vn = np . minimum ( np . maximum( S − 1 , −0.1) , 0 . 1 5 )

c o n s t r a i n t s = [V @ y == p ,
Vn @ y == pn ,
y >= 0 ]

prob = cp . Problem ( cp . Minimize ( pn ) , c o n s t r a i n t s )

prob . s o l v e ( )
print ( pn . v a l u e )

As outputted by the above program, the possible range of the collar is [0.03262, 0.06495].

3
Exercise 6.10
We first deal with the constraint that f is nondecreasing on B. Note that f is nondecreasing
on B if and only if
for all u ⪰ 0 and x ∈ B, ⟨∇f (x), u⟩ ≥ 0.
Since having nonnegative inner product with every u ∈ Rn+ is equivalent to being in Rn+ itself
(the dual cone of Rn+ is Rn+ ), we can further simplify the above constraint into

∇f (x) ⪰ 0 for all x ∈ B.

Now, plugging in the formula for f , we have ∇f (x) = P x + q. For each k ∈ [n], the kth
entry (P x + q)k ≥ 0 for all x ∈ B if and only if under the worst case selection of each xk ,
n   
X l, if Pkj ≥ 0
Pkj · + qk ≥ 0.
u, if Pkj < 0
j=1

Combining all the above observations, we can finally state the constraint that f is nonde-
creasing on B as follows:
min (l · P, u · P ) · 1 + q ⪰ 0,
where the min function denotes entrywise minimum of two matrices. Since min(lz, uz) is
a concave function of z ∈ R, the above system of constraints are all convex constraints (in
(P, q), which takes value in Sn × Rn ).
Having formulated the nondecreasing constraint, the nonnegative constraint is reduced
to being nonnegative at x = l · 1:

l2 T
1 P 1 + l · 1T q + r ≥ 0.
2
Finally, f being concave is the same as its hessian P ⪯ 0.
In conclusion, we can write the optimization problem as
N  2
X 1
minimize xTi P xi T
+ q xi + r − y i
i=1
2
subject to min (l · P, u · P ) · 1 + q ⪰ 0,
l2 T
1 P 1 + l · 1T q + r ≥ 0,
2
P ⪯ 0.

The objective is a sum of squares of affine functions in the variables (P, q, r), and is hence
convex.

4
Exercise A5.21
(a) f is convex because it’s the supremum of a family of convex (linear) functions.

(b) The Lagrangian is


L(c, λ) = cT x + λT (g − F c),
and so the dual problem is

minimize λT g
subject to λT F = xT ,
λ ⪰ 0.

The dual optimal value is equal to the primary optimal value f (x) because strong
duality always holds for linear programs.

(c) Plugging the dual problem into the original robust LP yields

minimize λT g
subject to Ax ⪰ b,
λ T F = xT ,
λ ⪰ 0.

Or, more succinctly,

minimize λT g
subject to AF T λ ⪰ b,
λ ⪰ 0.

Python code:
import numpy as np
import cvxpy as cp

np . random . s e e d ( 1 0 )
(m, n ) = ( 3 0 , 1 0)
A = np . random . rand (m, n ) ; A = np . a s m a t r i x (A)
b = np . random . rand (m, 1 ) ; b = np . a s m a t r i x ( b )
c nom = np . ones ( ( n , 1 ) ) + np . random . rand ( n , 1 ) ; c nom = np . a s m a t r i x ( c nom
average nom = np .sum( c nom ) / n

lambd = cp . V a r i a b l e ( ( 2 ∗ n + 2 , 1 ) )

F = np . b l o c k ( [ [ np . eye ( n ) ] ,
[−np . eye ( n ) ] ,

5
[ np . ones ( n ) / n ] ,
[−np . ones ( n ) / n ] ] )

g = np . b l o c k ( [ [ 1 . 2 5 ∗ c nom ] ,
[ −0.75∗ c nom ] ,
[ 1 . 1 ∗ average nom ] ,
[ −0.9∗ average nom ] ] )

c o n s t r a i n t s = [A @ F .T @ lambd >= b ,
lambd >= 0 ]
prob = cp . Problem ( cp . Minimize ( lambd .T @ g ) , c o n s t r a i n t s )
prob . s o l v e ( )
print ( prob . v a l u e )

x = cp . V a r i a b l e ( ( n , 1 ) )
prob nom = cp . Problem ( cp . Minimize ( c nom .T @ x ) , [A @ x >= b ] )
prob nom . s o l v e ( )
print ( prob nom . v a l u e )

As outputted by the above program, the optimal value of f (x) is 3.166, while the
optimal value of cTnom x is 2.109. The rather large difference between the two values
might indicate that the nominal LP is not very robust.

You might also like