Section 2.
                               Solutions Chapter 2
1.   SECTION 2.1
     2.1.9     www
     From Prop. 2.1.2(a), if x∗ is a local minimum, then
                                         ∇f (x∗ ) (x − x∗ ) ≥ 0,       ∀ x ∈ X,
     or
                                               
                                               n
                                                 ∂f (x∗ )
                                                             (xi − x∗i ) ≥ 0.
                                               i=1
                                                      ∂xi
     If x∗i = αi , then xi ≥ x∗i , ∀ xi . Letting xj = x∗j , for j = i, we have
                                                       ∂f (x∗ )
                                                                ≥ 0.
                                                        ∂xi
     Similarly, if x∗i = βi , then xi ≤ x∗i , for all xi . Letting xj = x∗j , for j = i, we have
                                                       ∂f (x∗ )
                                                                ≤ 0.
                                                        ∂xi
     If αi < x∗i < βi , let xj = x∗j for j = i. Letting xi = αi , we obtain
                                                       ∂f (x∗ )
                                                                ≤ 0,
                                                        ∂xi
     and letting xi = βi , we obtain
                                                       ∂f (x∗ )
                                                                ≥ 0.
                                                        ∂xi
     Combining these inequalities, we see that we must have
                                                       ∂f (x∗ )
                                                                = 0.
                                                        ∂xi
             Assume that f is convex. To show that Eqs. (1.6)-(1.8) are sufficient for x∗ to be a global
     minimum, let I1 = {i | x∗i = αi }, I2 = {i | x∗i = βi }, I3 = {i | αi < x∗i < βi }. Then
                                  
                                  n
                                    ∂f (x∗ )
          ∇f (x∗ ) (x − x∗ ) =                (xi − x∗i )
                                  i=1
                                         ∂xi
                                ∂f (x∗ )               ∂f (x∗ )               ∂f (x∗ )
                             =            (xi − αi ) +            (xi − βi ) +            (xi − x∗i ).
                                  ∂xi                     ∂xi                     ∂xi
                                  i∈I1                       i∈I2                    i∈I3
                                                             1
                                                                                                         Section 2.1
         ∂f (x∗ )                      ∂f (x∗ )                          ∂f (x∗ )
Since     ∂xi       ≥ 0 for i ∈ I1 ,    ∂xi       ≤ 0 for i ∈ I2 , and    ∂xi       = 0 for i ∈ I3 , each term in the
above equation is greater than or equal to zero. Therefore
                                          ∇f (x∗ ) (x − x∗ ) ≥ 0,       ∀ x ∈ X.
From Prop. 2.1.2(b), it follows that x∗ is a global minimum.
2.1.10       www
For any x ∈ X such that ∇f (x∗ ) (x − x∗ ) = 0, we have by the second order expansion of Prop.
A.23, for all α ∈ [0, 1] and some α̃ ∈ [0, α],
                                                                                 
                f x∗ + α(x − x∗ ) − f (x∗ ) = 12 α2 (x − x∗ ) ∇2 f x∗ + α̃(x − x∗ ) (x − x∗ ).
For all sufficiently small α, the left-hand side is nonnegative, since x∗ is a local minimum. Hence
the same is true for the right-hand side, and by taking the limit as α → 0 (and also α̃ → 0), we
obtain
                                            (x − x∗ ) ∇2 f (x∗ )(x − x∗ ) ≥ 0.
2.1.11       www
Proof under condition (1): Assume, to arrive at a contradiction, that x∗ is not a local
minimum. Then there exists a sequence {xk } ⊆ X converging to x∗ such that f (xk ) < f (x∗ ) for
all k. We have
                                                  1
        f (xk ) = f (x∗ ) + ∇f (x∗ ) (xk − x∗ ) + (xk − x∗ ) ∇2 f (x∗ )(xk − x∗ ) + o( xk − x∗ 2 ).
                                                  2
Introducing the vector
                                                            xk − x∗
                                                     pk =           ,
                                                            xk − x∗
and using the relation f (xk ) < f (x∗ ), we obtain
                                       1                            o( xk − x∗ 2 )
                         ∇f (x∗ ) pk + pk  ∇2 f (x∗ )pk xk − x∗ +                < 0.                           (1)
                                       2                               xk − x∗
This together with the hypothesis ∇f (x∗ ) pk ≥ 0 implies
                                1 k 2                    o( xk − x∗ 2 )
                                  p ∇ f (x∗ )pk xk − x∗ +                < 0.                                     (2)
                                2                            xk − x∗
        Let us call feasible direction at x∗ any vector p of the form α(x − x∗ ), where α > 0 and
x ∈ X, x = x∗ (see also Section 2.2). The sequence {pk } is a sequence of feasible directions at
                                                             2
                                                                                             Section 2.1
x∗ that lie on the surface of the unit sphere. Therefore, a subsequence {pk }K converges to a
vector p, which because X is polyhedral, must be a feasible direction at x∗ (this is easily seen by
expressing the polyhedral set X in terms of linear equalities and inequalities). Therefore, by the
hypothesis of the exercise, we have ∇f (x∗ ) p ≥ 0. By letting k → ∞, k ∈ K in (1), we have
                                             ∇f (x∗ ) p = 0.
The hypothesis of the exercise implies that
                                           p ∇2 f (x∗ )p > 0.                                       (3)
Dividing by xk − x∗ and taking the limit in Eq. (2) as k → ∞, k ∈ K, we obtain
                            1  2                     o( xk − x∗ 2 )
                              p ∇ f (x∗ )p +   lim                   ≤ 0.
                            2                k→∞, k∈K    xk − x∗ 2
This contradicts Eq. (3).
Proof under condition (2): Here we argue in the similar way as in part (1). Suppose that all
the given assumptions hold and x∗ is not a local minimum. Then there is a sequence {xk } ⊆ X
converging to x∗ such that f (xk ) < f (x∗ ) for all k. By using the second order expansion of f at
                                         xk −x∗
x∗ and introducing the vector pk =      xk −x∗ 
                                                  ,   we have that both Eq. (1) and (2) hold for all k.
Since {pk } consists of feasible directions at x∗ that lie on the surface of the unit sphere, there
is a subsequence {pk }K converging to a vector p with ||p|| = 1. By the assumption given in the
exercise, we have that
                                       ∇f (x∗ )pk ≥ 0,           ∀ k.
Hence ∇f (x∗ )p ≥ 0. By letting k → ∞, k ∈ K in (1), we obtain ∇f (x∗ )p ≤ 0. Consequently
∇f (x∗ )p = 0. Since the vector p is in the closure of the set of the feasible directions at x∗ , the
condition given in part (2) implies that p ∇2 f (x∗ )p > 0. Dividing by xk − x∗ and taking the
limit in Eq. (2) as k → ∞, k ∈ K, we obtain p ∇2 f (x∗ )p ≤ 0, which is a contradiction. Therefore,
x∗ must be a local minimum.
Proof under condition (3): We have
                                                1
         f (x) = f (x∗ ) + ∇f (x∗ ) (x − x∗ ) + (x − x∗ ) ∇2 f (x∗ )(x − x∗ ) + o( x − x∗ 2 ),
                                                2
so that by using the hypotheses ∇f (x∗ ) (x − x∗ ) ≥ 0 and (x − x∗ ) ∇2 f (x∗ )(x − x∗ ) ≥ γ x − x∗ 2 ,
                                                γ
                            f (x) − f (x∗ ) ≥     x − x∗    2   + o( x − x∗ 2 ).
                                                2
The expression in the right-hand side is nonnegative for x ∈ X close enough to x∗ , and it is
strictly positive if in addition x = x∗ . Hence x∗ is a strict local minimum.
                                                       3
                                                                                              Section 2.1
Example: [Why the assumption that X is a polyhedral set was important under
condition (1)] A polyhedral set X has the property that for any point x ∈ X, the set V (x) of
the feasible directions at x is closed. This was crucial for proving that the conditions
                                 ∇f (x∗ ) (x − x∗ ) ≥ 0,        ∀ x ∈ X,                                (1)
       (x − x∗ )∇2 f (x∗ )(x − x∗ ) > 0,    ∀ x ∈ X, x = x∗ , for which ∇f (x∗ ) (x − x∗ ) = 0,        (2)
are sufficient for local optimality of x∗ .
      Consider the set X = {(x1 , x2 ) | (x1 )2 ≤ x2 } and the point (0, 0) ∈ X. Let the cost function
be f (x1 , x2 ) = −2(x1 )2 + x2 . Note that the gradient of f at 0 is [0, 1] . It is easy to see that
                            ∇f (0) (x − 0) = x2 > 0,        ∀ x ∈ X, x = 0.
Thus the point x∗ = 0 satisfies conditions (1) and (2) (condition 2 is trivially satisfied since in
our example ∇f (0) (x − 0) = 0 simply never occurs for x ∈ X, x = 0). On the other hand,
x∗ = 0 is not a local minimum of f in X. Consider the points xn = ( n1 , n12 ) ∈ X for n ≥ 1.
Since xn → x∗ as n → ∞, for any δ > 0 there is an index nδ such that ||xn − x∗ || < δ for all
n ≥ nδ . By evaluating the cost function, we have f (xn ) = − n12 < 0 = f (x∗ ). Hence, in any δ
neighborhood of x∗ = 0, there are points xn ∈ X with the better objective value, i.e. x∗ is not a
local minimum.
      This is happening because the set V (x∗ ) of the feasible directions at point x∗ is not closed
in this case. The set V (x∗ ) is given by
                              V (x∗ ) = {d = (d1 , d2 ) | d2 > 0, ||d|| = 1},
and is open. The vectors                                       
                                            1                −1
                                                  and
                                            0                0
belong to the closure of V (x∗ ) but they are not in the set V (x∗ ).
2.1.18    www
The assumption on ∇2 f (x) guarantees that f is strictly convex and coercive, so it has a unique
global minimum over any closed convex set (using Weierstrass’ theorem, Prop. A.8). By the
second order expansion of Prop. A.23, we have for all x and y in            n
                       f (y) = f (x) + ∇f (x) (y − x) + 12 (y − x)∇f (ỹ)(y − x)
                                                    4
                                                                                               Section 2.1
for some ỹ in the line segment connecting x and y. It follows, using the hypothesis, that
                               M                                                      m
           ∇f (x) (y − x) +     y−x        2   ≥ f (y) − f (x) ≥ ∇f (x) (y − x) +     y − x 2.
                               2                                                      2
Taking the minimum in this inequality over y ∈ X, and changing sign, we obtain
                                  
                       M                                                         m
− min ∇f (x) (y − x) +
                           y−x  2   ≤ f (x) − f (x∗ ) ≤ − min ∇f (x) (y − x) +   y−x                2   ,
  y∈X                   2                                  y∈X                   2
which is the desired relation.
2.1.19 of 2nd Printing (Existence of Solutions of Nonconvex Quadratic
Programming Problems)              www
Let {γ k } be a decreasing sequence with γ k ↓ f ∗ , and denote
                                   S k = {x ∈ X | x Qx + c x ≤ γ k }.
Then the set of optimal solutions of the problem is ∩∞    k
                                                     k=0 S , so by Prop. 2.1.4, it will suffice
to show that for each asymptotic direction of {S k }, all corresponding asymptotic sequences are
retractive. Let d be an asymptotic direction and let {xk } be a corresponding asymptotic sequence.
Similar to the proof of Prop. 2.1.5, we have d Qd ≤ 0. Also, in case (i), similar to the proof of
Prop. 2.1.5, we have aj d ≤ 0 for all j, while in case (ii) it is seen that d ∈ N , where X = B + N
and B is compact and N is a polyhedral cone. For any x ∈ X, consider the vectors x̃k = x + kd.
Then, in both cases (i) and (ii), it can be seen that x̃k ∈ X [in case (i) by using the argument
in the proof of Prop. 2.1.5, and in case (ii) by using the definition X = B + N ]. Thus, the cost
function value corresponding to x̃k satisfies
                               f ∗ ≤ (x + kd) Q(x + kd) + c (x + kd)
                                  = x Qx + c x + k 2 d Qd + k(c + 2Qx) d
                                  ≤ x Qx + c x + k(c + 2Qx) d,
where the last inequality follows from the fact d Qd ≤ 0. From the finiteness of f ∗ , it follows
that
                                     (c + 2Qx) d ≥ 0,         ∀ x ∈ X.
We now show that {xk } is retractive, so that we can use Prop. 2.1.4. Indeed for any α > 0, since
 xk → ∞, it follows that for k sufficiently large, we have xk − αd ∈ X [this follows similar to
the proof of Prop. 2.1.5 in case (i), and because d ∈ N in case (ii)]. Furthermore, we have
                     f (xk − αd) = (xk − αd) Q(xk − αd) + c (xk − αd)
                                    = xk Qxk + c xk − α(c + 2Qxk ) d + α2 d Qd
                                    ≤ xk Qxk + c xk
                                    ≤ γk,
                                                        5
                                                                                              Section 2.1
where the first inequality follows from the facts d Qd ≤ 0 and (c + 2Qxk ) d ≥ 0 shown earlier.
Thus for sufficiently large k, we have xk − αd ∈ S k , so that {xk } is retractive. The existence of
an optimal solution now follows from Prop. 2.1.4.
2.1.20 of 2nd Printing         www
We proceed as in the proof of Prop. 2.1.5. By using a decomposition of dk as the sum of a vector
in the nullspace of A and its orthogonal complement, and an argument like the one in the proof
of Prop. 2.1.5, we can show that
                                           Ad = 0,           c d ≤ 0.
Similarly, we can show that
                                        aj d ≤ 0,         j = 1, . . . , r.
Using the finiteness of f ∗ , we can also show that c d = 0, and we can conclude the proof similar
to the proof of Prop. 2.1.5.
2.1.21 of 2nd Printing         www
Note that the cone N in this exercise must be assumed polyhedral (see the errata sheet). Let
S k = x ∈ X | f (x) ≤ γ k , and let d be an asymptotic direction of {S k }, and let {xk } be a
corresponding asymptotic sequence. We will show that {xk } is retractive, so by applying Prop.
2.1.4, it follows that the intersection of {S k }, the set of minima of f over X, is nonempty.
     Since d is an asymptotic direction of {S k }, d is also an asymptotic direction of x | f (x) ≤
γ k , and by hypothesis for some bounded positive sequence {αk } and some positive integer k,
we have f (xk − αk d) ≤ γ k for all k ≥ k.
     Let X = X + N , where X is compact, and N is the polyhedral cone
                                     N = {y | aj y ≤ 0, j = 1, . . . , r},
where a1 , . . . , ar are some vectors. We can represent xk as
                                   xk = xk + y k ,           ∀ k = 0, 1, . . . ,
where xk ∈ X and y k ∈ N , so that
                       aj xk = aj (xk + y k ),      ∀ k = 0, 1, . . . , j = 1, . . . , r.
Dividing both sides with xk and taking the limit as k → ∞, we obtain
                                                              aj y k
                                             aj d = lim              .
                                                     k→∞       xk
                                                       6
                                                                                                  Section 2.2
     Since aj y k ≤ 0 for all k and j, we obtain that aj d ≤ 0 for all j, so that d ∈ N .
           For each j, we consider two cases:
      (1) aj d = 0. In this case, aj (y k − αd) ≤ 0 for all k, since y k ∈ N and aj y k ≤ 0.
      (2) aj d < 0. In this case, we have
                                         1  k           1
                                            a (y − αd) = k aj (xk − xk − αd),
                                         xk j            x
                            xk
           so that since   xk 
                                   → d, {xk } is unbounded, and {xk } is bounded, we obtain
                                                    1  k
                                              lim      a (y − αd) = aj d < 0.
                                             k→∞    xk j
           Hence aj (y k − αd) < 0 for k greater than some k.
     Thus, for k ≥ k and α ∈ (0, α], we have aj (y k − αd) ≤ aj (y k − αd) ≤ 0 for all j, so that
     y k − αd ∈ N and xk − αd ∈ X.
           Thus {xk } is retractive, and by applying Prop. 2.1.4, we have that {S k } has nonempty
     intersection.
     2.1.22 of 2nd Printing           www
     We follow the hint. Let {yk } be a sequence of points in A S converging to some y ∈          n.   We will
     prove that A S is closed by showing that y ∈ A S.
           We introduce the sets
                                            Wk = z | z − y ≤ yk − y        ,
     and
                                              Sk = {x ∈ S | Ax ∈ Wk }.
     To show that y ∈ A S, it is sufficient to prove that the intersection ∩∞
                                                                          k=0 Sk is nonempty, since
     every x ∈ ∩∞
                k=0 Sk satisfies x ∈ S and Ax = y (because yk → y). The asymptotic directions of
     {Sk } are asymptotic directions of S that are also in the nullspace of A, and it can be seen that
     every corresponding asymptotic sequence is retractive for {Sk }. Hence, by Prop. 2.1.4, ∩∞
                                                                                              k=0 Sk
     is nonempty.
2.   SECTION 2.2
                                                          7
                                                                                                             Section 2.3
      2.2.7   www
     Since the number of extreme points of f is finite, some extreme point must be repeated within a
     finite number of iterations, i.e., for some k and i ∈ {0, 1, . . . , k − 1}, we have
                                            xi = arg min ∇f (xk ) (x − xk ).
                                                      x∈X
     Since xk minimizes f (x) over X k−1 , we must have
                                ∇f (xk ) (xi − xk ) ≥ 0,           ∀ i = 0, 1, . . . , k − 1.
     Combining the above two equations, we see that
                                          ∇f (xk ) (x − xk ) ≥ 0,        ∀ x ∈ X,
     which implies that xk is a stationary point of f over X.
3.   SECTION 2.3
     2.3.4    www
     We assume here that the unscaled version of the method (H k = I) is used and that the stepsize
     sk is a constant s > 0.
     (a) If xk is nonstationary, there exists a feasible descent direction x̂k −xk for the original problem,
     where x̂k ∈ X. Since x̂k ∈ X k , we have
                                          1 k                                            1 k
                ∇f (xk ) (x̃k − xk ) +      x̃ − xk    2   ≤ ∇f (xk ) (x̂k − xk ) +       x̂ − xk   2   < 0,
                                          2s                                             2s
     where x̃k is defined by the algorithm. Thus,
                                                                   1 k
                                    ∇f (xk ) (x̃k − xk ) ≤ −         x̃ − xk     2   < 0,
                                                                   2s
     so that x̃k − xk is a descent direction at xk . It is also a feasible direction, since aj (x̃k − xk ) ≤ 0
     for all j such that aj xk = bj .
     (b) As in the proof of Prop. 2.3.1, we will show that the direction sequence {xk − xk } is gradient-
     related, where
                                                xk = γ k x̃k + (1 − γ k )xk
                                                               8
                                                                                              Section 2.3
and
                            γ k = max γ ∈ [0, 1] | γ x̃k + (1 − γ)xk ∈ X .
Indeed, suppose that {xk }k∈K converges to a nonstationary point x̃. We must prove that
                                        lim sup         xk − xk < ∞,                                  (*)
                                       k→∞, k∈K
                                    lim sup ∇f (xk ) (xk − xk ) < 0.                               (**)
                                   k→∞, k∈K
Since xk − xk ≤ x̃k − xk ≤ s ∇f (xk ) , Eq. (*) clearly holds, so we concentrate on proving
(**). The key to this is showing that γ k is bounded away from 0, so that the inner product
∇f (xk ) (xk − xk ) is bounded away from 0 when ∇f (xk ) (x̃k − xk ) is.
      For each k, we either have γ k = 1, or else we must have for some j with aj xk < bj − ,
                                                               
                                      aj γ k x̃k + (1 − γ k )xk = bj
so that
                                    γ k aj (x̃k − xk ) = bj − aj xk > ,
from which
                                                           
                                         γk >                       .
                                                   aj    · x̃k − xk
It follows that for all k, we have
                                                                 
                                                      
                              min 1, min                              ≤ γ k ≤ 1.
                                         j    aj    · x̃k − xk
Since the subsequence {xk }K converges, the subsequence {x̃k − xk }K is bounded implying also
that the subsequence {γ k }K is bounded away from 0.
      For sufficiently large k, the set
                      X k = x | aj x ≤ bj , for all j with bj −  ≤ aj xk ≤ bj ,
is equal to the set
                        X̃ = x | aj x ≤ bj , for all j with bj −  ≤ aj x̃ ≤ bj ,
so proceeding as in the proof of Prop. 2.3.1, we obtain
                                                             1                  +   2
                       lim sup ∇f (xk ) (x̃k − xk ) ≤ −       x̃ − x̃ − s∇f (x̃)         ,
                      k→∞, k∈K                               s
where [·]+ denotes projection on the set X̃. Since x̃ is nonstationary, the right-hand side of the
above inequality is negative, so that
                                    lim sup ∇f (xk ) (x̃k − xk ) < 0.
                                   k→∞, k∈K
                                                         9
                                                                                       Section 2.3
We have xk − xk = γ k (x̃k − xk ), and since γ k is bounded away from 0, it follows that
                                     lim sup ∇f (xk ) (xk − xk ) < 0,
                                 k→∞, k∈K
proving Eq. (**).
(c) Here we consider the variant of the method that uses a constant stepsize, which however, is
reduced if necessary to ensure that xk is feasible. If the stepsize is sufficiently small to ensure
convergence to the unique local minimum x∗ of the positive definite quadratic cost function, then
xk will be arbitrarily close to x∗ for sufficiently large k, so that xk = x̃k . Thus the convergence
rate estimate of the text applies.
2.3.7   www
The key idea is to show that xk stays in the bounded set
                                     A = x ∈ X | f (x) ≤ f (x0 )
and to use a constant stepsize sk = s that depends on the constant L corresponding to this
bounded set. Let
                                        R = max{ x | x ∈ A},
                                     G = max{ ∇f (x) | x ∈ A},
and
                                       B = {x | x ≤ R + 2G}.
Using condition (i) in the exercise, there exists some constant L such that ∇f (x) − ∇f (y) ≤
L x − y , for all x, y ∈ B. Suppose the stepsize s satisfies 0 < s < 2 min{1, 1/L}. We will, show
by induction on k that with this stepsize, we have xk ∈ A and
                                                 
                                          L 1
                     f x k+1   ≤ f (x ) −
                                     k        −      xk+1 − xk       2   ≤ f (xk ),            (*)
                                            2   s
for all k ≥ 0.
    To start the induction, we note that x0 ∈ A, by the definition of A. Suppose that xk ∈ A.
                             +
We have xk+1 = xk − s∇f (xk ) , so by using the nonexpansiveness of the projection mapping,
                    xk+1 − xk ≤ (xk − s∇f (xk )) − xk ≤ s ∇f (xk ) ≤ 2G.
Thus,
                                     xk+1 ≤ xk + 2G ≤ R + 2G,
                                                   10
                                                                                                      Section 2.3
implying that xk+1 ∈ B. Since B is convex, we conclude that the entire line segment connecting
xk and xk+1 belongs to B. In order to prove Eq. (*), we now proceed as in the proof of Prop. 2.3.2.
A difficulty arises because Prop. A.24 assumes that the inequality ∇f (x) − ∇f (y) ≤ L x − y
holds for all x, y, whereas in this exercise this inequality holds only for x, y ∈ B. However, using
the fact that the Lipschitz condition holds along the line segment connecting xk and xk+1 (which
belongs to B as argued earlier), the proof of Prop. A.24 can be repeated to obtain
                                                                                L k+1
                          f (xk+1 ) − f (xk ) ≤ ∇f (xk ) (xk+1 − xk ) +          x   − xk   2.
                                                                                2
Using this relation, and the relation
                                                                      1 k+1
                                    ∇f (xk ) (xk+1 − xk ) ≤ −          x   − xk    2,
                                                                      s
[which is Eq. (3.27) of the text], we obtain Eq. (*) [as in the text, cf. Eq. (3.29)]. It follows that
xk+1 ∈ A, completing the induction. The remainder of the proof is the same as in Prop. 2.3.2.
2.3.8       www
(a) The expression for f given in the hint is verified by straightforward calculation. Based on
this expression, the method takes the form
                                                                                                     
                                                                           1
          xk+1  = arg min ∇f (xk ) (x − xk ) + 12 (x − xk ) Q(x − xk ) + k x − xk               2       ,
                      x∈X                                                 2c
or                                                                             
                                                                    1
                  xk+1
                   = arg min           ∇f (xk ) (x
                                           −     + (x −
                                                      xk )     Q + k I (x − x ) .
                                                                  1
                                                                  2     xk )   k
                         x∈X                                        c
This is recognized as the scaled gradient projection method with scaling matrix H k = Q+(1/ck )I
and stepsizes sk = 1, αk = 1.
(b) Similar to part (a), we have
                       xk = arg min {∇f (xk ) (x − xk ) + 12 (x − xk ) (Q + M k )(x − xk )} ,
                                 x∈X
and x − xk is recognized as the direction of the scaled gradient projection method with scaling
        k
matrix H k = Q + M k and stepsize sk = 1.
(c) If X =        n   and M k = Q, we have
                             xk = xk − (Q + M k )−1 ∇f (xk ) = xk − 12 Q−1 ∇f (xk ),
so for a stepsize αk = 2, we have
                                 xk+1 = xk + αk (xk − xk ) = xk − Q−1 ∇f (xk ).
Thus the method reduces to the pure form of Newton’s method for unconstrained minimization
of f , which for a quadratic function converges in a single step to the optimal solution.
                                                             11