Lecture 5
Necessary Conditions for
     Finite‐variable
Constrained Minimization
ME260 Indian Institute of Science
S t r u c t u r a l O p t i m i z a t i o n : S i z e , S h a p e , a n d To p o l o g y
G. K. Ananthasuresh
P r o f e s s o r, M e c h a n i c a l E n g i n e e r i n g , I n d i a n I n s t i t u t e o f S c i e n c e , B e n g a l u r u
suresh@iisc.ac.in
                                                                                                                                     1
Outline of the lecture
How do constraints influence the ability to minimize the objective
function?
The concept of Lagrange multipliers
Feasible space
Active and inactive constraints
Necessary conditions
What we will learn:
Constrained optimum lies on the boundary of the feasible space.
Conditions for constrained local minimum; constraint qualification
Sensitivity of the constrained optimum to small changes in constraints.
Physical meaning of Lagrange multipliers.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   2
Two variables and an equality constraint
 Min f  x1 x2                       (The intent here is to maximize the
  x1 ,x2                              product of two numbers such that their
 Subject to                           sum is equal to 1)
             x1  x2  1                                Min f   x1 (1  x1 )
                                                            x1
One easy way to solve this                              or
problem is to eliminate either
x1 or x2 by expressing in terms                         Min f  (1  x2 ) x2
                                                           x2
of the other using the equality
constraint and make this an                               But this kind of explicit elimination
unconstrained problem in one                              of a variable may not always be
variable.                                                 possible. What do we do then?
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   3
Eliminate a variable in the first order
approximation
 Min f (x1 , x2 )                                           Let the optimum be:                         (x1* , x2* )
   x1 ,x2                                                                                                                               x1  x1  x1*
                                                            Let the first order perturbations be:
 Subject to                                                                                                                             x2  x2  x2*
                                                                Now, consider first‐order approximations of
                h(x1, x2 )  0                                  the objective function and the constraint.
                                        f                          f
            f (x1 , x2 )  f (x , x ) 
                                *
                                1
                                      *
                                      2                       x1                             x2
                                        x1           x* ,x *
                                                                    x2             x1* ,x2*
                                                       1    2
                                            h                          h                             h                       h
            h(x1 , x2 )  h(x1* , x2* )                        x1                           x2                     x1                     x2  0
                                            x1      x1* ,x2*
                                                                        x2        x1* ,x2*
                                                                                                       x1   x1* ,x2*
                                                                                                                                x2   x1* ,x2*
                       h                                 h                
             x2                         x1                             
                       x1         x1* ,x2*               x2    x1* ,x2*   
ME 260 / G. K. Ananthasuresh, IISc             Structural Optimization: Size, Shape, and Topology                                          4
After eliminating one variable’s first‐order
perturbation in terms of the other…
                h                                      h                
  With x2                              x1                             
                x1                                     x2               
                                x1* , x2*                     x1* , x2*   
                                                                                    h              
                                                                                                x1 
                                      f                        f                  x              
  f ( x1 , x2 )  f ( x1* , x2* )                      x1                        1 x1* , x2*       f ( x* , x* )  f                x1  
                                                                                                                                                     h
                                                                                                                                                                       x1
                                                                                                              1    2
                                      x1   x1* , x2*
                                                                x2    x1* , x2*
                                                                                      h                               x1   x1* , x2*
                                                                                                                                                     x1   x1* , x2*
                                                                                                  
                                                                                      x2 x* , x* 
                                                                                             1 2 
              f                                                     f                        h                                           Because the
                                       
              x2                                                                                              0                         first order
 where    
                              x1* , x2* 
                                                                       x1          x1* ,x2*     x1   x1* ,x2*                               derivative
              h                                                                                                                                should be
                                                                   f                         h                
              x2                                                                                                                               zero for
                             x1* , x2* 
                                                                                                                 0                         any x
                                                                     x2           x1* ,x2*     x2    x1* ,x2*                                               1
ME 260 / G. K. Ananthasuresh, IISc                        Structural Optimization: Size, Shape, and Topology                                  5
Some generality and a new concept
    Two things are remarkable about the two equations we got: (i) they are
    similar and (ii) they both give a similar expression for the constant,  .
 f                 h               
                                    0                         f            f                   
 x1   x1* ,x2*     x1                                         
                                                                      x
                                                                                                         
                                                                                     x2
                             x1* ,x2*
                                                                     1 x1 ,x2 
                                                                          *  *                        * *
                                                                                                    x1 ,x2
 f                 h                                                    
                                                                     h            h                   
                                    0                                                            
 x2   x1* ,x2*     x2   x1* ,x2*                                x
                                                                     1 x1 ,x2 
                                                                         *   *       x2            * * 
                                                                                                    x1 ,x2
       This is called the Lagrange multiplier corresponding to the equality constraint.
 Think about what the Lagrange multiplier physically means…
 It is the negative of the ratio of the rate of change of objective
 function to the rate of change of the constraint with respect to either
 variable.
ME 260 / G. K. Ananthasuresh, IISc             Structural Optimization: Size, Shape, and Topology              6
The concept of the Lagrangian
        Min f (x1 , x2 )
         x1 ,x2                                               An alternative formulation…
        Subject to                                           Min L  f (x1 , x2 )   h(x1, x2 )
                    h(x1 , x2 )  0                           x1 ,x2
                                                Both have the
                                                same
                                                necessary                                L is called
                                                conditions!
                                      f                 h               
                                                                                         the
                                                                         0        Lagrangian.
                                      x1   x1* ,x2*     x1   x1* ,x2*   
                                      f                 h               
                                                                         0
                                      x2   x1* ,x2*     x2   x1* ,x2*   
ME 260 / G. K. Ananthasuresh, IISc      Structural Optimization: Size, Shape, and Topology   7
General problem in two variables and
one equality constraint
  Min f (x1 , x2 )
   x1 ,x2
                                                Min L  f (x1 , x2 )   h(x1, x2 )
                                                   x1 ,x2
  Subject to
              h(x1 , x2 )  0
                                          f     h
                                                   0                  Three variables (x1 , x2 ,  )
                             conditions
                             Necessary
                                          x1    x1                     And three
                                          f          h                 equations!
                                                         0
                                          x2         x2     We are fine.
                                          h( x1 , x2 )  0
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   8
n variables and m equality constraints
      Min          f (x1 , x2 ,, xn )                               Min f (x)
                                                        Short form
    x1 ,x2 ,,xn                                                       x
    Subject to                                                       Subject to
              h1 (x1 , x2 ,, xn )  0
                                                                            h(x)  0
             h2 (x1 , x2 ,, xn )  0
                                                                   x1           h1 (x1 , x2 ,, xn )   
                                                                                                        
             hm (x1 , x2 ,, xn )  0                    
                                                       x
                                                                     x2           h2 (x1 , x2 ,, xn )   
                                                                         and h                          
                                                                                                      
Can there be more constraints than variables?                       xn           hm (x1 , x2 ,, xn )   
                                                                                                        
  mn         No; feasible values may not exist. It is over‐constrained.
  mn         Some discrete feasible values may exist; but cannot do minimization.
  mn         This must be true in order to do minimization of the objective function.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology    9
Feasible space; partitioning variables
                                                                     Let us partition n variables in
 If there are m constraints (note: m <                               to s (solution or dependent)
 n), we can choose only (n-m)                                        variables and d (decision or
 variables freely because m variables                                independent variables).
 can be found using the m equality
 constraints.                              x1                                           s1       
                                                                                                 
 So, we search in the (n-m)‐               x2                                           s2       
 dimensional feasible space.                                                                   
                                                                                                 
 Feasible space is the reduced space       xm                                           sm         s 
 that satisfies the constraints.       x                                                          
                                           xm1                                         d1         d 
 When we say it is (n-m)‐dimensional,      xm2                                         d2       
 we mean that m variables are                                                                    
 somehow eliminated using m                                                                    
                                           xn                                          dnm      
 equality constraints.                                                                           
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology        10
Taylor series again
                          f (x *
                                n
                                  )
    f (x)  f (x )  
                *
                                    (xi  xi* )  O(2)
                     i1   xi
                 f (x * )  f T (x * ) x *             Approximated to first order.
                                                                                          As per
                 f (x * )   s f T (x * ) s*   d f T (x * ) d*                      partitioned
                                                                                          variables.
 For the necessary condition, we want the first order terms go
 to zero; then, the function value does not change in the
 vicinity of the minimum up to first order.
 But we know that perturbation in s variables cannot be
 independent of those in d variables. So… (next slide)
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology     11
Taylor series for equality constraints
                             n     h j (x * )
h j (x)  h j (x * )  
                             i1      xi
                                                 (xi  xi* ) O(2)
                                                                                       j  1,2,,m
                                                                        As per
             h j (x * )  h j T (x * ) x *                           partitioned
                                                                        variables.
                         0
             h j (x * )   s h j T (x * ) s*   d h j T (x * ) d*  0
Because x* is                                                                                Because x
feasible; i.e.,                                                                              should remain
it satisfies             s h j T (x * ) s*  d h j T (x * ) d*  0                      feasible after
the                                                                                          perturbation
constraints.                                                                                 from x*.
ME 260 / G. K. Ananthasuresh, IISc      Structural Optimization: Size, Shape, and Topology      12
Eliminating s‐perturbations
s h j (x ) s  d h j (x ) d  0                                  j  1,2,,m
        T       *      *             T      *        *
  s h T ( x* )  s *   d h T ( x * )  d *  0                  Compact notation for all constraints.
                                                                   Note the sizes of the quantities.
   m m                 m  (n  m)       m 1
                    m 1        (n  m)  1
                                          1
  s   s h (x )   d hT (x * ) d*
            *                T       *
Remember that we can take the inverse of the m x m matrix here only
if it is not singular. This implies that the gradients of the constraints
should be linearly independent. This is known as constraint
qualification.
ME 260 / G. K. Ananthasuresh, IISc       Structural Optimization: Size, Shape, and Topology   13
Reduced gradient                                                                          Should be zero for
                                                                                          the necessary
 f (x)  f (x * )  s f T (x * ) s*   d f T (x * ) d*                                condition for a
                                                                                          minimum.
                                                       1
With           s   s h (x )   d hT (x * ) d*
                   *                    T       *
We get  s f T (x * ) s*  d f T (x * )d*  0
                                                               1
               s f (x )  s h (x )   d hT (x * ) d*   d f T (x * )d* = 0
                        T       *              T       *
                           T       *              T       *
                                                                1
                  s f (x ) s h (x )  d hT (x * )   d f T (x * ) d* = 0                
                                                                             Think of f as z that depends
           Reduced gradient in the                              z(d)
           p=n‐m space.                                                      only the d variables
ME 260 / G. K. Ananthasuresh, IISc          Structural Optimization: Size, Shape, and Topology   14
                                                                                Reduced gradient is
The multipliers, again                                                          zero is the necessary
                                                                                condition.
Where we used the new symbol for
Lagrange multipliers
appear again.
Compare with the two‐                                               Notice that both equations
variable case on slide 5 of                                         have the same form; one is
this lecture.                                                       gradient w.r.t. to d and the
Same story here!                                                    other w.r.t. s.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   15
The Lagrangian
 With
                                                  f   m    h j
                                                      j      0                       i  1,2,,n
                                                  xi j 1 xi
is a row vector.                              We have n scalar equations here.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   16
Necessary conditions for equality‐
constrained minimization problem
   Min f (x)
      x
   Subject to
        h(x)  0
                                                                        Variables: n+m
                                                                        Equations: n+m
                                                                        So, we are fine.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   17
Geometric interpretation
                                                                This means that the gradient
                                                                of the objective function is a
     f   m    h j                                             linear combination of the
         j      0                    i  1,2,,n           gradients of the constraints.
     xi j 1 xi
         x2                               “Contours” or “level‐set curves” of f(x1,x2)
                                                               f     increases
                                                                            Here, at the constrained
                                                                            minimum, the gradients
                                  h                                       of f an h are in the same
                                                                            direction.
                             f                                    h( x1 , x2 )  0
ME 260 / G. K. Ananthasuresh, IISc      Structural Optimization: Size, Shape, and Topology   18
With inequality constraints
   Min f (x)
      x
   Subject to
        h(x)  0                          p inequality constraints.
        g(x)  0                          gk (x)  0 k  1,2,, p
 Inequality constraints may be active or inactive at the minimum point.
                                                g0               g0
                          Active constraints should                Inequality constraints may
                          be treated just like                     simply be ignored because
                          equality constraints.                    they don not play any role.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   19
Complementarity conditions
How do we know if an inequality constraint is active or not?
We don’t.
So, we express it in the form of equations!
                                                                    This is an interesting set
  k gk  0 k  1,2,, p                                            of equations:
                                                                    Either multiplier is zero
                                                                    or the constraint is zero.
                                                                    They are called
  Lagrange multiplier                                               complementarity
  corresponding to kth                                              conditions.
  inequality constraint
                                                                    Both may also be zero under
                                                                    special situations.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   20
Necessary conditions for constrained
minimization with equalities and inequalities.
Min f (x)
   x
Subject to
     h(x)  0
     g(x)  0
                                                                                 Variables: n+m+p
                                                                                Equations: n+m+p
                                                                                   So, we are fine.
                                                                                                .
But we are not done yet.
The Lagrange multipliers of inequality constraints are
restricted in sign. Let us discuss why.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   21
Inequality constraint; simplest problem
             f ( x)
         Constrained
         local
         minimum
                                                   x
  Feasible region
                                                            For x* to be a local minimum.
                                                           For x* to continue to satisfy
                                                           the inequality even after
                                                           perturbation.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   22
Minimizability vs. feasibility
But necessary condition requires:
Multiply both sides by
  It is a simple but a good                                                                0
  explanation for the non‐
  negativity of the Lagrange                                    So, this has to be non‐negative.
  multipliers of inequality                                     That is,
  constraints.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology    23
Necessary condition for general
constrained minimization problem
Min f (x)
   x
Subject to
     h(x)  0
     g(x)  0
                                            Variables: n+m+p
                                           Equations: n+m+p
                                   Number of inequalities: 2p
   The first condition follows from the Lagrangian.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   24
Necessary conditions: KKT conditions
Min f (x)
   x
Subject to
     h(x)  0
     g(x)  0
    Karush‐Kuhn‐Tucker conditions.
                    Independently done at Princeton University; later Kuhn dug up Karush’s
                    work and gave credit that is due to him. A rare and admirable gesture.
        Had done this in his master’s thesis at University of Chicago before Kuhn and Tucker.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   25
A caveat: constraint qualification
KKT conditions are applicable as “necessary
conditions” only if the constraints qualification is
satisfied.
Constraint qualification requires that the gradients of
the equality constraints and active inequality
constraints be linearly independent at the optimum.
◦ See slide 13.
◦ One can construct special example where a point is a
  minimum but KKT conditions are not satisfied.
  ◦ How can “necessary” conditions be not satisfied?
  ◦ It is because at such special points “constraint qualification” is not
    satisfied. So, KKT conditions are not applicable.
ME 260 / G. K. Ananthasuresh, IISc   Structural Optimization: Size, Shape, and Topology   26
The end note
                                               Two variables and one equality constraint
for finite‐variable constrained optimization
                                               The concept of Lagrange multiplier and the Lagrangian
                                               Feasible space
                                               Reduced gradient with equality constraints
            Necessary conditions
                                               Lagrange multipliers
                                               Constraint qualification
                                               Inequality constraint and the implication of
                                               the sign of the Lagrange multiplier
                                               Complementarity conditions
                                               Karush‐Kuhn‐Tucker necessary conditions
                                                                                                                   Thanks
  ME 260 / G. K. Ananthasuresh, IISc                         Structural Optimization: Size, Shape, and Topology   27