16.323 Principles of Optimal Control: Mit Opencourseware
16.323 Principles of Optimal Control: Mit Opencourseware
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
               16.323 Lecture 6
          ẋ = a(x, u, t)
          ṗ = −HxT
         Hu = 0
Spr 2008	                                                                                      16.323 6–1
                           Optimal Control Problems
•	 Then
                            �                 �
                                     T
            δJa = hxδxf + htf + g + p (a − ẋ) (tf )δtf
                    � tf
                                                T         T
                         �	                                      �
                  +       Hxδx + Huδu + (a − ẋ) δp(t) − p (t)δẋ dt
                                       t0
                                                                                   11
• To proceed, note that by integrating by parts                                         we get:
     �	 tf                 � tf
− pT (t)δẋdt = − pT (t)dδx
                t0                                     t0
                                                                 �   tf   �             �T
                                                    T
                                                       �tf                    dp(t)
                                                = −p δx�t +                                  δxdt
                                                             0
                                                                  t0           dt
                                                                          �   tf
                                                       T
                                                = −p (tf )δx(tf ) +       ṗT (t)δxdt
                                                                      t0
                                                                                  � tf
                                                = −pT (tf ) (δxf − ẋ(tf )δtf ) +      ṗT (t)δxdt
                                                                                              t0
 11
      �                �
          udv ≡ uv −       vdu
∂h
m(x(tf ), tf ) = 0
• Follow the same procedure on 6–1 using the insights provided on 5–21
  (using the ga form on 5–20) to form
• Work through the math, and get the necessary conditions are
                     ẋ = a(x, u, t)         (dim n)        (6.22)
                     ṗ = −HxT               (dim n)        (6.23)
                    Hu = 0                   (dim m)        (6.24)
     – With the boundary condition (lost if tf fixed)
                                 H(tf ) + wtf (tf ) = 0
     – And m(x(tf ), tf ) = 0, with x(t0) and t0 given.
     – With (since x(tf ) is not directly given)
                                         �          �T
                                           ∂w
                               p(tf ) =       (tf )
                                           ∂x
                                          c2    c1
             ẋ1(t) = x2(t) → x1(t) = c4 − t2 + t3
                                          2b    6b
  and since x1(0) = 10, c4 = 10
                       c2     c1
           x2(tf ) = − tf + t2f = 0
                        b     2b
                          c2 2 c1 3
           x1(tf ) = 10 − tf + tf = 0
                          2b     6b
                          c2            60b                         120b
                   = 10 − t2f = 0 → c2 = 2 ,                 c1 =
                          6b             tf                          t3f
     – But that gives us:
                               �                  �2
                          1       60b 120b               (60b)2
                    tf =         − 2 + 3 tf            =
                         2bα       tf  tf                2bαt4f
        so that t5f
 = 1800b/α or tf ≈ 4.48(b/α)1/5, which makes sense
        because tf goes down as α goes up.
Example 6–1
1 %
4 % Jonathan How
5 % opt1.m
6 %
10 %
12 G=ss(A,B,C,D);
13 X0=[10 0]’;
14 b=0.1;
   15
   16   alp=1;
17 tf=(1800*b/alp)^0.2;
18 c1=120*b/tf^3;
19 c2=60*b/tf^2;
20 time=[0:1e-2:tf];
21 u=(-c2+c1*time)/b;
22 [y1,t1]=lsim(G,u,time,X0);
   23
   24   figure(1);clg
25 plot(time,u,’k-’,’LineWidth’,2);hold on
26 alp=10;
27 tf=(1800*b/alp)^0.2;
28 c1=120*b/tf^3;
29 c2=60*b/tf^2;
30 time=[0:1e-2:tf];
31 u=(-c2+c1*time)/b;
32 [y2,t2]=lsim(G,u,time,X0);
33 plot(time,u,’b--’,’LineWidth’,2);
   34
   35   alp=0.10;
36 tf=(1800*b/alp)^0.2;
37 c1=120*b/tf^3;
38 c2=60*b/tf^2;
39 time=[0:1e-2:tf];
40 u=(-c2+c1*time)/b;
41 [y3,t3]=lsim(G,u,time,X0);
42 plot(time,u,’g-.’,’LineWidth’,2);hold off
   43
   44   legend(’\alpha=1’,’\alpha=10’,’\alpha=0.1’)
45 xlabel(’Time (sec)’)
46 ylabel(’u(t)’)
47 title([’b= ’,num2str(b)])
   48
   49   figure(2);clg
50 plot(t1,y1(:,1),’k-’,’LineWidth’,2);
51 hold on
52 plot(t2,y2(:,1),’b--’,’LineWidth’,2);
53 plot(t3,y3(:,1),’g-.’,’LineWidth’,2);
54 hold off
55 legend(’\alpha=1’,’\alpha=10’,’\alpha=0.1’)
56 xlabel(’Time (sec)’)
57 ylabel(’y(t)’)
58 title([’b= ’,num2str(b)])
   59
   60   print -dpng -r300 -f1 opt11.png
z(t) = Cz (t)x(t)
     Cost:
         �       tf   � T
                       z (t)Rzz(t)z(t) + u (t)Ruu(t)u(t) dt + x(tf )T Ptf x(tf )
                                          T
                                                        �
 2JLQR =
                t0
                                               −1 T
                                                  Bu p(t)
       ∂u
                                                            2
    4. As before, we can check for a minimum by looking at H2 ≥ 0
                                                          ∂
                                                          ∂u
       (need to check that Ruu ≥ 0)
• Note that p(t) plays the same role as Jx�(x(t), t)T in previous solutions
  to the continuous LQR problem (see 4–8).
     – Main difference is there is no need to guess a solution for J �(x(t), t)
• Now have:
                                                        −1 T
                    ẋ(t) = Ax(t) + Bu�(t) = Ax(t) − BuRuu Bu p(t)
    which can be combined with equation for the adjoint variable
            ṗ(t) = −Rxxx(t) − AT p(t) = −CzT RzzCz x(t) − AT p(t)
                                                                −1 T
                                          �                             ��
                                                  A         −BuRuu Bu
                      �           �                                                 �
                          ẋ(t)                                              x(t)
                ⇒                     =
                          ṗ(t)               −CzT RzzCz        −AT          p(t)
                                          �                ��           �
                                                           H
    where H is called the Hamiltonian Matrix.
     – Matrix describes coupled closed loop dynamics for both x and p.
     – Dynamics of x(t) and p(t) are coupled, but x(t) known initially
       and p(t) known at terminal time, since p(tf ) = Ptf x(tf )
     – Two point boundary value problem ⇒ typically hard to solve.
•	 How proceed?
    1. For the 2n system
                        � �	                  −1 T
                                                   ��
                                 A	     −B  R   B
               �                                           �
                 ẋ(t)	                   u   uu u    x(t)
                         =
                 ṗ(t)       −CzT RzzCz   −AT         p(t)
        define a transition matrix
                                       �	                             �
                                            F11(t1, t0) F12(t1, t0)
                        F (t1, t0) =
                                            F21(t1, t0) F22(t1, t0)
        and use this to relate x(t) to x(tf ) and p(tf )
                         � �	                           ��
                                F11(t, tf ) F12(t, tf )
                 �                                                �
                    x(t)                                   x(tf )
                            =
                    p(t)        F21(t, tf ) F22(t, tf )    p(tf )
        so
                      x(t) = F11(t, tf )x(tf ) + F12(t, tf )p(tf )
                             �	                            �
                           = F11(t, tf ) + F12(t, tf )Ptf x(tf )
    •	 Now have p(t) = P (t)x(t), must find the equation for P (t)
                                         ṗ(t) = Ṗ (t)x(t) + P (t)ẋ(t)
                ⇒	 − CzT RzzCz x(t) − AT p(t) =
                                                              −1 T
               = CzT Rzz	Cz x(t) + AT p(t) + P (t)(Ax(t) − BuRuu Bu p(t))
                                                          −1 T
               = (CzT RzzCz + P (t)A)x(t) + (AT − P (t)BuRuu Bu )p(t)
                                                           −1 T
               = (CzT RzzCz + P (t)A)x(t) + (AT	 − P (t)BuRuu Bu )P (t)x(t)
                                                       −1 T
                 �	 T                T	
                                                                  �
               = A P (t) + P (t)A + Cz RzzCz − P (t)BuRuu Bu P (t) x(t)
•	 Key point: This controller works equally well for MISO and MIMO
   regulator designs.
                                                                  12
•	 Since F is the transition matrix                                    for the system (see 6–10), then
                          d
                             F (t, tf ) =	 HF (t, tf )
                          dt
           F˙	11                                         −1 T
      �                      �             �	                 �          �         �
                        Ḟ12                   A −BuRuu    Bu              F11 F12
                               (t, tf ) =                       (t, tf )             (t, tf )
           Ḟ21         Ḟ22                  −Rxx     −AT                 F21 F22
  12 Consider homogeneous system ẋ(t) = A(t)x(t) with initial condition x(t ) = x . The general solution to this differential
                                                                                    0      0
equation is given by x(t) = Φ(t, t0 )x(t0 ) where Φ(t1 , t1 ) = I. Can show the following properties of the state transition matrix Φ:
  1. Φ(t2 , t0 ) = Φ(t2 , t1 )Φ(t1 , t0 ), regardless of the order of the ti
F˙ 12 = −1 T
       ��                                          �
                      T                  T
  Ṗ =   −RxxF11 − A F21 + (−RxxF12 − A F22)Ptf
   �                                              ��
             −1 T                   −1 T
−P AF11 − BuRuu Bu F21 + (AF12 − BuRuu Bu F22)Ptf    [F11 + F12Ptf ]−1
              −1 T
         P BuRuu Bu (F21 + F22Ptf )[F11 + F12Ptf ]−1 = P BuRuu
                                                            −1 T
                                                               Bu P
Psi11=V(1:2,1:2);
Psi21=V(3:4,1:2);
Ptest=Psi21*inv(Psi11);
                                            �   tf   �              �
                  1	                    1                d T
                =	 x(tf )T Ptf x(tf ) −                     (p x)       dt
                  2                     2    t0          dt
                    1                     1 �	
                      x(tf )T Ptf x(tf ) − pT (tf )x(tf ) − pT (t0)x(t0)
                                                                        �
                =
                    2	                    2
                  1	    T            1 �	 T                T
                                                                            �
                = x(tf ) Ptf x(tf ) − x (tf )Ptf x(tf ) − x (t0)P (t0)x(t0)
                  2	                 2
                  1
                =	 xT (t0)P (t0)x(t0)
                  2
                    � �                     −1 T
                                                 ��
                            A	      −B    R   B
           �                                              �
             ẋ(t)	                     u   uu u     x(t)
                     =
             ṗ(t)      −CzT RzzCz      −AT          p(t)
    with the appropriate boundary conditions.
•	 Key point: For a SISO system, we can relate the closed-loop poles
   to a Symmetric Root Locus (SRL) for the transfer function
                                                     N (s)
                     Gzu(s) = Cz (sI − A)−1Bu =
                                                     D(s)
     – Poles and zeros of Gzu(s) play an integral role in determining SRL
     – Note Gzu(s) is the transfer function from control inputs to perfor
       mance variable.
  •	 Note: if A is invertible:
                     �         �
                       A B
                 det             = det(A) det(D − CA−1B)
                       C D
                                     −1 T
det(sI−H) = (−1)n D(s)D(−s) det I + Ruu Bu (−sI − AT )−1 CzT Rzz Cz (sI − A)−1 Bu
                               �	                                                 �
                                                  Example 6–2
•	 Simple example from 4–12: A scalar
                                  � ∞system2 with ẋ = ax + bu with
   cost (Rxx > 0 and Ruu > 0) J = 0 (Rzzx (t) + Ruuu2(t)) dt
                                          2 2
•	 The steady-state
               √ 2 P2 solves 2aP + Rzz − P b /Ruu = 0 which gives
             a+ a +b Rzz /Ruu
   that P =         −1 b2
                   Ruu
                              >0
                                                   √2 2
                                         −1      a+ a +b Rzz /Ruu
    – So that u(t) = −Kx(t) where K = Ruu	  bP =       b
     – and the closed-loop dynamics are
                               �	                    �
                                     b  �
           ẋ	 = (a − bK)x = a − (a + a2 + b2Rzz/Ruu) x
                                     b
                    �
               = − a2 + b2Rzz/Ruu x = Acl x(t)
•	 In this case, Gzu(s) = b/(s−a) so that N (s) = b and D(s) = (s−a),
   and the SRL is of the form:
                                          Rzz 2
                       (s − a)(−s − a) +      b =0
                                          Ruu
                                                               Symmetric root locus
0.8
0.6
0.4
                                     0.2
                   Imaginary Axis
−0.2
−0.4
−0.6
−0.8
                                     −1
                                      −2   −1.5    −1   −0.5             0            0.5   1   1.5   2
                                                                    Real Axis
•	 SRL is the same whether a < 0 (OL stable) or a > 0 (OL unstable)
     – But the CLP is always the one in the LHP
     – Explains result on 4–12 about why gain K �= 0 for OL unstable
       systems, even for expensive control problem (Ruu → ∞)
            (l + 1/2)π                     n−m
        ±              , l = 0, 1, . . . ,     −1 ,                                 (n-m) even
              n−m                           2
                                              n−m    Phase
                                               1        0
                                               2     ±π/4
                                               3    0, ±π/3
                                               4  ±π/8, ±3π/8
•	 Note: Plot the SRL using the 180o rules (normal) if n − m is even
   and the 0o rules if n − m is odd.
                                                                   (s−2)(s−4)
                                    Figure 6.2: G(s) =      (s−1)(s−3)(s2 +0.8s+4)s2
                                                     Symmetric root locus
                               8
                               2
                  Imag Axis
−2
−4
−6
                              −8
                               −6        −4     −2           0              2   4      6
                                                          Real Axis
                                         �     �	                                                                   �   �
                                z=p= 1 0 x            and                                                       z=v= 0 1 x
                                           SRL with Position Penalty
                                                                                                                          SRL with Velocity Penalty
                  4
                                                                                                         1.5
                                                                                                         0.5
                  1
Imaginary Axis
Imaginary Axis
0 0
                 −1
                                                                                                        −0.5
−2
                                                                                                         −1
                 −3
                 −4                                                                                     −1.5
                  −4    −3       −2   −1               0               1   2   3   4                       −3   −2   −1               0                1    2       3
                                                  Real Axis                                                                      Real Axis
Figure 6.3: SRL with position (left) and velocity penalties (right)
•	 Turns out that the news is even better than that, because LQR exhibits
   very good stability margins
     – Consider the LQR stability robustness.
                             � ∞
                      J =          zT z + ρuT u dt
                                    0
                              ẋ = Ax + Bu
                              z = Cz x,    Rxx = CzT Cz
                                                                       z
                                                              Cz           �
                        u
                    �        B          (sI − A)−1        �
                                                              K            �
                –
                                                     x
•	 Can develop a relationship between the open-loop cost C(s) and the
   closed-loop return difference I +L(s) called the Kalman Frequency
   Domain Equality
                                              1
              [I + L(−s)]T [I + L(s)] = 1 + C T (−s)C(s)
                                              ρ
• Sketch of Proof
     – Start with u = −Kx,	K = ρ1 B T P , where
                                             1
                      0 = −AT P − P A − Rxx + P BB T P
                                             ρ
     – Introduce Laplace variable s using ±sP
                                                     1
                0 = (−sI − AT )P + P (sI − A) − Rxx + P BB T P
                                                     ρ
     – Pre-multiply by B T (−sI − AT )−1, post-multiply by (sI − A)−1B
     – Complete the square . . .
                                               1
                [I + L(−s)]T [I + L(s)] = 1 + C T (−s)C(s)
                                               ρ
•	 Can handle the MIMO case, but look at the SISO case to develop
   further insights (s = jω)
    [I + L(−s)]T [I + L(s)]	 = (I + Lr (ω) − jLi(ω))(I + Lr (ω) + jLi(ω))
                             ≡ |1 + L(jω)|2
    and
                   C T (−jω)C(jω) = Cr2 + Ci2 = |C(jω)|2 ≥ 0
3 |LN(jω)|
2 |1+LN(jω)|
                             1
                Imag Part
                             0
                                                             (−1,0)
                            −1
−2
−3
                            −4
                             −7   −6   −5   −4    −3    −2   −1    0      1
                                              Real Part
•	 Great, but why is this so significant? Recall the SISO form of the
   Nyquist Stability Theorem:
    If the loop transfer function L(s) has P poles in the RHP s-plane (and
    lims→∞ L(s) is a constant), then for closed-loop stability, the locus
    of L(jω) for ω : (−∞, ∞) must encircle the critical point (-1,0) P
    times in the counterclockwise direction (Ogata528)
•	 So we can directly prove stability from the Nyquist plot of L(s).
   But what if the model is wrong and it turns out that the actual loop
   transfer function LA(s) is given by:
                            LA(s) = LN (s)[1 + Δ(s)],   |Δ(jω)| ≤ 1,     ∀ω
                                                                       |L|
                                    1
                                                     |1+L|
                       Imag Part
0 ω=0
−1
                                                         ω
                                   −2
                                   −3
                                    −2      −1   0       1         2       3   4
                                                     Real Part
•	 Claim is that “since the LTF L(jω) is guaranteed to be far from the
   critical point for all frequencies, then LQR is VERY robust.”
     – Can study this by introducing a modification to the system, where
       nominally β = 1, but we would like to consider:
      �	The gain β ∈ R
      �	The phase β ∈ ejφ
                                            K(sI − A)−1B               �   β       �
                              –         �
Figure 6.5: Example loop transfer functions for open-loop stable system.
Figure 6.6: Example loop transfer functions for open-loop unstable system.
•	 Similar statements hold for the MIMO case, but it requires singular
   value analysis tools.
June 18, 2008
Spr 2008                                                                    16.323 6–30
13 0.6602
14 0.3420
15 0.2897];
17 r=1e-2;
18 eig(a)
19 k=lqr(a,b,cz’*cz,r)
20 w=logspace(-2,2,200)’;w2=-w(length(w):-1:1);
21 ww=[w2;0;w];
22 G=freqresp(a,b,k,0,1,sqrt(-1)*ww);
   23
   24   p=plot(G);
25 tt=[0:.1:2*pi]’;Z=cos(tt)+sqrt(-1)*sin(tt);
26 hold on;plot(-1+Z,’r--’);plot(Z,’r:’,’LineWidth’,2);
27 plot(-1+1e-9*sqrt(-1),’x’)
32 hold off
33 set(p,’LineWidth’,2);
34 axis(’square’)
35 axis([-2 4 -3 3])
   36
   37   ylabel(’Imag Part’);xlabel(’Real Part’);title(’Stable OL’)
38 text(.25,-.5,’\infty’)
   40
   41   %%%%%%%%%%%%%%%%%%%%%%
   42
   43   a=diag([-.75 -.75 1 1])+diag([-2 0 -4],1)+diag([2 0 4],-1);
44 r=1e-1;
45 eig(a)
46 k=lqr(a,b,cz’*cz,r)
47 G=freqresp(a,b,k,0,1,sqrt(-1)*ww);
   48
   49   p=plot(G);
50 hold on;plot(-1+Z,’r--’);plot(Z,’r:’,’LineWidth’,2);
51 plot(-1+1e-9*sqrt(-1),’x’)
56 hold off
57 set(p,’LineWidth’,2)
58 axis(’square’)
59 axis([-3 3 -3 3])
   60
   61   ylabel(’Imag Part’);xlabel(’Real Part’);title(’Unstable OL’)