Lda1 PDF
Lda1 PDF
g Limitations of LDA
g Variants of LDA
     n   Of all the possible lines we would like to select the one that maximizes the
         separability of the scalars
            g    This is illustrated for the two-dimensional case in the following figures
                x2                                            x2
                                                x1                                           x1
         Introduction to Pattern Analysis                                                         2
         Ricardo Gutierrez-Osuna
         Texas A&M University
Linear Discriminant Analysis, two-classes (2)
g   In order to find a good projection vector, we need to define a measure
    of separation between the projections
         g   The mean vector of each class in x and y feature space is
                                                1                  ~ = 1            1
                                         µi =
                                                Ni
                                                     ∑x
                                                     x∈ωi
                                                             and   µi
                                                                       Ni
                                                                            ∑y = N ∑w
                                                                            y∈ωi
                                                                                                    T
                                                                                                        x = w Tµi
                                                                                      i x∈ωi
         g   We could then choose the distance between the projected means as our objective
             function
                                                 ~ = w T (µ − µ )
                                              ~ −µ
                                     J( w ) = µ1  2        1   2
         g   However, the distance between the projected means is not a very good measure since it
             does not take into account the standard deviation within the classes
                                                            x2
                                                                            µ1
       This axis yields better class separability
                                                                                               µ2
x1
x1
                                                                    ∑ (x − µ )(x − µ )
                                                                                               T
                                                            Si =                i          i
                                                                    x∈ωi
                                                            S1 + S 2 = S W
         g   where SW is called the within-class scatter matrix
  n   The scatter of the projection y can then be expressed as a function of the scatter matrix in
      feature space x
                     ~
                             ∑ (y − µ~ )             ∑ (w                   ) = ∑ w (x − µ )(x − µ ) w = w S w
                                             2                              2                               T
                     si2 =               i       =          T
                                                                x − w Tµi                  T
                                                                                                    i   i
                                                                                                                T
                                                                                                                    i
                             y∈ωi                    x∈ωi                           x∈ωi
                                                                ~
                                                                s12 + ~
                                                                      s22 = w T S W w
  n   Similarly, the difference between the projected means can be expressed in terms of the means
      in the original feature space
SB
         g   The matrix SB is called the between-class scatter. Note that, since SB is the outer product of two vectors,
             its rank is at most one
  n   We can finally express the Fisher criterion in terms of SW and SB as
                                                                          w T SB w
                                                                  J( w ) = T
                                                                          w SW w
      Introduction to Pattern Analysis                                                                                  5
      Ricardo Gutierrez-Osuna
      Texas A&M University
Linear Discriminant Analysis, two-classes (5)
  n   To find the maximum of J(w) we derive and equate to zero
                                        [J( w )] = d  wT SB w  = 0 ⇒
                                                           T
                                      d
                                     dw           dw  w S W w 
                                      [
                               ⇒ w SW w
                                  T
                                            dw
                                                ][
                                        d w T SB w         ] [
                                                    − w SB w
                                                         T     d w TS W w
                                                                    dw
                                                                             ][
                                                                          =0 ⇒
                                                                                       ]
                                            [          ]             [
                                  ⇒ w T S W w 2SB w − w T SB w 2S W w = 0     ]
  n   Dividing by wTSWw
                                          [w S w ]S w − [w S w ] S
                                            T                    T
                                                                                  w =0 ⇒
                                          [w S w ]      [w S w ]
                                                W                        B
                                            T          B         T            W
                                                W                        W
                                                    ⇒ SB w − JS W w = 0 ⇒
                                                     ⇒ S −W1SB w − Jw = 0
                                                       w T SB w 
                                                                  = S W (µ1 − µ2 )
                                                                       −1
                                          w* = argmax  T
                                                  w    w SW w 
          g    This is know as Fisher’s Linear Discriminant (1936), although it is not a discriminant but rather a
               specific choice of direction for the projection of the data down to one dimension
      n   X2=(x1,x2)={(9,10),(6,8),(9,5),(8,7),(10,8)}
g   SOLUTION (by hand)                                                                6
i=1
                                                    1       1
                                where µ =             ∑
                                                    N ∀x
                                                         x = ∑ Niµi
                                                            N x∈ωi
                                                                                             SW2
                                                                                                                          x1
              g    where ST=SB+SW is called the total scatter matrix
                                      ~ =1 y
                                                                      C
                                                                 ~
                                      µ   ∑                      SB = ∑ Ni (µ  ~)(µ
                                                                            ~ −µ
                                                                             i
                                                                                     ~)T
                                                                                  ~ −µ
                                                                                   i
                                         N ∀y                         i=1
                                                     ~
                                                     S W = W TS W W
                                                     ~
                                                     SB = W T SB W
  n   Recall that we are looking for a projection that maximizes the ratio of between-class to
      within-class scatter. Since the projection is no longer a scalar (it has C-1 dimensions), we
      then use the determinant of the scatter matrices to obtain a scalar objective function:
                                                          ~
                                                          SB  W T SB W
                                                 J( W ) = ~ =
                                                          SW  W TSW W
n And we will seek the projection matrix W* that maximizes this ratio
                                                                  W T SB W   
                               [
                      W* = w | w | L | w
                                   *
                                   1
                                         *
                                         2
                                             *
                                             C −1   ]   = argmax  T             ⇒ (SB − λ iS W )w i = 0
                                                                                                    *
 W S W W 
g   NOTES
     n   SB is the sum of C matrices of rank one or less and the mean vectors are constrained by
                     1 C
                       ∑ µi = µ
                     C i=1
             g    Therefore, SB will be of rank (C-1) or less
             g    This means that only (C-1) of the eigenvalues λi will be non-zero
     n   The projections with maximum class separability information are the eigenvectors
         corresponding to the largest eigenvalues of SW-1SB
     n   LDA can be derived as the Maximum Likelihood method for the case of normal class-
         conditional densities with equal covariance matrices
                                                                                                               Sensor response
              n
                                                                                                                                 20
                     the response of the gas sensor array was processed in
                     order to obtain a 60-dimensional feature vector
                                                                                                                                  0
g   Results
              n      From the 3D scatter plots it is clear that LDA                                                              -20
                     outperforms PCA in terms of class discrimination
              n      This is one example where the discriminatory                                                                -40
                     information is not aligned with the direction of                                                                  0                50                   100          150              200
                     maximum variance                                                                                                      Sulaw es y        Keny a            A rabian   Sumatra      Colombia
PCA LDA
                                                       2
                                                     223114                                                                                              1 1 1                                    5
                                                    32333
                                                     2    11345                                                                                      11 11                                   5 5
             35                               23 32 242 41     45                                                                                1 3111111111
                                                                                                                                                        3
                                                                                                                                                                                         5 5 55 5
                                           4        2     43   15 5                                            7.42                                    1
                                                                                                                                                       3  3  11111
                                                                                                                                                           111             1               5
                                      32 11   452223
                                                   3
                                                   4
                                                   4
                                                   23
                                                    23 14
                                                        1315
                                                           4
                                                           4 4   312  5                                                                                     3  311
                                                                                                                                                 33333131131 13 14
                                                                                                                                                                   31                            555 5 5
                                                                                                                                                                                                   555
             30                   2 2       5 131 155 3141252
                                                          454425
                                                             2121
                                                                2
                                                                3
                                                                1
                                                                322532 4
                                                                   2                                                                                 333331313313 3 4                    5
                                                                                                                                                                                     5 5555 555     55
                                                      4       21
                                                              333435      43142 4                                7.4
                                                                                                                                                                                              5555 55 5
                                    2 5 135 5 1 4 2           1154355
                                                                    34
                                                                     51
                                                                      5551353     44
                                                                              51415                                                               3 333      331 3 444444  4 4       5 55 555
                                            2  1     4                       3
                                                                              4
                                                                              4   35
                                                                                  5  5
                                                                                   5 4                         7.38                            3   33 3           4
                                                                                                                                                             333 44 4                   22 22 5
                                                                                                                                                                                                 5
                                                                                                                                                                                                 2 5
                                                           2                                                                                           333 44444444
             25                         3                                         5
                                              4          4                                                                                                                    4                2
                                       2511 4 2 413       234311
                                                           1 4                         5                                                                                44 4          2222
    axis 3
                                                                                                      axis 3
                                     3 3 5433
                                            51 2 2 51 51                                                       7.36                                               44 44    4 4 22 22
                                                                                                                                                                           4          22
                                                                                                                                                                                      2
                                                                                                                                                                                      2  22
             20
                                                                                                                                                                    4 44 444     2 222222 2
                                                                                                                                                                                       2
                                      3 213 35 52                                                                                                                               22
                                             5                    1                                                                                                4         4 2 22 2 22
                                           1 5
                                                                                                               7.34
             15                                                                                                                                                       4   44              2
                                                                                                               7.32                                                               2     2        2
             10                                                                                 100                                                                                                              0.4
                                                       5                                   90
                                                                                                                                                                                    2
             -260                             2                                                                                        -1.96                                                        0.35
                         -240                                                     80                                                            -1.94
                                     -220                                                                                                               -1.92
                                                                          70                                                                                          -1.9
                                                     -200                                                                                                                    -1.88         0.3
                                                                                       axis 2                                                                                                        axis 2
                                            axis 1                                                                                                               axis 1
                                                                  µ1         ω1         µ1=µ2=µ
                          ω1          ω2
                                                 ω1
                                                                       ω2
                          ω2          ω1
                                                      µ2
g   LDA will fail when the discriminatory information is not in the mean but rather
    in the variance of the data
                                                 x2
PC
                                                                               A
                                                                            LD
                                                              A
x1
                               UNINTERESTING                                   INTERESTING
                       x2                                              x2
x1 x1
                                                E(d, d' ) = ∑
                                                                  [d(P ,P ) − d(P ,P )]
                                                                      i   j            i
                                                                                           '
                                                                                               j
                                                                                                '    2
i≠ j d(Pi ,Pj )
                  n   The original method did not obtain an explicit mapping but only a lookup table for the elements in
                      the training set
                  n   Recent implementations using artificial neural networks (MLPs and RBFs) do provide an explicit
                      mapping for test data and also consider cost functions (Neuroscale)
                  n   Sammon’s mapping is closely related to Multi-Dimensional Scaling (MDS), a family of multivariate
                      statistical methods commonly used in the social sciences
                                     x2                                                             x2
                                           Pj                                                                  P’j
x3 x1