Lec 06
Lec 06
                                         1
References and Slide Credits
• Slides from Deep Learning for Computer Vision, Prof. Yu-
  Chiang Frank Wang, National Taiwan University
• Slides from Machine Learning, Prof. Hung-Yi Lee, EE,
  National Taiwan University
• Slides from CE 5554 / ECE 4554: Computer Vision, Prof. J.-B.
  Huang, Virginia Tech
• http://cs231n.stanford.edu/syllabus.html
• Marc'Aurelio Ranzato, Tutorial in CVPR2014
• Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep
  Learning
   • https://www.deeplearningbook.org/
• Bishop, Pattern Recognition and Machine Learning
• Reference papers
                                                                 2
Outline
• Introduction of neural network
• Go deeper
• Introduction of convolutional neural network (CNN)
• Modern CNN models
                                                   3
History of Neural Network and
Deep Learning             [Prof. Hung-Yi Lee]
LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey, “Deep learning,” Nature, 2015.
                                                                          4
How Powerful?
Object Recognition
Not deep-learning
Deep-learning based
Source:
https://devblogs.nvidia.com/parallelforall/mocha-jl-deep-learning-julia/
https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-
machine-learning-deep-learning-ai/
                                                                                     5
Biological neuron and Perceptrons
                                                                       9
    Recap: Linear Classification
• Linear Classifier
  • Let’s take the input image as x, and the linear classifier as W.
    We need y = Wx + b as a 10-dimensional output vector, indicating the score for each class.
  • For example, an image with 2 x 2 pixels & 3 classes of interest
    we need to learn a linear classifier W (plus a bias b),
    so that desirable outputs y = Wx + b can be expected.
                                                 11
Multi-Layer Perceptron: A Nonlinear Classifier (cont’d)
                                                     12
Layer 1 in MLP
                 13
Layer 2 in MLP
                 14
Multi-Layer Perceptron: A Nonlinear Classifier (cont’d)
                                                     15
  Let’s Get a Closer Look…
• A single neuron 1
0.5
                                       0
                                       一5   0   5
output of neuron
activity of neuron
                   inputs to neuron
                                                    16
Input-Output Function of a Single Neuron
                 w = [0,1]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   17
Input-Output Function of a Single Neuron
             w = [0.2,1]
                                                       5
0.8
                                                 z2
                                                       0
       0.6
   x
       0.4
                                             5
       0.2
         0                          0
        −5                                            −5
             0                                         −5   0     5
                           5   −5       z2                  z1
                 z1
                                                1
                      x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                 18
Input-Output Function of a Single Neuron
             w = [0.3,0.9]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  19
Input-Output Function of a Single Neuron
             w = [0.5,0.9]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  20
Input-Output Function of a Single Neuron
             w = [0.6,0.8]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  21
Input-Output Function of a Single Neuron
             w = [0.8,0.6]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  22
Input-Output Function of a Single Neuron
             w = [0.9,0.5]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  23
Input-Output Function of a Single Neuron
             w = [0.9,0.3]
                                                        5
0.8
                                                  z2
                                                        0
       0.6
   x
       0.4
                                              5
       0.2
         0                           0
        −5                                             −5
              0                                         −5   0     5
                            5   −5       z2                  z1
                  z1
                                                 1
                       x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                  24
Input-Output Function of a Single Neuron
             w = [1,0.2]
                                                       5
0.8
                                                 z2
                                                       0
       0.6
   x
       0.4
                                             5
       0.2
         0                          0
        −5                                            −5
             0                                         −5   0     5
                           5   −5       z2                  z1
                 z1
                                                1
                      x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                 25
Input-Output Function of a Single Neuron
                 w = [1,0]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   26
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,1]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   27
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,2]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   28
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,3]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   29
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,4]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   30
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,5]
                                                         5
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        −5                                              −5
             0                                           −5   0     5
                             5   −5       z2                  z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                   31
Input-Output Function of a Single Neuron (cont’d)
                 w = [0,1]                                    contours
                                                         5       sets direction of boundary
                                                                 sets steepness of boundary
0.8
                                                   z2
                                                         0
       0.6
   x
       0.4
                                               5
       0.2
         0                            0
        一5                                              一5
             0                                           一5              0              5
                             5   一5       z2                             z1
                   z1
                                                  1
                        x(z1, z2) =       1+exp(一w1z1一w2z2)
                                                                                       32
Weight Space of a Single Neuron
W = [2,2]
1 1 1 1
            0                     0
                                      5
                                             0                        0
                                                                          5
                                                                                        0                                  0
                                                                                                                               z2
                                                                                                                               5
                                                                                                                                              -5
                                                                                                                                               0                                   0
                                                                                                                                                                                       5
           -5                               -5                                         -5
                                                                                                                 5 -5
                                                                                                        0
                           5 -5                                                                                                                                           5 -5
                 0                                                                                                                                              0
                                                               5 -5
                                                          0
                                                                                                    z1
1 1 1 1
  W2 0               0.5
                                                      5
                                                              0.5
                                                                                                5
                                                                                                                 0.5
                                                                                                                                                            5
                                                                                                                                                                         0.5
                                                                                                                                                                                                      5
                      0                           0            0                            0                      0                                0                      0                      0
                     -5                                       -5                                                  -5                                                      -5
                                                                                                                                             5 -5                                          5 -5
                                                                                                                                                                                       0
                                          5 -5
                                                                                                                                   0
                                                                               5 -5
                                  0                                       0
1 1 1 1
                -2                                        0                                                 2                                                       4
                                                                              W1                                                                                                            33
Training a Single Neuron
      0              0
            0              training data
class class
                                                   34
Training a Single Neuron
      0              0
            0                               training data
class class
                                                                    35
Training a Single Neuron
      0                        0
               0                                        training data
    class                 class
    objective function:
                                                training data
      1
      0                      0
              0
objective function:
                   w = [0,−1]
                                                                           5
                                                       z2
                                                                           0
0.8
                                                                   −5
        0.6                                                         −5                 0            5
                                                                                      z
    x
                                                                                          1
        0.4                                       5
        0.2                                                            0
                                                                  10
                                                      objective
                                         0
         0
         −5
               0
                                    −5       z2
                                5
                                                                           0   5      10       15   20
                     z1                                                            iteration
                                                                                                    38
Training a Single Neuron
               w = [0.4,−0.7]
                                                                           5
                                                       z2
                                                                           0
0.8
                                                                   −5
        0.6                                                         −5                 0            5
                                                                                      z
    x
                                                                                          1
        0.4                                       5
        0.2                                                            0
                                                                  10
                                                      objective
                                         0
         0
         −5
               0
                                    −5       z2
                                5
                                                                           0   5      10       15   20
                   z1                                                              iteration
                                                                                                    39
Training a Single Neuron
               w = [0.9,−0.2]
                                                                           5
                                                       z2
                                                                           0
0.8
                                                                   −5
        0.6                                                         −5                 0            5
                                                                                      z
    x
                                                                                          1
        0.4                                       5
        0.2                                                            0
                                                                  10
                                                      objective
                                         0
         0
         −5
               0
                                    −5       z2
                                5
                                                                           0   5      10       15   20
                   z1                                                              iteration
                                                                                                    40
Training a Single Neuron
                   w = [1.1,0.1]
                                                                              5
                                                          z2
                                                                              0
0.8
                                                                      −5
        0.6                                                            −5                 0            5
                                                                                         z
    x
                                                                                             1
        0.4                                          5
        0.2                                                               0
                                                                     10
                                                         objective
                                            0
         0
         −5
               0
                                       −5       z2
                                   5
                                                                              0   5      10       15   20
                       z1                                                             iteration
                                                                                                       41
Training a Single Neuron
                   w = [1.4,0.4]
                                                                              5
                                                          z2
                                                                              0
0.8
                                                                      −5
        0.6                                                            −5                 0            5
                                                                                         z
    x
                                                                                             1
        0.4                                          5
        0.2                                                               0
                                                                     10
                                                         objective
                                            0
         0
         −5
               0
                                       −5       z2
                                   5
                                                                              0   5      10       15   20
                       z1                                                             iteration
                                                                                                       42
Training a Single Neuron
               w = [5.2,12.6]
                                                                           5
                                                       z2
                                                                           0
0.8
                                                                   −5
        0.6                                                         −5                 0            5
                                                                                      z
    x
                                                                                          1
        0.4                                       5
        0.2                                                            0
                                                                  10
                                                      objective
                                         0
         0
         −5
               0
                                    −5       z2
                                5
                                                                           0   5      10       15   20
                   z1                                                              iteration
                                                                                                    43
Training a Single Neuron
               w = [9.7,25.3]
                                                                           5
                                                       z2
                                                                           0
0.8
                                                                   −5
        0.6                                                         −5                    0                 5
                                                                                          z
    x
                                                                                              1
        0.4                                       5
                                                                       0
        0.2                                                       10
                                                      objective
                                         0
         0
                                                                       −5
         −5                                                       10
               0
                                    −5       z2
                                5
                                                                            0   10   20           30   40   50
                   z1                                                                iteration
                                                                                                            44
Overfitting and Weight Decay
                                             training data
      1
      0                     0
              0
objective function:
                                                                  z2
               0                                                        0
              −5                                                       −5
               −5                            0            5             −5                0             5
                                             z                                           z
                                                 1                                           1
                                                                                     original
                    objective
                                                                                     regularised
                                     0
                                10
                                         0           10   20               30   40                 50
                                                               iteration
                                                                                                            46
Training a Single Neuron (cont’d)
                                                                  z2
               0                                                        0
              −5                                                       −5
               −5                            0            5             −5                 0             5
                                             z                                            z
                                                 1                                            1
                                                                                      original
                    objective
                                                                                      regularised
                                     0
                                10
                                         0           10   20               30    40                 50
                                                               iteration
                                                                                                             47
Training a Single Neuron (cont’d)
                                                                  z2
               0                                                        0
              −5                                                       −5
               −5                            0            5             −5                 0             5
                                             z                                            z
                                                 1                                            1
                                                                                      original
                    objective
                                                                                      regularised
                                     0
                                10
                                         0           10   20               30    40                 50
                                                               iteration
                                                                                                             48
Training a Single Neuron (cont’d)
                                                                 z2
              0                                                       0
              −5                            0            5            −5                  0             5
                                            z                                            z
                                                1                                            1
                                                                                     original
                   objective
                                                                                     regularised
                                    0
                               10
                                        0           10   20               30    40                 50
                                                              iteration
                                                                                                            49
Training a Single Neuron (cont’d)
                                                                  z2
               0                                                        0
              −5                                                       −5
               −5                            0            5             −5                 0             5
                                             z                                            z
                                                 1                                            1
                                                                                      original
                    objective
                                                                                      regularised
                                     0
                                10
                                         0           10   20               30    40                 50
                                                               iteration
                                                                                                             50
Training a Single Neuron (cont’d)
                                                                  z2
               0                                                        0
              −5                                                       −5
               −5                            0            5             −5                0             5
                                             z                                           z
                                                 1                                           1
                                                                                     original
                    objective
                                                                                     regularised
                                     0
                                10
                                         0           10   20               30   40                 50
                                                               iteration
                                                                                                            51
Single Hidden Layer Neural Networks
output
  hidden
   layer
  inputs
   layer
                                              52
Sampling Random Neural Network Classifiers
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          53
Training a Neural Network with a Single Hidden Layer
  objective function:
                                             likelihood same as before
                                                                  54
Training a Neural Network with a Single Hidden Layer
Networks with hidden layers can be fit using gradient descent using an
 algorithm called back-propagation.
    objective function:
                                                           likelihood same as before
                                                                                55
Training a Neural Network with a Single Hidden Layer
0.8
                                               z2
                                                     0
          0.6
      x
          0.4
                                           5
0.2
            0                     0
           一5                                       一5
                0                                    一5   0       5
                         5   一5       z2                  z
                                                              1
                    z1
                                                                      56
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          57
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          58
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          59
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          60
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          61
Training a Neural Network with a Single Hidden Layer
0.8
                                                    2
                                                         0
                                                    z
         0.6
     x
         0.4
                                                5
0.2
          0                         0
          −5                                            −5
               0                                         −5   0       5
                               −5       z                     z
                           5                2                     1
                   z
                       1
                                                                          62
Hierarchical Models with Many Layers
output
  hidden
   layer
  inputs
   layer
                                       63
 Convolutional Neural Networks (CNN):
 Local Connectivity
                               Hidden layer
Input layer
 w1        w3        w5        w7        w9                  w1        w3        w2        w1        w3
      w2        w4        w6        w8                            w2        w1        w3        w2
Input layer
Channel 2
Input layer
Filter 1 Filter 2
                                                                71
              Ref: Marc'Aurelio Ranzato, Tutorial in CVPR2014
Generalized to 2D Cases:
                                                                72
              Ref: Marc'Aurelio Ranzato, Tutorial in CVPR2014
Generalized to 2D Cases:
                                                                73
              Ref: Marc'Aurelio Ranzato, Tutorial in CVPR2014
Convolutional Layer
Input Output
                               74
Convolutional Layer
Input Output
                               75
Convolutional Layer
Input Output
                               76
Convolutional Layer
Input Output
                               77
Convolutional Layer
Input Output
                               78
Convolutional Layer
Input Output
                               79
Convolutional Layer
Input Output
                               80
                                                  81
Ref: Marc'Aurelio Ranzato, Tutorial in CVPR2014
     Putting them together → CNN
     • Local connectivity
     • Weight sharing
     • Handling multiple input channels
     • Handling multiple output maps
                                                                Weight sharing
Local connectivity
                           86
Putting them together (cont’d)
• The brain/neuron view of CONV layer
                                        90
Putting them together (cont’d)
• The brain/neuron view of CONV layer
                                        91
Putting them together (cont’d)
• The brain/neuron view of CONV layer
                                        92
Putting them together (cont’d)
• Image input with 32 x 32 pixels convolved repeatedly with 5 x 5 x 3
  filters shrinks volumes spatially (32 -> 28 -> 24 -> …).
                                                                        93
    Variations of Convolution
•   Zero Padding
    •   Output is the same size as input (doesn’t shrink as the network gets deeper).
                                                                             94
    Variations of Convolution
•   Stride
    •   Step size across signals
                                   95
    Variations of Convolution
•   Stride
    •   Step size across signals
                                   96
Nonlinearity Layer in CNN
                            99
    Nonlinearity Layer
•   E.g., ReLU (Rectified Linear Unit)
    •   Pixel by pixel computation of max(0, x)
                                                  100
    Receptive Field
• For convolution with kernel size n x n,
  each entry in the output layer depends on a n x n receptive field in the input layer.
• Thus, for large images, we need many layers for each entry in output to “see” the entire input image.
  Possible solution → downsample the image/feature map (see pooling layer next)
                       104
    Pooling Layer
•   Makes the representations smaller and more manageable
•   Operates over each activation map independently
•   E.g., Max Pooling
                                                            105
    Pooling Layer for 2D Cases
•   Reduces the spatial size and provides spatial invariance
                                                               106
Fully Connected (FC) Layer in CNN
                                    109
    FC Layer
•   Contains neurons that connect to the entire input volume,
    as in ordinary neural networks
                                                                110
    FC Layer
•   Contains neurons that connect to the entire input volume,
    as in ordinary neural networks
                                                                111
CNN
      112
    LeNet
•   Presented by Yann LeCun during the 1990s for reading digits
•   Has the elements of modern architectures
                                                                  113
LeNet [LeCun et al. 1998]
                                                115
    AlexNet [Krizhevsky et al., 2012]
•    Repopularized CNN
     by winning the ImageNet Challenge 2012
•    7 hidden layers, 650,000 neurons,
     60M parameters
•    Error rate of 16% vs. 26% for 2nd place.
                                                                                                                 116
              Krizhevsky et al. “ImageNet classification with deep convolutional neural networks,” NIPS, 2012.
 AlexNet
• Parameters
   • Convolution: 1.89M parameters = 7.56MB
   • Fully connected: 58.62M parameters = 234.49MB
• Computation
   • Convolution: 591M Floating MAC
   • Fully connected: 58.62M Floating MAC
   • Full-HD 30fps: 805 GFLOPS (no overlap)
                                                                                                            117
         Krizhevsky et al. “ImageNet classification with deep convolutional neural networks,” NIPS, 2012.
Deep or Not?
• Depth of the network is critical for performance.
                                                      118
 Ultra Deep Network
                                                        22 layers
http://cs231n.stanford.e
du/slides/winter1516_le            19 layers
cture8.pdf
              8 layers
                                               6.7%
                           7.3%
      16.4%
                     7.3%       6.7%
16.4%
    AlexNet             VGG      GoogleNet    Residual Net     Taipei
     (2012)            (2014)     (2014)         (2015)         101
     VGG (2014)
   • Parameters:
          • Convolution: ~14M, 56MB
          • Fully connected: ~124M, 496MB
   • Computation:
          • Convolution: 15.52G Floating MAC
          • Fully connected: 123.63M Floating MAC
          • Full-HD 30fps: 19.3TFLOPS(no overlap)
                                                                                                                          125
Simonyan and Zisserman, “Very Deep Convolutional Networks for Large-scale Image Recognition,” arxiv :1409.1556v6, Sept. 2014
ResNet (2016)
• Can we just increase the #layer?
                                           128
ResNeXT (2017)
• Deeper and wider → better…what else?
    • Increase cardinality
                                                                                                    132
 Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." CVPR, 2017.
Squeeze-and-Excitation Net (SENet)
• How to improve acc. without much overhead?
   • Feature recalibration (channel attention)
                                                                                           133
          Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." CVPR, 2018.
Various Deep Learning Models…
                                                                                                                  131
Ref: Bianco et al., "Benchmark Analysis of Representative Deep Neural Network Architectures," arXiv:1810.00736.