ECE5602: Neural Networks
(Deep Learning)
               Taesup Moon
                     Lecture 2
M.IN.D Lab     ECE 5602: Neural Networks
                                Outline
• Review
  – Probability
    (http://web.stanford.edu/class/cs224n/readings/cs229-prob.pdf)
  – Linear algebra
    (http://web.stanford.edu/class/cs224n/readings/cs229-linalg.pdf)
  – Convex optimization
    (http://web.stanford.edu/class/cs224n/readings/cs229-cvxopt.pdf)
  – Information theory
   M.IN.D Lab               ECE 5602: Neural Networks                  2 / 24
                      Probability
• Axioms of probability
• Joint probability
   – Sum rule
   – Product rule
    M.IN.D Lab        ECE 5602: Neural Networks   3 / 24
                      Probability
• Conditional probability
• Bayes’ rule
    M.IN.D Lab       ECE 5602: Neural Networks   4 / 24
                     Probability
• Random variables
  – Discrete
  – Continuous
   M.IN.D Lab        ECE 5602: Neural Networks   5 / 24
                    Probability
• Independence
• Conditional independence
   M.IN.D Lab       ECE 5602: Neural Networks   6 / 24
                    Probability
• Mean & variance
• Covariance
   M.IN.D Lab       ECE 5602: Neural Networks   7 / 24
                 Information theory
• Entropy
  – Measure of uncertainty
  – Lower limit of data compression
   M.IN.D Lab         ECE 5602: Neural Networks   8 / 24
                  Information theory
• Relative entropy
   – Also known as Kullback-Leibler (KL) divergence
   – Often used as a distance between two distributions
     à Rigorously, not a metric, though.
    M.IN.D Lab          ECE 5602: Neural Networks         9 / 24
                    Linear algebra
• Matrix, vector
• Norms
• Eigenvalues / eigenvectors
    M.IN.D Lab       ECE 5602: Neural Networks   10 / 24
                      Linear algebra
• Matrix calculus
   – Gradient, Jacobian matrix
   – Hessian
    M.IN.D Lab          ECE 5602: Neural Networks   11 / 24
       Convex functions / Optimization
• Convex set
   M.IN.D Lab    ECE 5602: Neural Networks   12 / 24
       Convex functions / Optimization
• Convex function
                                                Jensen’s Inequality
   M.IN.D Lab       ECE 5602: Neural Networks                         13 / 24
       Convex functions / Optimization
• Convex optimization
  – f(x) is a convex function
  – C is a convex set
• Good thing about convex optimization
  – All locally optimal points are globally optimal
   M.IN.D Lab           ECE 5602: Neural Networks     14 / 24