H.
Madsen, Time Series Analysis, Chapmann Hall
                            Time Series Analysis
                                       Henrik Madsen
                                        hm@imm.dtu.dk
                         Informatics and Mathematical Modelling
                             Technical University of Denmark
                                  DK-2800 Kgs. Lyngby
Henrik Madsen                                    1
H. Madsen, Time Series Analysis, Chapmann Hall
  Outline of the lecture
           Practical information
           Introductory examples (See also Chapter 1)
           A brief outline of the course
           Chapter 2:
               Multivariate random variables
               The multivariate normal distribution
               Linear projections
           Example
Henrik Madsen                                    2
H. Madsen, Time Series Analysis, Chapmann Hall
  Introductory example – shares (COLO B 18m)
    From www.cse.dk
Henrik Madsen                                    3
H. Madsen, Time Series Analysis, Chapmann Hall
  Consumption of District Heating (VEKS) – data
       Heat Consumption (GJ/h)
                                 2000
                                 1200
                                 800
                                        Nov          Dec   Jan       Feb          Mar   Apr
                                              1995                         1996
                                 10
       Air Temperature (°C)
                                 5
                                 0
                                 −5
                                 −10
                                        Nov          Dec   Jan       Feb          Mar   Apr
                                              1995                         1996
Henrik Madsen                                                    4
H. Madsen, Time Series Analysis, Chapmann Hall
  Consumption of DH – simple model
                                                 2500
                                                 2000
                       Heat Consumption (GJ/h)
                                                 1500
                                                 1000
                                                        −15   −10     −5          0        5   10
                                                                    Air Temperature (°C)
Henrik Madsen                                                              5
H. Madsen, Time Series Analysis, Chapmann Hall
  Consumption of DH – model error
                                                                        Model Error
       Model Error (GJ/h)
                            400
                            0
                            −400
                                   Nov          Dec              Jan                  Feb                Mar   Apr
                                         1995                                                1996
                                                      Model Error as it should be if the model were OK
       Model Error (GJ/h)
                            400
                            0
                            −400
                                   Nov          Dec              Jan                  Feb                Mar   Apr
                                         1995                                                1996
Henrik Madsen                                                             6
H. Madsen, Time Series Analysis, Chapmann Hall
  A brief outline of the course
           General aspects of multivariate random variables
           Prediction using the general linear model
           Time series models
           Some theory on linear systems
           Time series models with external input
    Some goals:
           Characterization of time series / signals; correlation functions,
           covariance functions, spectral distributions, stationarity,
           ergodicity, linearity, . . .
           Signal processing; filtering, sampling, smoothing
           Modelling; with or without external input
           Prediction / Control
Henrik Madsen                                    7
H. Madsen, Time Series Analysis, Chapmann Hall
  Multivariate random variables
           Joint and marginal densities
           Conditional distributions
           Expectations and moments
           Moments of multivariate random variables
           Conditional expectation
           The multivariate normal distribution
           Distributions derived from the normal distribution
           Linear projections
Henrik Madsen                                    8
H. Madsen, Time Series Analysis, Chapmann Hall
  Multivariate random variables
           Definition (n-dimensional random variable; random vector)
                                                             
                                               X1
                                              X2 
                                           X= . 
                                              
                                              .. 
                                                         Xn
           Joint distribution function:
                       F (x1 , · · · , xn ) = P{X1 ≤ x1 , · · · , Xn ≤ xn }
Henrik Madsen                                    9
H. Madsen, Time Series Analysis, Chapmann Hall
  Multivariate random variables
           Probability density function (continuous case):
                                                     ∂ n F (x1 , · · · , xn )
                              f (x1 , · · · , xn ) =
                                                         ∂x1 · · · ∂xn
                                            Z   x1     Z   xn
                   F (x1 , · · · , xn ) =            ···        f (t1 , · · · , tn ) dt1 . . . dtn
                                                −∞         −∞
           Probability density function (discrete case):
                        f (x1 , · · · , xn ) = P{X1 = x1 , · · · , Xn = xn }
Henrik Madsen                                         10
H. Madsen, Time Series Analysis, Chapmann Hall
  The Multivariate Normal Distribution
           The joint p.d.f.                                                                
                                1           1
                fX (x) =        √      exp − (x − µ)T Σ−1 (x − µ)
                         (2π)n/2 det Σ      2
           Σ must be positive semidefinite
           Notation: X ∼ N(µ, Σ)
           Standardized multivariate normal: X ∼ N(0, I)
           N(µ, Σ) = µ + T N(0, I), where Σ = T T T
           If X ∼ N(µ, Σ) and Y = a + BX then
           Y ∼ N(a + Bµ, BΣB T )
           More relations between distributions in Sec. 2.7
Henrik Madsen                                    11
H. Madsen, Time Series Analysis, Chapmann Hall
  Marginal density function
                   Sub-vector: (X1 , · · · , Xk )T (k < n)
                   Marginal density function:
                                              Z ∞ Z             ∞
                      fS (x1 , · · · , xk ) =    ···                    f (x1 , · · · , xn ) dxk+1 · · · dxn
                                                 −∞             −∞
                                                                                                              Marginal histogram of 100000 samples
                                             4
                                                                             0.10
                                                                                                         15
                                             2                               0.08
                                                                                      Percent of Total
                                                                             0.06
                                                                                                         10
                                       x2
    Density                                  0
                                                                             0.04
                                            −2
                                                                             0.02
                                                                                                         5
                                            −4                               0.00
                                                                                                         0
                                                 −4   −2   0        2    4
              x2
                           x1
                                                                                                              −4     −2        0        2        4
                                                           x1
                                                                                                                              x1
Henrik Madsen                                                  12
H. Madsen, Time Series Analysis, Chapmann Hall
  Conditional distributions
           The conditional density                 4
                                                                             0.10
           of Y given X = x is
           defined as (fX (x) > 0):                2                         0.08
                         fX,Y (x, y)               0
                                                                             0.06
                                           y
           fY |X=x (y) =
                           fX (x)                                            0.04
                                                  −2
           (joint density of (X, Y )                                         0.02
           divided by the marginal                −4
           density of X evaluated at                                         0.00
                                                       −4   −2   0   2   4
           x)
                                                                 x
Henrik Madsen                                    13
H. Madsen, Time Series Analysis, Chapmann Hall
  Independence
           If knowledge of X does not give information about Y we get
           fY |X=x (y) = fY (y)
           This leads to the following definition of independence:
                                   fX,Y (x, y) = fX (x)fY (y)
Henrik Madsen                                    14
H. Madsen, Time Series Analysis, Chapmann Hall
  Expectation
           Let X be a univariate random variable with density fX (x). The
           expectation of X is then defined as:
                             Z ∞
                    E[X] =        xfX (x)dx (continuous case)
                                  −∞
                                 X
                       E[X] =            xP (X = x)   (discrete case)
                                 all x
           Calculation rule:
                        E[a + bX1 + cX2 ] = a + b E[X1 ] + c E[X2 ]
Henrik Madsen                                    15
H. Madsen, Time Series Analysis, Chapmann Hall
  Moments and variance
           n’th moment:                          Z    ∞
                                        n
                                  E[X ] =                 xn fX (x) dx
                                                     −∞
           n’th central moment:
                                                 Z    ∞
                      E[(X − E[X])n ] =                   (x − E[X])n fX (x) dx
                                                     −∞
           The 2’nd central moment is called the variance:
                       V [X] = E[(X − E[X])2 ] = E[X 2 ] − (E[X])2
Henrik Madsen                                    16
H. Madsen, Time Series Analysis, Chapmann Hall
  Covariance
           Covariance:
           Cov[X1 , X2 ] = E[(X1 −E[X1 ])(X2 −E[X2 ])] = E[X1 X2 ]−E[X1 ]E[X2 ]
           Variance and covariance:
                                        V [X] = Cov[X, X]
           Calculation rule:
            Cov[aX1 + bX2 , cX3 + dX4 ] =
            ac Cov[X1 , X3 ] + ad Cov[X1 , X4 ] + bc Cov[X2 , X3 ] + bd Cov[X2 , X4 ]
           The calculation rule can be used for the variance also
Henrik Madsen                                    17
H. Madsen, Time Series Analysis, Chapmann Hall
  Expectation and Variance for Random Vectors
           Expectation: E[X] = [E[X1 ] E[X2 ] . . . E[Xn ]]T
           Variance-covariance (matrix):
           ΣX = V [X] = E[(X − µ)(X − µ)T ] =
                                                                        
                       V [X1 ]     Cov[X1 , X2 ] · · ·      Cov[X1 , Xn ]
                    Cov[X2 , X1 ]   V [X2 ]     ···        Cov[X2 , Xn ]
                                                                        
                         ..                                     ..      
                          .                                      .      
                       Cov[Xn , X1 ] Cov[Xn , X2 ] · · ·          V [Xn ]
           Correlation:
                                       Cov[Xi , Xj ]       σij
                                 ρij = p                =
                                         V [Xi ]V [Xj ]   σi σj
Henrik Madsen                                    18
H. Madsen, Time Series Analysis, Chapmann Hall
  Expectation and Variance for Random Vectors
           The correlation matrix R = ρ is an arrangement of ρij in a
           matrix
           Covariance matrix between X (dim. p) and Y (dim. q ):
                                                                        T                                                                           
                      ΣXY       = C[X, Y ] = E (X − µ)(Y − ν)
                                                                    
                                    Cov[X1 , Y1 ] · · · Cov[X1 , Yq ]
                                = 
                                       ..                  ..       
                                         .                   .       
                                      Cov[Xp , Y1 ] · · ·   Cov[Xp , Yq ]
           Calculation rules – see the book.
           The special case of the variance C[X, X] = V [X] results in
           V [AX] = AV [X]AT
Henrik Madsen                                    19
H. Madsen, Time Series Analysis, Chapmann Hall
  Conditional expectation
                                            Z    ∞
                          E[Y |X = x] =               yfY |X=x (y) dy
                                                 −∞
                     E[Y |X] = E[Y ] if andY are independent
                                                            E[Y ] = E E[Y |X]
                     E[g(X)Y |X] = g(X)E[Y |X]                                              
                     E[g(X)Y ] = E g(X)E[Y |X]
                     E[a|X] = a
                     E[g(X)|X] = g(X)
                     E[cX + dZ|Y ] = cE[X|Y ] + dE[Z|Y ]
Henrik Madsen                                    20
H. Madsen, Time Series Analysis, Chapmann Hall
  Variance separation
           Definition of conditional variance and covariance:
                                                            T 
                   V [Y |X] = E Y − E[Y |X] Y − E[Y |X] |X
                                                            T 
               C[Y , Z|X] = E Y − E[Y |X] Z − E[Z|X] |X
           The variance separation theorem:
                                                                  V [Y ] = E V [Y |X] + V E[Y |X]                                                        
                C[Y , Z] = E C[Y , Z|X] + C E[Y |X], E[Z|X]
Henrik Madsen                                    21
H. Madsen, Time Series Analysis, Chapmann Hall
  Linear Projections
           Consider two random vectors Y and X , then
                                                                               
                    Y                µY                    Y                ΣY Y   ΣY X
           E                 =                and V                 =
                    X                µX                    X                ΣXY    ΣXX
           Consider the linear projection: E[Y |X] = a + BX
           Then:
                                 E[Y |X] = µY + ΣY X ΣXX −1 (X − µX )
                   V [Y − E[Y |X]] = ΣY Y − ΣY X ΣXX −1 ΣY X T                                
                C Y − E[Y |X], X = 0
           The linear projection above has minimal variance among all
           linear projections.
Henrik Madsen                                    22
H. Madsen, Time Series Analysis, Chapmann Hall
  Air pollution in cities
           Carstensen (1990) has used time series analysis to set up
           models for N O and N O2 at Jagtvej in Copenhagen
           Measurements of N O and N O2 available every third hour (00,
           03, 06, 09, 12, . . . )
           We have µN O2 = 48µg/m3 and µN O = 79µg/m3
           In the model X1,t = N O2,t − µN O2 and X2,t = N Ot − µN O is
           used
Henrik Madsen                                    23
H. Madsen, Time Series Analysis, Chapmann Hall
  Air pollution in cities – model and forecast
                                                                                                     X1,t               0.9 −0.1             X1,t−1               ξ1,t
                               =                                         +
                    X2,t               0.4  0.8             X2,t−1               ξ2,t
                                        X t = ΦX t−1 + ξt
                                       σ12
                                                                      
                                             σ12                30 21
                V [ξt ] = Σ =                           =                    (µg/m3 )2
                                       σ21   σ22                21 23
           Assume that t corresponds to 09:00 today and we have
           measurements 64 µg/m3 N O2 and 93 µg/m3 N O
           Forecast the concentrations at 12:00 (t + 1)
           What is the variance-covariance of this forecast?
Henrik Madsen                                      24
H. Madsen, Time Series Analysis, Chapmann Hall
  Air pollution in cities – linear projection
           At 12:00 (t + 1) we now assume that N O2 is measured with
           67 µg/m3 as the result, but N O cannot be measured due to
           some trouble with the equipment.
           Estimate the missing N O measurement.
           What is the variance of the error of the estimation?
Henrik Madsen                                    25