Price Forecasting Using Statistical Model
Data
   1. Time series – data gathered sequentially in time;
      correlation among observations usually exists.
   2. Cross section – data for specific period of time consisting
      of uncorrelated observation.
Model - a set of assumptions that summarizes a structure of
        the system.
Purpose of Modeling
   1. To understand the mechanism that generates the data
   2. To forecast
   3. To optimize the response of a variable of interest
Model Building Process
   1. Planning
      a. Define the problem
      b. Determine an appropriate model
      c. Define the variables
   2. Development of the model
      a. Collect data
      b. Estimate the model
      c. Statistical hypothesis
      d. Diagnostic tests
   3. Verification of the model
      a. Forecasting ability
      b. Usefulness of the model
AECO 122 Lecture Note: Nora DM. Carambas                      Page 1
Models Used in Forecasting
   1. Non-structured or time series model
   2. Structured or econometric models
Structured or Causal or Fundamental or Econometric Models
   - incorporates variables or factors that might influence
     price, e.g. seasonal use patterns, seasonal supply
     patterns, prices of related goods, market structure,
     income.
   - technique for investigating relationships among
     variables  causal models
   - tool in research and planning
   - applicable in social sciences, business, and economics
   a. Variables
           Y = dependent variable
           X’s = independent variables
           p = number of independent variables
           n = number of observations
     Example:
      Pt = deflated season average cranberry price
      Pt-1 = lagged deflated season average cranberry price
      PCS = per capita supply of cranberry
      PCI = per capita deflated disposable income
      D = dummy equal to 0 for 1954-76 observations and PCS
            for 1991-2002 observations
    Have to match observations for all variables
AECO 122 Lecture Note: Nora DM. Carambas                Page 2
   b. Linear Model
             𝑷 = 𝜷𝒐 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟐 + ⋯ + 𝜷𝒑 𝑿𝒑 + 𝜺
      Example:
             𝑷𝒕 = 𝜷𝒐 + 𝜷𝟏 𝑷𝒕−𝟏 + 𝜷𝟐 𝑷𝑪𝑺 + 𝜷𝟑 𝑷𝑪𝑰 + 𝜷𝟒 𝑫 + 𝜺
   c. Reasons for inclusion of the error term ε
      i. to measure the error
      ii. to account for other independent variables not in the
          model
   d. Regression assumptions
      i. The average value of unknown variable is 0, μ = 0.
      ii. The variance of error term is the same for all
           observations, σ2 = c.
      iii. The error terms among observations are
           uncorrelated.
      iv. The error term follow normal distribution.
      v. The independent variables are not strongly
           correlated.
   e. Problems
      i. To estimate the regression coefficients
      ii. To verify the validity of assumptions
      iii.To check the usefulness of the model
   f. Estimation of regression coefficients
      - least-squares method
         minimizes the deviations of the observations from
           the fitted line.
AECO 122 Lecture Note: Nora DM. Carambas                      Page 3
   g. Interpretation of coefficients
      - in relation to theory
      - effect on Y
      - coefficients are slopes/partial elasticities
   h. Significance of the effects of variables
      - test whether independent variables affect dependent
        variable.
         t-test: large values support significance
         p-value: small values imply difference
   i. Regression tests
      i.   ANOVA – F-test
      ii. Coefficient of determination – R2
      iii. Diagnostic checking of residuals
      iv. Heteroscedasticity – Goldfeld-Quandt test
      v. Autocorrelation – Durbin Watson test
      vi. Multicollinarity – ridge regression, deletion
      vii. Use of dummy variables
   j. Forecasting (prediction)
   - based on the given levels of Xs, forecast the value of Y.
Non-Structured or Time-Series Models
   - attempt to predict future using historical data.
   - assume that what will happen in the future is a function
     of what happened in the past.
   1. Naïve Model
      - assumes forecast for price in period t+1 is equivalent
        to its immediate price in period t.
             𝑷𝒕+𝟏 = 𝑷𝒕
AECO 122 Lecture Note: Nora DM. Carambas                   Page 4
      where
        Pt+1 = forecast price in period t+1
        Pt = actual price in period t
   2. Moving Average
      - assumes that price will stay fairly steady over time.
      - smooths out sudden fluctuations in order to provide
        stable estimates.
                          ∑ 𝒑𝒓𝒊𝒄𝒆 𝒊𝒏 𝒑𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝒏 𝒑𝒆𝒓𝒊𝒐𝒅𝒔
         𝑴𝒐𝒗𝒊𝒏𝒈 𝑨𝒗𝒆𝒓𝒂𝒈𝒆 =
                                        𝒏
      Limitations
      1. Increasing n does smooth out fluctuations better, but
         it makes the method less sensitive to real changes in
         the data.
      2. It cannot pick up trends very well.
      3. It requires extensive record keeping of past data when
         n is large.
   3. Exponential Smoothing
      New forecast = last period’s forecast + α (last period’s
                    actual price – last period’s forecast)
             𝑷𝒕+𝟏 = 𝑷𝒕 + 𝜶(𝑨𝑷𝒕 − 𝑷𝒕 )
             𝑷𝒕+𝟏 = 𝜶𝑨𝑷𝒕 + (𝟏 − 𝜶)𝑷𝒕
      where
        Pt+1 = price forecast for period t+1
        Pt = price forecast for period t
        α = smoothing constant (0 ≤ α ≤1)
AECO 122 Lecture Note: Nora DM. Carambas                    Page 5
   4. Trend Projection
      - fits a trend line to a series of historical data and then
        projects the trend line into the future for medium- to
        long-range forecasts
      - mathematical trend equations:
          Linear: 𝑷𝒕 = 𝜶 + 𝜷𝒕 + 𝜺
          Exponential: 𝑷𝒕 = 𝜶𝒕𝜷
          Quadratic: 𝑷𝒕 = 𝜶 + 𝜷𝒕 + 𝜹𝒕𝟐
   5. Decomposition Method
          Multiply the trend line forecast by the appropriate
          seasonal index.
   6. Random Walk Model
      - states that price series do not exhibit predictive
        patterns over time but can best be described with a
        random walk.
   7. Box-Jenkins Arima model
   8. ARCH (Autoregressive Conditional Heteroskedastic) Model
   9. GARCH (Generalized Autoregressive Conditional
      Heteroskedastic) Model
AECO 122 Lecture Note: Nora DM. Carambas                        Page 6
Measures of Forecast Accuracy
   - to see how well one model works, or to compare
     models, the forecasted values are compared with the
     actual or observed values.
   Forecast error = actual value – forecast value
1. Mean Absolute Deviation (MAD)
                   ∑|𝒇𝒐𝒓𝒆𝒄𝒂𝒔𝒕 𝒆𝒓𝒓𝒐𝒓𝒔|
      𝑴𝑨𝑫 =
                           𝒏
2. Mean Squared Error (MSE)
            ∑(𝒇𝒐𝒓𝒆𝒄𝒂𝒔𝒕 𝒆𝒓𝒓𝒐𝒓)𝟐
      𝑴𝑺𝑬 =
                    𝒏
3. Mean Absolute Percent Error (MAPE)
   - The average of the absolute values of the errors
     expressed as percentage of the actual values
               |𝒇𝒐𝒓𝒆𝒄𝒂𝒔𝒕 𝒆𝒓𝒓𝒐𝒓|
             ∑(                 ) ∗ 𝟏𝟎𝟎
                𝒂𝒄𝒕𝒖𝒂𝒍 𝒗𝒂𝒍𝒖𝒆𝒔
      𝑴𝑨𝑷𝑬 =
                        𝒏
AECO 122 Lecture Note: Nora DM. Carambas                Page 7