Demand Forecasting for Businesses
Demand Forecasting for Businesses
8 February 2012 
 
 
J. Scott Armstrong 
The Wharton School, University of Pennsylvania 
747 Huntsman, Philadelphia, PA 19104, U.S.A. 
T: +1 610 622 6480 F: +1 215 898 2534 armstrong@wharton.upenn.edu 
 
Kesten C. Green 
International Graduate School of Business, University of South Australia 
City West Campus, North Terrace, Adelaide, SA 5000, Australia 
T: +61 8 8302 9097 F: +61 8 8302 0709 kesten.green@unisa.edu.au 
 
 
 
ABSTRACT 
 
This  chapter  provides  principles  for  forecasting  demand  that  are  based  on  evidence  of  the  relative 
accuracy of forecasts from alternative procedures and methods. When quantitative data are scarce, as is 
often the case in forecasting, one must rely on judgment. To do so, impose structure on judgment by 
using  prediction  markets,  expert  surveys,  intentions  surveys,  judgmental  bootstrapping,  structured 
analogies,  and  simulated  interaction.  Avoid  intuition,  unstructured  meetings,  game  theory,  and  focus 
groups.  Where  quantitative  data  are  abundant,  use  extrapolation,  quantitative  analogies,  rule-based 
forecasting, and causal methods. Among causal methods, use econometrics when theory is sound and 
there is much data, some prior knowledge, and few important variables. Use index models for choosing 
the best or most likely option when there are many important variables and much knowledge about the 
situation.  Use  structured  procedures  to  incorporate  managers  domain  knowledge  into  forecasts  from 
quantitative  methods.  Combine  forecasts  from  different  forecasters  and  different  evidence-based 
methods. Avoid complex methods. Avoid quantitative methods that have not been validated and those 
that ignore domain knowledge, such as neural networks, stepwise regression, and data mining. Given 
that invalid methods are widely used and valid ones often overlooked, there are many opportunities for 
companies to improve forecasting and decision-making. 
 
Keywords: competitor behaviour, forecast accuracy, market share, market size, sales forecasting. 
 
2 
  
 
Demand  forecasting  asks  how  much  can  be  sold  given  the  situation?  The  situation  includes  the 
broader economy, social and legal environment, and the nature of the market. It also includes actions 
by the firm, by its competitors, and by interest groups. Thanks to empirical research, there have been 
substantial improvements in forecasting methods. As an indication of the increasing rate of progress, 
about 75% of the papers we cite were published since 2000.  
In this chapter we describe evidence from comparative studies of forecasting methods. We then 
make  evidence-based  recommendations  for  improving  the  accuracy  of  your  organizations  forecasts. 
We  answer  the  question,  What  are  the  best  methods  for  my  forecasting  problem?  The  chapter 
provides information on new and useful methods that have not been included in previous reviews of 
demand  forecasting.  We  also  tell  you  what  commonly  used  methods  are  not  effective.  Sources  are 
provided so that you can access the evidence for yourself.  
 
Overview of possible methods 
 
Forecasting methods that might be used to forecast demand are shown in Figure 4.1, the Methodology 
Tree for Forecasting. The primary distinction is between methods that rely on judgement and those that 
estimate relationships from quantitative data.  
 
Methods that rely mainly on judgment 
 
Unaided judgment 
 
Important forecasts are usually made using unaided judgment. By unaided we mean judgment that 
does not use evidence-based procedures. Forecasts that are typically made in this way include those for 
sales of a new product; effects of changes in design, pricing, or advertising; and competitor behaviour. 
Forecasts by experts using their unaided judgment are most likely to be accurate when the situation is 
well  understood  and  simple,  there  is  little  uncertainty,  and  the  experts  receive  accurate,  timely,  and 
well-summarized feedback about their forecasts. 
  Beware! Unaided judgement is often used when the above conditions are not met. Research on 
forecasting for highly uncertain and complex situations has found that experts unaided judgements are 
of little value. For example, a study of more than 82,000 judgmental forecasts made over 20 years by 
284 experts in politics and economics found that their unaided forecasts were little more accurate than 
those  made  by  non-experts,  and  they  were  less  accurate  than  forecasts  from  simple  models  (Tetlock 
2005).   
 
Prediction markets 
 
Despite numerous attempts since the 1930s to devise a better model for forecasting prices, no method 
has  been  found  to  be  superior  to  free  markets.  However,  few  people  believe  this  as  they  pay 
handsomely for investment recommendations in an effort to do better than the market.  
Prediction markets, which are also known as betting markets, information markets, and futures 
markets, have been used to make forecasts since the 1800s. Prediction markets can be created in order 
to predict such things as the proportion of U.S. households with three or more vehicles by the end of 
2015. Confidential markets can be established within firms so that employees can reveal their forecasts  
Figure 4.1: Methodology Tree for Forecasting 
        Unaided 
     Judgment 
    Knowledge 
       source 
     Intentions/ 
    Expectations/ 
   Experimentation 
  Conjoint 
  analysis 
Extrapolation 
     models 
Quantitative 
  analogies 
   Neutral  
     nets 
Rule-based 
forecasting 
   Data 
  mining 
   Causal  
  methods 
Role playing     
(Simulated            
Interaction) 
     Expert  
  forecasting 
 Structured 
  analogies 
Decomposition 
 Judgmental  
bootstrapping 
     Expert 
    systems 
Regression 
analysis 
   Index  Segmentation 
Statistical  Judgmental 
Self      Others 
Structured  Unstructured 
      Role  !No role!
    Univariate  Multivariate 
Data- 
based 
Theory- 
based 
  Linear  Classification 
 
4 
of such things as first year sales of a new product by buying and selling contracts that reward 
accurate forecasts.  
Some unpublished studies suggest that prediction markets can produce accurate sales forecasts 
for companies. However, the average improvement in accuracy across eight published comparisons in 
the  field  of  business  forecastingrelative  to  forecasts  from,  variously,  nave  models,  econometric 
models, individual judgment, and statistical groupswas zero. While the error reductions ranged from 
+28% (relative to nave models) to -29% (relative to average judgmental forecasts), the comparisons 
were insufficient to provide guidance on the conditions that favour prediction markets (Graefe 2011). 
 
Expert surveys 
 
Experts often have information about how others will behave. Thus, it is sometimes possible to learn a 
lot  by  asking  experts  to  make  forecasts.  To  do  so,  use  formal  questionnaires  to  ensure  that  each 
question is asked the same way for all experts and to avoid the biases associated with interviews. 
Here,  the  Delphi  technique  provides  a  useful  way  to  obtain  expert  forecasts  from  diverse 
experts while avoiding the disadvantages of traditional group meetings. It is likely to be most effective 
in situations where relevant knowledge is distributed among experts. For example, decisions regarding 
where to locate a retail outlet would benefit from forecasts obtained from experts on real estate, traffic, 
retailing, and consumers.  
To forecast with Delphi, select between five and twenty experts diverse in their knowledge of 
the situation. Ask the experts to provide forecasts and reasons for their forecasts then provide them with 
anonymous summary statistics on the panels forecasts and their reasons. Repeat the process until there 
is little change in forecasts between roundstwo or three rounds are usually sufficient. The median or 
mode  of  the  experts  final-round  forecasts  is  the  Delphi  forecast.  Software  to  help  administer  the 
procedure is available at forecastingprinciples.com. 
Delphi provided forecasts that were more accurate than those from traditional meetings in five 
studies, less accurate in one, and equivocal in two (Rowe and Wright 2001). Delphi was more accurate 
than expert surveys for 12 of 16 studies, with two ties and two cases in which Delphi was less accurate. 
Among these 24 comparisons, Delphi improved accuracy in 71% and harmed it in 12%.  
Delphi is attractive to managers because it is easy to understand, it supports the forecasts with 
reasons, and provides information on forecast uncertainty. It is relatively cheap because there is no need 
for the experts to meet. Delphis advantages over prediction markets include (1) broader applicability, 
(2)  ability  to  address  complex  questions,  (3)  ability  to  maintain  confidentiality,  (4)  avoidance  of 
manipulation, (5) revelation of new knowledge, and (6) avoidance of cascades (Green, Armstrong, and 
Graefe 2007). Points 5 and 6 refer to the fact that whereas the Delphi process requires participants to 
share their knowledge and reasoning and to respond to that of others, prediction markets do not. As a 
consequence of the absence of non-price information, prediction market participants might think that 
price changes are due to new information when they are not, and this erroneous belief might lead to 
extreme price movements or cascades. 
 
Structured analogies 
 
Forecasters often refer to analogies when forecasting, but tend to use them to justify rather than derive 
their forecasts. In contrast, the structured-analogies method uses a formal, unbiased process to gather 
information about analogous situations prior to making forecasts.  
 
5 
To use the method, prepare a description of the situation for which forecasts are required (the 
target situation) and select experts who are likely to be familiar with analogous situations, preferably 
from  direct  experience.  Instruct  the  experts  to  identify  and  describe  analogous  situations,  rate  their 
similarity to the target situation, and to match the outcomes of their analogies with potential outcomes 
of the target situation. Take the outcome of each experts top-rated analogy, and use the mode of these 
as the structured analogies forecast. 
The  research  to  date  on  structured  analogies,  though  limited,  has  been  promising.  Structured 
analogies  were  41%  more  accurate  than  unaided  judgment  in  forecasting  decisions  in  eight  real 
conflicts,  which  included  union-management  disputes,  a  hostile  takeover  initiative,  and  a  supply 
channel negotiation (Green and Armstrong 2007). The method is easily implemented and understood, 
and can be used for a diverse set of forecasting problems.   
 
Game theory 
 
The  authors  of  textbooks  and  research  papers  recommend  game  theory  to  make  forecasts  about 
conflicts.  Game  theory  involves  identifying  the  incentives  that  motivate  parties  and  deducing  the 
decisions they will make. In the only test of this assumption to date, game theory experts were asked to 
make  predictions  of  decisions  made  in  eight  real  conflict  situations  involving  interaction  among  the 
parties. The game theory experts forecasts were no more accurate than university students forecasts 
(Green 2002 and 2005). 
 
Judgmental Decomposition 
 
Judgemental  decomposition  involves  dividing  a  forecasting  problem  into  multiplicative  parts.  For 
example,  to  forecast  sales  for  a  brand,  one  might  separately  forecast  total  market  sales  and  market 
share, and then multiply these components. Doing so makes sense when it is easier to derive forecasts 
for the parts than for the whole problem. Different methods can be used for forecasting each part.  
Forecasts from decomposition are generally more accurate than those obtained using a global 
approach.  In  particular,  decomposition  is  more  accurate  when  there  is  much  uncertainty  about  the 
aggregate  forecast  and  when  large  numbers  (over  one  million)  are  involved.  Over  three  studies 
involving  15  tests,  judgmental  decomposition  led  to  a  42%  reduction  in  error  when  there  was  high 
uncertainty about the situation (MacGregor 2001).  
 
Judgmental bootstrapping 
 
Judgmental bootstrapping is used to estimate a forecasting model from experts subjective judgments. 
Ask experts what information they use to make predictions about a class of situations. Then ask them to 
make predictions for a set of real or hypothetical cases. The latter is preferable, as one can then ensure 
that  the  values  of  the  independent  variables  vary  widely  and  independently  of  one  another.  For 
example,  experts,  working  independently,  might  forecast  first  year  sales  for  new  stores  using 
information about proximity of competing stores, size of the local population, and traffic flows. These 
variables are used in a regression model where the dependent variable is the experts forecast. When 
the first study was done in 1917, the forecasts from the model of the expert were more accurate than the 
experts original forecast. How could this be? The answer is that the models applied the experts rules 
consistently, whereas the experts did not. 
 
6 
Judgemental  bootstrapping  models  are  most  useful  for  repetitive,  complex  forecasting 
problems for which data on the dependent variable are not available (e.g. demand for a new product) or 
where the available data on the causal variable do not vary sufficiently to allow for the estimation of a 
regression coefficient. Once developed, judgmental bootstrapping models provide forecasts that are less 
expensive than those provided by experts.  
A meta-analysis found that the judgmental bootstrapping model forecasts were more accurate 
than those from unaided judgment in 8 of the 11 comparisons, with two tests showing no difference 
and one showing a small loss (Armstrong 2001a). The typical error reduction was about 6%. The one 
failure occurred when the experts relied heavily on an erroneous variable. In other words, when judges 
use  a  variable  that  lacks  predictive  validity,  such  as  the  height  of  a  job  candidate  to  predict 
effectiveness, consistency is likely to harm accuracy. 
 
Expert systems 
 
Expert systems are structured implementations of the forecasting rules that experts use. Experts rules 
can  be  discovered  by  recording  the  experts  as  they  describe  what  they  are  doing  while  they  make 
forecasts. Empirical estimates of relationships from structured analyses such as econometric studies and 
experiments should also be used when available. Expert opinions, conjoint analysis, and bootstrapping 
can also provide useful information that can be used to determine rules. An expert system should be 
simple, clear, and complete.  
Expert systems forecasts were found to be more accurate than those from unaided judgement in 
a  review  (Collopy,  Adya  and  Armstrong  2001).  The  gains  in  accuracy  were  small,  however.  Thus, 
given  the  high  cost  of  developing  and  revising  expert  systems,  we  expect  that  they  will  seldom  be 
justified. 
 
Simulated interaction 
 
Simulated interaction is a form of role-playing that can be used to forecast decisions by people who are 
interacting. It is especially useful when the situation involves conflict. For example, a manager might 
want to know how best to secure an exclusive distribution arrangement with a major supplier, or how a 
competitor would respond to a 25% price reduction.  
 To  use  simulated  interaction,  prepare  a  description  of  the  situation,  describe  the  main 
protagonists roles, and provide a list of possible decisions. If necessary, secrecy can be maintained by 
disguising the situation. Ask role players to each adopt a role and then read about the situation. When 
they  are  familiar  with  the  situation,  ask  them  to  engage  in  realistic  interactions  with  the  other  role 
players, staying in their roles until they reach a decision. Simulations typically last between 30 and 60 
minutes. 
 Relative  to  the  usual  forecasting  method  (unaided  expert  judgment),  simulated  interaction 
reduced forecast errors by 57% for eight conflict situations (Green 2005).  
 Simulated  interaction  is  most  useful  when  little  or  no  quantitative  data  are  available,  the 
situation for which forecasts are required is unique or unusual, and decision makers wish to predict the 
effects  of  different  policies  or  strategies.  Simulated  interactions  can  be  conducted  inexpensively  by 
using students to play the roles.  
If the simulated interaction method seems onerous, you might wonder whether following the 
common  advice  to  put  yourself  in  the  other  persons  shoes  would  help  a  clever  person  such  as 
yourself to predict the decisions they will make. It will not (Green and Armstrong 2011). Apparently, it 
 
7 
is too difficult to think through the interactions of parties with divergent roles in a complex situation. 
Active  role-playing  between  parties  is  needed  to  represent  such  situations  with  sufficient  realism  to 
derive useful forecasts. 
 
Intentions and expectations surveys, and experimentation 
 
Intentions surveys ask people how they intend to behave in specified situations. The data collected can 
be used, for example, to predict how people would respond to major changes in the design or price of a 
good. A meta-analysis covering 47 comparisons with over 10,000 subjects found that there is a strong 
relationship between peoples intentions and their behaviour (Kim and Hunter 1993). Sheeran (2002) 
reached the same conclusion with his meta-analysis of ten meta-analyses with data from over 83,000 
subjects. 
Surveys can also be used to ask people how they expect they would behave. Expectations differ 
from intentions because people know that unintended things happen. For example, if you were asked 
whether you intended to visit the dentist in the next six months you might say no. However, you realize 
that a problem might arise that would necessitate a visit, so your expectation would be that visiting the 
dentist in the next six months had a probability greater than zero.   
To  forecast  demand  using  a  survey  of  potential  consumers,  prepare  an  accurate  and 
comprehensive  description  of  the  product  and  conditions  of  sale.  Expectations  and  intentions  can  be 
obtained  using  probability  scales  such  as  0  =  No  chance,  or  almost  no  chance  (1  in  100)  to  10  = 
Certain, or practically certain (99 in 100). Evidence-based procedures for selecting samples, obtaining 
high response rates, compensating for non-response bias, and reducing response error are described in 
Dillman,  Smyth,  and  Christian  (2009).  Response  error  is  often  a  large  component  of  error.  This  is 
especially so when the situation is new to the people responding to the survey, as would be the case for 
questions about a new product. Intentions data provide unbiased forecasts of demand, so no adjustment 
is needed (Wright and MacRae 2007).  
Intentions and expectations surveys are useful when historical demand data are not available, 
such as for new product forecasts or a new market. They are most likely to be useful in cases where 
survey respondents have had relevant experience. Other conditions favouring the use of expectations 
surveys include: (1) the behaviour is important to the respondent, (2) the behaviour is planned, (3) the 
respondent is able to fulfil the plan, and (4) the plan is unlikely to change (Morwitz 2001). 
Experimentation provides the most effective way to forecast the effects of alternative courses of 
action. Experiments can also be used to estimate relationships that can be used in forecasting models.  
Experiments can be conducted in the field or in the laboratory. Laboratory experiments allow 
greater  control,  testing  of  conditions  is  easier,  they  are  typically  cheaper  to  conduct,  and  they  avoid 
revealing sensitive information to competitors prior to implementing a plan. A field experiment might 
involve, for example, charging different prices in different geographical markets to estimate the effects 
on total revenue. A lab experiment might involve testing consumers relative preferences by presenting 
a product in different packaging, and recording their purchases in a mock retail environment.  
Focus  group  surveys  are  popular,  but  you  should  not  use  them  for  forecasting  because  they 
violate important forecasting principles. First, focus groups are seldom representative of the population 
of interest. Second, they are small samples. Third, in practice, questions for the participants are often 
not well structured or well tested. Fourth, it is difficult to avoid subjectivity and bias in summarising the 
responses  of  focus  group  participants.  Fifth,  and  most  important,  the  responses  of  participants  are 
influenced by the presence and expressed opinions of others in the group. We have been unable to find 
evidence to show that focus groups provide useful forecasts.  
 
8 
 
Methods requiring quantitative data 
 
Extrapolation 
 
Extrapolation methods require historical data only for the variable to be forecast. They are appropriate 
when little is known about the factors affecting a variable to be forecast (Armstrong 2001b). Statistical 
extrapolations  are  cost  effective  when  many  forecasts  are  needed.  For  example,  some  firms  need 
frequent forecasts of demand for each of hundreds of inventory items.  
Perhaps the most widely used extrapolation method, with the possible exception of using last 
years value, is exponential smoothing. It is sensible in that it weights recent data more heavily and, as a 
type  of  moving  average,  it  smoothes  out  fluctuations.  Exponential  smoothing  is  understandable, 
inexpensive,  and  relatively  accurate.  Gardner  (2006)  reviewed  the  state  of  the  art  on  exponential 
smoothing. 
Given  that  extrapolation  does  not  use  information  about  causal  factors,  there  may  be  much 
forecast uncertainty about the accuracy of the forecasts, especially long-term forecasts. The proper way 
to deal with uncertainty is to be conservative. For time series, conservatism requires that estimates of 
trend be damped (i.e., reduce the forecast trend.) The greater the uncertainty, the greater the damping 
required. Procedures are available to damp the trend and some software packages allow for damping. A 
review of ten comparisons found that, on average, damping reduced the error by almost 5% when used 
with exponential smoothing (Armstrong 2006). In addition, it reduces risk and will moderate the effects 
of recessions. You should avoid software that does not provide proper procedures for damping. 
When  extrapolating  data  of  greater  than  annual  frequency,  remove  the  effects  of  seasonal 
influences  first.  Seasonality  adjustments  lead  to  substantial  gains  in  accuracy,  as  was  shown  in  a 
large-scale study of time-series forecasting: In forecasts over an 18-month horizon for 68 monthly 
economic series, they reduced forecast errors by 23 percent (Makridakis et al. 1984, Table 14). 
Because  estimates  of  seasonal  factors  are  subject  to  uncertainty,  they  should be damped. 
Miller and Williams (2003, 2004) developed procedures for damping seasonal factors. Their software 
for  calculating  damped  seasonal  adjustment  factors  is  available  at  forecastingprinciples.com.  When 
they  applied  the  procedures  to  the  1,428  monthly  time  series  from  the  M3-Competition,  forecast 
accuracy  improved  for  68%  of  the  series.  In  another  study,  damped  seasonal  estimates  were 
obtained  by  averaging  estimates  for  a  given  series  with  seasonal  factors  estimated  for  a  set  of 
time-series  for  related  products;  This  damping  reduced  forecast  error  by  about  20%  (Bunn  and 
Vassilopoulos  1999).  In  another  example,  pooling  monthly  seasonal  factors  for  crime  rates  for 
six  precincts  of  a  city  increased  forecast  accuracy  by  7%  compared  to  when  seasonal  factors 
were estimated individually for each precinct (Gorr, Oligschlager, and Thompson 2003). 
 
Quantitative analogies 
 
When few data are available on the thing being forecast (the target) quantitative data from analogous 
situations  can  be  used  to  extrapolate  what  will  happen.  For  example,  in  order  to  assess  the  annual 
percentage  loss  in  sales  when  the  patent  protection  for  a  drug  is  removed,  one  might  examine  the 
historical pattern of sales when patents were removed for similar drugs in similar markets.  
To forecast using quantitative analogies, ask experts to identify situations that are analogous to 
the target situation and for which data are available. Analogous data may be as directly relevant as, for 
example, previous per capita ticket sales for a play that is touring from city to city.  
 
9 
Rule-based forecasting 
 
Rule-based forecasting, or RBF, allows an analyst to integrate managers knowledge about the situation 
with  time-series  forecasts  in  a  structured  and  inexpensive  way.  For  example,  managers  might  have 
good reasons to think the causal forces acting on a time-series will lead to decline, as with evidence of a 
global financial crisis. 
To use RBF, identify the features of the series. The 28 series features include such things as the 
length of the forecast horizon, the amount of data available, and the existence of outliers. The features 
can  be  identified  by  inspection,  statistical  analysis,  or  domain  knowledge.  Use  the  99  RBF  rules  to 
adjust the data and to estimate short- and long-range models. RBF forecasts are a blend of the short- 
and long-range model forecasts (Armstrong, Adya and Collopy 2001). 
For one-year ahead ex ante forecasts of 90 annual series, the median absolute percentage 
error (MdAPE) for rule-based forecasts was 13% less than that from equally weighted combined 
forecasts.  For  six-year  ahead  ex  ante  forecasts,  the  rule-based  forecasts  had  a  MdAPE  that  was 
42%  less.  Rule-based  forecasts  were  more  accurate  than  equal-weights  combined  forecasts  in 
situations  involving  significant  trends,  low  uncertainty,  stability,  and  good  domain  expertise.  In 
cases where the conditions were not met, the forecasts were no more accurate (Collopy and Armstrong 
1992). 
If implementing RBF is too big a step for your organization, at least use the contrary series rule. 
The rule states that when the expected direction of a time-series and the historical trend of this series are 
contrary to one another, set the forecasted trend to zero. The rule yielded substantial improvements, 
especially for 6-year ahead forecasts where the error reduction exceeded 40% (Armstrong and Collopy 
1993). 
 
Neural nets 
 
Neural nets are designed to pick up nonlinear patterns in long time-series. Studies on neural nets have 
been  popular  with  researchers  with  more  than  300  research  papers  published  during  the  period  from 
1994  to  1998.  Early  reviews  on  the  accuracy  of  forecasts  from  neural  nets  were  not  favorable. 
However,  Adya  and  Collopy  (1998)  found  eleven  studies  that  met  the  criteria  for  a  comparative 
evaluation,  and  in  eight  of  these,  neural  net  forecasts  were  more  accurate  than  alternative  methods. 
Tests of ex ante accuracy in forecasting 111 time series, found that neural network forecasts were about 
as accurate as forecasts from established extrapolation methods (Crone et al. 2011). Perhaps the fairest 
comparison  has  been  the  M3-Competition  with  its  3,003  varied  time  series.  In  that  study,  neural  net 
forecasts were 3.4% less accurate than damped trend forecasts and 4.2% less accurate than combined 
forecasts (Makridakis and Hibon 2000). 
Our advice is to avoid neural networks: The findings are inconclusive, the method ignores prior 
knowledge, and the results are difficult to understand. Furthermore, given the large number of studies 
on  neural  nets,  the  published  research  might  not  reflect  the  value  of  the  method  as  studies  with 
unfavorable  results  might  have  been  rejected  by  journals  due  to  the  well-established  bias  against 
insignificant results.  
 
Causal Models  
 
Causal models include models derived using regression analysis, the index method, and segmentation. 
These  methods  are  useful  if  knowledge  and  data  are  available  for  variables  that  might  affect  the 
 
10 
situation of interest. For situations in which large changes are expected, forecasts from causal models 
are more accurate than forecasts derived from extrapolating the dependent variable (Armstrong 1985, 
p.  408-9;  Allen  and  Fildes  2001).  Theory,  prior  research,  and  expert  domain  knowledge  provide 
information  about  relationships  between  explanatory  variables  and  the  variable  to  be  forecast.  The 
models can be used to forecast the effects of different policies.  
Causal models are most useful when (1) strong causal relationships exist, (2) the directions of 
the  relationships  are  known,  (3)  large  changes  in  the  causal  variables  are  expected  over  the  forecast 
horizon, and (4) the causal variables can be accurately forecast or controlled, especially with respect to 
their direction. 
Regression analysis is used to estimate the relationship between a dependent variable and one 
or  more  causal  variables.  It  is  typically  used  to  estimate  relationships  from  historical  (non-
experimental)  data.  It  is  likely  to  be  useful  in  situations  in  which  three  or  fewer  causal  variables  are 
important, effect sizes are important, and effect sizes can be estimated from many reliable observations 
that include data in which the causal variables varied independently of one another (Armstrong (2012). 
Important  principles  for  developing  regression  models  are  to  (1)  use  prior  knowledge  and 
theory,  not  statistical  fit,  for  selecting  variables  and  for  specifying  the  directions  of  their  effects,  (2) 
discard  variables  if  the  estimated  relationship  conflicts  with  prior  evidence  on  the  nature  of  the 
relationship, and (3) keep the model simple in terms of the number of equations, number of variables, 
and  the  functional  form  (Armstrong  (2012).  Choose  between  models  on  the  basis  of  out-of-sample 
accuracy, not on the basis of R
2
 (Armstrong, 2001c).  
Because  regression  models  tend  to  over-fit  data,  damp  the  estimated  coefficients  of  a  model 
toward no effect. This adjustment for uncertainty tends to improve out-of-sample forecast accuracy, 
particularly  when  one  has  small  samples  and  many  variables.  As  this  situation  is  common  for  many 
prediction problems, unit (or equal weight) modelsthe most extreme case of dampingoften yield 
more  accurate  forecasts  than  models  with  statistically  fitted  regression  coefficients  (see  Armstrong 
2012 for more on this topic.)  
The index method is suitable for situations with little data on the variable to be forecast, where 
many causal variables are important, and where there is good prior knowledge about the effects of the 
variables.  Use  prior  empirical  evidence  to  identify  predictor  variables  and  to  assess  each  variables 
directional influence on the outcome. Experimental findings are especially valuable. Better yet, draw on 
findings  from  meta-analyses  of  experimental  studies.  If  prior  studies  are  not  available,  independent 
expert judgments can be used to choose the variables and determine the directions of their effects. If 
prior knowledge on a variables effect is ambiguous or contradictory, do not include it in the model.  
Index scores are the sum of the values across the variables, which might be coded as 0 or 1, or 
using  a  scale  with  more  points,  depending  on  the  nature  of  the  data  and  the  state  of  knowledge.  An 
alternative  with  a  higher  index  score  is  better,  or  more  likely.  Where  sufficient  historical  data  are 
available, it is possible to estimate a forecasting model by regressing index values against the variable 
of interest, such as sales,  
The  index  method  is  especially  useful  for  selection  problems,  such  as  for  assessing  which 
geographical location offers the highest demand for a product. The method has been tested for making 
early  forecasts  of  the  outcomes  of  U.S.  presidential  elections.  The  test  involved  a  model  that  used 
biographical  information  about  potential  candidates.  Based  on  a  list  of  59  biographical  variables,  the 
candidates relative index scores correctly predicted the popular vote winner for 27 of the 29 elections 
from 1896 to 2008 (Armstrong and Graefe 2011).  
 
11 
Where a single variable is more important than the rest of the variables, an accurate forecast can 
be made from the most important variable. This was used, for example, in the take the best model 
used to predict the outcomes of political elections (Graefe and Armstrong 2012). 
In  general,  avoid  causal  methods  that  lack  theory  or  do  not  use  prior  knowledge.  Data-
mining,  step-wise  regression,  and  neural  networks  are  such  methods.  For  example,  data  mining  uses 
sophisticated  statistical  analyses  to  identify  variables  and  relationships.  Although  it  is  popular,  we 
found  no  evidence  that  data-mining  techniques  provide  useful  forecasts.  An  extensive  review  and 
reanalysis of 50 real-world data sets also found little evidence that data mining is useful (Keogh and 
Kasetty 2002).  
 
Segmentation 
 
Segmentation  involves  breaking  a  problem  down  into  independent  parts  of  the  same  kind,  using 
knowledge and data to make a forecast about each part, and combining the forecasts of the parts. For 
example, a hardware company could forecast industry sales for each type of product and then add the 
forecasts.  
To forecast using segmentation, identify important causal variables that can be used to define 
the  segments,  and  their  priorities.  Determine  cut-points  for  each  variable  such  that  the  stronger  the 
relationship with the dependent variable, the greater the non-linearity in the relationship, and the more 
data  that  are  available  the  more  cut-points  that  should  be  used.  Using  the  best  method  given  the 
information  available,  forecast  the  population  of  each  segment  and  the  behaviour  of  the  population 
within  each  segment.  Combine  population  and  behaviour  forecasts  for  each  segment,  and  sum  the 
segment forecasts. 
Segmentation  has  advantages  over  regression  analysis  where  there  is  interaction  between 
variables,  the  effect  of  variables  on  demand  are  non-linear,  and  there  are  clear  causal  priorities. 
Segmentation is especially useful when there is no reason to think that errors in segment forecasts will 
tend  to  be  in  same  direction.  This  is  likely  to  occur  where  the  segments  are  independent  and  are  of 
roughly  equal  importance,  and  when  information  on  each  segment  is  good.  For  example,  one  might 
improve  accuracy  by  forecasting  demand  for  the  products  of  each  division  of  a  company  separately, 
then  adding  the  forecasts.  But  if  there  are  only  small  samples  and  erratic  data  for  the  segments,  the 
segment forecasts might contain large errors (Armstrong 1985, pp. 412-420).  
Segmentation  based  on  a  priori  selection  of  variables  offers  the  possibility  of  improved 
accuracy at a low risk. Experts prefer segmentations bottom-up approach as it allows them to use their 
knowledge about the problem effectively (Jrgensen 2004). Bottom-up forecasting produced forecasts 
that  were  more  accurate  than  those  from  top-down  forecasting  for  74%  of  192  monthly  time-series 
(Dangerfield and Morris 1992). In a study involving seven teams making estimates of the time required 
to complete two software projects, the typical error from the bottom-up forecast was half of that for the 
top-down approach (Jrgensen 2004).  
 
Selecting methods 
 
Selecting the best forecasting method for a given situation is not a simple task. Often more than one 
will  provide  useful  forecasts.  In  order  to  help  forecasters  choose  methods  and  procedures  that  are 
appropriate for their problems, we used empirical findings and expert opinions to develop the decision 
tree shown in Figure 4.2.  
 
Figure 4.2: Forecasting Method Selection Tree 
 
13 
 
The  first  question  a  forecaster  confronts  is  whether  the  data  are  sufficient  to  develop  a  quantitative 
model.  If  not,  you  will  need  to  use  judgmental  procedures.  The  two  are  not  mutually  exclusive:  In 
many situations, both quantitative and judgmental methods are possible and useful. 
For  situations  involving  small  changes,  where  no  policy  analysis  is  needed,  and  where 
forecasters get good feedbacksuch as with the number of diners that will come to a restaurant at a 
given timeunaided judgement can work well. If, however, the feedback is poor or uncertainty is high, 
it  will  help  to  use  experts  in  a  structured  manner  such  as  with  a  questionnaire  or,  if  the  relevant 
information  is  distributed  among  experts,  with  a  Delphi  panel.  Where  policy  analysis  is  needed, 
judgemental bootstrapping or decomposition will help to use experts knowledge effectively.  
For  situations  involving  large  changes,  but  which  do  not  involve  conflicts  among  a  few 
decision  makers,  ask  whether  policy  analysis  is  required.  If  policy  analysis  is  required,  as  with 
situations involving small changes, use judgmental bootstrapping  or decomposition to  elicit forecasts 
from  experts.  Use  conjoint  analysis  to  elicit  forecasts  from  potential  customers  of  how  they  will 
respond  to  different  offers.  Experimentation  is  likely  to  provide  the  most  useful  forecasts  of  how 
customers would respond to changes. 
If policy analysis is not required, intentions or expectations surveys of, for example, potential 
customers may be useful. Consider also expert surveys, perhaps using the Delphi technique. 
To  make  forecasts  about  situations  that  involve  conflict  among  a  few  decision  makers,  ask 
whether similar cases exist. If they do, use structured analogies. If similar cases are hard to identify or 
the  value  of  an  accurate  forecast  is  high,  such  as  where  a  competitor  reaction  might  have  major 
consequences, use simulated interaction. 
Turning now to situations where there are sufficient quantitative data to consider the estimation 
of  quantitative  models,  ask  whether  there  is  good  knowledge  about  the  relationships  between  causes 
and  effects.  If  knowledge  about  such  relationships  is  poor,  speculative,  or  contentious,  then  consider 
next the kind of data that are available.  
If  the  data  are  cross-sectional  (e.g.  for  stores  in  different  locations  or  product  launches  in 
different  countries)  use  the  method  of  quantitative  analogies.  For  example,  the  introduction  of  new 
products in U.S. markets can provide analogies for the outcomes of the subsequent release of similar 
products in other countries.  
If time-series data are available and domain knowledge is not good, use extrapolation methods 
to  forecast.  Where  good  domain  knowledge  exists  (such  as  when  a  manager  knows  that  sales  will 
increase due to the advertising of a price reduction), consider using rule-based forecasting. Much of the 
benefit of rule-based forecasting can be obtained by using the contrary series rule. The rule is easy to 
implement: ignore the historical trend when managers expect causal forces to act against the trend. For 
example, where sales of new cars have been increasing over recent times, forecast flat sales when signs 
of economic recession are emerging. 
For situations where knowledge of relationships is good and large changes are unlikely, as is 
common  in  the  short-term,  use  extrapolation.  If  large  changes  are  likely,  causal  methods  provide 
forecasts  that  are  more  accurate.  Models  estimated  using  regression  analysis,  or  econometrics,  may 
provide useful forecasts when there are few variables, much good quantitative data, linear relationships, 
low correlations among the causal variables, and an absence of interactions.  
If  the  relationships  are  complicated,  consider  segmentation.  Forecast  the  segments 
independently using appropriate methods.  
Often  the  conditions  are  not  favourable  for  regression  analysis.  In  such  situations,  consider 
using the index method. 
 
14 
 
Combining and adjusting forecasts 
 
Combining forecasts is one of the most powerful procedures in forecasting and it is applicable to a wide 
variety  of  problems.  It  is  most  useful  in  situations  where  the  forecasts  from  different  methods  might 
bracket the true value; that is, the true value would fall between the forecasts. 
In order to increase the likelihood that two forecasts bracket the true value, use methods and 
data that differ substantially. The extent and probability of error reduction through combining is higher 
when  differences  among  the  methods  and  data  that  produced  the  component  forecasts  are  greater 
(Batchelor and Dua 1995). For example, with real GNP forecasts, combining the 5% of forecasts that 
were  most  similar  in  their  methods  reduced  the  error  compared  to  the  typical  forecast  by  11%.  By 
comparison, combining the 5% of forecasts that were most diverse in their methods yielded an error 
reduction of 23%.   
Use  trimmed  averages  or  medians  for  combining  forecasts.  Only  use  differential  weights  if 
there is strong empirical evidence about the relative accuracy of forecasts from the different methods.  
Under  conditions  favorable  for  combining  (i.e.,  when  forecasts  are  made  for  an  uncertain 
situation, and many forecasts are available from several reasonable methods and from using different 
data sources) combining can cut errors by half (Graefe et al. 2010). Combining forecasts is especially 
useful if the forecaster wants to avoid large errors and if there is uncertainty about which method will 
be most accurate.  
Integrate  judgmental  and  statistical  methods.  Integration  is  effective  when  judgments  are 
collected in a systematic manner and then used as inputs to the quantitative models, rather than simply 
used as adjustments to the outputs (Armstrong and Collopy 1998).  
When  making  judgmental  adjustments  of  statistical  forecasts:  (1)  Adjust  only  for  important 
information  about  future  events;  (2)  Record  reasons  for  adjustments;  (3)  Decompose  the  adjustment 
task  if  it  is  feasible  to  do  so;  (4)  Mechanically  combine  judgmental  and  statistical  forecasts;  and  (5) 
Consider  using  a  Delphi  panel  for  determining  adjustments  (Goodwin  2005).  Future  events  might 
include new government regulations coming into force, a planned promotion, the loss of an important 
client,  or  a  competitors  actions.  Mechanical  combination  can  be  as  simple  as  averaging.  Consider 
estimating a regression model to correct the judgmental forecasts for biases (Goodwin et al. 2011). 
When statistical forecasts are derived using causal methods, judgmental adjustments can help 
accuracy if important variables are missing from the causal model, data are poor, relationships are miss-
specified, relationships are believed to have changed, or the environment has changed (Goodwin et al. 
2011). 
When  judgmental  forecasts  are  made  repeatedly,  regress  errors  against  variables  forecasters 
should  have  used,  then  combine  statistical  forecasts  of  error  from  the  resulting  model  with  new 
judgmental forecasts to improve accuracy (Fildes et al. 2009).  
 
On the need for forecasts  
 
Managers may need forecasts of the actions and reactions of key decision makers such as competitors, 
suppliers,  distributors,  collaborators,  or  government  officials.  Forecasts  of  these  actions  can  help  to 
forecast market share. The resulting forecasts allow one to calculate a demand forecast as illustrated in 
Figure 4.3. 
 
15 
 
 
Figure 4.3: Needs for marketing forecasts 
 
 
We now examine the need for forecasts of the elements of demand shown in Figure 4.3. 
 
Forecasting market size 
 
Market size is influenced by environmental factors. For example, the demand for alcoholic beverages 
will  be  influenced  by  such  things  as  local  climate,  size  and  age  distribution  of  the  population, 
disposable  income,  laws,  and  culture.  To  forecast  market  size,  one  can  use  Delphi,  intentions  or 
expectations, extrapolation, causal methods, and segmentation. 
Market forecasts for relatively new or rapidly changing markets in particular are often based on 
judgement. Given the risk of bias from unaided judgement, we recommend using structured methods. 
For example, the Delphi technique could be used to answer questions about market size such as: By 
what  percentage  will  the  wine  market  grow  over  the  next  10  years?  or  What  proportion  of 
households will watch movies via the Internet five years from now? 
When sufficient data are available, such as when the market is well established or when data on 
analogous markets or products are available, use time-series extrapolation methods or causal methods. 
Simple  time-series  extrapolation  is  inexpensive.  Rule-based  forecasting  is  more  expensive,  but  more 
likely to avoid large errors. Use causal methods, such as econometrics and segmentation, when large 
changes are expected in the causal variables, the direction of the change can be predicted accurately, 
and good knowledge exists about the effects of such changes.  
     
Forecasting decision makers actions 
 
The development of a successful business strategy sometimes depends upon having good forecasts of 
the actions and reactions of competitors whose actions might have an influence on market shares. For 
example, if you lower your price, what will your competitors do? A variety of judgmental methods can 
be used to forecast competitors actions. These include: 
  expert opinion (ask experts who know about relevant markets); 
  intentions (ask competitors how they would respond in a given situation); 
 
16 
  structured analogies (analyse similar situations and the decisions that were made); 
  simulated  interaction  (act  out  the  interactions  among  decision  makers  for  the  firm  and  its 
competitors); and 
  experimentation (try the strategy on a small scale and monitor the results).  
 
Sometimes  it  is  useful  to  forecast  the  actions  of  interest  groups.  For  example,  how  would 
organizations  that  lobby  for  environmental  causes  react  to  the  introduction  packaging  changes  by  a 
large fast-food restaurant chain? Use structured analogies and simulated interaction for such problems. 
Company  plans  typically  require  the  cooperation  of  many  people.  Managers  may  decide  to 
implement  a  given  strategy,  but  will  the  organization  be  able  to  carry  out  the  plan?  Sometimes  an 
organization fails to implement a plan because of a lack of resources, misunderstanding, or opposition 
by  employees  or  unions.  The  need  to  forecast  behaviour  in  ones  own  organization  is  sometimes 
overlooked. Better forecasting might lead to plans that are easier to implement. Intentions surveys of 
key  decision  makers  in  an  organization  may  help  to  assess  whether  a  given  strategy  can  be 
implemented successfully. Simulated interactions can also provide useful forecasts in such situations. 
It is also important to predict the effects of the various actions. One can make such forecasts by 
using expert judgment, judgmental bootstrapping, or econometric methods. 
 
Forecasting market share 
 
If one expects the same causal forces and the same types of actions to persist in the future, a simple 
extrapolation  of  market  share,  such  as  from  a  nave  (e.g.,  constant  market  share)  model,  is  usually 
sufficient. 
Draw  upon  methods  that  incorporate  causal  reasoning  when  large  changes  are  expected.  If 
small changes in the factors that affect market share are anticipated, use judgmental methods such as 
expert  surveys  or  Delphi.  If  the  changes  in  the  factors  are  expected  to  be  large,  the  causes  are  well 
understood, and data are scarce, use judgmental bootstrapping. 
Use econometric methods when (1) the effects of current marketing activity are strong relative 
to  the  residual  effects  of  previous  activity;  (2)  there  are  enough  data  and  sufficient  variability  in  the 
data; (3) models can allow for different responses by different brands; (4) models can be estimated at 
brand level; and (5) competitors actions can be forecast (Brodie, et al. 2001). 
Knowledge  about  relationships  can  sometimes  be  can  be  obtained  from  prior  research.  For 
example,  a  meta-analysis  of  price  elasticities  of  demand  for  367  branded  products,  estimated  using 
econometric models, reported a mean value of -2.5  (Tellis 1988). Estimates can also be made about 
other measures of market activity, such as advertising elasticity.  
 
Forecasting for new products 
 
New  product  forecasting  is  important  given  that  large  investments  are  commonly  involved  and 
uncertainty is high.  
The choice of a forecasting method depends on what stage the product has reached in its life 
cycle.  As  a  product  moves  from  the  concept  phase  to  prototype,  test  market,  introduction,  growth, 
maturation, and declining stages, the relative value of the alternative forecasting methods changes. In 
general, the movement is from purely judgmental approaches to quantitative models. 
Surveys  of  consumers  intentions  and  expectations  are  often  used  for  new  product  forecasts. 
Intentions  to  purchase  new  products  are  complicated  because  potential  customers  may  not  be 
 
17 
sufficiently familiar with the proposed product and because the various features of the product affect 
one  another  (e.g.,  price,  quality,  and  distribution  channel).  This  suggests  the  need  to  prepare  a  good 
description  of  the  circumstances  surrounding  the  release  of  the  proposed  product,  but  a  relatively 
simple description of the key features of the product may be sufficient (Armstrong and Overton 1971). 
A  product  description  may  involve  prototypes,  visual  aids,  product  clinics,  or  brochures.  Consumer 
surveys can also improve forecasts even when you already have some sales data (Armstrong, Morwitz 
and Kumar 2000). 
Conjoint  analysis  is  often  used  to  examine  how  demand  varies  as  important  features  of  a 
product are varied (Wittink and Bergestuen 2001). However, using conjoint analysis to forecast new-
product demand can be expensive because it requires large samples of potential customers who may be 
difficult to identify. Potential customers are asked to make selections from a set of offers such as 20 
pairs  of  products.  For  example,  various  features  of  a  personal  digital  assistant  such  as  price,  weight, 
battery life, and screen clarity could be varied substantially while ensuring that the variations in features 
do not correlate with one another. The potential customer chooses from among various offerings. The 
resulting data can be analysed by regressing respondents choices against the product features.   
The accuracy of forecasts from conjoint analysis is likely to increase with increasing realism of 
the  choices  presented  to  respondents.  The  method  is  based  on  sound  principles,  such  as  using 
experimental  design  and  soliciting  independent  intentions  from  a  representative  sample  of  potential 
customers.  Unfortunately  however,  there  are  few  experimental  comparisons  of  conjoint-analysis 
forecasts with forecasts from other reasonable methods (Wittink and Bergestuen 2001). 
Expert  opinions  are  required  in  the  concept  phase.  For  example,  it  is  common  to  obtain 
forecasts from the sales force. When doing so, it is important to properly pose the questions, adjust for 
biases  in  experts  forecasts,  and  aggregate  their  responses.  The  Delphi  method  provides  an  effective 
way to conduct such surveys. 
Expert forecasts can often be improved  if the problem is decomposed in such a way that the 
parts to be forecast are better known than the whole. Thus, to forecast the sales of very expensive cars, 
rather  than  making  a  direct  forecast,  one  could  break  the  problem  into  parts  such  as  How  many 
households will there be in the U.S. in the forecast year? Of these households, what percentage will 
make  more  than  $500,000  per  year?  and  so  on.  The  forecasts  are  obtained  by  multiplying  the 
components.  
Experts are often biased when making forecasts. Those advocating a new product are likely to 
be  optimistic.  Sales  people  may  try  to  forecast  on  the  low  side  if  their  forecasts  will  be  used  to  set 
quotas. Marketing executives may forecast high, believing that this will gain approval for a project or 
motivate the sales force. Avoid experts who would have obvious reasons to be biased. Another strategy 
is to use a heterogeneous group of experts in the hope that their differing biases will be offsetting. 
Experts can make predictions about a set of situations (20 or so) involving alternative product 
designs and  alternative marketing  plans.  These  predictions  would then be  related to the situations by 
regression  analysis.  Expert  judgments  have  advantages  over  conjoint  analysis  in  that  few  experts
between five and twentyare needed. In addition, expert judgments can incorporate policy variables, 
such as advertising, that are difficult for consumers to assess. 
Information  about  analogous  products  can  be  used  to  forecast  demand  for  new  products. 
Collect  historical  data  on  the  analogous  products  and  examine  their  growth  patterns.  Use  the  typical 
pattern for the introductory phases of the products as a forecast for the new product.  
Once  a  new  product  is  on  the  market,  it  is  possible  to  use  extrapolation  methods.  Much 
attention has been given to selecting the proper functional form. The diffusion literature recommends 
an  S-shaped  curve  to  predict  new  product  sales.  That  is,  growth  builds  up  slowly  at  first  and  then 
 
18 
becomes rapid (if word-of-mouth is good, and if people see the product being used by others). Then it 
slows  as  it  approaches  a  saturation  level.  Evidence  on  what  is  the  best  way  to  model  the  process  is 
limited and the benefits of choosing the best functional form are modest (Meade and Islam 2001). Our 
advice is to use simple and understandable growth curves. 
 
Forecast errors and uncertainty   
 
In addition to improving accuracy, the discipline of forecasting is concerned with measuring error and 
assessing uncertainty. Good assessments of  forecast uncertainty or  risk can help in planning, such as 
with the need for contingency plans.  
Present  uncertainty  estimates  as  prediction  intervals,  such  as  there  is  an  80%  chance  that 
demand  for  new  passenger  vehicles  in  Australia  in  2020  will  be  between  400,000  and  700,000. To 
assess the uncertainty of forecasts, one might look at (1) how well the forecasting model fits historical 
data, (2) experts assessments, (3) the distribution of forecasts from different methods and forecasters, 
or (4) the distribution of ex ante forecast errors. Avoid the first.  
The  fit  of  a  model  to  historical  data  is  a  poor  way  to  estimate  prediction  intervals. 
Traditional  confidence  intervals,  which  are  estimated  from  historical  data  for  quantitative 
forecasts,  tend  to  be  too  narrow.  Empirical  studies  have  shown  that  the  percentage  of  actual 
values that fall outside the 95% prediction intervals is often greater than 50% (Makridakis, et al. 
1987).  This  occurs  because  confidence  interval  estimates  ignore  some  sources  of  uncertainty 
over the forecast horizon.  
In addition, forecast errors in time series are often asymmetric, so this makes it difficult 
to  estimate  confidence  intervals.  Asymmetry  of  errors  is  likely  to  occur  when  the  forecasting 
model uses an additive trend. The most sensible procedure is to transform the forecast and actual 
values to logs, calculate the prediction intervals using logged differences, and present the results 
in actual values (Armstrong and Collopy 2001). 
Loss functions can also be asymmetric. For example, the losses due to a forecast that is too low 
by 50 units may differ from the losses if it is too high by 50 units. But this is a problem for the planner, 
not the forecaster. 
Overconfidence  arising  from  historical  fit  is  compounded  when  analysts  use  the  traditional 
statistics  provided  with  regression  programs  (Soyer  and  Hogarth  2012).  Surprisingly  to  most  people, 
tests  of  statistical  significance  are  of  no  value  even  when  properly  used  and  properly  interpreted 
(Schmidt and Hunter 1997). Moreover, the tests often mislead decision makers. We have been unable 
to find a single case where statistical significance tests have helped to improve forecasting (Armstrong 
2007). For a comprehensive review of the evidence on the value of tests of statistical significance, see 
Ziliak and McCloskey (2008). 
Experts  also  are  typically  overconfident  and  thereby  underestimate  uncertainty  (Arkes  2001). 
For  example,  in  an  examination  of  economic  forecasts  from  22  economists  over  11  years,  the  actual 
values fell outside the range of their prediction intervals about 43% of the time, and this occurred even 
when  the  economists  were  warned  in  advance  against  overconfidence  (McNees  1992).  Group 
interaction  tends  to  increase  overconfidence.  Interestingly,  when  people  are  asked  to  explain  the 
reasons for their predictions, their overconfidence increases.  
To  improve  the  calibration  of  judges,  ensure  they  receive  timely  and  accurate  information  in 
what actually happened, along with reasons why their forecasts were right or wrong. This is part of the 
reason  why  next-day  weather  forecasters  are  well  calibrated.  For  example,  for  60%  of  the  days  for 
which they say that there is a 60% chance of rain, it rains. In cases where good feedback is not possible, 
 
19 
ask experts to write all the reasons why their forecasts might be wrong; this will correct for some of the 
overconfidence (Arkes, 2001). 
Still  another  way  to  assess  uncertainty  is  to  examine  the  agreement  among  forecasts.  For 
example, Ashton (1985), in a study of forecasts of annual advertising sales for Time magazine, found 
that  the  agreement,  or  lack  of  it,  among  the  individual  judgmental  forecasts  was  a  good  proxy  for 
uncertainty.  
Uncertainty is most faithfully represented using empirical prediction intervals estimated from 
ex ante forecast errors from the same or similar forecasting situations (Chatfield 2001). Unfortunately, 
empirical prediction intervals are not widely used in practice (Dalrymple 1987). It is best to simulate 
the actual forecasting procedure as closely as possible, and use the distribution of the resulting ex ante 
forecasts  to  assess  uncertainty.  For  example,  if  you  need  to  make  forecasts  for  two  years  ahead, 
withhold enough data to be able to estimate the forecast errors for two-year-ahead ex ante forecasts. 
The choice of error measures is important. Do not use mean square error (MSE). While MSE 
has  characteristics  that  make  it  attractive  to  statisticians,  it  is  not  reliable  (Armstrong  and  Collopy 
1992). Fortunately, the use of MSE by firms has dropped substantially in recent years (McCarthy, et al. 
2006). The median absolute percentage error (MdAPE) is appropriate for many situations because it is 
not affected by scale or by outliers.  
 
Gaining acceptance of forecasts 
 
Forecasts  that  contradict  managers  expectations  can  be  valuable.  Unfortunately,  they  may  also  be 
ignored, as was shown in a study by Griffith and Wellman (1979). One way to avoid this problem is to 
gain prior agreement from managers regarding which forecasting procedures should be used. Another 
way to increase the likelihood that forecasts will be accepted is to ask decision makers to determine in 
advance  what  decisions  they  will  make  when  presented  with  different  possible  forecasts.  If  the 
decisions would not be affected by the forecasts, there is no need to make forecasts.  
Scenarios can help to gain acceptance of forecasts. Scenarios involve writing a story using the 
past tense about what happened in the future and how decision makers responded. Instructions for 
writing  scenarios  are  provided  in  Gregory  and  Duran  (2001).  Scenarios  are  effective  in  getting 
managers to accept the possibility that an event might occur.  
Scenario writing should not, however, be used as a way to forecast. Scenarios lead people to 
greatly overestimate the likelihood that the portrayed event will occur.  
 
Conclusions  
 
Important advances have been made in forecasting over the past half century. These advances can be 
used  to  improve  many  aspects  of  demand  forecasting.  Some  advances  relate  to  the  use  of  judgment, 
such  as  with  Delphi,  simulated  interactions,  intentions  surveys,  expert  surveys,  judgmental 
bootstrapping, and combining. Others relate to quantitative methods such as extrapolation, rule-based 
forecasting,  and  the  index  method.  Most  recently,  gains  have  come  from  the  integration  of  statistical 
and judgmental forecasts. Finally, much has been learned about how to gain acceptance of forecasts. 
Most  firms  ignore  or  are  unaware  of  the  evidence-based  techniques  and  principles  for 
forecasting.  Consequently,  there  are  many  opportunities  to  improve  forecasting.  Over  the  past  few 
years, much effort has been expended to help practitioners by providing understandable principles that 
summarize  research  findings.  These  evidence-based  principles  are  freely  available  at 
forecastingprinciples.com. 
 
20 
REFERENCES  
 
Adya, Monica, and Fred Collopy. 1998. How effective are neural nets at forecasting and prediction? A 
review and evaluation. Journal of Forecasting 17: 451461. 
Allen, P. Geoffrey, and Robert Fildes. 2001. Econometric forecasting. In Principles of Forecasting, 
edited by J. Scott Armstrong, 303362. Norwell, MA: Kluwer Academic Publishers. 
Arkes, Hal R. 2001. Overconfidence in judgmental forecasting. In Principles of Forecasting, edited 
by J. Scott Armstrong, 495515. Norwell, MA: Kluwer Academic Publishers. 
Armstrong, J. Scott. 2012. Illusions in regression analysis. International Journal of Forecasting 
[Forthcoming]. 
Armstrong, J. Scott. 2007. Significance tests harm progress in forecasting. International Journal of 
Forecasting 23: 321327. 
Armstrong, J. Scott. 2006. Findings from evidence-based forecasting: Methods for reducing 
forecast error. International Journal of Forecasting 22: 583598. 
Armstrong, J. Scott, editor. 2001. Principles of Forecasting. Norwell, MA: Kluwer Academic 
Publishers. 
Armstrong, J. Scott. 2001a. Judgmental bootstrapping: Inferring experts rules for forecasting. In 
Principles of Forecasting, edited by J. Scott Armstrong. 171192. Norwell, MA: Kluwer 
Academic Publishers.  
Armstrong, J. Scott. 2001b. Extrapolation of time-series and cross-sectional data. In Principles of 
Forecasting, edited by J. Scott Armstrong. 217243. Norwell, MA: Kluwer Academic Publishers. 
Armstrong, J. Scott. 2001c. Evaluating forecasting methods. In Principles of Forecasting, edited by 
J. Scott Armstrong. 365382. Norwell, MA: Kluwer Academic Publishers. 
Armstrong, J. Scott, Monica Adya, and Fred Collopy. 2001. Rule-based forecasting: Using judgment 
in time-series extrapolation. In Principles of Forecasting, edited by J. Scott Armstrong. 259282. 
Norwell, MA: Kluwer Academic Publishers. 
Armstrong, J. Scott and Fred Collopy. 2001. Identification of asymmetric prediction intervals through 
causal forces. Journal of Forecasting 20: 273283. 
Armstrong, J. Scott and Fred Collopy. 1998. Integration of statistical methods and judgment for time 
series forecasting: Principles from empirical research. In Forecasting with Judgment, edited by 
George Wright and Paul Goodwin. Chichester: John Wiley. 
Armstrong, J. Scott and Fred Collopy. 1993. Causal forces: Structuring knowledge for time series 
extrapolation. Journal of Forecasting 12: 103115.  
Armstrong, J. Scott and Fred Collopy. 1992. Error measures for generalizing about forecasting 
methods: empirical comparisons. International Journal of Forecasting 8: 6980. 
Armstrong, J. Scott and Andreas Graefe. 2011. Predicting elections from biographical information 
about candidates: A test of the index method. Journal of Business Research 64: 699706. 
Armstrong, J. Scott, Vicki Morwitz, and V. Kumar. 2000. Sales forecasts for existing consumer 
products and services: Do purchase intentions contribute to accuracy? International Journal of 
Forecasting 16: 383397. 
Armstrong, J. Scott and Terry S. Overton. 1971. Brief vs. Comprehensive Descriptions in Measuring 
Intentions to Purchase. Journal of Marketing Research 8: 114117. 
Ashton, Alison H. 1985. Does consensus imply accuracy in accounting studies of decision making? 
Accounting Review 60: 173185. 
Batchelor, Roy and Pami Dua. 1995. Forecaster diversity and the benefits of combining forecasts. 
Management Science 41: 6875.  
 
21 
Brodie, Roderick J., Peter Danaher, V. Kumar, and Peter Leeflang. 2001. Econometric models for 
forecasting market share. In Principles of Forecasting, edited by J. Scott Armstrong, 597611. 
Norwell, MA: Kluwer Academic Publishers. 
Chatfield, Christopher. 2001. Prediction intervals for time series. In Principles of Forecasting, edited 
by J. Scott Armstrong. 475494. Norwell, MA: Kluwer Academic Publishers. 
Collopy, Fred, Monica Adya, and J. Scott Armstrong. 2001. Expert systems for forecasting. In 
Principles of Forecasting, edited by J. Scott Armstrong. 285300. Norwell, MA: Kluwer 
Academic Publishers.  
Collopy, Fred, and J. Scott Armstrong. 1992. Rule-based Forecasting: Development and 
Validation of an Expert Systems Approach to Combining Time-Series Extrapolations. 
Management Science 38: 13941414.   
Crone, Sven F., Michle Hibon, and Konstantinos Nikolopoulos. 2011. Advances in forecasting with 
neural networks? Empirical evidence from the NN3 competition on time series prediction. 
International Journal of Forecasting 27: 635660.   
Dangerfield, Byron J. and John S. Morris. 1992. Top-down or bottom-up: Aggregate versus 
disaggregate extrapolations. International Journal of Forecasting 8: 233241.  
Dillman, Don A., Jolene D. Smyth, and Leah Melani Christian. 2009. Internet, Mail, and Mixed-Mode 
Surveys: The Tailored Design Method, (3
rd
 ed.). Hoboken, NJ: John Wiley.  
Fildes, Robert, Paul Goodwin, Michael Lawrence, and Konstantinos Nikolopoulos. 2009. Effective 
forecasting and judgmental adjustments: an empirical evaluation and strategies for improvement 
in supply-chain planning. International Journal of Forecasting 25: 323.  
Gardner, Everette S., Jr. 2006. Exponential smoothing: The state of the art  Part II (with 
commentary). International Journal of Forecasting 22: 637677. 
Goodwin, Paul. 2005. How to integrate management judgment with statistical forecasts. Foresight 1: 
812. 
Goodwin, Paul, Dilek nkal, and Michael Lawrence. 2011. Improving the role of judgment in 
economic forecasting. In The Oxford Handbook of Economic Forecasting, edited by Michael P. 
Clements and David F. Hendry, 163189. OUP: Oxford, UK. 
Gorr, Wilpen, Andreas Olligschlaeger, and Yvonne Thompson. 2003. Short-term forecasting of 
crime. International Journal of Forecasting 19: 579594. 
Graefe, Andreas. 2011. Prediction market accuracy for business forecasting. In Prediction Markets, 
edited by L. Vaughan-Williams. 8795. New York: Routledge. 
Graefe, Andreas, J. Scott Armstrong, Randall. J. Jones, and Alfred G. Cuzn. 2012. Combining 
forecasts: An application to political elections. Working paper. [Available 
at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1902850]  
Graefe, Andreas. and J. Scott Armstrong. 2012. Predicting elections from the most important issue: A 
test of the take-the-best heuristic. Journal of Behavioral Decision Making, 25: 4148.  
Graefe, Andreas. and J. Scott Armstrong. 2011. Conditions under which index models are useful: 
Reply to Bio-index Commentaries. Journal of Business Research 64: 693695. 
Gregory, W. Larry and Anne Duran. 2001. Scenarios and acceptance of forecasts. In Principles of 
Forecasting, edited by J. Scott Armstrong. 519541. Norwell, MA: Kluwer Academic Publishers. 
Green, Kesten C. 2005. Game theory, simulated interaction, and unaided judgment for forecasting 
decisions in conflicts: Further evidence. International Journal of Forecasting 21: 463472. 
Green, Kesten C. 2002. Forecasting decisions in conflict situations: a comparison of game theory, 
role-playing, and unaided judgement. International Journal of Forecasting 18: 321344. 
Green, Kesten C. and J. S. Armstrong. 2011. Role Thinking: Standing in Other Peoples Shoes to 
 
22 
Forecast Decisions in Conflicts, International Journal of Forecasting 27: 6980. 
Green, Kesten C. and J. Scott Armstrong. 2007. Structured analogies for forecasting. International 
Journal of Forecasting 23: 365376 
Green, K. C., J. S. Armstrong, and Andreas Graefe. 2007. Methods to Elicit Forecasts from Groups: 
Delphi and Prediction Markets Compared. Foresight 8: 1720. Available from 
http://kestencgreen.com/green-armstrong-graefe-2007x.pdf  
Griffith, John R., and Barry T. Wellman. 1979. Forecasting bed needs and recommending facilities 
plans for community hospitals: A review of past performance. Medical Care 17: 293303. 
Herzog, Stefan M., and Ralph Hertwig. 2009. The wisdom of many in one mind: Improving 
individual judgments with dialectical bootstrapping. Psychological Science 20 (1997): 231237. 
Jrgensen, Magne. 2004. Top-down and bottom-up expert estimation of software development 
effort. Journal of Information and Software Technology 46 (1): 316. 
Keogh, Eamonn J. and Shruti Kasetty. 2002. On the need for time series data mining benchmarks: A 
survey and empirical demonstration. Proceedings of the Eighth ACM SIGKDD International 
Conference on Knowledge Discovery and Data Mining, p. 102111. 
Kim, Min-Sun and John E. Hunter. 1993. Relationships among attitudes, behavioral intentions, and 
behavior: A meta-analysis of past research. Communication Research 20: 331364. 
MacGregor, Donald G. 2001. Decomposition for judgmental forecasting and estimation. In 
Principles of Forecasting, edited by J. S. Armstrong. 107123. Norwell, MA: Kluwer Academic 
Publishers. 
Makridakis, Spyros G., A. Andersen, Robert Carbone, Robert Fildes, Michle Hibon, Rudolf 
Lewandowski, Joseph Newton, Emmanuel Parzen, and Robert Winkler. 1984. The Forecasting 
Accuracy of Major Times Series Methods. Chichester: John Wiley. 
Makridakis, Spyros G., Michle Hibon, Ed Lusk, and Moncef Belhadjali. 1987. Confidence intervals: 
An empirical investigation of time series in the M-competition. International Journal of 
Forecasting 3: 489508. 
Makridakis, Spyros G. and Michle Hibon. 2000. The M3-Competition: Results, conclusions and 
implications. International Journal of Forecasting 16: 451476. 
Makridakis, Spyros G., Steven C. Wheelwright, and Rob J. Hyndman. 1998. Forecasting Methods for 
Management, Third edition. New York: John Wiley. 
McCarthy, Teresa M., Donna F. Davis, Susan L. Golicic, and John T. Mentzer. 2006. The evolution 
of sales forecasting management: A 20-year longitudinal study of forecasting practices. Journal 
of Forecasting 25: 303324. 
McNees, Stephen K. 1992. The uses and abuses of consensus forecasts. Journal of Forecasting 11: 
703710. 
Meade, Nigel and Towhidul Islam. 2001. Forecasting the diffusion of innovations: Implications for 
time series extrapolation. In Principles of Forecasting, edited by J. Scott Armstrong. 577595. 
Norwell, MA: Kluwer Academic Publishers. 
Miller, Don M. and Dan Williams. 2004. Shrinkage estimators for damping X12-ARIMA seasonals. 
International Journal of Forecasting 20: 529549. 
Morwitz, Vicki G. 2001. Methods for forecasting from intentions data. In Principles of Forecasting, 
edited by J. Scott Armstrong. 3356.Norwell, MA: Kluwer Academic Publishers. 
Rowe, Gene and George Wright. 2001. Expert opinions in forecasting role of the Delphi technique. 
In Principles of Forecasting, edited by J. Scott Armstrong. 125144. Norwell, MA: Kluwer 
Academic Publishers. 
 
23 
Schmidt, Frank L. and John E. Hunter. 1997. Eight common but false objections to the 
discontinuation of significance testing in the analysis of research data. In What if there were no 
Significance Tests?, edited by Lisa L. Harlow, Stanley A. Mulaik, and James H. Steiger. 3764. 
London: Lawrence Erlbaum. 
Soyer, Emre and Robin Hogarth, R. 2012. Illusion of predictability: How regressions statistics 
mislead experts. International, Journal of Forecasting, 28 [Forthcoming]. 
Tellis, Gerald J. 1988. The price elasticity of selective demand: A meta-analysis of econometric 
models of sales. Journal of Marketing Research 25: 331341. 
Tetlock, Philip E. 2005. Expert political judgment: How good is it? How can we know? New Jersey: 
Princeton University Press. 
Wittink, Dick R. and Trond Bergestuen. 2001. Forecasting with conjoint analysis. In Principles of 
Forecasting, edited by J. Scott Armstrong. 147167. Norwell, MA: Kluwer Academic Publishers. 
Wright, Malcolm and Murray MacRae. 2007. Bias and variability in purchase intention scales, 
Journal of the Academy of Marketing Science 35: 617624. 
Ziliak, Stephen. T. and, Deirdre N. McCloskey. 2008. The cult of statistical significance: How the 
standard error costs us jobs, justice, and lives. Ann Arbor, MI: University of Michigan Press. 
 
Authors 
 
J. Scott Armstrong (Ph.D., MIT, 1968), Professor of Marketing at the Wharton School, 
University of Pennsylvania, is a founder of the Journal of Forecasting, the International 
Journal of Forecasting, and the International Symposium on Forecasting. He is the creator of 
forecastingprinciples.com and editor of Principles of Forecasting (Kluwer 2001), an evidence-based 
summary of knowledge on forecasting. In 1996, he was selected as one of the first Honorary Fellows by 
the International Institute of Forecasters. In 2004 and 2008, his PollyVote.com team showed how scientific 
forecasting principles can produce highly accurate forecasts of US presidential elections. He was named by 
the Society of Marketing Advances as Distinguished Marketing Scholar of 2000. One of Whartons most 
prolific scholars, he is the most highly cited professor in the Marketing Department at Wharton. His current 
projects involve the application of scientific forecasting methods to climate change, the effectiveness of 
learning at universities, and the use of the index method to make predictions for situations with many 
variables and much knowledge. His book, Persuasive Advertising, was published by Palgrave Macmillan in 
2010. It summarizes evidence-based knowledge on persuasion and it is supported by 
advertisingprinciples.com. He can be contacted at armstrong@wharton.upenn.edu. 
 
Kesten C. Green (Ph.D., VUW, 2003) is a Senior Lecturer in the International Graduate School of 
Business of the University of South Australia and a Senior Research Associate of the Ehrenberg-Bass 
Institute for Marketing Science. He is also a Director of the International Institute of Forecasters and co-
director of the Forecasting Principles public service Internet site devoted to the advancement of evidence-
based forecasting. His research has led to improvements in forecasting the decisions people make in 
conflicts such as occur in business competition, supply chains, mergers and acquisitions, and between 
customers and businesses. His other interests include forecasting for public policy, forecasting demand, 
forecasting for recessions and recoveries, and the effect of business objectives on performance. His research 
has been covered in the Australian Financial Review, the London Financial Times, the New Yorker, and the 
Wall Street Journal. He has advised the Alaska Department of Natural Resources, the U.S. Department of 
Defense, the Defense Threat Reduction Agency, the National Security Agency (NSA) and more that 50 
other business and government clients. Kesten can be contacted at kesten@me.com.