0% found this document useful (0 votes)

80 views7 pages

Boozed'd Trees-Beer Sales Forecasting: Dan Zylberglejd

- The document discusses predicting monthly beer sales in Mexico for Anheuser-Busch Inbev using machine learning models. - Data included past monthly sales along with additional data on temperature, holidays, sports schedules, and economic factors. - Models tested included linear regression, lasso, boosting, random forests, and ensembles. The best approach was an ensemble combining lasso and bagging that improved upon ABI's in-house forecasts.

Uploaded by

scrib@pym.cl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views7 pages

Boozed'd Trees-Beer Sales Forecasting: Dan Zylberglejd

Uploaded by

scrib@pym.cl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Boozed’d Trees—Beer Sales Forecasting

Dan Zylberglejd 1 1
dzylber@cs.stanford.edu

We predict sales volume of beer brands by unit size and region in

Mexico during 2016 for brewery-conglomerate Anheuser-Busch Inbev.
Augmentation with environmental data and surveying suited models
lead us to an ensemble method combining lasso and bagging with
ABI’s in-house predictions, over which we achieve a test error improve-
ment in RMSE of 9.2%. We additionally present an optimal reallocation
strategy when forecasts change after production has finished.

The Challenge

Anheuser-Busch Inbev is the largest beer company in the globe. It

recently announced one of the largest deals in history, its acquisi-
tion of SABMiller. It also controls some of the main breweries in the
world, such as Interbrew in Belgium and Anheuser-Busch in the US.
Our project is focused on Grupo Modelo, its Mexican branch that
produces, among others, beers such as Corona, Modelo and Victoria.
Mexico is a substantial beer consumer, and a big challenge posed
by its new CEO is to improve accuracy of two month leading sales Figure 1: Anheuser-Busch Inbev logo
forecasts. There is not much historical data on past forecasts, but an
internal analysis indicates that the consolidated monthly percentage
error for Mexico is around 15–20%. To reduce unnecessary waste and
ensure they meet customer demand, Modelo needs to decrease this
number, and we implement a Machine Learning approach to tackle
this great challenge. We believe that it can generate a considerable
impact not only from a profit perspective, but also as a step to push a
data-driven culture to its business.

The Data
Log change in sales volume

10.0

1.0

For each trio of UEN (region), unit size (bottle, can,...), and brand,
0.1
ABI provides us with monthly sales in hectolitres since Jan/2013, as
well as discounts and other pricing measures. For each example, we 2013 2014 2015
Date
2016

use past monthly sales up to one year.

Figure 2: Monthly change of sales
volume per trio UEN-Brand-Size on a
We also scrape the web in order to augment our dataset by in- log scale
cluding temperature, holidays, soccer calendars and economic inputs
(inflation, GDP). We carefully align the data with the information we
would have had "at the moment of training". In other words, since we Training: 2014—2015 Testing: 2016
19424 samples 6933 samples
are predicting 2 months ahead, some features need to have a 2-month
4:1 validation split
delay (such as past sales), while others don’t, since they are known in
advance (holidays, soccer calendar).
January ‘14 ‘15 ‘16 August ‘16

Finally, we perform feature engineering on the main features. We

beer sales forecasting 2

incorporate the difference between consecutive month sales, and also

group past sales by UEN, unit size, brand, and also all the pairwise
combinations (6 in total).

The final dataset then consists on 25694 examples with 149 fea-
tures and 1 numerical output. We split it into train, validation and
test set following chronological order. We run all of our experiments
on the training set followed by eventual checks on the validation set.
The test set was only used for the final model assessment.

The Metric

Industrial engineering and supply-chain management literature men-

tions the so called bullwhip effect2 . This phenomenon occurs as
inaccurate forecasts propagate backwards through a supply chain,
leading to inefficiencies in inventory management that grow superlin-
early. To account for this effect, we agreed with ABI to use a metric
that penalizes errors more than linearly for evaluation of model per-
formance. We pick Root Mean Squared Error (RMSE). Figure 3: Illustration of the bullwhip
effect and the outsized distortions as we
go through the supply chain
We also report other types of metrics which ABI executives are 2
Dean C Chatfield, Jeon G Kim, Terry P
more used to. However, it is important to reinforce that from a profit- Harrison, and Jack C Hayya. The
oriented perspective RMSE is the most important evaluation criterion bullwhip effect–impact of stochastic
lead time, information quality, and
to look at. In particular their use of Mean Average Percentage Error information sharing: a simulation study.
(MAPE) could be misleading: it grows linearly in terms of the nu- Production and Operations Management,
13(4):340–353, 2004
merator, and the denominator inflates errors for small sales, which
goes in the opposite direction of maximizing profit.

s
n n Ŷ − Y
1 1
∑ (Ŷi − Yi )2 ∑ Yi
i i
RMSE: MAPE:
n i =1
n i =1

The Model

We start our modelling with a baseline, using Constant YOY,

test error
which predicts the same volume for each trio (UEN-unit size-brand) training error
error

as the previous year.

We then jump into linear models, applying linear regression

and then adding an L-1 penalty to its coefficients to obtain a lasso 0.1 0.4 0.6 0.8 1.0
share of training set used
model. It is interesting to mention that our linear regression had high
variance due to multicollinearity (especially economic inputs that Figure 4: Convergence of training and
changed only 3 times per year and were highly correlated - GDP, % test error when the full training data set
is used. (Linear Regression)
change in GDP, Touristic GDP). lasso took care of it, and produced
good results with low variance, achieving similar training and vali-
dation errors. However, there was a large bias since the training error
was still high.
beer sales forecasting 3

To decrease our bias, we explore more flexible models such as

tree-based ones (given its success in Kaggle competitions 3 and other 3
Kaggle. Forecast sales using store,
applications). For boosting, we tuned the max-depth of each tree promotion, and competitor data,
September 2015. URL https://www.
and the number of trees using 5-fold cross validation (the step size kaggle.com/c/rossmann-store-sales
we set to 0.01). Surprisingly for us, it does not perform well on the 0 31 72 105 125 140

validation set. We applied random forests with the main hyperpa-

1000
rameter (number of possible variables on each split) set to its default,
with some success. We then tuned it using out-of-bag error (faster
than Cross Validation) and, surprisingly, the optimal number of al-

500
lowed variables is the greatest as possible.

Coefficients
This means we are not introducing any randomness on each split

0
and, thus, implementing bagging, which led to our best results so
far. We believe that the common reason for which boosting and
random forest did not perform well is the high order interaction

−500
effect of our data. Since each trio behaves differently from other trios,
we believe we have a strong interaction effect of order at least 3. For
high order interaction effects, boosting is not suited well if we are 0 2000 4000 6000

L1 Norm
8000 10000

using simple base learners with low depth, and the limitation of
which variables we can pick at each split on random forest does Figure 5: LASSO coefficients path. The
largest coefficients are associated to
not allow to fully capture those interactions either. brands and months.

Constant YOY LASSO Boosting Ensemble

RF+LASSO

Linear Regression Random Forest Bagging Ensemble

Bagging+LASSO
Prediction

Ground Truth

Figure 6: Predicted over actual Hec-

tolitres for the validation set on a
Given that we obtained good performance with both lasso and log–log plot. A perfect prediction
bagging, we decide to combine both. We opt to ensemble them in would display as a 45◦ line. Note that
a naive way, by just taking the average of their predictions (which we penalize with the squared of the er-
ror, so good predictors should be sharp
works as something like a uniform prior to the model weights, lead- on the top right. LASSO performs well
ing to a regularization effect). Since those functions define very dif- even though its predictions spread out
in the bottom left corner as those values
ferent surfaces, we expect a good ensemble effect, with errors can- are the least relevant for the resulting
RMSE.
beer sales forecasting 4

celling each other and less sensitivity to a particular model (i.e., we

expect to decrease our overall bias as well as variance, respectively).
We report the validation error for all the mentioned models, and pick
the ensemble of lasso with bagging as our final one. We are now
ready to apply it to the test set.

Linear Regression 1601 Figure 7: Performance of different

models on the validation set.
Boosting 1564
Random Forrest 1512
LASSO 1492
Bagging 1476
RF + LASSO 1455
Bagging + LASSO 1437
Baseline (YOY) 1877
Bar length proportional to MSE. Numbers show RMSE in Hectolitres.

The Results

We run a complete backtest on the test set (Jan-Aug ’16). More specif-
ically, for each month that we are forecasting, we use all available
data up to 2 months before it, retrain our model and make the pre-
dictions. We do it in a rolling window basis, so by the end of the
backtest we have predictions for each month in the test set, exactly as
if we have implemented our model to ABI’s routines and made those
forecasts by then. This allows us to fully trust our results and gener-
ate reliable estimates for future test errors (possibly even pessimistic,
since we will naturally have more data in the future).

The table on the right describes our results quantitatively. Some

Metric ABI Ours Both
important conclusions can be drawn:
RMSE 1883 1743 1709
1. The model that ABI is currently developing achieves an RMSE MAPE 3.8% 4.8% 4.2%
of 1883, which is believed to be already better than their current " by UEN 8.4% 8.3% 7.6%
" by Brand 12.2% 13.0% 11.7%
system. Our model achieves 1743, which is a great improvement " by Size 8.0% 9.8% 8.3%
over it. When we combine our model with ABI’s through simple
Table 1: Performance on the test set
averaging, the resulting model achieves 1709, which corresponds for ABI’s model, the model we devel-
to a 9.2% overall improvement. So we consider our approach suc- oped (ensemble of Bagging and LASSO
based on the mentioned features), and
cessful, and recommend it to be implemented. a model which consists on simply av-
eraging our predictions with ABI’s.
2. When analyzing other types of metrics, our model seems to be We report results for both Root Mean
slightly inferior than ABI’s model. It outperforms ABI on MAPE Squared Error and also different aggre-
aggregated by UEN, and underperforms it if aggregated by brand gations of Mean Absolute Percentage
Error.
or unit size. We once again restate that even though more inter-
pretable, this measure is not representative of potential profit, and
is also distorted by aggregations with low volume.

3. Even though we don’t recommend the MAPE as the optimal met-

ric, we still achieve good results on the average monthly MAPE
beer sales forecasting 5

over the test set. We achieve less than 5% error, ABI’s achieves less
than 4% and the combined one around 4%.

4. Other important advantages of the model are worth being men-

tioned: it deals well with missing data (the bagging part of the Ground truth (examples) 2-month leading prediction
model uses surrogate splits for this purpose); each monthly model
can be trained in around 1 hour on a standard local CPU; it is flex- Corona Extra

ible to incorporate new types of data, such as weather/humidity

Victoria
(which were not used yet), or even analysts forecasts.
Oct 2015 Jan 2016 Apr 2016 Jul 2016

Figure 8: Actual sales (gray) and pre-

The Improvement dictions (red) on the validation and test
set (separated by the dashed line), for
From an Artificial Intelligence perspective, we extend the model to the two mostly consumed brands in
Mexico: Corona Extra and Victoria.
adjust itself dynamically. Predictions made with 1 month in advance
tend to be more accurate than those made 2 months ahead. Since the
production plan starts with 2 months in advance, ABI cannot change
the amount of beer and bottles produced. However, they can reallo-
cate it differently through the UENs. This can enhance logistics and
even sales/marketing planning. So each month we train a model to
predict sales one month in advance, and adjust our previous predic-
tions under the constraint the the sum of the hectolitres for each pair
of brand and unit size remains constant. We first analyzed it from
a CSP perspective, but it turns out this optimization can be solved
analytically through Lagrange multipliers (appendix).

Running a backtest on the same format as previously described,

we obtain an improvement from an RMSE of 1743 to 1701 for our
model. Curiously, ABI’s model gets worse, increasing its RMSE from
1883 to 1966. The combination of both models also shows an im-
provement, going from 1709 to 1692. Thus, again, we recommend
implementing the dynamic regional reallocation.

The Next Steps

This project explored an application of Machine Learning to an in-

dustrial field such as food & beverages. It shows that a data-driven
approach allows for enhancement of business performance, and that
there is still room for other applications and improvement. Some pos-
sible extensions of our particular model, that came out of meetings
with senior management of Grupo Modelo are:

1. Forecast exports and international markets.

2. Make longer term predictions, such as six months or a full year.

3. Predict sales for new brands or unit sizes that are yet unreleased.
beer sales forecasting 6

References

Dean C Chatfield, Jeon G Kim, Terry P Harrison, and Jack C Hayya.

The bullwhip effect–impact of stochastic lead time, information
quality, and information sharing: a simulation study. Production and
Operations Management, 13(4):340–353, 2004.

Da Veiga et al. A comparison between the HoltWinters and ARIMA

models. WSEAS TRANSACTIONS, 11:608–614, 2014.

Kaggle. Stores sales forecasting, February 2014. URL https://www.

kaggle.com/c/walmart-recruiting-store-sales-forecasting/
data.

Kaggle. Forecast sales using store, promotion, and competi-

tor data, September 2015. URL https://www.kaggle.com/c/
rossmann-store-sales.
beer sales forecasting 7

Appendix
(1) (2) (n)
Call qold , qold , . . . , qold the old estimates for the quantity of beer to
be sold for each pair UEN for a given pair (brand, unit size). Also,
(1) (2) (n)
call qnew , qnew , . . . , qnew the new estimates for them, for the same pair
(brand, unit size). Also, call r (1) , . . . , r (n) our final estimates. Our goal
is to solve the following optimization problem:
n
∑ (r(i) − qnew )2
(i )
minimize f ( r (1) . . . r ( n ) ) =
r (1) ...r (n) i =1
n n
∑ r(i) − ∑ qold = 0
(i )
subject to g ( r (1) . . . r ( n ) ) =
i =1 i =1

Calling Knew the sum of total hectolitres produced by a given UEN

and Kold the sum of total hectolitres previously predicted by the 2-
month in advance model, our constraint consists of Knew − Kold = 0.
When we first designed this task, we thought about using Constraint
Satisfaction Problem tools. However, it turns out that this optimiza-
tion problem is convex and can be solved analytically using Lagrange
multipliers. Define:

L(r (1) . . . r (n) , λ) = f (r (1) . . . r (n) ) − λg(r (1) . . . r (n) )

From Lagrange multipliers, we need to set:

∇r(1) ...r(n) ,λ L(r (1) . . . r (n) , λ) = 0 ⇒

(1)
r (1) − qnew = λ
...
(1)
(n)
r (n) − qnew =λ
n
∑r
(i )
= Kold
i =1

From the last equation in 1 we get that Knew + nλ = Kold ⇒ λ =

Kold −Knew
n Solving for an arbitrary r (i) , we get the updated estimates:
.

(i ) Kold − Knew
r (i) = qnew +
n
This makes intuitive sense, since what it is doing is basically reallo-
cating the estimated excess or shortage of beer/bottles equally over
all the UENs while trying to stay as close as possible to the new,
updated predictions.

Sales Prediction
No ratings yet
Sales Prediction
37 pages
Supermarket Sales Forecasting Model
No ratings yet
Supermarket Sales Forecasting Model
3 pages
Ifmpt2024 343 349
No ratings yet
Ifmpt2024 343 349
7 pages
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
100% (2)
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
30 pages
Ids Case Study
No ratings yet
Ids Case Study
15 pages
Sales Prediction For Big Mart 3.0.pptx MM
No ratings yet
Sales Prediction For Big Mart 3.0.pptx MM
25 pages
Business Analytics Assignment
No ratings yet
Business Analytics Assignment
26 pages
Rossmann nr1 Doc
No ratings yet
Rossmann nr1 Doc
7 pages
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
No ratings yet
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
11 pages
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
No ratings yet
Machine-Learning Models For Sales Time Series Forecasting: Bohdan M. Pavlyshenko
11 pages
E Commerce
No ratings yet
E Commerce
20 pages
Pavlyshenko (2019) Machine-Learning Models For Sales Time Series Forecasting. Data-04-00015-V2
No ratings yet
Pavlyshenko (2019) Machine-Learning Models For Sales Time Series Forecasting. Data-04-00015-V2
11 pages
Optimizing Sales Forecasting - A Comprehensive Analysis
No ratings yet
Optimizing Sales Forecasting - A Comprehensive Analysis
11 pages
DS535 Note 6 (Page28-30)
No ratings yet
DS535 Note 6 (Page28-30)
4 pages
Petrophysics-Driven Well Log Quality Control Using Machine Learning-2
No ratings yet
Petrophysics-Driven Well Log Quality Control Using Machine Learning-2
15 pages
Application of Predictive Analytics in Volume Forecasting and Resource Planning
No ratings yet
Application of Predictive Analytics in Volume Forecasting and Resource Planning
69 pages
Datagiri: Presented 17 November By: Himanshu Shrivastava
No ratings yet
Datagiri: Presented 17 November By: Himanshu Shrivastava
17 pages
Additional - 2 c1
No ratings yet
Additional - 2 c1
7 pages
BeverageSalesPrediction FR
No ratings yet
BeverageSalesPrediction FR
8 pages
Using Stacking Approaches For Machine Learning Models
No ratings yet
Using Stacking Approaches For Machine Learning Models
4 pages
Ammmp2023 87 94
No ratings yet
Ammmp2023 87 94
8 pages
Grocery Sales Forecasting Report
No ratings yet
Grocery Sales Forecasting Report
8 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
FinalPaper SalesPredictionModelforBigMart
No ratings yet
FinalPaper SalesPredictionModelforBigMart
14 pages
Sales-Forecasting of Retail Stores Using Machine Learning Techniques
No ratings yet
Sales-Forecasting of Retail Stores Using Machine Learning Techniques
7 pages
Bigmart Sales Using Machine Learning With Data Analysis
No ratings yet
Bigmart Sales Using Machine Learning With Data Analysis
5 pages
Improvizing Big Market Sales Prediction: Meghana N
No ratings yet
Improvizing Big Market Sales Prediction: Meghana N
7 pages
E Monika Sree 10-10-2024
No ratings yet
E Monika Sree 10-10-2024
60 pages
Final DMT Report PDF
No ratings yet
Final DMT Report PDF
27 pages
Bicycle Sales Prediction Based On Ensemble Learnin
No ratings yet
Bicycle Sales Prediction Based On Ensemble Learnin
7 pages
5 Forecasterstoolbox
No ratings yet
5 Forecasterstoolbox
49 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Regression Analysis Report - Sanjeev Kumar - 24MSG1R43
No ratings yet
Regression Analysis Report - Sanjeev Kumar - 24MSG1R43
6 pages
Big Mart Sales Prediction Using Machine Learning Report PDF
No ratings yet
Big Mart Sales Prediction Using Machine Learning Report PDF
56 pages
Future Sales Prediction Methods
No ratings yet
Future Sales Prediction Methods
9 pages
ForecastingRetailSalesusingMachine Learning Models
No ratings yet
ForecastingRetailSalesusingMachine Learning Models
34 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Implementation (Raw)
No ratings yet
Implementation (Raw)
12 pages
TSF Rose
No ratings yet
TSF Rose
35 pages
Wine Quality Research Paper
100% (1)
Wine Quality Research Paper
3 pages
ml-4
No ratings yet
ml-4
22 pages
Big Mart Sales Forecasting
No ratings yet
Big Mart Sales Forecasting
6 pages
Final PBL of Aaryan & Satyam
No ratings yet
Final PBL of Aaryan & Satyam
19 pages
Food Sales Prediction Presentation
No ratings yet
Food Sales Prediction Presentation
10 pages
Predictive Analytics in Marketing
No ratings yet
Predictive Analytics in Marketing
90 pages
Improving Regressors Using Boosting Techniques: Observations, XX
No ratings yet
Improving Regressors Using Boosting Techniques: Observations, XX
9 pages
FA-19 - Articulo Final - Jose Santaella
No ratings yet
FA-19 - Articulo Final - Jose Santaella
6 pages
The Combination Forecasting Model of Auto Sales Based On Seasonal Index and RBF Neural Network
No ratings yet
The Combination Forecasting Model of Auto Sales Based On Seasonal Index and RBF Neural Network
10 pages
7 Regression
No ratings yet
7 Regression
96 pages
Rashmi Jeswani Capstone
No ratings yet
Rashmi Jeswani Capstone
84 pages
Gradient Boosting Application in Forecasting of Performance Indicators Values For Measuring The Efficiency of Promotions in FMCG Retail
No ratings yet
Gradient Boosting Application in Forecasting of Performance Indicators Values For Measuring The Efficiency of Promotions in FMCG Retail
8 pages
Kaggle Submission Guide
No ratings yet
Kaggle Submission Guide
4 pages
Thesis Presentation
No ratings yet
Thesis Presentation
23 pages
Trust-In Machine Learning Models
No ratings yet
Trust-In Machine Learning Models
11 pages
AnalyticsOlympiadPresentation RajatRanjan
No ratings yet
AnalyticsOlympiadPresentation RajatRanjan
16 pages
Supervised Learning - Basics
No ratings yet
Supervised Learning - Basics
115 pages
Weather Risk Analysis Guide
No ratings yet
Weather Risk Analysis Guide
13 pages
Weather Journals for Young Learners
No ratings yet
Weather Journals for Young Learners
2 pages
Present Simple: 3 Ants Eats Helps Drives
No ratings yet
Present Simple: 3 Ants Eats Helps Drives
6 pages
Effect of Casting Shape on Solidification
100% (1)
Effect of Casting Shape on Solidification
6 pages
Bridge Equipments
100% (2)
Bridge Equipments
14 pages
LP Typhoon
No ratings yet
LP Typhoon
4 pages
Time Series Analysis: Trend & Seasonality: APS 425 - Fall 2015
No ratings yet
Time Series Analysis: Trend & Seasonality: APS 425 - Fall 2015
12 pages
Ampatzi - The Potential For Solar Thermal Technologies v.2
No ratings yet
Ampatzi - The Potential For Solar Thermal Technologies v.2
263 pages
SAT Suite Question Bank - English (Easy)
No ratings yet
SAT Suite Question Bank - English (Easy)
348 pages
Cooking Appliances: United States
No ratings yet
Cooking Appliances: United States
4 pages
LARMINGHOONS
No ratings yet
LARMINGHOONS
37 pages
Levington Marina, Risk Assessment
No ratings yet
Levington Marina, Risk Assessment
2 pages
Weather Graphing and Writing: Freebie
No ratings yet
Weather Graphing and Writing: Freebie
10 pages
Market Segmentation, Targeting, and Positioning
No ratings yet
Market Segmentation, Targeting, and Positioning
22 pages
Take 2025 Stock Predictions With A Bucket of salt-Mint-Jan-2-2025
No ratings yet
Take 2025 Stock Predictions With A Bucket of salt-Mint-Jan-2-2025
2 pages
ALOHA Examples
No ratings yet
ALOHA Examples
54 pages
Sem 3 Managerial Economics Module 2 DEMAND FORCASTING TECHNIQUES
No ratings yet
Sem 3 Managerial Economics Module 2 DEMAND FORCASTING TECHNIQUES
9 pages
The Untold Story of The Founding of The Detective Agency (Updated 6.23)
No ratings yet
The Untold Story of The Founding of The Detective Agency (Updated 6.23)
26 pages
An Introduction To Weather and Climate
100% (1)
An Introduction To Weather and Climate
40 pages
Art 6 Romeo Bosneagu
No ratings yet
Art 6 Romeo Bosneagu
8 pages
Ensodisc
No ratings yet
Ensodisc
6 pages
Q4 W5 DLP - Science
No ratings yet
Q4 W5 DLP - Science
10 pages
Yats BASIC INTELLIGENCE
100% (2)
Yats BASIC INTELLIGENCE
24 pages
Irregular Verbs To Memorize
No ratings yet
Irregular Verbs To Memorize
4 pages
Flight Profile Variantions Due To Spreading Practice of CI Planning
No ratings yet
Flight Profile Variantions Due To Spreading Practice of CI Planning
8 pages
Practical Time Series Forecasting With R A Hands On Guide 2nd Edition Galit Shmueli Full Chapters Included
No ratings yet
Practical Time Series Forecasting With R A Hands On Guide 2nd Edition Galit Shmueli Full Chapters Included
82 pages
NZ USAR Awareness Team Leaders FOG (Print)
100% (2)
NZ USAR Awareness Team Leaders FOG (Print)
49 pages
Tide Times and Charts For Kochi, Kerala and Weather Forecast For Fishing in Kochi in 2024
No ratings yet
Tide Times and Charts For Kochi, Kerala and Weather Forecast For Fishing in Kochi in 2024
3 pages
A Note From WIO On The Rengali Dam and Flood Management - 24th Sept 2011
No ratings yet
A Note From WIO On The Rengali Dam and Flood Management - 24th Sept 2011
4 pages
Sales Forecasting: Learning Objectives
No ratings yet
Sales Forecasting: Learning Objectives
23 pages

Boozed'd Trees-Beer Sales Forecasting: Dan Zylberglejd

Uploaded by

Boozed'd Trees-Beer Sales Forecasting: Dan Zylberglejd

Uploaded by

Boozed’d Trees—Beer Sales Forecasting

We predict sales volume of beer brands by unit size and region in

Anheuser-Busch Inbev is the largest beer company in the globe. It

use past monthly sales up to one year.

Finally, we perform feature engineering on the main features. We

incorporate the difference between consecutive month sales, and also

Industrial engineering and supply-chain management literature men-

We start our modelling with a baseline, using Constant YOY,

as the previous year.

We then jump into linear models, applying linear regression

To decrease our bias, we explore more flexible models such as

validation set. We applied random forests with the main hyperpa-

Constant YOY LASSO Boosting Ensemble

Linear Regression Random Forest Bagging Ensemble

Figure 6: Predicted over actual Hec-

celling each other and less sensitivity to a particular model (i.e., we

Linear Regression 1601 Figure 7: Performance of different

The table on the right describes our results quantitatively. Some

3. Even though we don’t recommend the MAPE as the optimal met-

4. Other important advantages of the model are worth being men-

ible to incorporate new types of data, such as weather/humidity

Figure 8: Actual sales (gray) and pre-

Running a backtest on the same format as previously described,

The Next Steps

This project explored an application of Machine Learning to an in-

1. Forecast exports and international markets.

2. Make longer term predictions, such as six months or a full year.

Dean C Chatfield, Jeon G Kim, Terry P Harrison, and Jack C Hayya.

Da Veiga et al. A comparison between the HoltWinters and ARIMA

Kaggle. Stores sales forecasting, February 2014. URL https://www.

Kaggle. Forecast sales using store, promotion, and competi-

Calling Knew the sum of total hectolitres produced by a given UEN

L(r (1) . . . r (n) , λ) = f (r (1) . . . r (n) ) − λg(r (1) . . . r (n) )

∇r(1) ...r(n) ,λ L(r (1) . . . r (n) , λ) = 0 ⇒

From the last equation in 1 we get that Knew + nλ = Kold ⇒ λ =

You might also like