Markov Chain
Monte Carlo and
model fitting
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
Bayesian data analysis in production
Grid approximation: inconvenient with many parameters
Sampling from known posterior: requires conjugate priors
Markov Chain Monte Carlo (MCMC): sampling from unknown posterior!
BAYESIAN DATA ANALYSIS IN PYTHON
Monte Carlo
Approximating some quantity by
generating random numbers
From the formula, πr2 ≃ 78.5
BAYESIAN DATA ANALYSIS IN PYTHON
Monte Carlo
Approximating some quantity by
generating random numbers
From the formula, πr2 ≃ 78.5
Draw a 10x10 square around the circle.
BAYESIAN DATA ANALYSIS IN PYTHON
Monte Carlo
Approximating some quantity by
generating random numbers
From the formula, πr2 ≃ 78.5
Draw a 10x10 square around the circle.
Sample 25 random points in the square.
How many are within the circle?
19/25 = 76%
Circle's area approximation: 76% * 100 = 76
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chains
Models a sequence of states, between
which one transitions with given
probabilities.
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chains
Models a sequence of states, between What will the bear do next:
which one transitions with given
probabilities. hunt eat sleep
hunt 0.1 0.8 0.1
eat 0.05 0.4 0.55
sleep 0.8 0.15 0.05
A er many time periods, transition
probabilities become the same no ma er
where we started.
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chains
Models a sequence of states, between What will the bear do next:
which one transitions with given
probabilities. hunt eat sleep
hunt 0.1 0.8 0.1
eat 0.05 0.4 0.55
sleep 0.8 0.15 0.05
A er many time periods, transition What will the bear do in a distant future:
probabilities become the same no ma er
where we started. hunt eat sleep
hunt 0.28 0.44 0.28
eat 0.28 0.44 0.28
sleep 0.28 0.44 0.28
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Markov Chain Monte Carlo
BAYESIAN DATA ANALYSIS IN PYTHON
Aggregated ads data
print(ads_aggregated)
date clothes_banners_shown sneakers_banners_shown num_clicks
0 2019-01-01 20 18 2
1 2019-01-02 24 19 8
2 2019-01-03 20 20 5
.. ... ... ... ...
148 2019-05-29 24 25 8
149 2019-05-30 26 27 11
150 2019-05-31 26 24 8
[151 rows x 4 columns]
BAYESIAN DATA ANALYSIS IN PYTHON
Linear regression with pyMC3
formula = "num_clicks ~ clothes_banners_shown + sneakers_banners_shown"
with pm.Model() as model:
pm.GLM.from_formula(formula, data=ads_aggregated)
# Print model specification
print(model)
# Sample posterior draws
trace = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Let's practice
MCMC!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Interpreting results
and comparing
models
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
Running the model revisited
formula = "num_clicks ~ clothes_banners_shown + sneakers_banners_shown"
with pm.Model() as model_1:
pm.GLM.from_formula(formula, data=ads_aggregated)
trace_1 = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Running the model revisited
formula = "num_clicks ~ clothes_banners_shown + sneakers_banners_shown"
with pm.Model() as model_1:
pm.GLM.from_formula(formula, data=ads_aggregated)
trace_1 = pm.sample(draws=1000, tune=500, chains=4)
Number of parameters: 4
Number of draws for each parameter: 1000 × 4 = 4000
BAYESIAN DATA ANALYSIS IN PYTHON
Trace plot
pm.traceplot(trace_1)
BAYESIAN DATA ANALYSIS IN PYTHON
Trace plot: zoom in on one parameter
BAYESIAN DATA ANALYSIS IN PYTHON
Forest plot
pm.forestplot(trace_1)
BAYESIAN DATA ANALYSIS IN PYTHON
Trace summary
pm.summary(trace_1)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd \
Intercept 1.307 0.886 -0.305 2.962 0.018 0.013
clothes_banners_shown 0.103 0.031 0.043 0.160 0.001 0.000
sneakers_banners_shown 0.104 0.032 0.045 0.163 0.001 0.001
sd 2.654 0.157 2.382 2.970 0.003 0.002
ess_mean ess_sd ess_bulk ess_tail r_hat
Intercept 2346.0 2318.0 2351.0 2083.0 1.0
clothes_banners_shown 2085.0 2085.0 2089.0 1868.0 1.0
sneakers_banners_shown 2105.0 1953.0 2122.0 1869.0 1.0
sd 2615.0 2590.0 2646.0 1834.0 1.0
BAYESIAN DATA ANALYSIS IN PYTHON
Fitting another model
formula = "num_clicks ~ clothes_banners_shown + sneakers_banners_shown + weekend"
with pm.Model() as model_2:
pm.GLM.from_formula(formula, data=ads_aggregated)
trace_2 = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Widely Applicable Information Criterion (WAIC)
comparison = pm.compare({"trace_1": trace_1, "trace_2": trace_2},
ic="waic", scale="deviance")
print(comparison)
rank waic p_waic d_waic weight se dse warning \
trace_2 0 -362.8 5.1576 0 0.513792 9.37269 0 True
trace_1 1 -362.926 4.13318 0.126236 0.486208 9.48352 1.50682 True
waic_scale
trace_2 log
trace_1 log
BAYESIAN DATA ANALYSIS IN PYTHON
Compare plot
pm.compareplot(comparison)
BAYESIAN DATA ANALYSIS IN PYTHON
Let's practice
comparing models!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Making predictions
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
Number-of-clicks model again
formula = "num_clicks ~ clothes_banners_shown + sneakers_banners_shown + weekend"
with pm.Model() as model_2:
pm.GLM.from_formula(formula, data=ads_aggregated)
trace_2 = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Ads test data
print(ads_test)
clothes_banners_shown sneakers_banners_shown num_clicks weekend
0 40 36 7 True
1 42 47 8 False
2 45 37 11 False
3 22 15 4 False
4 20 18 2 False
BAYESIAN DATA ANALYSIS IN PYTHON
Sampling predictive draws
with pm.Model() as model:
pm.GLM.from_formula(formula, data=ads_test)
posterior_predictive = pm.fast_sample_posterior_predictive(trace_2)
BAYESIAN DATA ANALYSIS IN PYTHON
Predictive draws
posterior_predictive["y"].shape
(4000, 5)
print(posterior_predictive["y"])
array([[12.83527253, 10.22454815, 11.20386868, 7.50227286, 6.85458594],
[ 3.1015655 , 6.1253004 , 11.38324931, 2.1844722 , 4.21451756],
[ 3.40141276, 9.10157964, 6.57689421, 8.26669814, 4.23812161],
...,
[10.97303606, 9.0772305 , 10.6877039 , 1.78448969, 6.75663075],
[ 8.53734584, 12.14079593, 11.00969881, 4.69875055, 8.317338 ],
[16.44713387, 17.35163824, 19.59359831, 2.84058536, 4.21108186]])
BAYESIAN DATA ANALYSIS IN PYTHON
How good is the prediction?
clothes_banners_shown sneakers_banners_shown num_clicks weekend
0 40 36 7 True
pm.plot_posterior(posterior_predictive["y"][:, 0])
BAYESIAN DATA ANALYSIS IN PYTHON
Test error distribution
errors = []
for index, test_example in ads_test.iterrows():
error = posterior_predictive["y"][:, index] - test_example["num_clicks"]
errors.append(error)
error_distribution = np.array(errors).reshape(-1)
error_distribution.shape
(20000,)
pm.plot_posterior(error_distribution)
BAYESIAN DATA ANALYSIS IN PYTHON
Test error distribution
BAYESIAN DATA ANALYSIS IN PYTHON
Let's make
predictions!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
How much is an
avocado?
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
The Avocado, Inc.
BAYESIAN DATA ANALYSIS IN PYTHON
Case study: estimating price elasticity
Goal: estimate price elasticity of avocados and optimize the price
(price elasticity = impact of the change in price on the sales volume)
1. Fit a Bayesian regression model.
2. Inspect the model to verify its correctness.
3. Predict sales volume for di erent prices.
4. Propose the pro t-maximizing price and the associated uncertainty.
BAYESIAN DATA ANALYSIS IN PYTHON
Avocado data
print(avocado)
date price volume type_organic
0 2015-01-04 0.95 313.242777 0
1 2015-01-11 1.01 290.635427 0
2 2015-01-18 1.03 290.434588 0
3 2015-01-25 1.04 284.703108 0
.. ... ... ... ...
334 2018-03-04 1.52 16.344308 1
335 2018-03-11 1.52 16.642349 1
336 2018-03-18 1.54 16.758042 1
337 2018-03-25 1.55 15.599672 1
1 Data source: h ps://www.kaggle.com/neuromusic/avocado-prices
BAYESIAN DATA ANALYSIS IN PYTHON
Priors in pymc3
formula = "num_bikes ~ temp + work_day + wind_speed"
with pm.Model() as model:
pm.GLM.from_formula(formula, data=bikes)
trace = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Priors in pymc3
formula = "num_bikes ~ temp + work_day + wind_speed"
with pm.Model() as model:
priors = {"wind_speed": pm.Normal.dist(mu=-5)}
pm.GLM.from_formula(formula, data=bikes, priors=priors)
trace = pm.sample(draws=1000, tune=500)
BAYESIAN DATA ANALYSIS IN PYTHON
Extracting draws from trace
temp_draws = trace.get_values("temp")
print(temp_draws)
array([6.8705346, 6.7421152, 6.7393061, ..., 5.966574 , 6.1274128, 6.7149277])
BAYESIAN DATA ANALYSIS IN PYTHON
What you will need
Model ing: Visualization:
pm.Model() pm.forestplot()
pm.GLM.from_formula() pm.traceplot()
pm.sample()
pm.Normal()
Making predictions: Inference:
pm.fast_sample_posterior_predictive() pm.hpd()
BAYESIAN DATA ANALYSIS IN PYTHON
Let's put what
you've learned to
practice!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Final remarks
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N
Michal Oleszak
Machine Learning Engineer
What you know
Chapter 1: The Bayesian Way Chapter 3: Bayesian Inference
Bayesian vs. frequentist approach A/B testing
Probability theory & distributions Decision analysis
Updating beliefs with more data Forecasting & regression
Chapter 2: Bayesian Estimation Chapter 4: Bayesian Linear Regression
Grid approximation Markov Chain Monte Carlo (MCMC)
Prior distributions Fi ing and interpreting models with pymc3
Reporting Bayesian results Bayesian data analysis: a case study
BAYESIAN DATA ANALYSIS IN PYTHON
More Bayes
Hierarchical models: PyMC3 docs:
h ps://pymc3.readthedocs.io/en/latest
y = β0 + β1 x1 + β2 x2
β2 = β20 + β21 x3
Think Bayes by Allen Downey
More regression (logistic, Poisson, ...)
h p://allendowney.github.io/ThinkBayes2
Bayesian machine learning
BAYESIAN DATA ANALYSIS IN PYTHON
Congratulations and
good luck!
B AY E S I A N D ATA A N A LY S I S I N P Y T H O N