0% found this document useful (0 votes)

20 views9 pages

Hockey

This document discusses the use of hockey analytics to evaluate player performance in the NHL from 2002 to 2014, focusing on developing a better performance metric than the traditional plus-minus statistic. It outlines the use of regression models to account for the influence of teammates and opponents, and introduces the gamlr package for model fitting. The analysis reveals insights into player contributions, with a focus on differentiating between assets and liabilities in player performance.

Uploaded by

Pavel Fedorov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views9 pages

Hockey

Uploaded by

Pavel Fedorov

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Hockey analytics

Finding good players using variable selection

Mladen Kolar (mkolar@chicagobooth.edu)

We are going to investigate data on all of the goals in the 2002–2014 seasons of the National Hockey League
(NHL).
• See Robert Gramacy, Matt Taddy, and Sen Tian. Hockey performance via regression. Handbook
of Statistical Methods for Design and Analysis in Sports, 2015. For more details.
The data is available in the gamlr package, a competitor of glmnet.
library(gamlr)
data(hockey)

• The data was scraped from NHL.com using an R package called nhlscrapr, spanning 11 seasons:
2002-03 through 2013–14, with playoffs.
• It includes other info we’re not going to use (shots, blocked shots, penalties, etc.)
Hockey is like soccer, but on ice, 6-on-6 and with rapid substitution.

Quantifying player performance in hockey is hard:

• continuous nature of play
• infrequent number of goals
• combinatorially huge numbers of player configurations.
One popular metric of individual player performance is plus–minus:
• the number of goals scored by the player’s team,
• minus the number scored by the opposing team

1
while that player is on the ice.
Plus-minus is better than just goals, because it distributes the credit and blame.

The limits of this approach are obvious: there is no accounting for teammates or opponents.
In hockey, where players tend to be grouped together on “lines” and coaches will “line match” against
opponents, a player’s PM can be artificially inflated or deflated by the play of his opponents and peers.

In summary, two disadvantages to plus–minus:

• It is a marginal effect, averaging over situation, say.
• It doesn’t control for sample size.

A better measure of performance would be a partial effect, having controlled for the effect of
• teammates,
• opponents, etc.
An appealing aspect of such an analysis is that it requires no extra data beyond that used to calculate
plus–minus,
• just a (much) more involved calculation.

We will build a better performance metric with regression.

The setup
head(goal)

## homegoal season team.away team.home period differential playoffs

## 1 0 20022003 DAL EDM 1 0 0
## 2 0 20022003 DAL EDM 1 -1 0
## 3 1 20022003 DAL EDM 2 -2 0
## 4 0 20022003 DAL EDM 2 -1 0
## 5 1 20022003 DAL EDM 3 -2 0

2
## 6 1 20022003 DAL EDM 3 -1 0
Given n goals throughout the National Hockey League (NHL) over some specified time period, say

+1 for a goal by the home team, and
yi =
−1 for a goal by the away team.

Then, say that

qi = P(Yi = 1) = P(home team scored goal i).

• Home and away are merely organizational devices, creating a consistent binary bifurcation for goals
that can be applied across games, seasons, etc.
• Due to the symmetry in the logit transformation, player effects are unchanged when framing away team
probabilities as qi rather than (1 − qi ), so we loose no generality by “privileging” home team goals in
this way.
Player model
The simplest version of a model for partial player effects is the so-called player model, where
• the log odds that the home team has scored a given goal, i, becomes

qi
log = α + βhi1 + · · · + βhi6 − βai1 − · · · − βai6 ,
1 − qi

where
• hi1 , . . . , hi6 are the home team’s players (i.e., player indicators), and
• ai1 , . . . , ai6 are the away team players.
The coefficients β∗ are our partial player effects!
• What does α represent?
The data
How do we set up the data so that it is faithful to this format, in a logistic regression setup? Like this:

• Notice that the design matrix XP is sparse.

• Sparse matrix libraries can ease storage and computational burden.

3
player[1:3, 2:7]

## 3 x 6 sparse Matrix of class "dgCMatrix"

## ERIC_BREWER ANSON_CARTER JASON_CHIMERA MIKE_COMRIE ULF_DAHLEN ROB_DIMAIO
## [1,] 1 . 1 . . -1
## [2,] . 1 . 1 -1 .
## [3,] . 1 . 1 . -1

Getting fancier
Beyond controlling for the effect of who else is on the ice, we also want to control for things unrelated to
player ability.
Embellishments abound. You can add:
• player-season indicators;
• Team or team–season indicators;
• Special teams indicators (6v5, 6v4, . . . , pulled goalie, etc.);
• Special situations: overtime, playoffs, exhibition, etc.
But the idea is the same:
• These are just indicator variables,
• and its all just a big logistic regression.

gamlr package
We will use gamlr package to fit the model
• gamlr preferrs a so-called log-gamma penaty pen(β) = log(1 + |β)|).

It works very similarly to glmnet, but are some small differences. E.g., it
• supports selection via information criteria, in addition to CV;
• allows some coefficients to undergo different (e.g., ridge/no) penalization.

Differential penalization

This ability to differentially penalize could be advantageous.

If (large) player partial effects (assets and liabilities) are the main interest,
• i.e., large non-zero coefficients, separating the wheat from the chaff,

4
then it makes sense to “select” players, but be more lenient to special teams, etc.

Ok, we’re sold.

Assembling the design

Lets throw everything together:
• config with special teams, playoff indicators, etc.
• team with team indicators, and
• player with player indicators
These are sparse matrcies, so we’ll need to combine them together using a new command.
X <- cBind(config, team, player)
y <- goal$homegoal
dim(X)

## [1] 69449 2776

• Woah! That’s quite big.

gamlr call
nhlreg <- gamlr(X, y, free=1:(ncol(config)+ncol(team)),
family="binomial", standardize=FALSE)

• free denotes unpenalized columns. These are columns of the design matrix that we do not want
penalized. We are using it to keep the special-teams and team-season variables unpenalized—we know
that we want them in the model, and so we let them enter without restriction.
• We use standardize=FALSE because the columns are already indicators. This is one of the special
cases where all of our penalized variables are on the same scale (player presence or absence). Without
standardize=FALSE, we would be multiplying the penalty for each coefficient (player effect) by that
player’s standard deviation in the player matrix. The players with big standard deviation are guys who
play a lot. Players with small standard deviation are those who play little (almost all zeros). Hence,
weighting penalty by standard deviations in this case is exactly what we do not want: a bigger penalty
for people with many minutes on ice, a smaller penalty for those who seldom play. Indeed, running the
regression without standardize=FALSE leads to a bunch of farm-team players coming up on top.
Now how are we going to look at this output, with nearly 3000 coefficients? Patiently.

Start with α̂, the home-ice advantage (ignoring everything else).

exp(coef(nhlreg)[1])

## [1] 1.08
• Home ice increases the odds that the home team has scored by 8%. Without conditioning on any of the
other covariates, the home team is around 8% more likely to have scored any given goal. That is a big
home-ice advantage!

(De-) Selected players

Lets extract the coefficients.
Baicc <- coef(nhlreg)[colnames(player),]

5
By default, the reported coefficients are from the best model by AICc,
• a “corrected” AIC criterion.

How many are non-zero

c(nonzero=sum(Baicc != 0), prop=mean(Baicc != 0),
assets=mean(Baicc > 0), liabilities=mean(Baicc < 0))

## nonzero prop assets liabilities

## 646.000 0.265 0.160 0.105
• About 75% of the league is “average”.
• 16% assets, 10% liabilities.

Top/bottom ten
Here are the top ten players. They are almost all recognizable stars.
Baicc[order(Baicc, decreasing=TRUE)[1:10]]

## PETER_FORSBERG TYLER_TOFFOLI ONDREJ_PALAT ZIGMUND_PALFFY SIDNEY_CROSBY JOE_THORNTON

## 0.755 0.629 0.628 0.443 0.413 0.384
## PAVEL_DATSYUK LOGAN_COUTURE ERIC_FEHR MARTIN_GELINAS
## 0.376 0.368 0.368 0.358
And the bottom ten. They are not those with little ice time, but rather those with much ice time who
underperform.
Baicc[order(Baicc)[1:10]]

## TIM_TAYLOR JOHN_MCCARTHY P. J._AXELSSON NICLAS_HAVELID THOMAS_POCK MATHIEU_BIRON

## -0.864 -0.565 -0.428 -0.385 -0.384 -0.351
## CHRIS_DINGMAN DARROLL_POWE RAITIS_IVANANS RYAN_HOLLWEG
## -0.334 -0.334 -0.313 -0.299
Let us compare to what would happen if we run the regression without standardize=FALSE.
nhlreg.std <- gamlr(X, y, free=1:(ncol(config)+ncol(team)),
family="binomial")
Baicc.std <- coef(nhlreg.std)[colnames(player),]
Baicc.std[order(Baicc.std, decreasing=TRUE)[1:10]]

## JEFF_TOMS RYAN_KRAFT COLE_JARRETT TOMAS_POPPERLE DAVID_LIFFITON

## 1.738 1.483 1.212 1.111 1.097
## ALEXEY_MARCHENKO ERIC_SELLECK MIKE_MURPHY DAVID_GOVE TOMAS_KANA
## 1.030 1.006 0.960 0.926 0.879

Contribution to goal for/against

Whenever a goal is scored,
• Pittsburg’s odds of having scored (rather than being scored on) increase by 51% if Sidney Crosby is on
the ice;
exp(Baicc["SIDNEY_CROSBY"])

## SIDNEY_CROSBY
## 1.51

6
• and the Blue Jackets’ (or Kings’, pre 2011-12) odds of having scored drop by 22% if Jack Johnson is on
the ice.
exp(Baicc["JACK_JOHNSON"])

## JACK_JOHNSON
## 0.781
(Remember, the data is a little old.)

Cross-validation
Cross-validation results instead?
cv.nhlreg <- cv.gamlr(X, y, free=1:(ncol(config)+ncol(team)), family="binomial",
standardize=FALSE)
cv.nhlreg

##
## 5-fold binomial cv.gamlr object
The cv.gamlr object stores a gamlr object (the full data path fit) as one of its entries, and you can plot
both the regularization paths and the CV experiment.
par(mfrow=c(1,2)); plot(cv.nhlreg); plot(cv.nhlreg$gamlr)

2189 868 457 338 2189 868 457 338

1.160 1.165 1.170 1.175 1.180

2
binomial deviance

1
coefficient

0
−1

−11 −9 −8 −7 −11 −9 −8 −7

log lambda log lambda

Let us look at log(λ̂) under various criteria.
c(AICc=as.numeric(log(nhlreg$lambda[which.min(AICc(nhlreg))])),
AIC=as.numeric(log(nhlreg$lambda[which.min(AIC(nhlreg))])),
BIC=as.numeric(log(nhlreg$lambda[which.min(BIC(nhlreg))])),
CVmin=log(cv.nhlreg$lambda.min), CV1se=log(cv.nhlreg$lambda.1se))

## AICc AIC BIC CVmin CV1se

7
## -9.17 -9.17 -6.65 -9.03 -8.51
Lets compare de-selection to what we got with AICc.
Bcvmin <- coef(cv.nhlreg, select="min")[colnames(player),]
Bcv1se <- coef(cv.nhlreg)[colnames(player),]
Bbic <- coef(nhlreg,select=which.min(BIC(nhlreg)))[colnames(player),]
c(AICc=sum(Baicc!=0), CVmin=sum(Bcvmin!=0), CV1se=sum(Bcv1se!=0), BIC=sum(Bbic!=0))

## AICc CVmin CV1se BIC

## 646 569 341 0
• Woah! BIC way over-penalizes.

Partial plus-minus
Consider the situation where you have no information beyond the fact that player “k” is on the ice.
• All other coefficients are effectively zero.
In isolation, player k’s effect is the number of goals he was on the ice for, Nk , times

Pk − (1 − Pk ) = P(scored) − P(scored on).

• I.e., his expected “goals for” in isolation is Pk Nk ,
• and his expected “goals against” in isolation is Nk (1 − Pk ).
So a partial plus-minus (PPM) could be defined as

PPMk = Nk Pk − Nk (Pk − 1) = Nk (2Pk − 1).

• which will be on the same scale as plus-minus (PM),
• and that could help if you’re not good at thinking about “log odds”.

PPM calculation
Calculating PPM, and showing first 20.
P <- exp(Baicc)/(1+exp(Baicc))
N <- colSums(abs(player))
PPM <- N*(2*P-1)
sort(PPM, decreasing=TRUE)[1:20]

## JOE_THORNTON PAVEL_DATSYUK SIDNEY_CROSBY ALEX_OVECHKIN HENRIK_LUNDQVIST

## 330 321 319 255 252
## HENRIK_SEDIN MARIAN_HOSSA NICKLAS_LIDSTROM DANIEL_ALFREDSSON ANDREI_MARKOV
## 237 230 224 216 213
## MIIKKA_KIPRUSOFF MARIAN_GABORIK ALEXANDER_SEMIN CHRIS_PRONGER HENRIK_ZETTERBERG
## 209 203 200 197 193
## PETER_FORSBERG JONATHAN_TOEWS TEEMU_SELANNE LUBOMIR_VISNOVSKY RYAN_GETZLAF
## 192 182 182 180 179

PM comparison
Calculating PM for comparison, and showing first 20.
• +1 for a goal by your team, −1 for a goal against.

8
PM <- colSums(player*c(-1,1)[y+1])
names(PM) <- colnames(player)
sort(PM, decreasing=TRUE)[1:20] # all goalies

## PAVEL_DATSYUK SIDNEY_CROSBY HENRIK_SEDIN ALEX_OVECHKIN DANIEL_SEDIN

## 599 544 542 533 520
## JOE_THORNTON NICKLAS_LIDSTROM EVGENI_MALKIN HENRIK_ZETTERBERG DANIEL_ALFREDSSON
## 510 500 473 471 470
## TOMAS_HOLMSTROM MARIAN_HOSSA DANY_HEATLEY CHRIS_KUNITZ JAROME_IGINLA
## 451 448 436 426 425
## JASON_SPEZZA TEEMU_SELANNE JAROMIR_JAGR RYAN_GETZLAF DANIEL_BRIERE
## 399 397 387 379 366
bigs <- which(abs(PM)>200|abs(PPM)>200)
plot(PM[bigs],PPM[bigs],type="n", xlim=range(PM)*1.05, xlab="PM", ylab="PPM")
text(PM[bigs],PPM[bigs],labels=colnames(player)[bigs], cex=0.75); abline(a=0,b=1)

JOE_THORNTON
PAVEL_DATSYUK
SIDNEY_CROSBY
300

HENRIK_LUNDQVIST ALEX_OVECHKIN
HENRIK_SEDIN
MARIAN_HOSSA
NICKLAS_LIDSTROM
DANIEL_ALFREDSSON
200

ANDREI_MARKOV
MIIKKA_KIPRUSOFF MARIAN_GABORIK
ALEXANDER_SEMIN
CHRIS_PRONGER
PETER_FORSBERG HENRIK_ZETTERBERG
JONATHAN_TOEWS
TEEMU_SELANNE
LUBOMIR_VISNOVSKY
RYAN_GETZLAF
ALEX_TANGUAY DANIEL_SEDIN
ANZE_KOPITAR
KIMMO_TIMONEN
ZDENO_CHARA
MILAN_HEJDUK
JAROMIR_JAGR
PPM

MARTIN_HAVLAT
ROBERT_LANG
SHANE_DOAN
RADIM_VRBATA
ZACH_PARISE JAROME_IGINLA
MIKE_RIBEIRO
JOHAN_FRANZEN
JASON_ARNOTT
ILYA_KOVALCHUK
100

JUSTIN_WILLIAMS
THOMAS_VANEK
BRIAN_CAMPBELL
DAYMOND_LANGKOW
NATHAN_HORTON
RYAN_SMYTH
COREY_PERRY
PATRICE_BERGERON
BRIAN_RAFALSKI EVGENI_MALKIN
DANY_HEATLEY
SCOTT_HARTNELL
MARC_SAVARD
ALEX_KOVALEV CHRIS_KUNITZ
PIERRE−MARC_BOUCHARD
SIMON_GAGNE
NICKLAS_BACKSTROM
MIKE_KNUBLE
MICHAEL_RYDER
DOUG_WEIGHT
KEITH_TKACHUK
ERIC_STAAL
PATRICK_SHARP
DAVID_KREJCI JASON_SPEZZA
RAY_WHITNEY
MATHIEU_SCHNEIDER
MIKAEL_SAMUELSSON
STEVE_SULLIVAN
PAVOL_DEMITRA
ANDREW_BRUNETTE
SERGEI_GONCHAR
MILAN_LUCIC
MIKE_GREEN
PETR_SYKORA
MARKUS_NASLUND
JOHN−MICHAEL_LILES
JEAN−PIERRE_DUMONT
DAN_BOYLE
BRENDEN_MORROW
BILL_GUERIN TOMAS_HOLMSTROM
HAL_GILL
TODD_MARCHANT MAREK_ZIDLICKY
ANDY_MCDONALD
PATRICK_MARLEAUDANIEL_BRIERE
TODD_BERTUZZI
PATRICK_KANE
0

SAMUEL_PAHLSSON
BOYD_GORDON MARK_RECCHI
ROB_SCUDERI
JAMAL_MAYERS
BRYCE_SALVADOR
TRAVIS_MOEN
RYAN_JOHNSON
KARLIS_SKRASTINS
−100

SCOTT_HANNAN
JERRED_SMITHSON
JAY_MCCLEMENT
BRENDAN_WITT
CRAIG_ADAMS
ERIC_BREWER

−200 0 200 400 600

If you’re interested . . .
If you want to read more, check out
• Original paper in the Journal of Quantitative Analysis in Sports.
– arXiv version
– Describes optimal line formation, and cost-benefit analysis via salary.
• Book chapter in the Handbook of Statistical Methods and Analyses in Sports.
– arXiv version

Not sure if you’ll see PPM on ESPN and time soon.

Evaluating Basketball Player Performance Via Statistical Network Modeling
No ratings yet
Evaluating Basketball Player Performance Via Statistical Network Modeling
11 pages
Modeling Basketball Match Scores Through Team Specific Strength Factors.
No ratings yet
Modeling Basketball Match Scores Through Team Specific Strength Factors.
10 pages
Estimating NHL Scoring Rates
No ratings yet
Estimating NHL Scoring Rates
19 pages
2086-Sports Analytics IEEE Seminar 1-17-17
No ratings yet
2086-Sports Analytics IEEE Seminar 1-17-17
74 pages
Team 11
No ratings yet
Team 11
12 pages
Project Report
No ratings yet
Project Report
9 pages
Rules
No ratings yet
Rules
7 pages
Use of Performance Metrics To Forecast Success in The National Hockey League
No ratings yet
Use of Performance Metrics To Forecast Success in The National Hockey League
10 pages
Forcasting NHL Success
No ratings yet
Forcasting NHL Success
12 pages
Stat 1
No ratings yet
Stat 1
6 pages
06D MiamiHeat
No ratings yet
06D MiamiHeat
7 pages
Pettigrew NHL Win Probs
No ratings yet
Pettigrew NHL Win Probs
9 pages
Дуглас Хванг - Прогнозирование Результативности Игроков НБА с Помощью Временной Модели На Основе Распределения Вейбулла
No ratings yet
Дуглас Хванг - Прогнозирование Результативности Игроков НБА с Помощью Временной Модели На Основе Распределения Вейбулла
10 pages
NBA Betting Line Prediction Model
No ratings yet
NBA Betting Line Prediction Model
5 pages
Predicting Soccer League Games Using Multinomial Logistic Models
No ratings yet
Predicting Soccer League Games Using Multinomial Logistic Models
9 pages
Abstracts 21
No ratings yet
Abstracts 21
10 pages
NFL Draft Overconfidence Analysis
No ratings yet
NFL Draft Overconfidence Analysis
17 pages
02450ex Fall2017 Sol
No ratings yet
02450ex Fall2017 Sol
20 pages
Modeling and Forecasting The Outcomes of NBA Basketball Games
100% (1)
Modeling and Forecasting The Outcomes of NBA Basketball Games
25 pages
Ice Hockey's Analytics Evolution
No ratings yet
Ice Hockey's Analytics Evolution
11 pages
END332 EMid 1 Solns
No ratings yet
END332 EMid 1 Solns
6 pages
Thesis PDF
No ratings yet
Thesis PDF
137 pages
Project Three: Simple Linear Regression and Multiple Regression
No ratings yet
Project Three: Simple Linear Regression and Multiple Regression
10 pages
Exploring Game Performance in
No ratings yet
Exploring Game Performance in
15 pages
Slides l1 202220220207160519
No ratings yet
Slides l1 202220220207160519
41 pages
2006 - SAMPAIO - Discriminant Analysis of Game Related Statistics Between Basketball Guards, Forwards and Centres in Three Professional Leagues
No ratings yet
2006 - SAMPAIO - Discriminant Analysis of Game Related Statistics Between Basketball Guards, Forwards and Centres in Three Professional Leagues
7 pages
Stanley - Yang - Thesis PDF
No ratings yet
Stanley - Yang - Thesis PDF
31 pages
10.1515 - Jqas 2022 0120
No ratings yet
10.1515 - Jqas 2022 0120
25 pages
Ice Hockey
No ratings yet
Ice Hockey
19 pages
Exceptional Players in The NHL Draft
No ratings yet
Exceptional Players in The NHL Draft
14 pages
MLB Salary & Performance Analysis
No ratings yet
MLB Salary & Performance Analysis
23 pages
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
No ratings yet
Omid Aryan, Ali Reza Sharafat, A Novel Approach To Predicting The Results of NBA Matches
5 pages
Data Science Approach To Predict The Winning Fantasy Cricket Team-Dream 11 Fantasy Sports
No ratings yet
Data Science Approach To Predict The Winning Fantasy Cricket Team-Dream 11 Fantasy Sports
12 pages
Comparative Review of Statistical Parameters For Men's and Women's Basketball Leagues in Serbia
No ratings yet
Comparative Review of Statistical Parameters For Men's and Women's Basketball Leagues in Serbia
20 pages
Final
No ratings yet
Final
54 pages
Won Machinelearningtopredictnbapointspreads
No ratings yet
Won Machinelearningtopredictnbapointspreads
4 pages
How Often Does The Best Team Win - A Unified Approach To Understan
No ratings yet
How Often Does The Best Team Win - A Unified Approach To Understan
35 pages
Stats Analysis for University & NHL
No ratings yet
Stats Analysis for University & NHL
5 pages
44-Lutz Cluster Analysis NBA
No ratings yet
44-Lutz Cluster Analysis NBA
10 pages
Math AI IA
No ratings yet
Math AI IA
15 pages
Performance Evaluation of National Football League Teams
No ratings yet
Performance Evaluation of National Football League Teams
8 pages
NBA2023 2024 Data Guidelines
No ratings yet
NBA2023 2024 Data Guidelines
3 pages
Jasper Lin, Logan Short, Vishnu Sundaresan, Predicting National Basketball Association Game Winners
No ratings yet
Jasper Lin, Logan Short, Vishnu Sundaresan, Predicting National Basketball Association Game Winners
5 pages
Tugas Regresi Confident Interval
100% (1)
Tugas Regresi Confident Interval
23 pages
DICE Hockey Rules
100% (1)
DICE Hockey Rules
11 pages
Game ON! Predicting English Premier League Match Outcomes
No ratings yet
Game ON! Predicting English Premier League Match Outcomes
5 pages
Baseball Regression Models
No ratings yet
Baseball Regression Models
36 pages
Baseball Stats: Predicting Wins
No ratings yet
Baseball Stats: Predicting Wins
36 pages
Offense-Defense Approach To Ranking Team Sports
100% (1)
Offense-Defense Approach To Ranking Team Sports
19 pages
An Analysis of Nba Spatio Temporal Data
No ratings yet
An Analysis of Nba Spatio Temporal Data
44 pages
Data8 sp22 Midterm Solution
No ratings yet
Data8 sp22 Midterm Solution
16 pages
Trabajo-R 2023
No ratings yet
Trabajo-R 2023
6 pages
A Predictive Analytics Model For Forecasting Outcomes in The National Football League Games Using Decision Tree and Logistic Regression
No ratings yet
A Predictive Analytics Model For Forecasting Outcomes in The National Football League Games Using Decision Tree and Logistic Regression
10 pages
A Statistical Approach To Sports Betting
No ratings yet
A Statistical Approach To Sports Betting
193 pages
A Graded Concept of An Information Model
No ratings yet
A Graded Concept of An Information Model
7 pages
Rating Australian Rules Football Teams With The Playerratings Package
No ratings yet
Rating Australian Rules Football Teams With The Playerratings Package
9 pages
Deriving A Model To Calculate The Probability of Scoring A Goal From Every Shooting Position in The Football Pitch and Applying It To Predict The XG For Different Matches.
No ratings yet
Deriving A Model To Calculate The Probability of Scoring A Goal From Every Shooting Position in The Football Pitch and Applying It To Predict The XG For Different Matches.
28 pages
Demystify OpenAI Triton Fkong' Tech Blog
No ratings yet
Demystify OpenAI Triton Fkong' Tech Blog
17 pages
Ecology Zambak 1st Edition Osman Arpaci Instant Download
No ratings yet
Ecology Zambak 1st Edition Osman Arpaci Instant Download
50 pages
IR2382 FTIR Preventative Maintenance (PM) Kit Part Numbers
No ratings yet
IR2382 FTIR Preventative Maintenance (PM) Kit Part Numbers
3 pages
Fuck Better A Simple Guide To Superior Sex
73% (11)
Fuck Better A Simple Guide To Superior Sex
86 pages
Raymond Caldwell: Theatre Educator & Director
No ratings yet
Raymond Caldwell: Theatre Educator & Director
9 pages
Materializing The Digital: Architecture As Interface: Materia Arquitectura #13
No ratings yet
Materializing The Digital: Architecture As Interface: Materia Arquitectura #13
5 pages
Week 2
No ratings yet
Week 2
11 pages
Real Numbers Imp Descriptive Questions
No ratings yet
Real Numbers Imp Descriptive Questions
5 pages
Aquatic Therapy: Buoyancy Benefits
No ratings yet
Aquatic Therapy: Buoyancy Benefits
3 pages
1970 Richart Et Al - Vibrations of Soils and Foundations
No ratings yet
1970 Richart Et Al - Vibrations of Soils and Foundations
39 pages
Hostavin N 30 Pills
No ratings yet
Hostavin N 30 Pills
1 page
Cognitive Function in Schizophrenia A Review 187 PDF
No ratings yet
Cognitive Function in Schizophrenia A Review 187 PDF
8 pages
Yaeger - Apotheosis of Trash
No ratings yet
Yaeger - Apotheosis of Trash
20 pages
Pollab CV3
No ratings yet
Pollab CV3
4 pages
CType List24
No ratings yet
CType List24
49 pages
Chi-Square - Unequal Expected Frequency
0% (1)
Chi-Square - Unequal Expected Frequency
12 pages
Methods and Tools For Directed Activity: Presented by R Harish
No ratings yet
Methods and Tools For Directed Activity: Presented by R Harish
10 pages
Inter and Sub Trochanteric Fracture
100% (1)
Inter and Sub Trochanteric Fracture
25 pages
Elements Facts at Your Fingertips Pocket Eyewitness DK Instant Download
100% (2)
Elements Facts at Your Fingertips Pocket Eyewitness DK Instant Download
55 pages
Complete Bundle Mecanique Pour Ingenieur Dynamique Volume 9th Edition HQ File
100% (1)
Complete Bundle Mecanique Pour Ingenieur Dynamique Volume 9th Edition HQ File
406 pages
BPCC113 E July 2024-January 2025
No ratings yet
BPCC113 E July 2024-January 2025
6 pages
Oil-Grit Separator Guide for Stormwater
100% (1)
Oil-Grit Separator Guide for Stormwater
18 pages
MUPROSPECTUS2023
No ratings yet
MUPROSPECTUS2023
50 pages
A Game of Polo - Revision Mat
100% (1)
A Game of Polo - Revision Mat
1 page
DHC 8 Sop PDF
No ratings yet
DHC 8 Sop PDF
251 pages
Sage Intelligence Reporting - Beginner Training Manual
83% (6)
Sage Intelligence Reporting - Beginner Training Manual
48 pages
Vector Ex Problem Operation-of-Forces
No ratings yet
Vector Ex Problem Operation-of-Forces
2 pages
21h2z7b4 FBM207
No ratings yet
21h2z7b4 FBM207
12 pages
SPARC Program of MHRD For Webpage Upload
No ratings yet
SPARC Program of MHRD For Webpage Upload
2 pages
Zara's Pre-Owned Challenges & Strategy
No ratings yet
Zara's Pre-Owned Challenges & Strategy
4 pages

Hockey

Uploaded by

Hockey

Uploaded by

Hockey analytics

Finding good players using variable selection

Mladen Kolar (mkolar@chicagobooth.edu)

Quantifying player performance in hockey is hard:

In summary, two disadvantages to plus–minus:

We will build a better performance metric with regression.

## homegoal season team.away team.home period differential playoffs

Then, say that

qi = P(Yi = 1) = P(home team scored goal i).

• Notice that the design matrix XP is sparse.

## 3 x 6 sparse Matrix of class "dgCMatrix"

This ability to differentially penalize could be advantageous.

Ok, we’re sold.

Assembling the design

## [1] 69449 2776

Start with α̂, the home-ice advantage (ignoring everything else).

(De-) Selected players

How many are non-zero

## nonzero prop assets liabilities

## PETER_FORSBERG TYLER_TOFFOLI ONDREJ_PALAT ZIGMUND_PALFFY SIDNEY_CROSBY JOE_THORNTON

## TIM_TAYLOR JOHN_MCCARTHY P. J._AXELSSON NICLAS_HAVELID THOMAS_POCK MATHIEU_BIRON

## JEFF_TOMS RYAN_KRAFT COLE_JARRETT TOMAS_POPPERLE DAVID_LIFFITON

Contribution to goal for/against

2189 868 457 338 2189 868 457 338

log lambda log lambda

## AICc AIC BIC CVmin CV1se

## AICc CVmin CV1se BIC

Pk − (1 − Pk ) = P(scored) − P(scored on).

PPMk = Nk Pk − Nk (Pk − 1) = Nk (2Pk − 1).

## JOE_THORNTON PAVEL_DATSYUK SIDNEY_CROSBY ALEX_OVECHKIN HENRIK_LUNDQVIST

## PAVEL_DATSYUK SIDNEY_CROSBY HENRIK_SEDIN ALEX_OVECHKIN DANIEL_SEDIN

−200 0 200 400 600

Not sure if you’ll see PPM on ESPN and time soon.

You might also like