0% found this document useful (0 votes)

21 views26 pages

3 Classification

This document provides an overview of classification methods in statistics, focusing on logistic regression and its applications for predicting binary outcomes, such as credit card defaults. It discusses the limitations of regression methods for qualitative responses, introduces multinomial logistic regression for multi-class classification, and covers probit and Poisson regression for binary and count data, respectively. The document emphasizes the use of maximum likelihood estimation for model fitting and interpretation of coefficients in various regression models.

Uploaded by

Niyati Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views26 pages

3 Classification

Uploaded by

Niyati Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

STAT 654

Chapter 3: Classification

Dr. Sharmistha Guha

Department of Statistics, Texas A&M University

Spring 2024

1 / 26
An Overview of Classification

The response variable is qualitative.

For e.g., y ∈ {0, 1}.
Major reasons not to perform classification using a regression
method (studied earlier):
(1) A regression method cannot accommodate a qualitative
response with multiple classes, and (2) A regression method
will not provide meaningful estimates of Pr (Y |X ), even with
just two classes.
Consider Y = 1 if stroke; 2 if drug overdose; 3 if epileptic
seizure. This coding implies an ordering on the outcomes.
In practice there is no particular reason that this needs to be
the case.
One could choose an equally reasonable coding, Y = 1 if
epileptic seizure; 2 if stroke; 3 if drug overdose.
This would imply a totally different relationship among the
three conditions.

2 / 26
Default Data Illustration

Data to illustrate classification: Default data set (ISLR2).

Goal: Predict if an individual will default on credit card
payment, on the basis of (1) annual income and (2) monthly
credit card balance.

Individuals who defaulted in a given month are shown in orange, and those who did
not in blue. The ’Default’ variable is binary, i.e., defaulted or not.

3 / 26
Default Data Illustration - Regression vs. Classification

Classification using the Default data. Left: Estimated probability of default using linear
regression. Some estimated probabilities are negative! The orange ticks indicate the
0/1 values coded for default (No or Yes). Right: Predicted probabilities of default
using logistic regression. All probabilities lie between 0 and 1.

4 / 26
The Logistic Model
Model the relationship between p (X ) = Pr (Y = 1|X ) and X .
We model p(X) using a function that gives outputs between 0
and 1 for all values of X.
For e.g., we use the logistic function (or sigmoid function) in
logistic regression
β +β X
p (X ) = 1+e e0β0 +β1 1 X

Logistic Function

5 / 26
The Logistic Model

p (X )
This implies 1−p (X )
= e β0 +β1 X
The quantity 1−p(pX(X) ) is called the odds. Can take on any value
between 0 and ∞.
This implies log ( 1−p(pX(X) ) ) = β0 + β1 X . The LHS is called the
log odds or logit.
In a linear regression model, β1 gives the average change in Y
associated with a 1-unit increase in X .
In a logistic regression model, increasing X by one unit changes
the log-odds by β1 .
Because the relationship between p (X ) and X is not a straight
line, β1 does not correspond to the change in p(X) associated
with a one-unit increase in X.
The amount that p (X ) changes due to a 1-unit change in X
depends on the current value of X .

6 / 26
Estimating the Regression Coefficients
The coefficients β0 and β1 are unknown, and must be
estimated.
In linear regression, we used the least squares approach to
estimate the unknown linear regression coefficients.
Here the method of maximum likelihood is preferred due to
good statistical properties.
Consider theQ likelihood function:
Q
L(β0 , β1 ) = i :yi =1 p (xi ) j :yj =0 (1 − p (xj ))

The estimates βˆ0 and βˆ1 are chosen to maximize this likelihood
function.
The z-statistic for β1 = SEβ(β1 1 ) , and so a large (absolute) value
of the z-statistic indicates evidence against the null hypothesis
H0 : β1 = 0.
e β0
H0 implies that p (X ) = 1+e β0
, i.e., the probability of {Y = 1}
does not depend on X .
7 / 26
Making Predictions

Think of the Default data.

Once the coefficients have been estimated, we can compute the
probability of default for any given credit card balance.

For the Default data, estimated coefficients of the logistic regression model that
predicts the probability of default using balance. A 1-unit increase in balance is
associated with an increase in the log odds of default by 0.0055 units.

Using the coefficient estimates given above, we predict that the

default probability for an individual with a balance of $1000 is:
βˆ +βˆ X
p̂(X ) = e 0βˆ0 +1βˆ1 X = 0.00576 (plugging in values).
1+e

8 / 26
Qualitative Predictors
We can use qualitative predictors with the logistic regression
model using the dummy variable approach.
For e.g., the Default dataset contains the qualitative variable
student.
To use student status as a predictor variable, create a dummy
variable that takes value 1 for students, and 0 for non-students.

For the Default data, estimated coefficients of the logistic regression model that
predicts the probability of default using student status.

The coefficient associated with the dummy variable is positive, and

the associated p-value is statistically significant. This indicates that
students tend to have higher default probabilities than
non-students.
e −3.5+0.405X 1
p̂(default = 1|student = 1) = 1+e −3.5+0.405X 1 = 0.0431.
9 / 26
Multiple Logistic Regression

Predict a binary response using multiple predictors (p

predictors).
We can generalize
the logistic regression model as follows:
log 1−p(pX(X) ) = β0 + β1 X1 + ... + βp Xp .
e β0 +β1 X1 +...+βp Xp
Hence, p (X ) = 1+e β0 +β1 X1 +...+βp Xp
We use the maximum likelihood method to estimate
β0 , β1 , ..., βp .

10 / 26
Multinomial Logistic Regression

Classify a response variable that has more than two classes.

For e.g., we may have three categories of medical condition in
the emergency room: stroke, drug overdose, epileptic seizure.
The logistic regression approach that we have seen only allows
for K = 2 classes for the response variable, e.g., Y ∈ {0, 1}.
It is possible to extend the two-class logistic regression
approach to the setting of K > 2 classes ⇒ Multinomial
logistic regression.
Select a single class to serve as the baseline. Without loss of
generality, we select the K th class as the baseline. Then the
model is: βk0 +βk1 x1 +...+βkp xp
p (Y = k |X = x ) = 1+Pe K −1 e βl0 +βl1 x1 +...+βlp xp for k = 1, ..., K − 1.
l =1

11 / 26
Multinomial Logistic Regression

1
p (Y = K |X = x ) = P −1 β +β x1 +...+β xp
1+ K l0 l1 lp
l =1 e

We can show that for k = 1, ..., K − 1,

=k |X =x )
log pp((YY =K |X =x )
= βk0 + βk1 x1 + ... + βkp xp
Once again, the log odds between any pair of classes is linear in
the features.
The decision to treat the K th class as the baseline is
unimportant.
When classifying emergency room visits into stroke, drug
overdose, and epileptic seizure, if we fit 2 multinomial logistic
regression models: treating (1) stroke and (2) drug overdose, as
baselines.

12 / 26
Multinomial Logistic Regression

The coefficient estimates will differ between the two fitted

models due to the differing choice of baseline.
The fitted values (predictions) and the log-odds between any
pair of classes will remain the same.
Be careful with interpretation of the coefficients in a
multinomial logistic regression model, since they are tied to the
choice of baseline!
E.g., setting epileptic seizure as baseline, interpret βstroke ,0 as
the log odds of stroke versus epileptic seizure, given that
x1 = ... = xp = 0.
A 1-unit increase in Xj is associated with a βstroke ,j increase in
the log odds ofstroke over epileptic seizure, i.e., for 1-unit
p (Y =stroke |X =x )
increase in Xj , p(Y =epilepticseizure |X =x ) increases by e βstroke ,j .

13 / 26
Probit Regression

Binary Response y ∈ {0, 1}. Examples: Yes/No,

Success/Failure, Disease/No Disease.
Vector of regressors X , influencing the outcome Y. We assume
that the model takes the form
p (Y = 1|X ) = Φ(X T β).
Here Φ is the Cumulative Distribution Function (CDF) of the
standard normal distribution.
The parameters β are usually estimated by maximum likelihood.
We can motivate the probit model as a latent variable model.

14 / 26
Probit Regression

Think of a latent variable Y ∗ as the underlying latent

propensity that Y = 1. Note that Y ∗ is unobserved.
E.g., For the binary variable, Disease/No Disease, Y ∗ is the
propensity for Disease.
Consider Y ∗ = X T β + , where ∼ N (0, 1).
Then Y can be viewed as an indicator for whether this latent
variable is positive: Also, Y = 1 if Y ∗ > 0 and Y = 0 if
Y ∗ ≤ 0.
To see that the two approaches are equivalent, note that
p (Y = 1|X ) = p (Y ∗ > 0) = p (X T β + > 0) = p ( > −X T β)
= p ( < X T β) = Φ(X T β)

15 / 26
Maximum Likelihood estimation

Suppose data set {yi , xi }ni=1 contains n independent

observations.
For a single observation, we have p (yi = 1|xi ) = Φ(xi0 β), and
p (yi = 0|xi ) = 1 − Φ(xi0 β).
The likelihood of a single observation {yi , xi } is
L(β; yi , xi ) = Φ(xi0 β)yi [1 − Φ(xi0 β)](1−yi ) )
The observations are independent, so the likelihood of the
entire sample (joint
Qn likelihood):
L(β; Y , X ) = i =1 Φ(xi0 β)yi [1 − Φ(xi0 β)](1−yi )
We can take the joint log likelihood, and maximize w.r.t. β .
Thus we obtain the estimator β̂ , which has desirable theoretical
properties.

16 / 26
Generalized Linear Models
Till now, we covered linear regression (for continuous
response), logistic regression and probit regression (for binary
response), respectively.
All these are part of the generalized linear models, with
different link functions.
Now we will consider count data.
Consider the Bikeshare dataset in ISLR2.
The response is ‘bikers’, the number of hourly users of a bike
sharing program in Washington, DC.
Consider predicting bikers using mnth (month of the year), hr
(hour of the day, from 0 to 23), workingday (an indicator
variable that equals 1 if it is neither a weekend nor a holiday),
temp (the normalized temperature, in Celsius), and weathersit
(a qualitative variable that takes on one of four possible values:
clear; misty or cloudy; light rain or light snow; or heavy rain or
heavy snow.)
Treat mnth, hr, and weathersit as qualitative variables.
17 / 26
Regression with Count Data

Consider using linear regression for data with count response.

Some obvious issues appear.
Some fitted values may be negative (predicted values may be
negative).
This calls into question our ability to perform meaningful
predictions on the data.
Raises concerns about the accuracy of the coefficient estimates,
confidence intervals, and other outputs of the regression model.
Heteroscedasticity may be observed which questions the
suitability of a linear regression model.
While the response is integer-valued, in a linear model, the
response is necessarily continuous-valued. So a linear regression
model is not entirely satisfactory for this dataset.

18 / 26
Poisson Regression

To overcome the inadequacies of linear regression for count

data, use Poisson regression.
Recall the Poisson distribution: Suppose a random variable Y
takes on nonnegative integer values, i.e., Y ∈ {0, 1, 2, ...}.
−λ k
If Y follows the Poisson distribution, then p (Y = k ) = e k !λ
for k = 0, 1, 2, ....
Here, λ > 0 is E(Y ). Also, λ = V (Y ).
Thus if Y follows Poisson distribution, the larger E (Y ) , the
larger is V (Y ).
The Poisson distribution is typically used to model counts,
since counts, like the Poisson distribution, take on nonnegative
integer values.
For regression, rather than a fixed λ, we would like to allow the
mean to vary as a function of the covariates.

19 / 26
Poisson Regression

Consider the model:

log (λ(X1 , ..., Xp )) = β0 + β1 X1 + ... + βp Xp
Or equivalently, λ(X1 , ..., Xp )) = e β0 +β1 X1 +...+βp Xp
To estimate the coefficients β0 , β1 , ..., βp , we use the same
maximum likelihood approach that we adopted for logistic
regression.
Specifically, given n independent observations from the Poisson
regression model, the likelihood takes the form
−λ(xi )
xi )yi
L(β0 , β1 , ..., βp ) = ni=1 e yiλ(
Q
! , where
λ(xi ) = e β0 +β1 X1 +...+βp Xp

We estimate the coefficients that maximize the likelihood

L(β0 , β1 , ..., βp ), i.e., that make the observed data as likely as
possible.

20 / 26
Poisson Regression

Interpretation: To interpret the coefficients in the Poisson

regression, note that an increase in Xj by 1 unit is associated
with a change in E (Y ) = λ by a factor of e βj .
Mean-variance relationship: Under the Poisson model,
λ = E (Y ) = V (Y ).
Nonnegative fitted values: There are no negative predictions
using the Poisson regression model.

21 / 26
Generalized Linear Models: Closing Remarks

We have now discussed the following regression models: linear,

logistic, probit and Poisson.
All approaches use predictors X1 , ..., Xp to predict a response
Y . Conditional onX1 , ..., Xp , Y belongs to a certain family of
distributions.
Linear regression: Y ∼ Normal and
E (Y | X1 , ..., Xp ) = β0 + β1 X1 + ... + βp Xp
Logistic regression: Y ∼ Bernoulli and
e β0 +β1 X1 +...+βp Xp
E (Y | X1 , ..., Xp ) = p (Y = 1 | X1 , ..., Xp ) = 1+e β0 +β1 X1 +...+βp Xp

22 / 26
Generalized Linear Models: Closing Remarks

Poisson regression: Y ∼ Poisson and

E (Y | X1 , ..., Xp ) = λ(X1 , ..., Xp ) = e β0 +β1 X1 +...+βp Xp
These equations can be expressed using a link function η .
The link function applies a transformation to E (Y | X1 , ..., Xp )
so that the transformed mean is a linear function of the
predictors, i.e., η(E (Y | X1 , ..., Xp )) = β0 + β1 X1 + ... + βp Xp .
The link functions are the following:
linear η(µ) = µ,
logistic η(µ) = log (µ/(1 − µ)), and
Poisson η(µ) = log (µ)

23 / 26
Receiver Operating Characteristic (ROC) curve

ROC curve is a graphical plot that illustrates the diagnostic

ability of a binary classifier, as its discrimination threshold is
varied.
Method developed for operators of military radar receivers,
hence the name.
The ROC curve is created by plotting the true positive rate
(TPR) against the false positive rate (FPR) at various
thresholds.
The true-positive rate is also known as sensitivity or recall
(probability of detection).
The false-positive rate is also known as probability of false
alarm = (1 - specificity).
The performance of a classifier is given by the area under the
(ROC) curve (AUC).

24 / 26
Receiver Operating Characteristic (ROC) curve

An ideal ROC curve will be very close to the top left corner, so
the larger the AUC the better the classifier.
We expect a classifier that performs no better than chance to
have an AUC of 0.5.

The ideal ROC curve is close to the top left corner, indicating a high AUC. The dotted
line represents the “no information” classifier

25 / 26
Chapter Reference

An Introduction to Statistical Learning by G.James, D.Witten,

T.Hastie, R.Tibshirani

26 / 26

Class
No ratings yet
Class
102 pages
Data Science and Bigdata Analytics: Dr. Ali Imran Jehangiri
No ratings yet
Data Science and Bigdata Analytics: Dr. Ali Imran Jehangiri
20 pages
Binary Logistic
No ratings yet
Binary Logistic
29 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Regression With A Binary Dependent Variable: Michael Ash
No ratings yet
Regression With A Binary Dependent Variable: Michael Ash
18 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Day 13 Logistic Regression
No ratings yet
Day 13 Logistic Regression
28 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
33 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
04 Chap04 ClassificationMethods-LogisticRegression 2024
No ratings yet
04 Chap04 ClassificationMethods-LogisticRegression 2024
23 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
Lecture 8
No ratings yet
Lecture 8
22 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Lecture 05
No ratings yet
Lecture 05
5 pages
Logistic Regression for Researchers
100% (2)
Logistic Regression for Researchers
51 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regresson
No ratings yet
Logistic Regresson
32 pages
302 F 14 Logistic Regression
No ratings yet
302 F 14 Logistic Regression
23 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
ClassificationMethods LogisticRegression
No ratings yet
ClassificationMethods LogisticRegression
52 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
ML Logistic Regression Module3 Final
No ratings yet
ML Logistic Regression Module3 Final
22 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Chap4 Logistic Regression
No ratings yet
Chap4 Logistic Regression
40 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
Class5 Slides
No ratings yet
Class5 Slides
22 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
100% (2)
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
76 pages
Business Analytics & Machine Learning: Logistic and Poisson Regressions
No ratings yet
Business Analytics & Machine Learning: Logistic and Poisson Regressions
62 pages
Logistic-Regression
No ratings yet
Logistic-Regression
3 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Logistic Regression
No ratings yet
Logistic Regression
23 pages
Classification
No ratings yet
Classification
56 pages
Msfe Week9
No ratings yet
Msfe Week9
5 pages
Logistic Regression Guide
100% (1)
Logistic Regression Guide
34 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
4 - Logistic Reg 1
No ratings yet
4 - Logistic Reg 1
30 pages
Week04 Lecture BB
No ratings yet
Week04 Lecture BB
80 pages
Machine Learning for Mechanics
No ratings yet
Machine Learning for Mechanics
19 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
L16 LogisticRegression
No ratings yet
L16 LogisticRegression
15 pages
Notes 13
No ratings yet
Notes 13
18 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Day 4
No ratings yet
Day 4
29 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Chapter 5 MGT
No ratings yet
Chapter 5 MGT
60 pages
Lec 20
No ratings yet
Lec 20
16 pages
ML2 Logistic Regression
No ratings yet
ML2 Logistic Regression
23 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Solutions To Coding
No ratings yet
Solutions To Coding
16 pages
Multiregression
No ratings yet
Multiregression
34 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Rename
No ratings yet
Rename
8 pages
IBM Sales Forecasting Case Study
No ratings yet
IBM Sales Forecasting Case Study
17 pages
Testing Linear Restriction: DR Hédi Essid
No ratings yet
Testing Linear Restriction: DR Hédi Essid
19 pages
Instrumental Variables & 2SLS Guide
No ratings yet
Instrumental Variables & 2SLS Guide
21 pages
Questions For Practice (Econometric Methods)
No ratings yet
Questions For Practice (Econometric Methods)
2 pages
Regression PDF
No ratings yet
Regression PDF
21 pages
GRS
No ratings yet
GRS
9 pages
Mendenhall R
No ratings yet
Mendenhall R
14 pages
Bowerman Regression CHPT 1
100% (2)
Bowerman Regression CHPT 1
18 pages
Tests of Normality: Kolmogorov-Smirnov Shapiro-Wilk Statistic DF Sig. Statistic DF Sig. Standardized Residual For Daya
No ratings yet
Tests of Normality: Kolmogorov-Smirnov Shapiro-Wilk Statistic DF Sig. Statistic DF Sig. Standardized Residual For Daya
2 pages
STAT1400 2022 1st Week4-Lecture 8
No ratings yet
STAT1400 2022 1st Week4-Lecture 8
22 pages
Chapter 8
No ratings yet
Chapter 8
8 pages
Chapter 3 Economic and Econometric Models
No ratings yet
Chapter 3 Economic and Econometric Models
41 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
129 pages
Stock Watson 3U ExerciseSolutions Chapter15 Instructors
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter15 Instructors
12 pages
PLUM - Ordinal Regression: Notes
No ratings yet
PLUM - Ordinal Regression: Notes
4 pages
Upm Impact of Export Duty Structure On The Performance of Malaysia Palm Oil Industry
No ratings yet
Upm Impact of Export Duty Structure On The Performance of Malaysia Palm Oil Industry
18 pages
Goldfeld-Quandt Test - Basic Econometrics
No ratings yet
Goldfeld-Quandt Test - Basic Econometrics
3 pages
Business Profit Analysis
100% (1)
Business Profit Analysis
3 pages
Intro Regression Modeling
No ratings yet
Intro Regression Modeling
11 pages
Content Server
No ratings yet
Content Server
13 pages
BSC - Applied Statistics - Correlation and SLR
No ratings yet
BSC - Applied Statistics - Correlation and SLR
67 pages
ECN 702 Final Examination Question Paper
No ratings yet
ECN 702 Final Examination Question Paper
6 pages
Regression Analysis Overview
No ratings yet
Regression Analysis Overview
15 pages
PLUM - Ordinal Regression: Warnings
No ratings yet
PLUM - Ordinal Regression: Warnings
3 pages
FARMER, J. Doyne. Economy As Complex System
No ratings yet
FARMER, J. Doyne. Economy As Complex System
15 pages
Regression Shrinkage Techniques
No ratings yet
Regression Shrinkage Techniques
5 pages
Econ140 Spring18 Syllabus
No ratings yet
Econ140 Spring18 Syllabus
7 pages
Time Series
No ratings yet
Time Series
27 pages
Econometrics Notes Heidelberg
No ratings yet
Econometrics Notes Heidelberg
62 pages
Linear Regression and Correlation: Mcgraw Hill/Irwin
No ratings yet
Linear Regression and Correlation: Mcgraw Hill/Irwin
37 pages

3 Classification

Uploaded by

3 Classification

Uploaded by

STAT 654

Dr. Sharmistha Guha

Department of Statistics, Texas A&M University

The response variable is qualitative.

Data to illustrate classification: Default data set (ISLR2).

Think of the Default data.

Using the coefficient estimates given above, we predict that the

The coefficient associated with the dummy variable is positive, and

Predict a binary response using multiple predictors (p

Classify a response variable that has more than two classes.

We can show that for k = 1, ..., K − 1,

The coefficient estimates will differ between the two fitted

Binary Response y ∈ {0, 1}. Examples: Yes/No,

Think of a latent variable Y ∗ as the underlying latent

Suppose data set {yi , xi }ni=1 contains n independent

Consider using linear regression for data with count response.

To overcome the inadequacies of linear regression for count

Consider the model:

We estimate the coefficients that maximize the likelihood

Interpretation: To interpret the coefficients in the Poisson

We have now discussed the following regression models: linear,

Poisson regression: Y ∼ Poisson and

ROC curve is a graphical plot that illustrates the diagnostic

An Introduction to Statistical Learning by G.James, D.Witten,

You might also like

We can show that for k = 1, ..., K − 1,