0% found this document useful (0 votes)

16 views16 pages

Chapter 3

The document discusses logistic regression, a key model for categorical response data, highlighting its applications in fields such as biomedical studies, social science, and marketing. It explains the logistic regression model's structure, its interpretation using odds ratios, and provides an example using horseshoe crab data to illustrate the model's application and significance testing. Additionally, it includes a step-by-step guide on how to perform binary logistic regression using SPSS.

Uploaded by

Sanjida Tasnim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views16 pages

Chapter 3

Uploaded by

Sanjida Tasnim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Logistic Regression

In introducing generalized linear models for binary data in chapter two we highlighted logistic regression.
This is the most important model for categorical response data. It is used increasingly in a wide variety of
applications. Early uses were in biomedical studies but the past 20 years have also seen much use in
social science research and marketing. Recently, logistic regression has become a popular tool in business
applications.

Some credit-scoring applications use logistic regression to model the probability that a subject is credit
worthy. For instance, the probability that a subject pays a bill on time may use predictors such as the size
of the bill, annual income, occupation, mortgage and debt obligations, percentage of bills paid on time in
the past, and other aspects of an applicant’s credit history. A company that relies on catalog sales may
determine whether to send a catalog to a potential customer by modeling the probability of a sale as a
function of indices of past buying behavior. In this chapter we study logistic regression more closely.

3.1 THE LOGISTIC REGRESSION MODEL

To begin, suppose there is a single explanatory variable X, which is quantitative. For a binary response
variable Y, recall that π(x) denotes the “success” probability at value x. This probability is the parameter
for the binomial distribution. The logistic regression model has linear form for the logit of this
probability,

logit [ π ( x ) ] =… … … …( 1)

The formula implies that π(x) increases or decreases as an S-shaped function of x. The logistic regression
formula implies the following formula for the probability π(x),

exp ( α + βx )
π ( x) = … … … … … .(2)
1+exp ( α + βx )

3.2 Horseshoe Crabs: Viewing and Smoothing a Binary Outcome

To illustrate these interpretations, we analyze the horseshoe crab data introduced Table 3.1. The table
comes from a study of nesting horseshoe crabs (J. Brockmann, Ethology, 102: 1–21, 1996). Each female
horseshoe crab in the study had a male crab in her nest. The study investigated factors that affect whether
the female crab had any other males, called satellites, residing nearby her. The response outcome for each
female crab is her number of satellites. Explanatory variables are the female crab’s color, spine condition,
weight and female crab’s shell or carapace width (in centimeters). For now, we use female crab’s shell
width alone as a predictor/explanatory variable. In the sample, this shell width had a mean of 26.3 cm and
a standard deviation of 2.1 cm.

Table 3.1 Number of Crab Satellites by Female’s Color(C), Spine Condition(S), Width (W), Weight
(Wt) and Number of satellites (Sa)

1
2
Here, Y indicate whether a female crab has any satellites. That is, Y = 1 if a female crab has at least one
satellite, and Y = 0 if she has no satellite

From SPSS, we get the following output after performing binomial logistic regression, where explanatory
variable (X) is Width or W and dependent variable (Y) is Having satellite.

Let π ( x ) denote the probability that a female horseshoe crab of width x has a satellite. The estimated
probability of a satellite according to the data is,

exp (−12.3508+0.4972 x )
^π ( x ) =
1+exp (−12.3508+ 0.4972 x )

In this sample, the minimum width is 21.0 cm, the estimated probability for this minimum width is

exp (−12.3508+ 0.4972∗21 )

^π ( 21.0 )= =0.129
1+exp (−12.3508+0.4972∗21 )

At the maximum width of 33.5 cm, the estimated probability equals

exp (−12.3508+ 0.4972∗33.5 )

^π ( 33.5 )= =0.987
1+exp (−12.3508+0.4972∗33.5 )

The median effective level is the width at which ^π ( x ) =0.50 . This is

−^α −−12.3508
¿ E L50= = =24.8 . Figure 1 plots the estimated probabilities as a function of width.
^β 0.4972

3
The logistic regression formula (1) indicates that the logit (logarithm of the odds) increases by β for every
1 cm increase in x. The parameter β in equations (1) and (2) determines the rate of increase or decrease of
the S-shaped curve for π(x). The sign of β indicates whether the curve ascends (β > 0) or descends (β <
0), and the rate of change increases as |β| increases.

When β = 0, the right-hand side of equation (2) simplifies to a constant. Then, π(x) is identical at all x, so
the curve becomes a horizontal straight line. The binary response Y is then independent of X.

Figure 1 shows the S-shaped appearance of the model for π(x), for horseshoe crab data. Since it is curved
rather than a straight line, the rate of change in π(x) per 1 unit increase in x depends on the value of x. A
straight line drawn tangent to the curve at a particular x value, such as shown in Figure 1, describes the
rate of change at that point. For logistic regression parameter β, that line has slope equal to βπ(x)[1 −
π(x)]. For instance, the line tangent to the curve at x for which π(x) = 0.50 has slope β(0.50)(0.50) =
0.25β; by contrast, when π(x) = 0.90 or 0.10, it has slope 0.09β. The slope approaches 0 as the probability
approaches 1.0 or 0.

The steepest slope occurs at x for which π(x) = 0.50. That x value relates to the logistic regression
parameters by x = −α/β. This x value is sometimes called the median effective level and is denoted EL50.
It represents the level at which each outcome has a 50% chance.
At the sample mean width of 26.3 cm, ^π ( x ) = 0.674. From Section 4.1.1, the incremental rate of change
in the fitted probability at that point is ^β π^ ( x ) [ 1−π^ ( x ) ] =0. 4972(0 . 674)(0 . 326)=0 . 11 . For female
crabs near the mean width, the estimated probability of a satellite increases at the rate of 0.11 per 1 cm
increase in width. The estimated rate of change is greatest at the x value (24.8) at which ^π ( x ) = 0.50;
there, the estimated probability increases at the rate of (0.4972)(0.50)(0.50) = 0.12 per 1 cm increase in

4
width. Unlike the linear probability model, the logistic regression model permits the rate of change to
vary as x varies.

3.3 Odds Ratio Interpretation

An important interpretation of the logistic regression model uses the odds and the odds ratio. For model
(1), the odds of response 1 (i.e., the odds of a success) are

For the horseshoe crabs, the logistic regression model is

logit[ ^π ( x ) ] = −12.3508 + 0.4972x
When
x = 26.3, ^π ( x ) = 0.674, odds= 0.674/0.326 = 2.07 and log(odds)= ln(2.07)
x = 27.3, ^π ( x ) = 0.773, odds= 0.773/0.227 = 3.40 =2.07*1.64 and log(odds)= ln(3.40)=ln(2.07)+0.4972
x = 28.3, ^π ( x ) = 0.848, odds= 0.848/0.227 = 5.57 =2.07*1.64*1.64 and log(odds)=ln(5.57)=ln(3.4)+0.4972
……..

That means for one unit increase in x, the odds are multiplied by e β =e .4972=¿ 1.64. So, the estimated
odds of a satellite multiply by e β = exp (0.497) = 1.64 for each centimeter increase in width. The
interpretations are as following:

β=0.4972 indicates that, for one unit change in width, the expected change in log odds of having at
least one satellite is 0.4972.

You have to keep in mind that the estimated logistic regression coefficient also provides an estimate of
the odds ratio i.e. ¿=e β
So, OR= e β = exp (0.497) = 1.64 indicates that, there is a 64% increase in the odds of having at least
one satellite for each cm increase in width of female carb’s shell.

3.4 INFERENCE FOR LOGISTIC REGRESSION

Confidence Intervals for Effects

A large-sample Wald confidence interval for the parameter β in the logistic regression model (1) is
^β ± Z . SE
α
2
For the logistic regression analysis of the horseshoe crab data, the estimated effect of width on the
probability of a satellite is ^β = 0.497, with SE = 0.102. A 95%Wald confidence interval for β is 0.497 ±
1.96(0.102), or (0.298, 0.697).

The confidence interval for the odds ratio is (e 0.298 , e 0.697 ¿=( 1.347 , 2.007 ) . We infer that a 1 cm
increase in width has at least a 34 percent increase and at most a doubling in the odds that a female crab
has a satellite.

Significance Testing
For the logistic regression model, H o : β = 0 states that the probability of success is independent of X.
Wald test statistics for large samples is,

5
β^
z=
SE
has a standard normal distribution when β = 0. Refer z to the standard normal table to get a one-sided or
two-sided P-value. Equivalently, for the two-sided Ha: β ≠ 0,
2
z =¿ has a large-sample chi-squared null distribution with df = 1. For the horseshoe crab data, the Wald
β^ 0.4972
statistic z= = =4.9
SE .102
This shows strong evidence of a positive effect of width on the presence of satellites (P <0.0001). The
equivalent chi-squared statistic, z2 = 23.9, has df = 1.

How to run binary logistic in SPSS:

This total process in done in SPSS 22. It can be performed in the same way of the previous version of
SPSS as well.
1. You have to input the variable info in “variable view”.

6
2. In “data view” you will input the data.

3. As our dependent variable (Y) is binary for which the categories are having at least
one satellite (Y=1) or no satellite (Y=0). So we are creating a new binary
variable from “Sa”.

7
4. Select the variable you want to change and tap the button.

5. Write the name and label of the new variable and tap “change”.

8
6. Tap the button

9
7. If the number of satellite is zero, then our new variable Y (having at least one
satellite) is zero. Tap button “add”.

8. If the number of satellite is one or more, then our new variable Y (having at least one
satellite) is one (that means yes). Tap button “add”

10
9. As this new variable is categorical (string), you have to tap the button and then
“continue”.

10. Tap ”OK” .

11
11. Our new binary variable Y is ready .

12. Follow

12
13. Select the new variable as dependent .

14.

13
15. Select the independent variable width as covariate.

16. Go to “option” .

14
17. Select confidence interval for odds raio [OR=exp(B)] and “continue”.

18.

15
19. In “output window”, scroll down and you will find your result of binary logistic
regression.

Materi MT
No ratings yet
Materi MT
14 pages
Materi MT
No ratings yet
Materi MT
14 pages
02 Regresi Logistik
No ratings yet
02 Regresi Logistik
38 pages
Logisticregression PDF
No ratings yet
Logisticregression PDF
48 pages
Logistic Regression
0% (1)
Logistic Regression
49 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Logistic Regression: 30 March 2016
No ratings yet
Logistic Regression: 30 March 2016
49 pages
Logistic Regression
No ratings yet
Logistic Regression
49 pages
Logistic Regression
No ratings yet
Logistic Regression
49 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
54 pages
Lecture 2.3.1
No ratings yet
Lecture 2.3.1
50 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
29 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
Introduction To Logistic Regression
No ratings yet
Introduction To Logistic Regression
20 pages
Logistic Regresson
No ratings yet
Logistic Regresson
32 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Loges Tic
No ratings yet
Loges Tic
30 pages
Background 2.1. Logistic Definition
No ratings yet
Background 2.1. Logistic Definition
6 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
13 pages
RM - Binary Logistic Regression Model - Estimation
No ratings yet
RM - Binary Logistic Regression Model - Estimation
19 pages
Chap4 Logistic Regression
No ratings yet
Chap4 Logistic Regression
40 pages
Logistic Regression for Researchers
100% (2)
Logistic Regression for Researchers
51 pages
Lecture 7 Logistic Regression
No ratings yet
Lecture 7 Logistic Regression
33 pages
Logistic Regression for Statisticians
No ratings yet
Logistic Regression for Statisticians
25 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Logistic Regression Tutorial
No ratings yet
Logistic Regression Tutorial
25 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Logistic Regression
100% (3)
Logistic Regression
30 pages
Logistic
No ratings yet
Logistic
14 pages
ML Logistic Regression Module3 Final
No ratings yet
ML Logistic Regression Module3 Final
22 pages
SPSS Logistic Regression Guide
No ratings yet
SPSS Logistic Regression Guide
4 pages
Logistic Regression
100% (2)
Logistic Regression
47 pages
CH15
No ratings yet
CH15
19 pages
Logistic Regression Exercises
No ratings yet
Logistic Regression Exercises
3 pages
Logistic Regression
No ratings yet
Logistic Regression
98 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
Intro to Logistic Regression
No ratings yet
Intro to Logistic Regression
45 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
49 pages
Machine Learning and Its Algorithms
No ratings yet
Machine Learning and Its Algorithms
60 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
Logistic Regression ADA Xid-2911285 1 0SwZFA4qav
No ratings yet
Logistic Regression ADA Xid-2911285 1 0SwZFA4qav
98 pages
CDA Assignment4
No ratings yet
CDA Assignment4
12 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
79 LogisticReg - Cleaned
No ratings yet
79 LogisticReg - Cleaned
4 pages
12 - Logistics Regression
No ratings yet
12 - Logistics Regression
15 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
No ratings yet
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
170 pages
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
100% (1)
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
170 pages
Lecture 05
No ratings yet
Lecture 05
29 pages
Algorithm Lab Report
No ratings yet
Algorithm Lab Report
15 pages
AI in Advanced Manufacturing Trends
No ratings yet
AI in Advanced Manufacturing Trends
16 pages
Redunet: A White-Box Deep Network From The Principle of Maximizing Rate Reduction
No ratings yet
Redunet: A White-Box Deep Network From The Principle of Maximizing Rate Reduction
97 pages
A Continuously Varying Physical Quantity by A Sequence of Discrete Numerical Values
No ratings yet
A Continuously Varying Physical Quantity by A Sequence of Discrete Numerical Values
24 pages
Model Order Reduction Techniques For Reducing Order of Industrial PR
No ratings yet
Model Order Reduction Techniques For Reducing Order of Industrial PR
5 pages
Cse 5
No ratings yet
Cse 5
4 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
31 pages
Trivedi 23 A
No ratings yet
Trivedi 23 A
33 pages
DAA Unit II
No ratings yet
DAA Unit II
13 pages
CENGR 3140:: Numerical Solutions To Ce Problems
No ratings yet
CENGR 3140:: Numerical Solutions To Ce Problems
21 pages
Teaching AI To Play Games Using Neuroevolution of Augmenting Topologies
No ratings yet
Teaching AI To Play Games Using Neuroevolution of Augmenting Topologies
3 pages
HW No 1. February 14, 2017
No ratings yet
HW No 1. February 14, 2017
3 pages
Difference Between DDA N Bresenham
No ratings yet
Difference Between DDA N Bresenham
2 pages
Implementasi Data Mining Clustering Tingkat Kepuasan Konsumen Terhadap Pelayanan Go-Jek
No ratings yet
Implementasi Data Mining Clustering Tingkat Kepuasan Konsumen Terhadap Pelayanan Go-Jek
7 pages
NLP QB
No ratings yet
NLP QB
7 pages
Stanford CS224W: Graph ML Course
No ratings yet
Stanford CS224W: Graph ML Course
68 pages
Calculus: Predicting Water Trends
No ratings yet
Calculus: Predicting Water Trends
4 pages
Algorithms For Vlsi: Partitioning: Problem Formulation
No ratings yet
Algorithms For Vlsi: Partitioning: Problem Formulation
3 pages
Lab Report 4, Computer Graphics BCA 5th Sem
No ratings yet
Lab Report 4, Computer Graphics BCA 5th Sem
4 pages
Linear Equations Worksheet 3
No ratings yet
Linear Equations Worksheet 3
4 pages
Final Project RSA Secure Chat Server CSC 290 Warren Fong Wf007j@mail - Rochester.edu
No ratings yet
Final Project RSA Secure Chat Server CSC 290 Warren Fong Wf007j@mail - Rochester.edu
6 pages
Clustering Techniques Review
No ratings yet
Clustering Techniques Review
2 pages
Mathematical Foundations of Deep Learning
No ratings yet
Mathematical Foundations of Deep Learning
174 pages
Daily Report
No ratings yet
Daily Report
3 pages
Simio AI Whitepaper 2025-1
No ratings yet
Simio AI Whitepaper 2025-1
11 pages
CS1A September22 EXAM
No ratings yet
CS1A September22 EXAM
9 pages
Food Demand Forecast Final Report
No ratings yet
Food Demand Forecast Final Report
30 pages
Software Testing-2
No ratings yet
Software Testing-2
36 pages
Surya Santoso, H. Wayne Beaty, Roger C. Dugan, Mark F. McGranaghan - Electrical Power Systems Quality
No ratings yet
Surya Santoso, H. Wayne Beaty, Roger C. Dugan, Mark F. McGranaghan - Electrical Power Systems Quality
10 pages
DTI400 Presentation Template 4thsem
No ratings yet
DTI400 Presentation Template 4thsem
12 pages

Chapter 3

Uploaded by

Chapter 3

Uploaded by

Logistic Regression

3.1 THE LOGISTIC REGRESSION MODEL

3.2 Horseshoe Crabs: Viewing and Smoothing a Binary Outcome

exp (−12.3508+ 0.4972∗21 )

At the maximum width of 33.5 cm, the estimated probability equals

exp (−12.3508+ 0.4972∗33.5 )

The median effective level is the width at which ^π ( x ) =0.50 . This is

3.3 Odds Ratio Interpretation

For the horseshoe crabs, the logistic regression model is

3.4 INFERENCE FOR LOGISTIC REGRESSION

Confidence Intervals for Effects

How to run binary logistic in SPSS:

10. Tap ”OK” .

You might also like