0% found this document useful (0 votes)

22 views28 pages

Logistic Regression

This document discusses logistic regression, including what it is, when it is used, how the logistic function relates to logistic regression, and how to evaluate logistic regression models. It also includes code to load and explore a sample dataset.

Uploaded by

dgdangelodg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views28 pages

Logistic Regression

Uploaded by

dgdangelodg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

6/1/23, 10:31 PM logistic_regression

Here Some Important Questions with Answer on logistic Regression

What is logistic regression, and when is it used?

logistic regression is also part of regression but the main difference between logistics
regression and other regression problem is ,logistics regression work on categorical
problems like male and female , true and false ,yes or no

When is it used ?

1. Predictive modelling
2. Medical Research
3. Credit Scoring
4. Market and customer analystics

What is the logistic function (also known as the sigmoid function), and why is it used in
logistic regression?

The logistics function, also known as sigmoid function, it is a mathematical function that
maps any real value number to a value between 0 and 1 , it is s-shaped cured and is
represented by formula σ(z) = 1 / (1 + e^(-z))

where σ(z) represents the output (probability) and z represents the input to the function.

How do you evaluate the performance of a logistic regression model?

Here is some commonly used evaluation methods for logistic regression

1. confusion matrix
2. Accuracy
3. Precision
4. Recall
5. F1 Score
6. ROC Curve

Import Ncessary Library

In [ ]: import numpy as np
import pandas as pd

load dataset

In [ ]: df = pd.read_csv('ft.csv')
df.head()

file:///C:/Users/rinki/Downloads/logistic_regression.html 1/28
6/1/23, 10:31 PM logistic_regression

Out[ ]: male age education currentSmoker cigsPerDay BPMeds prevalentStroke prevalentHyp

0 1 39 4.0 0 0.0 0.0 0 0

1 0 46 2.0 0 0.0 0.0 0 0

2 1 48 1.0 1 20.0 0.0 0 0

3 0 61 3.0 1 30.0 0.0 0 1

4 0 46 3.0 1 23.0 0.0 0 0

Perform EDA

In [ ]: df.shape

Out[ ]: (4238, 16)

In [ ]: #view null value

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4238 entries, 0 to 4237
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 male 4238 non-null int64
1 age 4238 non-null int64
2 education 4133 non-null float64
3 currentSmoker 4238 non-null int64
4 cigsPerDay 4209 non-null float64
5 BPMeds 4185 non-null float64
6 prevalentStroke 4238 non-null int64
7 prevalentHyp 4238 non-null int64
8 diabetes 4238 non-null int64
9 totChol 4188 non-null float64
10 sysBP 4238 non-null float64
11 diaBP 4238 non-null float64
12 BMI 4219 non-null float64
13 heartRate 4237 non-null float64
14 glucose 3850 non-null float64
15 TenYearCHD 4238 non-null int64
dtypes: float64(9), int64(7)
memory usage: 529.9 KB

In [ ]: #view descriptive statics

df.describe()

file:///C:/Users/rinki/Downloads/logistic_regression.html 2/28
6/1/23, 10:31 PM logistic_regression

Out[ ]: male age education currentSmoker cigsPerDay BPMeds preval

count 4238.000000 4238.000000 4133.000000 4238.000000 4209.000000 4185.000000 42

mean 0.429212 49.584946 1.978950 0.494101 9.003089 0.029630

std 0.495022 8.572160 1.019791 0.500024 11.920094 0.169584

min 0.000000 32.000000 1.000000 0.000000 0.000000 0.000000

25% 0.000000 42.000000 1.000000 0.000000 0.000000 0.000000

50% 0.000000 49.000000 2.000000 0.000000 0.000000 0.000000

75% 1.000000 56.000000 3.000000 1.000000 20.000000 0.000000

max 1.000000 70.000000 4.000000 1.000000 70.000000 1.000000

In [ ]: #check duplicate rows

duplicate_rows = df.duplicated()
#count the number of True values
num_dup_rows = duplicate_rows.sum()
num_dup_rows

Out[ ]: 0

In [ ]: import matplotlib.pyplot as plt

import seaborn as sns

num_feat = ["male","age","education","currentSmoker","cigsPerDay","BPMeds","prev
for feature in num_feat:
plt.figure(figsize =(7,7) )
sns.histplot(df[feature],kde = True)
plt.title(f"Histogram of {feature}")

file:///C:/Users/rinki/Downloads/logistic_regression.html 3/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 4/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 5/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 6/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 7/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 8/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 9/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 10/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 11/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 12/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 13/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 14/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 15/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 16/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 17/28
6/1/23, 10:31 PM logistic_regression

In [ ]: num_feat = ["male","age","education","currentSmoker","cigsPerDay","BPMeds","prev
sns.pairplot(df[num_feat])
plt.show()

file:///C:/Users/rinki/Downloads/logistic_regression.html 18/28
6/1/23, 10:31 PM logistic_regression

In [ ]: for feature in num_feat:

plt.figure(figsize=(6,4))
sns.boxplot(x=df[feature])
plt.title(f'boxplot of {feature}')
plt.show()

file:///C:/Users/rinki/Downloads/logistic_regression.html 19/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 20/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 21/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 22/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 23/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 24/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 25/28
6/1/23, 10:31 PM logistic_regression

file:///C:/Users/rinki/Downloads/logistic_regression.html 26/28
6/1/23, 10:31 PM logistic_regression

In [ ]: X = df[['age','prevalentHyp','sysBP','diaBP','glucose']]
y = df['TenYearCHD']

In [ ]: X.isnull().sum()

Out[ ]: age 0
prevalentHyp 0
sysBP 0
diaBP 0
glucose 388
dtype: int64

In [ ]: X['glucose'] = X['glucose'].fillna(value=df['glucose'].mean())

<ipython-input-19-32a7772c3ba4>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/sta

ble/user_guide/indexing.html#returning-a-view-versus-a-copy
X['glucose'] = X['glucose'].fillna(value=df['glucose'].mean())

In [ ]: from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_st

In [ ]: from sklearn.linear_model import LogisticRegression

lr = LogisticRegression()
lr.fit(x_train, y_train)

Out[ ]: ▾ LogisticRegression

LogisticRegression()

file:///C:/Users/rinki/Downloads/logistic_regression.html 27/28
6/1/23, 10:31 PM logistic_regression

In [ ]: score = lr.score(x_train, y_train)

score

Out[ ]: 0.8486176668914363

In [ ]: from sklearn.metrics import confusion_matrix

y_pred = lr.predict(x_test)
y_true = y_test
confusion_matrix(y_true, y_pred)

Out[ ]: array([[1080, 4],

[ 182, 6]])

In [ ]: score = np.array(score).reshape(-1, 1)

In [ ]: from sklearn.metrics import roc_curve, auc

fpr, tpr, thresholds = roc_curve(y_true, y_pred)

In [ ]: roc_auc = auc(fpr, tpr)

In [ ]: plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' %
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()

file:///C:/Users/rinki/Downloads/logistic_regression.html 28/28

Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Logistic Regression for Heart Disease
No ratings yet
Logistic Regression for Heart Disease
8 pages
Week - 6 - SWI - MLP - LogisticRegression - Ipynb - Colaboratory
No ratings yet
Week - 6 - SWI - MLP - LogisticRegression - Ipynb - Colaboratory
15 pages
LAB8 LogisticReg HeartDisease
No ratings yet
LAB8 LogisticReg HeartDisease
31 pages
ASSIGNMENT II - Logistic Regression (Sukanya Das - 221001001006)
No ratings yet
ASSIGNMENT II - Logistic Regression (Sukanya Das - 221001001006)
10 pages
Stroke Prediction
No ratings yet
Stroke Prediction
14 pages
6034 Logistic Regression
No ratings yet
6034 Logistic Regression
6 pages
Heart Disease Diagnosis Using Machine Learning
No ratings yet
Heart Disease Diagnosis Using Machine Learning
26 pages
Eda-Ml-Decision-Tree - Ipynb - Colab
No ratings yet
Eda-Ml-Decision-Tree - Ipynb - Colab
20 pages
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
No ratings yet
# Load Packages: Pandas Pandas PD PD Numpy Numpy NP NP
17 pages
Heart Disease Indicator Prediction Model
No ratings yet
Heart Disease Indicator Prediction Model
17 pages
Heart - Cleveland - Ipynb - Colab
No ratings yet
Heart - Cleveland - Ipynb - Colab
5 pages
m3125 Practical 3
No ratings yet
m3125 Practical 3
13 pages
Preprocessing1.ipynb - Colab
No ratings yet
Preprocessing1.ipynb - Colab
13 pages
Apply Logistic Regression Model Techniques To Predict Data On Any Dataset
No ratings yet
Apply Logistic Regression Model Techniques To Predict Data On Any Dataset
5 pages
Baseline - Ipynb - Colab
No ratings yet
Baseline - Ipynb - Colab
5 pages
Unit5 - Logistic Regression
No ratings yet
Unit5 - Logistic Regression
4 pages
Bio-Signal Analysis For Smoking
No ratings yet
Bio-Signal Analysis For Smoking
1 page
AML Sessional 1 Students
No ratings yet
AML Sessional 1 Students
16 pages
4-10 Aiml
No ratings yet
4-10 Aiml
25 pages
Logistic - Ipynb - Colaboratory
No ratings yet
Logistic - Ipynb - Colaboratory
6 pages
Major Project - Colab
No ratings yet
Major Project - Colab
15 pages
DSBDA2
No ratings yet
DSBDA2
6 pages
Mock Part1.ipynb - Colab
No ratings yet
Mock Part1.ipynb - Colab
10 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Student Notebook HR Analysis
No ratings yet
Student Notebook HR Analysis
11 pages
Health Risk Prediction
No ratings yet
Health Risk Prediction
80 pages
Model2.ipynb - Colab
No ratings yet
Model2.ipynb - Colab
11 pages
Predictive Modelling - Logistic Regression - Mentor Version-1 - Jupyter Notebook
No ratings yet
Predictive Modelling - Logistic Regression - Mentor Version-1 - Jupyter Notebook
22 pages
Dsbda 5
No ratings yet
Dsbda 5
12 pages
Machine Learning Project Guide
No ratings yet
Machine Learning Project Guide
12 pages
Deeks Ex5
No ratings yet
Deeks Ex5
4 pages
Heart Failure Prediction
100% (1)
Heart Failure Prediction
41 pages
Diabetes Prediction 1704256341
No ratings yet
Diabetes Prediction 1704256341
17 pages
Brain Stroke Prediction Using ML - Jupyter Notebook
No ratings yet
Brain Stroke Prediction Using ML - Jupyter Notebook
17 pages
Stroke Prediction Dataset
No ratings yet
Stroke Prediction Dataset
48 pages
Turing Data Analysis
No ratings yet
Turing Data Analysis
30 pages
Heart Attack Prediction Model EDA
100% (1)
Heart Attack Prediction Model EDA
24 pages
Data Set Preperation
No ratings yet
Data Set Preperation
7 pages
Vedant, Aiml
No ratings yet
Vedant, Aiml
63 pages
Ordered Probit and Logit Models Stata Program and Output PDF
No ratings yet
Ordered Probit and Logit Models Stata Program and Output PDF
7 pages
Pythone Code For Predicting Diabetes Using ML
No ratings yet
Pythone Code For Predicting Diabetes Using ML
18 pages
Linear Merged Pagenumber
No ratings yet
Linear Merged Pagenumber
48 pages
Samplecode (HDPS)
No ratings yet
Samplecode (HDPS)
29 pages
Advanced Regression Techniques
No ratings yet
Advanced Regression Techniques
28 pages
ExNo 08ml
No ratings yet
ExNo 08ml
4 pages
Diabetes Prediction System
No ratings yet
Diabetes Prediction System
4 pages
Diabetic Retinopathy Risk Modeling
No ratings yet
Diabetic Retinopathy Risk Modeling
24 pages
Prediction - Ipynb - Colab
No ratings yet
Prediction - Ipynb - Colab
7 pages
LP Practical ! Jupyter Notebook
No ratings yet
LP Practical ! Jupyter Notebook
6 pages
Logistic REGRESSION
No ratings yet
Logistic REGRESSION
10 pages
Assignment 1
No ratings yet
Assignment 1
10 pages
Heart Health Data Analysis
No ratings yet
Heart Health Data Analysis
1 page
Diabetes
No ratings yet
Diabetes
7 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
Ide To 6 Classification Algorithms
No ratings yet
Ide To 6 Classification Algorithms
34 pages
I) The Accretion Wedge
No ratings yet
I) The Accretion Wedge
3 pages
Team Development Stages Guide
No ratings yet
Team Development Stages Guide
7 pages
Intro 2 Customer Service
No ratings yet
Intro 2 Customer Service
3 pages
Struggle Against The Introjects
No ratings yet
Struggle Against The Introjects
7 pages
Nonlinear Tracking Differentiator For Velocity Estimation From Shaft Encoder
No ratings yet
Nonlinear Tracking Differentiator For Velocity Estimation From Shaft Encoder
12 pages
Rational Numbers Explained
100% (1)
Rational Numbers Explained
20 pages
Avances (Conferencia) PDF
No ratings yet
Avances (Conferencia) PDF
374 pages
European Standard For Visual Testing-General Principles Paper
No ratings yet
European Standard For Visual Testing-General Principles Paper
10 pages
BMHA121 Course Module 1
No ratings yet
BMHA121 Course Module 1
4 pages
Import CAD Files Into ADAMS
No ratings yet
Import CAD Files Into ADAMS
3 pages
Doug Sigel, Northwestern University "Punishment: Does It Fit The Crime? 1985
No ratings yet
Doug Sigel, Northwestern University "Punishment: Does It Fit The Crime? 1985
3 pages
Step 1 - Prospecting: Name: Agent Code
No ratings yet
Step 1 - Prospecting: Name: Agent Code
31 pages
106106140
No ratings yet
106106140
2 pages
Monas Hieroglyphica: ('The Hieroglyphic Monad') by Dr. John Dee
100% (1)
Monas Hieroglyphica: ('The Hieroglyphic Monad') by Dr. John Dee
32 pages
Network File Systems
No ratings yet
Network File Systems
18 pages
Eurocode 6 Part 1,3 - DDENV 1996-1-3-1998 PDF
No ratings yet
Eurocode 6 Part 1,3 - DDENV 1996-1-3-1998 PDF
34 pages
比喩によってどのように詩的効果が喚起されるか
No ratings yet
比喩によってどのように詩的効果が喚起されるか
4 pages
Problem Identification
No ratings yet
Problem Identification
13 pages
Sinful Nanny Sinful Reads Ashlee Rose Digital
No ratings yet
Sinful Nanny Sinful Reads Ashlee Rose Digital
155 pages
Patricia Sawyer Benner
100% (1)
Patricia Sawyer Benner
12 pages
Project Report Middle
No ratings yet
Project Report Middle
5 pages
Subcont - PT Pulauintan Bajaperkasa Konstruksi - Pekerjaan
No ratings yet
Subcont - PT Pulauintan Bajaperkasa Konstruksi - Pekerjaan
3 pages
Online Chatting Safety Tips
No ratings yet
Online Chatting Safety Tips
2 pages
Boas - Mead-Nature Nurture and The Anthropology of Franz Boas and Margaret Mead
No ratings yet
Boas - Mead-Nature Nurture and The Anthropology of Franz Boas and Margaret Mead
18 pages
Ethical Research Guidelines
67% (3)
Ethical Research Guidelines
2 pages
Lauren Valentine Resume
No ratings yet
Lauren Valentine Resume
2 pages
Master Listening for Better Communication
No ratings yet
Master Listening for Better Communication
5 pages
ECA Deviation Management CAPA
0% (1)
ECA Deviation Management CAPA
4 pages
Plumbing and Pipe Fitting
50% (2)
Plumbing and Pipe Fitting
94 pages
(Part - I) Strategic Management (Eng)
100% (3)
(Part - I) Strategic Management (Eng)
209 pages