0% found this document useful (0 votes)
25 views6 pages

Apply Linear Regression Model Techniques To Predict Data On Any Dataset

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

Apply Linear Regression Model Techniques To Predict Data On Any Dataset

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.

ipynb

4. Apply Linear Regression Model techniques to predict data on any


dataset.

In [18]: import pandas as pd


import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [19]: df = pd.read_csv(r"C:\Users\ABHISHEK\Downloads\LungCapData - LungCapData.csv")

In [20]: df.head()

Out[20]: LungCap Age Height Smoke Gender Caesarean

0 6.475 6 62.1 no male no

1 10.125 18 74.7 yes female no

2 9.550 16 69.7 no female yes

3 11.125 14 71.0 no male no

4 4.800 5 56.9 no male no

In [21]: df.isnull().sum()

Out[21]: LungCap 0
Age 0
Height 0
Smoke 0
Gender 0
Caesarean 0
dtype: int64

In [22]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 725 entries, 0 to 724
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 LungCap 725 non-null float64
1 Age 725 non-null int64
2 Height 725 non-null float64
3 Smoke 725 non-null object
4 Gender 725 non-null object
5 Caesarean 725 non-null object
dtypes: float64(2), int64(1), object(3)
memory usage: 34.1+ KB

1 of 6 30-10-2024, 22:06
4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.ipynb

In [23]: from sklearn.preprocessing import LabelEncoder


from sklearn.model_selection import train_test_split

In [24]: le = LabelEncoder()

In [25]: df.Smoke = le.fit_transform(df.Smoke)

In [26]: df.Gender = le.fit_transform(df.Gender)

In [27]: df.Caesarean = le.fit_transform(df.Caesarean)

In [28]: df.head()

Out[28]: LungCap Age Height Smoke Gender Caesarean

0 6.475 6 62.1 0 1 0

1 10.125 18 74.7 1 0 0

2 9.550 16 69.7 0 0 1

3 11.125 14 71.0 0 1 0

4 4.800 5 56.9 0 1 0

In [29]: x = df.drop(['LungCap'],axis = 1)

In [30]: x

Out[30]: Age Height Smoke Gender Caesarean

0 6 62.1 0 1 0

1 18 74.7 1 0 0

2 16 69.7 0 0 1

3 14 71.0 0 1 0

4 5 56.9 0 1 0

... ... ... ... ... ...

720 9 56.0 0 0 0

721 18 72.0 1 1 1

722 11 60.5 1 0 0

723 15 64.9 0 0 0

724 10 67.7 0 1 0

725 rows × 5 columns

In [31]: y = df['LungCap']

2 of 6 30-10-2024, 22:06
4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.ipynb

In [32]: X_test,X_train,y_test,y_train = train_test_split(x,y,test_size = 0.2,random_state

In [34]: X_train.shape,y_train.shape

Out[34]: ((145, 5), (145,))

In [35]: from sklearn.linear_model import LinearRegression

In [36]: lr = LinearRegression()

In [37]: lr.fit(X_train,y_train)

Out[37]: LinearRegression()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust
the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page
with nbviewer.org.

In [38]: y_pred = lr.predict(X_test)

In [39]: from sklearn.metrics import mean_squared_error,r2_score,mean_absolute_error

In [40]: mse = mean_squared_error(y_test,y_pred)

In [41]: err_train = y_test - y_pred

In [43]: mse = np.mean(np.square(err_train))

In [44]: mse

Out[44]: 1.0559850321341964

In [45]: rmse = np.sqrt(mse)

In [46]: rmse

Out[46]: 1.0276113234750754

In [47]: r2_score(y_test,y_pred)

Out[47]: 0.8511008247863296

3 of 6 30-10-2024, 22:06
4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.ipynb

In [48]: plt.plot(err_train,"*")

Out[48]: [<matplotlib.lines.Line2D at 0x165af248dc0>]

In [51]: plt.hist(err_train,bins=20,edgecolor='g')
plt.grid()

In [55]: y_test.shape,y_pred.shape

Out[55]: ((580,), (580,))

In [65]: d = {"Actual":(y_test),
"Predicted":(y_pred)}

In [66]: pred_actual_df = pd.DataFrame(d)

4 of 6 30-10-2024, 22:06
4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.ipynb

In [67]: pred_actual_df

Out[67]: Actual Predicted

446 6.300 6.251456

6 4.950 6.996499

423 7.800 9.078604

596 3.925 4.716138

411 8.675 8.229591

... ... ...

71 9.700 9.940940

106 10.875 11.602824

270 6.100 5.671011

435 11.300 10.971752

102 3.450 6.361120

580 rows × 2 columns

In [69]: sns.jointplot(x ='Actual',y = 'Predicted',data= pred_actual_df ,kind = 'reg')


plt.grid()

5 of 6 30-10-2024, 22:06
4 - Jupyter Notebook http://localhost:8888/notebooks/Practicals_AI/4.ipynb

In [ ]:

6 of 6 30-10-2024, 22:06

You might also like