Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
CODE TO PREDICT CAR PURCHASING DOLLAR AMOUNT
USING ANNs (REGRESSION TASK)
Dr. Ryan Ahmed @STEMplicity
PROBLEM STATEMENT
You are working as a car salesman and you would like to develop a model to predict the total dollar amount that
customers are willing to pay given the following attributes:
Customer Name
Customer e-mail
Country
Gender
Age
Annual Salary
Credit Card Debt
Net Worth
The model should predict:
Car Purchase Amount
STEP #0: LIBRARIES IMPORT
In [1]: import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
STEP #1: IMPORT DATASET
1 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [2]: car_df = pd.read_csv('Car_Purchasing_Data.csv', encoding='ISO-8859-1')
2 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [3]: car_df
3 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
Out[3]:
Customer Annual
Customer e-mail Country Gender Age
Name Salary
Martina
0 cubilia.Curae.Phasellus@quisaccumsanconvallis.edu Bulgaria 0 41.851720 62812.09301
Avila
Harlan
1 eu.dolor@diam.co.uk Belize 0 40.870623 66646.89292
Barnes
Naomi
2 vulputate.mauris.sagittis@ametconsectetueradip... Algeria 1 43.152897 53798.55112
Rodriquez
Jade
3 malesuada@dignissim.com Cook Islands 1 58.271369 79370.03798
Cunningham
Cedric
4 felis.ullamcorper.viverra@egetmollislectus.net Brazil 1 57.313749 59729.15130
Leach
5 Carla Hester mi@Aliquamerat.edu Liberia 1 56.824893 68499.85162
Griffin
6 vehicula@at.co.uk Syria 1 46.607315 39814.52200
Rivera
Czech
7 Orli Casey nunc.est.mollis@Suspendissetristiqueneque.co.uk 1 50.193016 51752.23445
Republic
Marny
8 Phasellus@sedsemegestas.org Armenia 0 46.584745 58139.25910
Obrien
Rhonda
9 nec@nuncest.com Somalia 1 43.323782 53457.10132
Chavez
Jerome
10 ipsum.cursus@dui.org Sint Maarten 1 50.129923 73348.70745
Rowe
Akeem
11 turpis.egestas.Fusce@purus.edu Greenland 1 53.180158 55421.65733
Gibson
12 Quin Smith nulla@ipsum.edu Nicaragua 0 44.396494 37336.33830
Palestine,
13 Tatum Moon Cras.sed.leo@Seddiamlorem.ca 0 48.496515 68304.47298
State of
Sharon United Arab
14 eget.metus@aaliquetvel.co.uk 0 55.244866 72776.00382
Sharpe Emirates
Thomas
15 aliquet.molestie@ut.org Gabon 1 53.289768 64662.30061
Williams
Blaine
16 ultrices.posuere.cubilia@pedenonummyut.net Tokelau 0 44.742200 63259.87837
Bender
Stephen
17 erat.eget.ipsum@tinciduntpede.org Portugal 1 48.127085 52682.06401
Lindsey
Sloane
18 at.augue@augue.net Chad 1 51.853474 54503.14423
Mann
19 Athena Wolf volutpat.Nulla.facilisis@primis.ca Iraq 0 58.741842 55368.23716
Blythe
20 Sed.eu@risusNuncac.co.uk Sudan 1 51.900471 63435.86304
Romero
Zelenia
21 auctor.non@sapien.co.uk Angola 0 48.081120 64347.34531
Byers
Nola
22 Aliquam@augue.edu Nigeria 1 45.531842 65176.69055
Wiggins
Micah
23 arcu.eu@tincidunt.org Madagascar 1 47.022284 52027.63837
Wheeler
Caryn
24 condimentum.Donec@duiCum.com Macedonia 0 39.942995 69612.01230
Hendrix
Hedda
25 scelerisque@magnased.com Oman 0 52.577441 53065.57175
Miranda
26 Ulric Lynn sociis@vulputateveliteu.com Colombia 0 28.009676 82842.53385
27 Alma Pope Nunc.mauris.Morbi@turpis.org Namibia 0 55.630317 61388.62709
Gemma
28 lobortis@non.co.uk Denmark 1 46.124036 100000.00000
Hendrix
Castor Dominican
4 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
STEP #2: VISUALIZE DATASET
In [4]: sns.pairplot(car_df)
Out[4]: <seaborn.axisgrid.PairGrid at 0x20d2b487cc0>
STEP #3: CREATE TESTING AND TRAINING DATASET/DATA
CLEANING
In [5]: X = car_df.drop(['Customer Name', 'Customer e-mail', 'Country', 'Car Purchase Am
ount'], axis = 1)
In [ ]:
5 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [6]: X
6 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
Out[6]:
Gender Age Annual Salary Credit Card Debt Net Worth
0 0 41.851720 62812.09301 11609.380910 238961.2505
1 0 40.870623 66646.89292 9572.957136 530973.9078
2 1 43.152897 53798.55112 11160.355060 638467.1773
3 1 58.271369 79370.03798 14426.164850 548599.0524
4 1 57.313749 59729.15130 5358.712177 560304.0671
5 1 56.824893 68499.85162 14179.472440 428485.3604
6 1 46.607315 39814.52200 5958.460188 326373.1812
7 1 50.193016 51752.23445 10985.696560 629312.4041
8 0 46.584745 58139.25910 3440.823799 630059.0274
9 1 43.323782 53457.10132 12884.078680 476643.3544
10 1 50.129923 73348.70745 8270.707359 612738.6171
11 1 53.180158 55421.65733 10014.969290 293862.5123
12 0 44.396494 37336.33830 10218.320920 430907.1673
13 0 48.496515 68304.47298 9466.995128 420322.0702
14 0 55.244866 72776.00382 10597.638140 146344.8965
15 1 53.289768 64662.30061 11326.034340 481433.4324
16 0 44.742200 63259.87837 11495.549990 370356.2223
17 1 48.127085 52682.06401 12514.520290 549443.5886
18 1 51.853474 54503.14423 7377.820914 431098.9998
19 0 58.741842 55368.23716 13272.946470 566022.1306
20 1 51.900471 63435.86304 11878.037790 480588.2345
21 0 48.081120 64347.34531 10905.366280 307226.0977
22 1 45.531842 65176.69055 7698.552234 497526.4566
23 1 47.022284 52027.63837 11960.853770 688466.0503
24 0 39.942995 69612.01230 8125.598993 499086.3442
25 0 52.577441 53065.57175 17805.576070 429440.3297
26 0 28.009676 82842.53385 13102.158050 315775.3207
27 0 55.630317 61388.62709 14270.007310 341691.9337
28 1 46.124036 100000.00000 17452.921790 188032.0778
29 1 40.245327 62891.86556 12522.940520 583230.9760
... ... ... ... ... ...
470 0 59.619615 81565.95967 9072.063059 544291.9504
471 0 43.542528 65364.06334 7839.414396 579640.7982
472 1 39.281245 65019.15701 4931.560160 341330.7344
473 1 41.679623 58243.17992 15149.034260 649323.7878
474 0 32.308876 73558.87334 11164.526520 301245.7708
475 1 52.289799 66088.02369 6769.181833 557098.9636
476 0 56.287509 54441.72437 4362.720324 432850.4157
477 1 40.754052 60101.79725 12989.367840 340720.5185
478 1 50.769362 50153.43545 6596.013690 266939.1746
479 0 57.615456 61430.93415 11561.073650 421891.8460
480 0 50.801934 65846.50960 9141.668545 531840.3342
481 1 29.034521 55433.61187 10769.750590 276466.6203
482 1 52.967762 62979.60196 14297.253660 247421.9185
7 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [7]: y = car_df['Car Purchase Amount']
y.shape
Out[7]: (500,)
In [8]: from sklearn.preprocessing import MinMaxScaler
scaler_x = MinMaxScaler()
X_scaled = scaler_x.fit_transform(X)
In [9]: scaler_x.data_max_
Out[9]: array([1.e+00, 7.e+01, 1.e+05, 2.e+04, 1.e+06])
In [10]: scaler_x.data_min_
Out[10]: array([ 0., 20., 20000., 100., 20000.])
In [11]: print(X_scaled)
[[0. 0.4370344 0.53515116 0.57836085 0.22342985]
[0. 0.41741247 0.58308616 0.476028 0.52140195]
[1. 0.46305795 0.42248189 0.55579674 0.63108896]
...
[1. 0.67886994 0.61110973 0.52822145 0.75972584]
[1. 0.78321017 0.37264988 0.69914746 0.3243129 ]
[1. 0.53462305 0.51713347 0.46690159 0.45198622]]
In [12]: X_scaled.shape
Out[12]: (500, 5)
In [13]: y.shape
Out[13]: (500,)
In [14]: y = y.values.reshape(-1,1)
In [15]: y.shape
Out[15]: (500, 1)
8 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [16]: y
9 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
Out[16]: array([[35321.45877],
[45115.52566],
[42925.70921],
[67422.36313],
[55915.46248],
[56611.99784],
[28925.70549],
[47434.98265],
[48013.6141 ],
[38189.50601],
[59045.51309],
[42288.81046],
[28700.0334 ],
[49258.87571],
[49510.03356],
[53017.26723],
[41814.72067],
[43901.71244],
[44633.99241],
[54827.52403],
[51130.95379],
[43402.31525],
[47240.86004],
[46635.49432],
[45078.40193],
[44387.58412],
[37161.55393],
[49091.97185],
[58350.31809],
[43994.35972],
[17584.56963],
[44650.36073],
[66363.89316],
[53489.46214],
[39810.34817],
[51612.14311],
[38978.67458],
[10092.22509],
[35928.52404],
[54823.19221],
[45805.67186],
[41567.47033],
[28031.20985],
[27815.73813],
[68678.4352 ],
[68925.09447],
[34215.7615 ],
[37843.46619],
[37883.24231],
[48734.35708],
[27187.23914],
[63738.39065],
[48266.75516],
[46381.13111],
[31978.9799 ],
[48100.29052],
[47380.91224],
[41425.00116],
[38147.81018],
[32737.80177],
[37348.13737],
[47483.85316],
[49730.53339],
[40093.61981],
[42297.5062 ],
[52954.93121],
[48104.11184],
[43680.91327],
[52707.96816],
10 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [17]: scaler_y = MinMaxScaler()
y_scaled = scaler_y.fit_transform(y)
11 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [18]: y_scaled
12 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
Out[18]: array([[0.37072477],
[0.50866938],
[0.47782689],
[0.82285018],
[0.66078116],
[0.67059152],
[0.28064374],
[0.54133778],
[0.54948752],
[0.4111198 ],
[0.70486638],
[0.46885649],
[0.27746526],
[0.56702642],
[0.57056385],
[0.61996151],
[0.46217916],
[0.49157341],
[0.50188722],
[0.64545808],
[0.59339372],
[0.48453965],
[0.53860366],
[0.53007738],
[0.50814651],
[0.49841668],
[0.3966416 ],
[0.56467566],
[0.6950749 ],
[0.49287831],
[0.12090943],
[0.50211776],
[0.80794216],
[0.62661214],
[0.43394857],
[0.60017103],
[0.42223485],
[0.01538345],
[0.37927499],
[0.64539707],
[0.51838974],
[0.45869677],
[0.26804521],
[0.2650104 ],
[0.84054134],
[0.84401542],
[0.35515157],
[0.406246 ],
[0.40680623],
[0.55963883],
[0.2561583 ],
[0.77096325],
[0.55305289],
[0.5264948 ],
[0.3236476 ],
[0.55070832],
[0.54057623],
[0.45669016],
[0.41053254],
[0.33433524],
[0.39926954],
[0.5420261 ],
[0.57366948],
[0.43793831],
[0.46897896],
[0.61908354],
[0.55076214],
[0.48846357],
[0.61560519],
13 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
STEP#4: TRAINING THE MODEL
In [19]: from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_scaled, test_siz
e = 0.25)
In [20]: import tensorflow.keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import MinMaxScaler
model = Sequential()
model.add(Dense(25, input_dim=5, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(1, activation='linear'))
model.summary()
C:\Users\Dr. Ryan\Anaconda3\lib\site-packages\h5py\__init__.py:36: FutureWarni
ng: Conversion of the second argument of issubdtype from `float` to `np.floati
ng` is deprecated. In future, it will be treated as `np.float64 == np.dtype(fl
oat).type`.
from ._conv import register_converters as _register_converters
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 25) 150
_________________________________________________________________
dense_2 (Dense) (None, 25) 650
_________________________________________________________________
dense_3 (Dense) (None, 1) 26
=================================================================
Total params: 826
Trainable params: 826
Non-trainable params: 0
_________________________________________________________________
Using TensorFlow backend.
In [21]: model.compile(optimizer='adam', loss='mean_squared_error')
14 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
In [22]: epochs_hist = model.fit(X_train, y_train, epochs=20, batch_size=25, verbose=1,
validation_split=0.2)
Train on 300 samples, validate on 75 samples
Epoch 1/20
300/300 [==============================] - 0s 1ms/step - loss: 0.4438 - val_lo
ss: 0.2914
Epoch 2/20
300/300 [==============================] - 0s 47us/step - loss: 0.2172 - val_l
oss: 0.1303
Epoch 3/20
300/300 [==============================] - 0s 43us/step - loss: 0.0842 - val_l
oss: 0.0490
Epoch 4/20
300/300 [==============================] - 0s 43us/step - loss: 0.0302 - val_l
oss: 0.0238
Epoch 5/20
300/300 [==============================] - 0s 40us/step - loss: 0.0202 - val_l
oss: 0.0204
Epoch 6/20
300/300 [==============================] - 0s 40us/step - loss: 0.0181 - val_l
oss: 0.0173
Epoch 7/20
300/300 [==============================] - 0s 43us/step - loss: 0.0145 - val_l
oss: 0.0145
Epoch 8/20
300/300 [==============================] - 0s 47us/step - loss: 0.0125 - val_l
oss: 0.0126
Epoch 9/20
300/300 [==============================] - 0s 50us/step - loss: 0.0112 - val_l
oss: 0.0114
Epoch 10/20
300/300 [==============================] - 0s 40us/step - loss: 0.0104 - val_l
oss: 0.0109
Epoch 11/20
300/300 [==============================] - 0s 43us/step - loss: 0.0097 - val_l
oss: 0.0101
Epoch 12/20
300/300 [==============================] - 0s 47us/step - loss: 0.0093 - val_l
oss: 0.0093
Epoch 13/20
300/300 [==============================] - 0s 43us/step - loss: 0.0087 - val_l
oss: 0.0090
Epoch 14/20
300/300 [==============================] - 0s 43us/step - loss: 0.0082 - val_l
oss: 0.0084
Epoch 15/20
300/300 [==============================] - 0s 47us/step - loss: 0.0077 - val_l
oss: 0.0078
Epoch 16/20
300/300 [==============================] - 0s 50us/step - loss: 0.0074 - val_l
oss: 0.0076
Epoch 17/20
300/300 [==============================] - 0s 37us/step - loss: 0.0069 - val_l
oss: 0.0071
Epoch 18/20
300/300 [==============================] - 0s 40us/step - loss: 0.0065 - val_l
oss: 0.0069
Epoch 19/20
300/300 [==============================] - 0s 46us/step - loss: 0.0062 - val_l
oss: 0.0065
Epoch 20/20
300/300 [==============================] - 0s 50us/step - loss: 0.0059 - val_l
oss: 0.0062
15 of 16 17-04-2020, 10:28
Project 1 - Car Purchase Amount Predictions Using ANNs http://localhost:8888/nbconvert/html/Documents/Project 1 - Car Purcha...
STEP#5: EVALUATING THE MODEL
In [23]: print(epochs_hist.history.keys())
dict_keys(['val_loss', 'loss'])
In [24]: plt.plot(epochs_hist.history['loss'])
plt.plot(epochs_hist.history['val_loss'])
plt.title('Model Loss Progression During Training/Validation')
plt.ylabel('Training and Validation Losses')
plt.xlabel('Epoch Number')
plt.legend(['Training Loss', 'Validation Loss'])
Out[24]: <matplotlib.legend.Legend at 0x20d32676be0>
In [26]: # Gender, Age, Annual Salary, Credit Card Debt, Net Worth
# ***(Note that input data must be normalized)***
X_test_sample = np.array([[0, 0.4370344, 0.53515116, 0.57836085, 0.22342985]])
#X_test_sample = np.array([[1, 0.53462305, 0.51713347, 0.46690159, 0.45198622]])
y_predict_sample = model.predict(X_test_sample)
print('Expected Purchase Amount=', y_predict_sample)
y_predict_sample_orig = scaler_y.inverse_transform(y_predict_sample)
print('Expected Purchase Amount=', y_predict_sample_orig)
Expected Purchase Amount= [[0.42582208]]
Expected Purchase Amount= [[39233.367]]
EXCELLENT JOB! NOW YOU'VE MASTERED THE USE OF
ANNs FOR REGRESSION TASKS!
16 of 16 17-04-2020, 10:28