Customer Purchase Data Analysis with Python code using Pandas
Cust_Purch_FakeData.csv: Raw data file
Cust_Purch_Data_Exercise.ipynb: Python code file
- prefix
- first last
- gender
- age
- company
- profession
- phone
- postal
- province
- cc_no
- cc_exp
- cc_type
- price(CAD)
- fav_color
- ip
- weekday
- ampm
- date
import pandas as pd
df = pd.read_csv("Cust_Purch_FakeData.csv")
df.head(5)
df.info()
print("Max. age of the customer is:", df['age'].max())
print("Min. age of the customer is:", df['age'].min())
print("Avg. age of the customer is:", df['age'].mean())
print(df['first'].value_counts().head(3))
df['phone'].value_counts().head(2)
df[df['profession'] == 'Structural Engineer'].count()
df[(df['profession'] == 'Structural Engineer')&(df['gender'] == 'Male')].count()
df[(df['profession'] == 'Structural Engineer')&(df['gender'] == 'Female')&(df['province']=='AB')]
print('Max. spending:',df['price(CAD)'].max())
print('Min. spending:',df['price(CAD)'].min())
print('Avg. spending:',df['price(CAD)'].mean())
11. Who did not spend anything? Company wants to send a deal to encourage the customer to buy stuff!
df[df['price(CAD)'] == 0]
12. As a loyalty reward, company wants to send thanks coupon to those who spent 100CAD or more, please find out the customers?
df[df['price(CAD)'] >= 100]
df[df['cc_no'] == 5020000000000230]['email']
14. We need to send new cards to the customers well before the expire, how many cards are expiring in 2019?
Use sum() and count() and see the difference in their use :)
df['cc_exp'].apply(lambda x: x[5:] == '19').sum()<br>
df[df['cc_exp'].apply(lambda x: x[5:]) == '19']['cc_exp'].count()
df[df['cc_type'] == 'Visa']['cc_type'].count()
df[(df['price(CAD)'] == 100)&(df['cc_type'] == 'Visa')]
df['profession'].value_counts().head(2)
email = df['email'].str.split('@').str[1]
email.value_counts().head(5)
df[df['email'].apply(lambda x: x.split('@')[1] == 'am.edu')]
df['weekday'].value_counts().head()