0% found this document useful (0 votes)

206 views3 pages

Customer Clustering with K-Means

The Python code performs K-Means clustering on a customer dataset to determine the optimal number of clusters. It loads customer data, checks for null values, creates a scatter plot of Age vs Spending Score, uses the elbow method to find the best k between 1-10, and plots the clustered data points. It also shows an alternative sub-optimal approach that manually selects k=6.

Uploaded by

Vigneshwaran Ganapathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

206 views3 pages

Customer Clustering with K-Means

Uploaded by

Vigneshwaran Ganapathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Que : You work in XYZ Company as a Python Developer.

The company officials want you to

write code for a clustering problem. Dataset: customers.csv Tasks to be performed: 1. K-
Means Clustering: - Load customer data. - Check the number of cells in each column with
null values. - Create a scatter plot with Age as X and Spending Score as Y. - Find out the best
number for clusters between 1 and 10 (inclusive) using the elbowmethod. - Draw a scatter
plot displaying data points colored on the basis of clusters

For Optimal:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from yellowbrick.cluster import KElbowVisualizer

# Load customer data

data = pd.read_csv('customers.csv')

# Check for null values

print(data.isnull().sum())

# Create a scatter plot with Age as X and Spending Score as Y

plt.scatter(data['Age'], data['Spending Score (1-100)'])
plt.xlabel('Age')
plt.ylabel('Spending Score (1-100)')
plt.show()

# Use the optimal k-means clustering algorithm to determine the number of clusters
model = KMeans()
visualizer = KElbowVisualizer(model, k=(1,10))
visualizer.fit(data[['Age', 'Spending Score (1-100)']])
visualizer.show()

# Draw a scatter plot displaying data points colored on the basis of clusters
optimal_k = visualizer.elbow_value_
kmeans = KMeans(n_clusters=optimal_k, init='k-means++', max_iter=300, n_init=10,
random_state=0)
clusters = kmeans.fit_predict(data[['Age', 'Spending Score (1-100)']])
data['Cluster'] = clusters
plt.scatter(data['Age'], data['Spending Score (1-100)'], c=data['Cluster'], cmap='viridis')
plt.xlabel('Age')
plt.ylabel('Spending Score (1-100)')
plt.show()

For Sub-Optimal:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

# Load customer data

data = pd.read_csv('customers.csv')

# Check for null values

print(data.isnull().sum())

# Create a scatter plot with Age as X and Spending Score as Y

plt.scatter(data['Age'], data['Spending Score (1-100)'])

plt.xlabel('Age')
plt.ylabel('Spending Score (1-100)')

plt.show()

# Use the suboptimal k-means clustering algorithm to determine the number of clusters

sum_of_squared_distances = []

K = range(1,11)

for k in K:

km = KMeans(n_clusters=k, init='k-means++', max_iter=300, n_init=10, random_state=0)

km = km.fit(data[['Age', 'Spending Score (1-100)']])

sum_of_squared_distances.append(km.inertia_)

plt.plot(K, sum_of_squared_distances, 'bx-')

plt.xlabel('Number of Clusters')

plt.ylabel('Sum of Squared Distances')

plt.title('Elbow Method For Optimal k')

plt.show()

# Draw a scatter plot displaying data points colored on the basis of clusters

suboptimal_k = 6 # a value selected as an example

kmeans = KMeans(n_clusters=suboptimal_k, init='k-means++', max_iter=300, n_init=10,

random_state=0)

clusters = kmeans.fit_predict(data[['Age', 'Spending Score (1-100)']])

data['Cluster'] = clusters

plt.scatter(data['Age'], data['Spending Score (1-100)'], c=data['Cluster'], cmap='viridis')

plt.xlabel('Age')

plt.ylabel('Spending Score (1-100)')

plt.show()

Salesforce PDI Exam Questions
100% (1)
Salesforce PDI Exam Questions
3 pages
Know Salesforce: Want To How To Pass CRT-450 Exam
No ratings yet
Know Salesforce: Want To How To Pass CRT-450 Exam
17 pages
PDI Demo
No ratings yet
PDI Demo
7 pages
PD1 Set2
No ratings yet
PD1 Set2
9 pages
Salesforce PDI v2022-08-28 q103 PDF
No ratings yet
Salesforce PDI v2022-08-28 q103 PDF
31 pages
Sample Questions
100% (1)
Sample Questions
13 pages
Most Important File
No ratings yet
Most Important File
54 pages
4
No ratings yet
4
15 pages
PDI Mod 3 Printable PDF
No ratings yet
PDI Mod 3 Printable PDF
11 pages
SFDC Questions Iattended
No ratings yet
SFDC Questions Iattended
9 pages
Dumps Personal
No ratings yet
Dumps Personal
13 pages
Salesforce WI23 - PD1 - SET3
100% (1)
Salesforce WI23 - PD1 - SET3
20 pages
App Builder Wi23 Set1
No ratings yet
App Builder Wi23 Set1
34 pages
3
No ratings yet
3
15 pages
PD-1 Certification 144
No ratings yet
PD-1 Certification 144
8 pages
Examtorrent: Best Exam Torrent, Excellent Test Torrent, Valid Exam Dumps Are Here Waiting For You
No ratings yet
Examtorrent: Best Exam Torrent, Excellent Test Torrent, Valid Exam Dumps Are Here Waiting For You
9 pages
Salesforce Developer Exam Prep
No ratings yet
Salesforce Developer Exam Prep
12 pages
Salesforce Platform App Builder Q&A
No ratings yet
Salesforce Platform App Builder Q&A
8 pages
Salesforce Test-King CRT-450 v2018-12-02 by Tyler 31q
No ratings yet
Salesforce Test-King CRT-450 v2018-12-02 by Tyler 31q
15 pages
Latest Salesforce PDII Professional Certification Exam
No ratings yet
Latest Salesforce PDII Professional Certification Exam
5 pages
PD1 Dumps 2018
No ratings yet
PD1 Dumps 2018
16 pages
PD 1
No ratings yet
PD 1
8 pages
Salesforce Developer Q&A Guide
No ratings yet
Salesforce Developer Q&A Guide
57 pages
SG Certified Platform Developer I
No ratings yet
SG Certified Platform Developer I
8 pages
SET3
No ratings yet
SET3
62 pages
Platform Developer I
No ratings yet
Platform Developer I
6 pages
Test
No ratings yet
Test
48 pages
SFDC MCQ2 Ad
No ratings yet
SFDC MCQ2 Ad
81 pages
Salesforce Apex and Workflow Trigger Scenarios
No ratings yet
Salesforce Apex and Workflow Trigger Scenarios
47 pages
Platform Developer-2 SU18
No ratings yet
Platform Developer-2 SU18
42 pages
Salesforce Data Cloud
No ratings yet
Salesforce Data Cloud
4 pages
Platform Developer I Exam Revision 3
No ratings yet
Platform Developer I Exam Revision 3
54 pages
SF Apex Code Cheatsheet Web
No ratings yet
SF Apex Code Cheatsheet Web
8 pages
Pdi SF
No ratings yet
Pdi SF
20 pages
Exam Plat-Admn-202 Dumps - Salesforce Certified Platform App Builder
No ratings yet
Exam Plat-Admn-202 Dumps - Salesforce Certified Platform App Builder
31 pages
Salesforce Dev 401 Exam Dumps 10
No ratings yet
Salesforce Dev 401 Exam Dumps 10
2 pages
AdvanceAdminSet - 87
No ratings yet
AdvanceAdminSet - 87
32 pages
Salesforce: Question & Answers
No ratings yet
Salesforce: Question & Answers
7 pages
Advanced Pricing Exercise Guide2
No ratings yet
Advanced Pricing Exercise Guide2
150 pages
WI22 Set 2 - App Builder
No ratings yet
WI22 Set 2 - App Builder
22 pages
User Interface 3
No ratings yet
User Interface 3
22 pages
PD 1 - Spring - Questions
No ratings yet
PD 1 - Spring - Questions
101 pages
Salesforce Admin Exam Q&A Guide
No ratings yet
Salesforce Admin Exam Q&A Guide
4 pages
PD1 SP22
100% (1)
PD1 SP22
69 pages
ADM201 Examen 10
No ratings yet
ADM201 Examen 10
22 pages
Platform Developer 2
No ratings yet
Platform Developer 2
19 pages
Experience Cloud Consultant Questions
No ratings yet
Experience Cloud Consultant Questions
14 pages
Set 1
No ratings yet
Set 1
65 pages
Admin-4-Security and Access-15% - SFDC Notes
No ratings yet
Admin-4-Security and Access-15% - SFDC Notes
1 page
Resumen Sharing and Visibility
No ratings yet
Resumen Sharing and Visibility
45 pages
Salesforce Lightning Web Components Cheat Sheet
No ratings yet
Salesforce Lightning Web Components Cheat Sheet
14 pages
Salesforce Developer Essentials
No ratings yet
Salesforce Developer Essentials
41 pages
A. Recaptcha B. Kiosk/Data Entry Mode C. Progressive Profiling D. "Not You"? Link
No ratings yet
A. Recaptcha B. Kiosk/Data Entry Mode C. Progressive Profiling D. "Not You"? Link
11 pages
PDI Mod 4 Printable PDF
No ratings yet
PDI Mod 4 Printable PDF
19 pages
Salesforce Developer Exam Prep
No ratings yet
Salesforce Developer Exam Prep
10 pages
Platform App Builder Demo 3 1
No ratings yet
Platform App Builder Demo 3 1
7 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
Experiment-7: Implementation of K-Means Clustering Algorithm
No ratings yet
Experiment-7: Implementation of K-Means Clustering Algorithm
3 pages
Customer Segmentation Using Clustering
No ratings yet
Customer Segmentation Using Clustering
6 pages
Elbow Method Using Sns
No ratings yet
Elbow Method Using Sns
3 pages
The Hill Climbing Algorithm Is A Local Search Technique That Continuously Moves Towards Higher Values or
No ratings yet
The Hill Climbing Algorithm Is A Local Search Technique That Continuously Moves Towards Higher Values or
8 pages
Data Structures 2
No ratings yet
Data Structures 2
31 pages
Comparative Analysis of Denoising The Different Artifacts in Ecg Signal Using Different Adaptive Algorithems
No ratings yet
Comparative Analysis of Denoising The Different Artifacts in Ecg Signal Using Different Adaptive Algorithems
5 pages
Stack Implementation in C Using Arrays
No ratings yet
Stack Implementation in C Using Arrays
5 pages
Efficient 3D Pathfinding Algorithms
No ratings yet
Efficient 3D Pathfinding Algorithms
24 pages
Unit II Inheritance-Class 1
No ratings yet
Unit II Inheritance-Class 1
12 pages
Syllabus
No ratings yet
Syllabus
4 pages
Advanced Numerical Methods - Q. B
No ratings yet
Advanced Numerical Methods - Q. B
12 pages
Irregular Types of LP Models: Dr. Sania Bhatti
No ratings yet
Irregular Types of LP Models: Dr. Sania Bhatti
16 pages
Advanced Cluster Analysis: Clustering High-Dimensional Data
No ratings yet
Advanced Cluster Analysis: Clustering High-Dimensional Data
49 pages
30-Day SDE Preparation Plan
No ratings yet
30-Day SDE Preparation Plan
4 pages
Moore & Mealy Machines
No ratings yet
Moore & Mealy Machines
4 pages
Assignments02 Me
No ratings yet
Assignments02 Me
4 pages
Priority Queues & Heaps Lab
No ratings yet
Priority Queues & Heaps Lab
6 pages
2.1 Workbook
No ratings yet
2.1 Workbook
33 pages
Pseudo Code
No ratings yet
Pseudo Code
2 pages
Neural Networks Bias
No ratings yet
Neural Networks Bias
7 pages
Compiler Design
100% (2)
Compiler Design
17 pages
Data Structures in Swift
100% (2)
Data Structures in Swift
41 pages
Informed and Uninformed Search
No ratings yet
Informed and Uninformed Search
74 pages
Types of Data Structures
No ratings yet
Types of Data Structures
4 pages
MATH Revision Worksheet
No ratings yet
MATH Revision Worksheet
10 pages
Improving Quantum and Classical Decomposition Methods For 2nmw8meb0h
No ratings yet
Improving Quantum and Classical Decomposition Methods For 2nmw8meb0h
10 pages
Notes On Algorithm
No ratings yet
Notes On Algorithm
34 pages
DS&A-Chapter One
No ratings yet
DS&A-Chapter One
13 pages
Algorithm Ch6 Heapsort PDF
No ratings yet
Algorithm Ch6 Heapsort PDF
41 pages
Therotical Computer Science Book
No ratings yet
Therotical Computer Science Book
285 pages
Graph Theory Assignment-4
No ratings yet
Graph Theory Assignment-4
11 pages
Digital Image Processing Course
No ratings yet
Digital Image Processing Course
3 pages
Unit 1-Introduction To Data Structures and Algorithms
No ratings yet
Unit 1-Introduction To Data Structures and Algorithms
7 pages

Customer Clustering with K-Means

Uploaded by

Customer Clustering with K-Means

Uploaded by

Que : You work in XYZ Company as a Python Developer.

The company officials want you to

# Load customer data

# Check for null values

# Create a scatter plot with Age as X and Spending Score as Y

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

# Load customer data

# Check for null values

# Create a scatter plot with Age as X and Spending Score as Y

plt.scatter(data['Age'], data['Spending Score (1-100)'])

km = KMeans(n_clusters=k, init='k-means++', max_iter=300, n_init=10, random_state=0)

km = km.fit(data[['Age', 'Spending Score (1-100)']])

plt.plot(K, sum_of_squared_distances, 'bx-')

plt.ylabel('Sum of Squared Distances')

plt.title('Elbow Method For Optimal k')

suboptimal_k = 6 # a value selected as an example

kmeans = KMeans(n_clusters=suboptimal_k, init='k-means++', max_iter=300, n_init=10,

clusters = kmeans.fit_predict(data[['Age', 'Spending Score (1-100)']])

plt.scatter(data['Age'], data['Spending Score (1-100)'], c=data['Cluster'], cmap='viridis')

plt.ylabel('Spending Score (1-100)')

You might also like