0% found this document useful (0 votes)

13 views3 pages

Practical 03

Uploaded by

ضياء الحق النزيلي

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views3 pages

Practical 03

Uploaded by

ضياء الحق النزيلي

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Dr.

Rafiq Zakaria Campus

Maulana Azad College of Arts, Science & Commerce
P.G. Dept. of Computer Science
M.Sc. III Semester
Data Mining and Warehousing
Practical 03
Date : 9th October 2024

Aim : To study Data Clustering using Python.

Description:

Data clustering is a powerful technique used to group similar data points together. Here’s a
practical guide to performing clustering using Python, specifically with the `scikit-learn` library.

1. Install Required Libraries

Make sure you have the necessary libraries installed:

pip install numpy pandas matplotlib scikit-learn

2. Import Libraries

Start by importing the required libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

3. Generate Sample Data

For this example, we’ll create synthetic data using `make_blobs`:

# Generate synthetic data
X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

4. Visualize the Data

Visualizing the data can help understand the structure before clustering:

plt.scatter(X[:, 0], X[:, 1], s=30)

plt.title('Sample Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

5. Perform K-Means Clustering

Now, let's apply the K-Means clustering algorithm:

# Choose the number of clusters

k=4
kmeans = KMeans(n_clusters=k)
kmeans.fit(X)
# Get the cluster labels
y_kmeans = kmeans.predict(X)

6. Visualize the Clusters

You can visualize the resulting clusters:

plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, s=30, cmap='viridis')

centers = kmeans.cluster_centers_
plt.scatter(centers[:, 0], centers[:, 1], c='red', s=200, alpha=0.75, marker='X')
plt.title('K-Means Clustering Results')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

7. Evaluate the Clustering

Evaluate the clustering using the silhouette score:

score = silhouette_score(X, y_kmeans)
print(f'Silhouette Score: {score}')

8. Choosing the Right Number of Clusters

To choose the optimal number of clusters, you can use the Elbow method:

inertia = []
K = range(1, 11)
for k in K:
kmeans = KMeans(n_clusters=k)
kmeans.fit(X)
inertia.append(kmeans.inertia_)
plt.figure(figsize=(8, 4))
plt.plot(K, inertia, 'bx-')
plt.xlabel('Number of clusters K')
plt.ylabel('Inertia')
plt.title('Elbow Method For Optimal K')
plt.show()

Conclusion

In this practical, you learned how to perform clustering using K-Means in Python. Adjusting
parameters and preprocessing your data can yield better clustering results.

Prepared by Khan Shagufta (Assistant professor PG Dept of Comp Sci)

Python K-Means Clustering Guide
No ratings yet
Python K-Means Clustering Guide
6 pages
SE KMeansClustering
No ratings yet
SE KMeansClustering
21 pages
AAM 7th Prac
No ratings yet
AAM 7th Prac
4 pages
DS Prac 8
No ratings yet
DS Prac 8
4 pages
K-Means Algorithm
No ratings yet
K-Means Algorithm
29 pages
DADV Exp-5
No ratings yet
DADV Exp-5
3 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Week 8 DS Practical
No ratings yet
Week 8 DS Practical
13 pages
Ex No: Date: K-Means Clustering Using Python: Scatter
No ratings yet
Ex No: Date: K-Means Clustering Using Python: Scatter
10 pages
Experiment 4 1
No ratings yet
Experiment 4 1
4 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
EXP-6 K Mean Clustring
No ratings yet
EXP-6 K Mean Clustring
6 pages
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
K Means Illustration Colab
No ratings yet
K Means Illustration Colab
5 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
Lab11 Kmeans 6H
No ratings yet
Lab11 Kmeans 6H
3 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
DWM Exp4
No ratings yet
DWM Exp4
9 pages
K Means Clustering
No ratings yet
K Means Clustering
1 page
Wa0033.
No ratings yet
Wa0033.
38 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
ML Assignment-10
No ratings yet
ML Assignment-10
5 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
10 pages
0006 - K Means Clustering - Introduction - 2025
No ratings yet
0006 - K Means Clustering - Introduction - 2025
19 pages
K-Means Clustering Report
No ratings yet
K-Means Clustering Report
2 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
AI With Python - Unsupervised Learning - Clustering
No ratings yet
AI With Python - Unsupervised Learning - Clustering
12 pages
ML Clustering2
No ratings yet
ML Clustering2
11 pages
ML - K-Means
No ratings yet
ML - K-Means
12 pages
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
7 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Vid 4
No ratings yet
Vid 4
6 pages
K-Means Clustering Tutorial
No ratings yet
K-Means Clustering Tutorial
16 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
Avinash Tiwari 9
No ratings yet
Avinash Tiwari 9
4 pages
ML 2.3 Prashant
No ratings yet
ML 2.3 Prashant
4 pages
K Means Clustering
No ratings yet
K Means Clustering
5 pages
Unit 3 Unsupervised Learning
No ratings yet
Unit 3 Unsupervised Learning
9 pages
Tutorial 8
No ratings yet
Tutorial 8
12 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
K Means Clustering
No ratings yet
K Means Clustering
2 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Clustering
No ratings yet
Clustering
1 page
AppliedML Chap1 Clustering
No ratings yet
AppliedML Chap1 Clustering
37 pages
Using Machine Learning To Locate Support and Resistance Lines For Stocks
No ratings yet
Using Machine Learning To Locate Support and Resistance Lines For Stocks
14 pages
FullMarks - Clustering StudentSolution 2
No ratings yet
FullMarks - Clustering StudentSolution 2
13 pages
Aiml Assignment 10
No ratings yet
Aiml Assignment 10
6 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
K-Means Clustering Using PCA Analysis Lab Report
No ratings yet
K-Means Clustering Using PCA Analysis Lab Report
9 pages
Clustering Model XX
No ratings yet
Clustering Model XX
5 pages
Da Exp 10
No ratings yet
Da Exp 10
6 pages
Da Exp 10
No ratings yet
Da Exp 10
6 pages
LAB7 Kmeans
No ratings yet
LAB7 Kmeans
11 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
Ds Paper
No ratings yet
Ds Paper
35 pages
JAVIER KMeans Clustering Jupyter Notebook
No ratings yet
JAVIER KMeans Clustering Jupyter Notebook
7 pages
DMDW Lab8
No ratings yet
DMDW Lab8
3 pages
Techkriti'23: Startup Expo at IIT Kanpur
No ratings yet
Techkriti'23: Startup Expo at IIT Kanpur
5 pages
Proposal Thesis
No ratings yet
Proposal Thesis
15 pages
Case Study: Hazel: Introduction To Operations Management
100% (1)
Case Study: Hazel: Introduction To Operations Management
10 pages
Teicoplanin Prescribing and Therapeutic Drug Monitoring Clinical Guideline V2.0 March 2019
No ratings yet
Teicoplanin Prescribing and Therapeutic Drug Monitoring Clinical Guideline V2.0 March 2019
12 pages
Co1 Science 1ST Quarter
No ratings yet
Co1 Science 1ST Quarter
4 pages
Reports Sample
No ratings yet
Reports Sample
20 pages
Chapter I - Introduction To Accounting Information Systems
No ratings yet
Chapter I - Introduction To Accounting Information Systems
29 pages
Introduction To Agile Change Management v1.0 1
100% (1)
Introduction To Agile Change Management v1.0 1
8 pages
MUPROSPECTUS2023
No ratings yet
MUPROSPECTUS2023
50 pages
Samantha Danico Resume
No ratings yet
Samantha Danico Resume
2 pages
Demystify OpenAI Triton Fkong' Tech Blog
No ratings yet
Demystify OpenAI Triton Fkong' Tech Blog
17 pages
Watch Size Guide
No ratings yet
Watch Size Guide
1 page
Teach Yourself Unix System Administration in 24 Hours
100% (7)
Teach Yourself Unix System Administration in 24 Hours
525 pages
Pride and Prejudice Character Analysis
No ratings yet
Pride and Prejudice Character Analysis
1 page
Flames of War - FoW - 4.0 - Special Rules & Warriors Errata
100% (1)
Flames of War - FoW - 4.0 - Special Rules & Warriors Errata
5 pages
Final PPT of Carbon Nanotubes
67% (3)
Final PPT of Carbon Nanotubes
29 pages
Inter and Sub Trochanteric Fracture
100% (1)
Inter and Sub Trochanteric Fracture
25 pages
Texas Edible Wild Plant Foraging Beginner Foraging Field Guide for Finding, Identifying, Harvesting, and Preparing Edible Wild Food
100% (9)
Texas Edible Wild Plant Foraging Beginner Foraging Field Guide for Finding, Identifying, Harvesting, and Preparing Edible Wild Food
27 pages
Aci-Punching
No ratings yet
Aci-Punching
5 pages
Forensic Ballistics
No ratings yet
Forensic Ballistics
4 pages
Contingency Procedures for May 2022 Elections
No ratings yet
Contingency Procedures for May 2022 Elections
36 pages
Buku Gelatin
No ratings yet
Buku Gelatin
129 pages
APC 300 Service Manual
No ratings yet
APC 300 Service Manual
87 pages
Key Partner Types: Pharm-Bio Technology and Traditional Medicine Centre (PHARMBIOTRAC)
No ratings yet
Key Partner Types: Pharm-Bio Technology and Traditional Medicine Centre (PHARMBIOTRAC)
2 pages
Allen Prac - Cockroach
No ratings yet
Allen Prac - Cockroach
5 pages
Occupational Health and Safety
No ratings yet
Occupational Health and Safety
3 pages
RM and Patient Safety Manual 2018
No ratings yet
RM and Patient Safety Manual 2018
196 pages
CS205 Quiz 1 Solved by VU Answer
No ratings yet
CS205 Quiz 1 Solved by VU Answer
6 pages
Bukky
No ratings yet
Bukky
3 pages
UAV Evolution at Northrop Grumman
No ratings yet
UAV Evolution at Northrop Grumman
50 pages

Practical 03

Uploaded by

Practical 03

Uploaded by

Dr.

Rafiq Zakaria Campus

Aim : To study Data Clustering using Python.

1. Install Required Libraries

Make sure you have the necessary libraries installed:

pip install numpy pandas matplotlib scikit-learn

Start by importing the required libraries

3. Generate Sample Data

For this example, we’ll create synthetic data using `make_blobs`:

4. Visualize the Data

plt.scatter(X[:, 0], X[:, 1], s=30)

5. Perform K-Means Clustering

Now, let's apply the K-Means clustering algorithm:

# Choose the number of clusters

6. Visualize the Clusters

You can visualize the resulting clusters:

plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, s=30, cmap='viridis')

7. Evaluate the Clustering

Evaluate the clustering using the silhouette score:

8. Choosing the Right Number of Clusters

Prepared by Khan Shagufta (Assistant professor PG Dept of Comp Sci)

You might also like