0% found this document useful (0 votes)

46 views5 pages

Dbscan Implementation in Python

This document details an experiment on Density-based spatial clustering (DBSCAN) conducted by Anvita Singh. It includes Python code for data preprocessing, applying PCA for dimensionality reduction, and implementing the DBSCAN algorithm to identify clusters and noise in a dataset. The results are visualized using scatter plots and count plots to illustrate the clustering output.

Uploaded by

Anvita Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views5 pages

Dbscan Implementation in Python

Uploaded by

Anvita Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Pattern Recognition & Anomaly Detection

Lab
EXPERIMENT – 12
Density-based spatial clustering(DBSCAN)

NAME – ANVITA SINGH

ROLL NO – R2142221063

SAP_ID – 500107712

BATCH – 8

CODE -
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import DBSCAN
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# Load dataset
df = pd.read_csv("/content/city_day.csv")
print("Initial Data Sample:")
print(df.head())

# Remove missing values

df.dropna(inplace=True)

# Feature Selection (only numeric columns)

X = df.select_dtypes(include=['float64', 'int64'])
# Feature Scaling
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Apply PCA for dimensionality reduction

pca = PCA(n_components=2) # You can choose the number of components (2
for 2D, or more for higher dimensions)
X_pca = pca.fit_transform(X_scaled)

# Explained variance ratio

print("\nExplained Variance Ratio of the PCA Components:")
print(pca.explained_variance_ratio_)

# Train-Test Split (optional for DBSCAN, but we'll do it for

visualization)
X_train, X_test = train_test_split(X_pca, test_size=0.2,
random_state=42)

# DBSCAN Model
dbscan = DBSCAN(eps=0.5, min_samples=5) # You can adjust eps and
min_samples based on your data
dbscan.fit(X_train)

# Predict Clusters
y_pred_train = dbscan.labels_ # DBSCAN assigns labels, where -1
represents noise (outliers)

# Show sample predictions

print("\nSample DBSCAN Clusters (Noise = -1, Clusters = 0, 1,
2, ...):")
print(y_pred_train[:10])

# Count of Clusters vs Noise

unique, counts = np.unique(y_pred_train, return_counts=True)
result_counts = dict(zip(unique, counts))

print("\nCluster Counts:")
print(result_counts)

# Visualize Clusters and Noise

plt.figure(figsize=(8,5))
sns.countplot(x=y_pred_train)
plt.title("DBSCAN Clustering Output")
plt.xlabel("Cluster/Noise")
plt.ylabel("Count")
plt.show()
# Visualizing the PCA-reduced data with clusters highlighted
plt.figure(figsize=(10, 6))
sns.scatterplot(x=X_train[:, 0], y=X_train[:, 1], hue=y_pred_train,
palette="coolwarm", style=y_pred_train, legend="full")
plt.title("DBSCAN Clustering on PCA-reduced Data (Train Set)")
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.show()

# Visualizing the test set with clusters

y_pred_test = dbscan.fit_predict(X_test) # DBSCAN on the test set

# Visualizing the PCA-reduced data with anomalies (clusters)

highlighted for the test set
plt.figure(figsize=(10, 6))
sns.scatterplot(x=X_test[:, 0], y=X_test[:, 1], hue=y_pred_test,
palette="coolwarm", style=y_pred_test, legend="full")
plt.title("DBSCAN Clustering on PCA-reduced Data (Test Set)")
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.show()

print("✅ DBSCAN Clustering Model Trained, Clusters Identified, and

Visualized.")

OUTPUT –

DBSCAN Clustering
No ratings yet
DBSCAN Clustering
6 pages
Esam - DWM Lab 8
No ratings yet
Esam - DWM Lab 8
5 pages
Cheat Sheet-Building Unsupervised Learning Models
No ratings yet
Cheat Sheet-Building Unsupervised Learning Models
3 pages
ML0101EN Clus DBSCN Weather Py v1
No ratings yet
ML0101EN Clus DBSCN Weather Py v1
16 pages
Exp 6
No ratings yet
Exp 6
10 pages
Practical 5
No ratings yet
Practical 5
6 pages
Sklearn Kmeans Dbscan Guide
No ratings yet
Sklearn Kmeans Dbscan Guide
2 pages
Clustering
No ratings yet
Clustering
1 page
ML2 Practical List
No ratings yet
ML2 Practical List
80 pages
Se Demo
No ratings yet
Se Demo
29 pages
Baidurya Debnath 4
No ratings yet
Baidurya Debnath 4
37 pages
BIexp 12
No ratings yet
BIexp 12
4 pages
Maxbox Starter60 Machine Learning
No ratings yet
Maxbox Starter60 Machine Learning
8 pages
Assignment4 CH5650 CH21B112
No ratings yet
Assignment4 CH5650 CH21B112
3 pages
Apriori Algorithm & Clustering Guide
No ratings yet
Apriori Algorithm & Clustering Guide
8 pages
10 - DBSCANClusteringOnIRIS-Copy1 - Jupyter Notebook
No ratings yet
10 - DBSCANClusteringOnIRIS-Copy1 - Jupyter Notebook
4 pages
Week 8 DS Practical
No ratings yet
Week 8 DS Practical
13 pages
DBSCAN Clustering in ML - Density Based Clustering
No ratings yet
DBSCAN Clustering in ML - Density Based Clustering
5 pages
4.cluster Analysis
No ratings yet
4.cluster Analysis
7 pages
DBSCAN - Introduction in Machine Learning.
No ratings yet
DBSCAN - Introduction in Machine Learning.
3 pages
Week 2 B
No ratings yet
Week 2 B
12 pages
Aam Codes
No ratings yet
Aam Codes
8 pages
DBSCAN
No ratings yet
DBSCAN
29 pages
Image Processing
No ratings yet
Image Processing
5 pages
Untitled Document-2-1-13-7-11.4
No ratings yet
Untitled Document-2-1-13-7-11.4
5 pages
22mid0187 ML Lab-5
No ratings yet
22mid0187 ML Lab-5
13 pages
Lab Report 4
No ratings yet
Lab Report 4
6 pages
LAB7 Kmeans
No ratings yet
LAB7 Kmeans
11 pages
DM ML Practical
No ratings yet
DM ML Practical
13 pages
ML Notes 1
No ratings yet
ML Notes 1
3 pages
Data Science for Customer Segmentation
No ratings yet
Data Science for Customer Segmentation
8 pages
Image Classification
No ratings yet
Image Classification
18 pages
Mercedes-Benz Greener Manufacturing Ai
0% (1)
Mercedes-Benz Greener Manufacturing Ai
16 pages
DB Scan
No ratings yet
DB Scan
7 pages
Experiment 4 1
No ratings yet
Experiment 4 1
4 pages
Dbscan Code Python
No ratings yet
Dbscan Code Python
1 page
ML Short
No ratings yet
ML Short
2 pages
Lab 8
No ratings yet
Lab 8
8 pages
External Program2
No ratings yet
External Program2
2 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
IDM Assignment
No ratings yet
IDM Assignment
15 pages
5 Clustering Algorithm 17-09-2024
No ratings yet
5 Clustering Algorithm 17-09-2024
2 pages
Bigdata External Programs 181801120034
No ratings yet
Bigdata External Programs 181801120034
4 pages
Amlt Bca Unit-3
No ratings yet
Amlt Bca Unit-3
7 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
Unit 3 Unsupervised Learning
No ratings yet
Unit 3 Unsupervised Learning
9 pages
B22EE010 Report
No ratings yet
B22EE010 Report
9 pages
Slip Clustering
No ratings yet
Slip Clustering
2 pages
ML Minors Exp7
No ratings yet
ML Minors Exp7
6 pages
K-Means 10
No ratings yet
K-Means 10
2 pages
NF Assighment4
No ratings yet
NF Assighment4
5 pages
Maxbox - Starter67 Machine Learning
No ratings yet
Maxbox - Starter67 Machine Learning
7 pages
Py 2
No ratings yet
Py 2
7 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
Numpy NP Sklearn - Cluster Sklearn Sklearn - Datasets Sklearn - Preprocessing
No ratings yet
Numpy NP Sklearn - Cluster Sklearn Sklearn - Datasets Sklearn - Preprocessing
1 page
Assignments Introduction To Machine Learning 2024
No ratings yet
Assignments Introduction To Machine Learning 2024
45 pages
SOLUTION ONLY CODE DWDM - Lab - All
No ratings yet
SOLUTION ONLY CODE DWDM - Lab - All
8 pages
Data Modeling for Credit Risk
No ratings yet
Data Modeling for Credit Risk
31 pages
Drawback of Standard K-Means Algorithm
No ratings yet
Drawback of Standard K-Means Algorithm
5 pages
Rules of Passage Narration
No ratings yet
Rules of Passage Narration
7 pages
Science Year 5
No ratings yet
Science Year 5
5 pages
Business Plan
No ratings yet
Business Plan
24 pages
Cpe101 3T1415
No ratings yet
Cpe101 3T1415
4 pages
Lecture 7 Bolted Connections 240226
No ratings yet
Lecture 7 Bolted Connections 240226
34 pages
MSDS English
No ratings yet
MSDS English
7 pages
Mining Zimbabwe Magazine
No ratings yet
Mining Zimbabwe Magazine
60 pages
Integumentary System
100% (1)
Integumentary System
40 pages
User's Manual: GSM FAX Terminal-8848
No ratings yet
User's Manual: GSM FAX Terminal-8848
13 pages
Ictl Form 2
No ratings yet
Ictl Form 2
10 pages
Exerscise 2 Micro Theory 1
No ratings yet
Exerscise 2 Micro Theory 1
2 pages
Mysteries Reading Booklet
No ratings yet
Mysteries Reading Booklet
5 pages
Cepheus Engine - 50 Wonders of The Reticulan Empire (Oef)
No ratings yet
Cepheus Engine - 50 Wonders of The Reticulan Empire (Oef)
30 pages
RFQ Process
No ratings yet
RFQ Process
19 pages
10 GPON Fundamentals
No ratings yet
10 GPON Fundamentals
38 pages
UAS Report Laundry - in
No ratings yet
UAS Report Laundry - in
13 pages
If Youre Feeling Froggy - Google Search
No ratings yet
If Youre Feeling Froggy - Google Search
1 page
Welding Bits
No ratings yet
Welding Bits
41 pages
Reflection
No ratings yet
Reflection
1 page
Migration Studies: Key Concepts & Theories
0% (1)
Migration Studies: Key Concepts & Theories
13 pages
54 Django Questions Answers
No ratings yet
54 Django Questions Answers
86 pages
Scarface 1983
No ratings yet
Scarface 1983
183 pages
User Manual - 19 Inch LCD Monitor (MML1941-PCR MML1942-PER)
No ratings yet
User Manual - 19 Inch LCD Monitor (MML1941-PCR MML1942-PER)
48 pages
.Electricity in 19th Century Medicine and Mary Shelley's Frankenstein..
No ratings yet
.Electricity in 19th Century Medicine and Mary Shelley's Frankenstein..
3 pages
The Production Supervisor of The Machining Department For Rodriguez Company
No ratings yet
The Production Supervisor of The Machining Department For Rodriguez Company
1 page
Power - Aware Query Processing Over Sensor Networks
No ratings yet
Power - Aware Query Processing Over Sensor Networks
15 pages
Supplement TEAC
No ratings yet
Supplement TEAC
16 pages
2019 Siemens Ccs Catalogue 1
No ratings yet
2019 Siemens Ccs Catalogue 1
16 pages
Cavalier 2.2 Fusilera Tablero
No ratings yet
Cavalier 2.2 Fusilera Tablero
4 pages
NGT Rle Procedure
No ratings yet
NGT Rle Procedure
4 pages

Dbscan Implementation in Python

Uploaded by

Dbscan Implementation in Python

Uploaded by

Pattern Recognition & Anomaly Detection

NAME – ANVITA SINGH

from sklearn.model_selection import train_test_split

# Remove missing values

# Feature Selection (only numeric columns)

# Apply PCA for dimensionality reduction

# Explained variance ratio

# Train-Test Split (optional for DBSCAN, but we'll do it for

# Show sample predictions

# Count of Clusters vs Noise

# Visualize Clusters and Noise

# Visualizing the test set with clusters

# Visualizing the PCA-reduced data with anomalies (clusters)

print("✅ DBSCAN Clustering Model Trained, Clusters Identified, and

You might also like