0% found this document useful (0 votes)
42 views20 pages

Heirarchical Clustering

Hierarchical clustering is a method that organizes data into a tree-like structure of clusters, allowing exploration at various levels. It can be performed using bottom-up or top-down approaches, with the agglomerative method being a common technique that merges smaller clusters. The document details the single linkage clustering method, which calculates distances between clusters based on the shortest distance between points, and provides a step-by-step example of forming clusters.

Uploaded by

Asiya Anjum S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views20 pages

Heirarchical Clustering

Hierarchical clustering is a method that organizes data into a tree-like structure of clusters, allowing exploration at various levels. It can be performed using bottom-up or top-down approaches, with the agglomerative method being a common technique that merges smaller clusters. The document details the single linkage clustering method, which calculates distances between clusters based on the shortest distance between points, and provides a step-by-step example of forming clusters.

Uploaded by

Asiya Anjum S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

MODULE 5

HIERARCHICAL CLUSTERING
ALGORITHMS
HIERARCHICAL CLUSTERING ALGORITHMS
Hierarchical Clustering is a method of cluster analysis that builds a hierarchy of
clusters
A hierarchy of clusters is an organized tree-like structure that shows how data points
are progressively grouped into larger and larger clusters
This structure allows us to explore data at
different levels — from individual points to one
big super-cluster
Therefore hierarchical clustering builds clusters
step by step by merging smaller clusters into
bigger ones
1.Bottom-up 1.Top-up
2.Start with each point as its 2.Start with all points in one
own cluster and merge the cluster and recursively split
closest clusters step by step them to get individual data
until a single final cluster is points
formed
AGGLOMERATIVE CLUSTERING METHOD
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
Definition:
In single linkage, the distance between two clusters is defined as the shortest distance
between any pair of points, one from each cluster

Here,
· DSL is the single linkage distance
· Ci , Cj are clusters
· d(a,b) is the distance between the elements a and b
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
Objects X Y

A 4 3

B 1 4

C 2 1

D 3 8

E 6 9

F 5 1
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
Step 1: Compute the Distance or Proximity Matrix

Find the Euclidean Distance between each pairs of points


SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)

Objects X Y

x1 y1
A 4 3 2 2
d(A,B) = √ (x2-x1) + (y2-y1)
x2 y2
B 1 4

C 2 1 Substituting the values of x and y


D 3 8 2 2
d(A,B)= √ (1-4) + (4-3) = 3.16
E 6 9

F 5 1
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
2 2
Objects X Y
d(A,C)= √ (2-4) + (1-3) = 2.822
2 2
d(A,D)= √ (3-4) + (8-3) = 5.09
A 4 3 2 2
d(A,E)= √ (6-4) + (9-3) = 6.32
B 1 4 2 2
d(A,F)= √ (5-4) + (1-3) = 2.23
C 2 1
Similarly find
D 3 8
d(B,C), d(B,C), d(B,C), d(B,C)
E 6 9 d(C,D), d(C,E), d(C,F)
F 5 1
d(E,F)
DISTANCE MATIRX
A B C D E F

A 0 3.16 2.82 5.09 6.32 2.23

B -- 0 3.16 4.47 7.07 5

C -- -- 0 7.07 8.94 3

D -- -- -- 0 3.16 7.28

E -- -- -- -- 0 8.06

F -- -- -- -- -- 0
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
Step 2: Merging the two closest members
Here the minimum value is 2.23 and hence we combine A and F
Now, form clusters of elements corresponding to the minimum value and
update the distance matrix
D E

B
A

C F
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
A B C D E F

A 0 3.16 2.82 5.09 6.32 2.23

B -- 0 3.16 4.47 7.07 5

C -- -- 0 7.07 8.94 3

D -- -- -- 0 3.16 7.28

E -- -- -- -- 0 8.06

F -- -- -- -- -- 0
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
A B C D E F (A,F) B C D E

A 0 3.16 2.82 5.09 6.32 2.23 (A,F) 0 3.16 2.82 5.09 6.32

B -- 0 3.16 4.47 7.07 5 B -- 0 3.16 4.47 7.07

C -- -- 0 7.07 8.94 3 C -- -- 0 7.07 8.94

D -- -- -- 0 3.16 7.28 D -- -- -- 0 3.16

E -- -- -- -- 0 8.06 E -- -- -- -- 0

F -- -- -- -- -- 0 Cluster Formed = (A,F)


SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
(A,F) B C D E

E
(A,F) 0 3.16 2.82 5.09 6.32 D

B -- 0 3.16 4.47 7.07

B
C -- -- 0 7.07 8.94

D -- -- -- 0 3.16 (A,F)
C

E -- -- -- -- 0
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)
(A,F) B C D E
(A,F),C B D E
(A,F) 0 3.16 2.82 5.09 6.32
(A,F),C 0 3.16 5.09 6.32
B -- 0 3.16 4.47 7.07
B -- 0 4.47 7.07
C -- -- 0 7.07 8.94
D -- -- 0 3.16
D -- -- -- 0 3.16
E -- -- -- 0
E -- -- -- -- 0
Cluster Formed = {(A,F), C }
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)

(A,F),C B D E
D
E
(A,F),C 0 3.16 5.09 6.32

B -- 0 4.47 7.07 B

D -- -- 0 3.16
(A,F), C

E -- -- -- 0
SINGLE LINKAGE CLUSTERING (MINIMUM LINKAGE)

(A,F),C B D E [{(A,F),C}, B] (D,E)

2
(A,F),C 0 3.16 5.09 6.32 [{(A,F),C}, B] 0 4.47

B -- 0 4.47 7.07 (D,E) -- 0

1
D -- -- 0 3.16 Cluster Formed = [{(A,F), C }, B]
and (D, E)
E -- -- -- 0
Final Cluster
{ [{(A,F), C }, B] , (D, E) }
DENDROGRAM FOR { [{(A,F), C }, B] , (D, E) }
5

1 3

A F C B D E

You might also like