0% found this document useful (0 votes)
362 views2 pages

A Diana Algoritma

DIANA is a hierarchical clustering technique that builds the cluster hierarchy in the inverse order of agglomerative methods. It starts with all objects in one cluster and recursively splits the largest cluster until each object is its own cluster. To determine how to split clusters, it finds the object most dissimilar to others and assigns it to a new "splinter group". It then calculates the difference between each remaining object's average distance to the splinter group and non-splinter group, splitting the cluster based on which objects have a positive difference. This process repeats recursively until each object is its own cluster. The divisive coefficient metric indicates the strength of clustering found, ranging from 0 (no structure) to 1 (clear structure).
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
362 views2 pages

A Diana Algoritma

DIANA is a hierarchical clustering technique that builds the cluster hierarchy in the inverse order of agglomerative methods. It starts with all objects in one cluster and recursively splits the largest cluster until each object is its own cluster. To determine how to split clusters, it finds the object most dissimilar to others and assigns it to a new "splinter group". It then calculates the difference between each remaining object's average distance to the splinter group and non-splinter group, splitting the cluster based on which objects have a positive difference. This process repeats recursively until each object is its own cluster. The divisive coefficient metric indicates the strength of clustering found, ranging from 0 (no structure) to 1 (clear structure).
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

7.1.

5 Divisive Analysis (DIANA)


DIANA is a hierarchical clustering technique, but its main
difference with the agglomerative method (AGNES) is that it
constructs the hierarchy in the inverse order.
Initially (Step 0), there is one large cluster consisting of
all n objects. At each subsequent step, the largest available
cluster is split into two clusters until finally all clusters, comprise
of single objects. Thus, the hierarchy is built in n-1 steps.
In the first step of an agglomerative method, all possible fusions
of two objects are considered leading to
combinations. In
the divisive method based on the same principle, there are
possibilities to split the data into two clusters. This number is
considerably larger than that in the case of an agglomerative
method.
To avoid considering all possibilities, the algorithm proceeds as
follows.
1. Find the object, which has the highest average dissimilarity to all
other objects. This object initiates a new cluster a sort of
a splinter group.
2. For each object i outside the splinter group compute
3. Di = [average d(i,j) j Rsplinter group ] - [average d(i,j) j Rsplinter group]
4. Find an object h for which the difference Dh is the largest. If Dh is
positive, then h is, on the average close to the splinter group.
5. Repeat Steps 2 and 3 until all differences Dh are negative. The data
set is then split into two clusters.
6. Select the cluster with the largest diameter. The diameter of a
cluster is the largest dissimilarity between any two of its objects.
Then divide this cluster, following steps 1-4.
7. Repeat Step 5 until all clusters contain only a single object.

Divisive Coefficient (DC)


For each object i, let d ( i ) denote the diameter of the last cluster
to which it belongs (before being split off as a single object),
divided by the diameter of the whole data set.
The divisive coefficient (DC), given by

indicates the strength of the clustering structure found by the


algorithm.
Graphical display
The hierarchy obtained from Diana can be represented
graphically by the dissimilarity banner.
Dissimilarity banner consists of lines with stars and stripes,
which repeat the identifiers of the objects. The banner is read
from left to right, but the fixed scales above and below the
banner range from 1.00 (corresponding to the diameter of the
entire data set) to 0.00 (corresponding to the diameter of the
singletons)
Each line with stars ends at the diameter at which the cluster is
split. The actual diameter of the data set (corresponding to 1.00
in the banner) is displayed just below the banner.
The divisive coefficient defined above can be seen as the average
width of the divisive banner.

where li is the length of the line containing the identifier of


object i. Thus it can be seen as the counterpart of the
agglomerative coefficient (AC):
DC = 0

No cluster structure

DC = 1

Clear cluster structure

http://www.unesco.org/webworld/idams/advguide/Chapt7_1_5.htm

anggal 27 april 2016

You might also like