Spectral Clustering
Eyal David
Image Processing seminar
May 2008
Lecture Outline
Motivation
Graph overview and construction
Demo
Spectral Clustering
Demo
Cool implementations
2
3
A Tutorial on Spectral Clustering\Arik Azran
Spectral Clustering Example – 2 Spirals
2
1.5 Dataset exhibits complex
1 cluster shapes
K-means performs very
0.5
-2 -1.5 -1 -0.5
0
0 0.5 1 1.5 2 poorly in this space due bias
-0.5
toward dense spherical
-1
clusters.
-1.5
-2
0.8
0.6
0.4
0.2
In the embedded space -0.709 -0.7085 -0.708 -0.7075 -0.707 -0.7065
0
-0.706
given by two leading -0.2
eigenvectors, clusters are -0.4
trivial to separate. -0.6
4
-0.8
Spectral Clustering - Derek Greene
Lecture Outline
Motivation
Graph overview and construction
Graph demo
Spectral Clustering
Spectral Clustering demo
Cool implementation
5
6
Matthias Hein and Ulrike von Luxburg August 2007
7
Matthias Hein and Ulrike von Luxburg August 2007
8
Matthias Hein and Ulrike von Luxburg August 2007
9
Matthias Hein and Ulrike von Luxburg August 2007
10
Matthias Hein and Ulrike von Luxburg August 2007
11
Matthias Hein and Ulrike von Luxburg August 2007
12
Matthias Hein and Ulrike von Luxburg August 2007
13
Matthias Hein and Ulrike von Luxburg August 2007
14
Matthias Hein and Ulrike von Luxburg August 2007
15
Matthias Hein and Ulrike von Luxburg August 2007
16
Matthias Hein and Ulrike von Luxburg August 2007
17
Matthias Hein and Ulrike von Luxburg August 2007
Demo
(Live example)
Lecture Outline
Motivation
Graph overview and construction
Demo
Spectral Clustering
Demo
Cool implementations
19
20
Matthias Hein and Ulrike von Luxburg August 2007
21
Matthias Hein and Ulrike von Luxburg August 2007
22
Matthias Hein and Ulrike von Luxburg August 2007
23
Matthias Hein and Ulrike von Luxburg August 2007
24
Matthias Hein and Ulrike von Luxburg August 2007
25
Matthias Hein and Ulrike von Luxburg August 2007
26
Matthias Hein and Ulrike von Luxburg August 2007
27
Matthias Hein and Ulrike von Luxburg August 2007
28
Matthias Hein and Ulrike von Luxburg August 2007
Eigenvectors & Eigenvalues
29
30
Matthias Hein and Ulrike von Luxburg August 2007
31
Matthias Hein and Ulrike von Luxburg August 2007
Demo
(Live example)
Spectral Clustering Algorithm
Ng, Jordan, and Weiss
Motivation
Given a set of points
S s1 ,..., sn Rl
We would like to cluster them into k
subsets
33
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
Form the affinity matrix W R nxn
||si s j || / 2
if i j
2 2
DefineWij e
Wii 0
Scaling parameter chosen by user
Define D a diagonal matrix whose
(i,i) element is the sum of A’s row i
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry 34
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
1/ 2 1/ 2
Form the matrix LD WD
Find x1 , x2 ,..., xk , the k largest eigenvectors of
L
These form the the columns of the new
matrix X
Note: have reduced dimension from nxn to nxk
35
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
Form the matrix Y
Renormalize each of X’s rows to have unit length
Yij X ij /( X ij 2 ) 2
Y Rnxk j
Treat each row of Y as a point in R k
Cluster into k clusters via K-means
36
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Algorithm
Final Cluster Assignment
Assign point si to cluster j iff row i of Y was
assigned to cluster j
37
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Why?
If we eventually use K-means, why not just
apply K-means to the original data?
This method allows us to cluster non-convex
regions
38
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
Some Examples
39
40
41
42
43
44
45
46
47
User’s Prerogative
Affinity matrix construction
Choice of scaling factor
Realistically, search over 2
and pick value that
gives the tightest clusters
Choice of k, the number of clusters
Choice of clustering method
48
Slides from Spectral Clustering by Rebecca Nugent, Larissa Stanberry
based on Ng et al On Spectral clustering: analysis and algorithm
How to select k?
Eigengap: the difference between two consecutive eigenvalues.
Most stable clustering is generally given by the value k that
maximises the expression
k k k 1
Largest eigenvalues 50
of Cisi/Medline data 45 λ1
40
35
max k 2 1
Eigenvalue
30
25
λ2
Choose k=2
20
15
10
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
K 49
Spectral Clustering - Derek Greene
Recap – The bottom line
50
Matthias Hein and Ulrike von Luxburg August 2007
Summary
Spectral clustering can help us in hard
clustering problems
The technique is simple to understand
The solution comes from solving a simple
algebra problem which is not hard to
implement
Great care should be taken in choosing the
“starting conditions”
51
The End