1 Welcome
INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
2
k- NEAREST NEIGHBOR algorithm is most basic instance-based
method.
Algorithms assumes all instances correspond to points in the n-
dimensional space Rn .
The nearest neighbors of an instance are defined in terms of the
standard Euclidean distance.
Let an arbitrary instance x be described by the feature vector
𝑎1(𝑥), 𝑎2(𝑥), … , 𝑎𝑛(𝑥)
where ar(x) denotes the value of rth attribute of instance x.
INSTANCE-BASED LEARNING
3 INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
4
The distance between two instances xi and xj is defined to be d(xi,xj),
where
𝑑(𝑥𝑖, 𝑥𝑗) ≡ 𝑎𝑟 𝑥𝑖 − 𝑎𝑟 𝑥𝑗 2
𝑟=1
In nearest- neighbor learning the target function may be either
discrete-valued or real-valued.
INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
5
Discrete-Valued :
The target function of the form f:Rn->V , where V is the finite set
{v1,…,vs}.
Let xq is the query instance.
The value f(xq) returned by the algorithm as its estimate of f(xq) is
just the most common value of f among the k training examples
nearest to xq.
If we choose k = 1, then 1- NEAREST NEIGHBOR algorithm assigns
to 𝑓 𝑥𝑞 the value 𝑓 𝑥𝑞 where is xi is the training instance nearest
to xq
For larger values of k, the algorithm assigns the most common value
among the k nearest training examples.
INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
6
Discrete-Valued :
The above figure shows the operation of the k-NEAREST NEIGHBOR algorithm for
the case where the instances are points in a two-dimensional space and where the
target function is Boolean valued.
The positive and negative training examples are shown ‘+’ and ‘-’ respectively.
A query point xq is shown as well.
The 1- NEAREST NEIGHBOR algorithm classifies xq as positive example in this
figure , whereas the 5-NEAREST NEIGHBOR algorithm classifies it as a negative
example. INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
7
Discrete-Valued :
The k-NEAREST NEIGHBOR algorithm never forms an explicit general
hypothesis f regarding the target function f().
It simply computes the classification of each new query instance as
needed.
INSTANCE-BASED LEARNING
k- NEAREST NEIGHBOR LEARNING
8
Discrete-Valued :
The diagram shows the shape of the decision surface induced by 1-
NEAREST NEIGHBOR over the entire instance space.
The decision surface is a combination of convex polyhedral surrounding
each of the training examples.
For every training example, the polyhedron indicates the set of query
points whose classification will be completely determined by that training
example.
Query point outside the polyhedron are closer to some other training
example.
This kind of diagram is often called the Voronoi diagram of the set of
INSTANCE-BASED LEARNING
training examples.
k- NEAREST NEIGHBOR LEARNING
9
Continuous-Valued :
k-NEAREST NEIGHBOR can be adapted to approximating
continuous-valued target function.
So we must calculate the mean value of the k nearest training
examples rather than calculate their most common value.
The approximate real-valued target function f:Rn -> R by :
𝑘
𝑖=1 𝑓(𝑥𝑖)
𝑓 𝑥𝑞 ←
𝑘
INSTANCE-BASED LEARNING
Distance – Weighted NEAREST NEIGHBOR
Algorithm
10
The one of the refinement of k-NEAREST NEIGHBOR algorithm is to
weight the contribution of each of the k neighbors according to their
distance to the query point xq,.
This can be accomplished by
𝑘
𝑓 𝑥𝑞 ← 𝑎𝑟𝑔max 𝜔𝑖𝛿 𝑣, 𝑓 𝑥𝑖
𝑣∈𝑉
𝑖=1
where
1
𝑤𝑖 ≡
𝑑 𝑥𝑞, xi 2
If the query point xq exactly matches one of the training instances xi and
the denominator d(xq,xi)2 is therefore zero , so we assign f(xq) to be f(xi) .
If there are several such training examples, we assign the majority
classification among them. INSTANCE-BASED LEARNING
Distance – Weighted NEAREST NEIGHBOR
Algorithm
11
Distance-weight the instances for real-valued target function in a
similar in a similar fashion by
𝑘
𝑖=1 𝑤𝑖𝑓(𝑥𝑖)
𝑓 𝑥𝑞 ← 𝑘
𝑖=1 𝑤𝑖
Where wi is a constant that normalizes the contributions of the
various weights.
The variants of the k-NEAREST NEIGHBOR algorithm consider
only the k nearest neighbors to classify the query point.
INSTANCE-BASED LEARNING
Distance – Weighted NEAREST NEIGHBOR
Algorithm
12
We add distance weighting , there is really no harm in allowing all
training examples to have an influence on the classification of the
xq, because very distant examples will have very little effect on
f(xq).
Disadvantage :
The classifier will run more slowly.
If all training examples are considered when classifying a new
query instance, we call the algorithm a global method.
INSTANCE-BASED LEARNING
Distance – Weighted NEAREST NEIGHBOR
Algorithm
13
Remarks :
It is highly effective inductive inference method for many practical
problems.
The inductive bias corresponds to an assumption that the
classification of an instance xq will be most similar to the
classification of other instances that are nearby in Euclidean
distance.
The distance between instances is calculated based on all attributes
of the instance.
Though an instance is described by large number of attributes , but
only few of them will be relevant to determine the classification for
the function.
If these few attributes identical values may be distant from one
another in the d-dimensional space. LEARNING
INSTANCE-BASED
Distance – Weighted NEAREST NEIGHBOR
Algorithm
14
Remarks :
The similarity metric used by k-NN depending on all attributes will
be misleading.
The distance between two neighbors will be dominated by the large
number of irrelevant attributes. This difficulty , which arise when
many irrelevant attributes are present, is sometimes referred to as
the curse of dimensionality.
We can overcome this problem , by attaching the weight each
attribute differently when calculating the distance between two
instances.
That means we are stretching the axes in the Euclidean space.
Shortening the axes that correspond to less relevant attribute and
lengthing the axes that correspond to more relevant attributes.
INSTANCE-BASED LEARNING
Distance – Weighted NEAREST NEIGHBOR
Algorithm
15
Remarks :
We us cross validation to determine the amount by which each axis
is to be stretched automatically.
The other alternative is to eliminate the least relevant attributes
from the instance space.
This means to setting some of the scaling factors to zero.
Cross validation , we explore methods based on leave-one-out cross
validation, in which the set of m training examples is repeatedly
divided into a training set of size m-1LEARNING
INSTANCE-BASED and test set of size 1 .