UNIT-3 INSTANCE BASED
LEARNING
Instance Based Learning
 It consists of simply storing the presented training
 data. When a new query instance is encountered, a set
 of similar related instances is retrieved from memory
 and used to classify the new query instance.
 Instance-based approaches can construct a different
 approximation to the target function for each distinct
 query instance that must be classified. In fact, many
 techniques construct only a local approximation to the
 target function that applies in the neighborhood of the
 new query instance, and never construct an
 approximation designed to perform well over the entire
 instance space.
Disadvantages of instance-based approaches:
 The cost of classifying new instances can be high. This
 is due to the fact that nearly all computation takes
 place at classification time rather than when the
 training examples are first encountered.
 A second disadvantage to many instance-based
 approaches, especially nearest neighbor approaches, is
 that they typically consider all attributes of the
 instances when attempting to retrieve similar training
 examples from memory.
k-NEAREST NEIGHBOR
LEARNING:
 The most basic instance-based method is the
 k-NEAREST           NEIGHBOR           algorithm.    This
 algorithm assumes all instances correspond to points
 in the n-dimensional space
  The nearest neighbors of an instance are defined in
 terms of the standard Euclidean distance. More
 precisely, let an arbitrary instance x be described by the
 feature vector.
LOCALLY WEIGHTED
REGRESSION:
 It is a generalization of nearest neighbor approach.
 The phrase "locally weighted regression" is called local
 because the function is approximated based only on
 data near the query point, weighted because the
 contribution of each training example is weighted by its
 distance from the query point, and regression because
 this is the term used widely in the statistical learning
 community for the problem of approximating
 real-valued functions.
Linear regression cannot be used for making
predictions when there exists a non-linear relationship
between X and Y. In such cases, locally weighted linear
regression is used.
Locally     weighted     linear   regression    is    a
non-parametric algorithm. Rather parameters are
computed individually for each query point x. While
computing, a higher “preference” is given to the points
in the training set lying in the vicinity of x than the
points lying far away from x.
We modify this procedure to derive a local
approximation rather than a global one. The simple way
is to redefine the error criterion E to emphasize fitting
the local training examples. Three possible criteria are
given below:
If we choose criterion three above and re derive the
gradient descent rule we obtain the following training
rule:
Radial Basis Functions:
 One approach to function approximation that is closely
 related to distance-weighted regression and also to
 artificial neural networks is learning with radial basis
 functions.
 In this approach, the learned hypothesis is a function of
 the form:
CASE BASED LEARNING (CBR):
 THREE KEY PROPERTIES OF INSTANCE
 BASED LEARNING:
 ◦ First, they are lazy learning methods in that they defer the
   decision of how to generalize beyond the training data until a
   new query instance is observed.
 ◦ Second, they classify new query instances by analyzing similar
   instances while ignoring instances that are very different from
   the query.
 ◦ Third, they represent instances as real-valued points in an
   n-dimensional Euclidean space.
CBR is a learning paradigm based on the first two of
these principles, but not the third. In CBR, instances
are typically represented using more rich symbolic
descriptions, and the methods used to retrieve similar
instances are correspondingly more elaborate.
CBR has been applied to problems such as conceptual
design of mechanical devices based on a stored library
of previous designs.
Example:
 The CADET system employs case-based reasoning to
 assist in the conceptual design of simple mechanical
 devices such as water faucets.
 The system uses a library containing approximately 75
 previous designs and design fragments to suggest
 conceptual designs to meet the specifications of new
 design problems.
 Each instance stored in memory (e.g., a water pipe) is
 represented by describing both its structure and its
 qualitative function.