Modeling with
Instances Prof. Sonia F. Panesar
Department of Computer Sci. & Eng.
Babaria Institute of
Technology
Classificatio
n
Classificatio
n
• Classification is a form of supervised machine
learning
• The classification algorithm learns from
labeled data. Data labels make it easier for
your models to make decisions based on the
logic rules you’ve defined
Classificatio
n
• Instance-based learning classifiers are
supervised
• lazy learners — they have no training
phase, and they simply memorize training
data, in- memory, to predict classifications
for new data points
Instance-based classifiers
• k-nearest neighbor (kNN)
• Self-organizing maps
• Locally weighted learning
limitations
• These classifiers are not well-suited for
– Noisy data (data with unexplainable
random variation)
– Datasets with unimportant or irrelevant
features
– Datasets with missing value
Clustering Vs Classification
Paramenter CLASSIFICATION CLUSTERING
Type used for supervised learning used for unsupervised learning
grouping the instances based on
process of classifying the input instances their similarity without the help of
Basic based on their corresponding class labels class labels
it has labels so there is need of training and
testing dataset for verifying the model there is no need of training and
Need testing dataset
created
less complex as compared to
Complexity more complex as compared to clustering classification
k-means clustering algorithm, Fuzzy
Example Logistic regression, Naive Bayes classifier, c-means clustering algorithm,
Algorithms Support vector machines etc. Gaussian (EM) clustering algorithm
etc.
Nearest Neighbour Analysis
• Nearest neighbor methods work by taking an
observation’s attribute value and then locating
another observation whose attribute value is
numerically nearest to it
Average Nearest Neighbour Analysis
• Average nearest neighbor algorithms are basic
yet powerful classification algorithms. They’re
useful for finding and classifying observations
that are most similar on average
• used in pattern recognition, in chemical and
biological structural analysis, and in spatial data
modeling.
• The purpose of using an average nearest
neighbor algorithm is to classify observations
based on the average of the arithmetic distances
between them.
K Nearest Neighbour Analysis
How it Works
Usefulness & Limitation
• Useful for multi-label learning —
supervised learning
• It takes a lot longer than other
classification methods to classify a sample
• performance depends on the distance
function and on the of
neighborhood parameter k. value the
Applications
• website categorization, web page ranking
• customer relationship management (CRM)
Real-World Problems with Nearest
Neighbor Algorithms
• used in retail to detect patterns in credit card
usage.
• use for visual pattern recognition to scan and
detect hidden packages in the bottom bin of a
shopping cart at checkout. If an object is
detected that is an exact match for an object
listed in the database, the price of the spotted
product could even automatically be added to
the customer’s bill.
Reference
s• https://analyticsindiamag.com/7-types
- classification-algorithms/
• https://matthew- brett.github.io/dsfe/chapters/
09/Nearest_Neighb ors
• https://www.analyticsvidhya.com/blog/2018/08
/ k-nearest-neighbor-introduction-regression-
python/
• https://www.analyticsvidhya.com/blog/2018/03
/ introduction-k-neighbours-algorithm-clustering
/