0% found this document useful (0 votes)

9 views74 pages

Machine Learning: Professor Department of Computer Science & Engineering

The document discusses the concepts of eager and lazy learning in machine learning, highlighting the differences between them. It focuses on instance-based learning, particularly the k-nearest neighbors (K-NN) algorithm, explaining its application in both classification and regression tasks. Additionally, it addresses the importance of choosing the appropriate value of k and the implications of distance measures in the K-NN algorithm.

Uploaded by

sanjaydm23072005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views74 pages

Machine Learning: Professor Department of Computer Science & Engineering

Uploaded by

sanjaydm23072005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 74

Machine Learning

Dr. Shylaja S S
Professor
Department of Computer Science & Engineering

Acknowledgement : Dr. Rajanikanth K (Former Principal MSRIT, PhD(IISc),

Academic
Advisor, PESU)
Prof. Preet Kanwal (Associate Professor, PESU)
Teaching Assistant (Arya Rajiv Chaloli)
Machine Learning
Eager vs Lazy Learning

Eager Learning Lazy Learning

store it in
Training data MODEL Training data
memory

Test Test
MODEL Prediction Training data Prediction
query query

● Creates a model using the training data ● For every test query, it processes the
which is used across all test queries training data to make a prediction
Machine Learning
Eager vs Lazy Learning

Eager Learning ● Training data is used to create global

model

● A generalized model is created based

on training set

Training data MODEL ● On test query the global model is used

to make the prediction

● Training data is not directly used to

Test make prediction , instead the generated
MODEL Prediction model is used
query
● Example: Decision Tree
Machine Learning
Eager vs Lazy Learning

Lazy Learning
● No global model is created

Training data
store it in ● For every test instance a common
memory procedure is run on training data to find
the prediction value.

● Prediction uses training data directly

Test
Training data Prediction ● Example: Instance Based Learning
query
algorithms like KNN
Machine Learning
Instance Based Learning

● Instance-based learning (also called memory-based learning) is a

family of learning algorithms that compare new problem instances
with instances seen in training, which have been stored in
memory.

● Because computation is postponed until a new instance is

observed, these algorithms are sometimes referred to as "lazy."

● Examples of instance-based learning algorithms are the k-

nearest neighbors algorithm (our focus), kernel machines and
RBF networks
Machine Learning
K-NN

● A lazy learning algorithm

● Classify based on its neighbour

● The algorithm can be used for both

classification and regression

● Works on the idea that similar

people/things/subjects tend to stay
together
Machine Learning
Inductive Bias in K-NN

The inductive bias of a learning algorithm is the set of assumptions that the
learner uses to predict outputs of given inputs that it has not encountered.

Inductive Bias in kNN: The classification of an instance x will be most similar to

the classification of the K other instances that are nearby.
Machine Learning
K-NN Overview

Find it’s K neighbours

These points may belong to
some class (classification)
(K in kNN is i.e. number
or may have real value
of neighbours in
(regression)
consideration)

Assign mode of
All instances Given a query
neighbours for
correspond to point find where
classification and
points in the n-D it belongs in the
mean for
space. space
regression.

How to find neighbour? Using distance measures

Machine Learning
K-NN Distance Measures

where,
● d: number of attributes
● xi and yi: ith attributes

We can manipulate the

values of q and calculate
different distances:
● q = 1 -> Manhattan
distance
● q = 2 -> Euclidean
distance
Machine Learning
K-NN for Classification and Regression

● In classification, the target function is discrete valued (m classes).

● Each of the data instances may belong to one of the m classes

We define Set V as the set of all classes, V = {v1,v2,v3,..........vm}.

● In d-dimensional space (Rd) where d is the number of features,

We can define the target function as f : Rd → V

● Any data instance xi will have a label f(xi) (- one of the class from
set V).
Machine Learning
K-NN for Classification and Regression
Machine Learning
K-NN for Classification and Regression

● In regression, the target function is real valued

(continuous value).

● In d-dimensional space (Rd) where d is the

number of features,

We can define the target function as f : Rd → R

Machine Learning
K-NN for Classification and Regression

Mean value of k nearest neighbors is returned as the

approximated value of the query instance xq.
Machine Learning
Practice problem for K-NN Classification

Practice Problem - 1
Machine Learning
Practice problem for K-NN Classification

Consider the following data set. Apply kNN algorithm to

assess the risk for a patient whose BP is 100, sugar is 135,
Haemoglobin is 12 and WBC count is 8 thousand. That means the query point
Take k =3 and use euclidean distance measure. is xq = (100,135,12,8)

# B.P. Sugar Haemoglobin WBC(in Risk

thousands)
A 100 120 12 6 No
B 110 130 14 5 Yes
C 120 110 11 7 Yes
D 100 140 13 7 No
E 115 140 11 6 Yes
Machine Learning
Practice problem for K-NN Classification

# B.P. Sugar Haemoglobin WBC Risk

A 100 120 12 6 No
we calculate distance between our
B 110 130 14 5 Yes query point xq = (100,135,12,8) and
C 120 110 11 7 Yes every other point
D 100 140 13 7 No Query other Euclidean
E 115 140 11 6 Yes points distance
xq A 15.13
xq B 11.74
xq C 32.04
xq D 5.19
xq E 15.96
Machine Learning
Practice problem for K-NN Classification

# B.P. Sugar Haemoglobin WBC Risk

A 100 120 12 6 No
B 110 130 14 5 Yes Sort these distances and pick the
C 120 110 11 7 Yes 3 nearest neighbors(as k = 3)
D 100 140 13 7 No Query other Euclidean
points distance
E 115 140 11 6 Yes
xq D 5.19
xq B 11.74
xq A 15.13
xq E 15.96
xq C 32.04
Machine Learning
Practice problem for K-NN Classification

# B.P. Sugar Haemoglobin WBC Risk

A 100 120 12 6 No
B 110 130 14 5 Yes Sort these distances and pick the
C 120 110 11 7 Yes 3 nearest neighbors(as k = 3)
D 100 140 13 7 No Query other Euclidean
points distance
E 115 140 11 6 Yes
xq D 5.19
xq B 11.74
our choice for the label of xq is mode
of the risks of D, B, A
xq A 15.13
xq E 15.96
i.e mode(NO,YES,NO) = NO xq C 32.04
Machine Learning
Practice problem for K-NN Regression

Practice Problem - 2
Machine Learning
Practice problem for K-NN Regression

Find the weight of ID 11. The value of k is 3. Use Euclidean distance.

Machine Learning
Practice problem for K-NN Regression
We find distance between the point id=11 and every other point

Query pt Data Distance

points
id 11 id 1 7.017
id 11 id 2 12.006
id 11 id 3 8.006
id 11 id 4 4.019
id 11 id 5 2.118
id 11 id 6 2.022
id 11 id 7 19.001
id 11 id 8 10.00
id 11 id 9 15
Machine Learning
Practice problem for K-NN Regression
Sort the distances and choose 3 nearest neighbours

Query pt Data Distance

points
id 11 id 6 2.022
id 11 id 5 2.118
id 11 id 4 4.019
id 11 id 10 6.00
id 11 id 1 7.017
id 11 id 3 8.006
id 11 id 8 10.00
id 11 id 2 12.006
id 11 id 9 15
Machine Learning
Practice problem for K-NN Regression

Id weight
id6 60
id5 72
id4 59
we choose mean of these 3
nearest neighbour as our answer
Machine Learning
Choices a designer has to make while implementing KNN

● Decide on the features that are relevant for the problem?

---- Meaning of similarity

● What happens if the range of the features differ a lot?

---- Feature scaling techniques required (use normalization)

● How to find out how close/far away other points are?

---- Which distance measure to be used?

● On how many neighbors should I base my result on?

---- What is the appropriate value of k to be used for a given
problem?
Machine Learning
Find appropriate value of k

Let us consider the following points

Point X Y CLASS
A 2 8 triangle
B 3 9 triangle
C 5 11 circle
D 6 13 circle
Machine Learning
Find appropriate value of k

Taking a closer look at the points D

Point X Y CLASS
C
A 2 8 triangle
B 3 9 triangle
C 5 11 circle
D 6 13 circle B
A
Machine Learning
Find appropriate value of k

Let our query point to be xq = (4,10) D

Point X Y CLASS
C
A 2 8 triangle
B 3 9 triangle
C 5 11 circle
D 6 13 circle B
A
Machine Learning
Find appropriate value of k

Euclidean distance :
Let us calculate distance of query
point xq to all other points Distance calculation

Point X Y CLASS P1 P2 Distance

A 2 8 triangle
B 3 9 triangle
xq A 2.82
C 5 11 circle
xq B 1.414
D 6 13 circle
Training data xq C 1.414
xq D 3.6055
MACHINE INTELLIGENCE
What K value should I start ?

Let us try out different values of k

P1 P2 Distance CLASS

D xq A 2.82 triangle
xq B 1.414 triangle
xq C 1.414 circle
C
xq D 3.6055 circle

• Let’s start with k = 1

B
• We have a dilemma in choosing neighbors.
A
MACHINE INTELLIGENCE
What K value should I start ?

Let us try out different values of k

P1 P2 Distance CLASS

D xq A 2.82 triangle
xq B 1.414 triangle
xq C 1.414 circle
C
xq D 3.6055 circle

B • Let’s set k = 2.
A • We have a dilemma while choosing mode.
MACHINE INTELLIGENCE
What K value should I start ?

Let us try out different values of k

P1 P2 Distance CLASS

D xq A 2.82 triangle
xq B 1.414 triangle
xq C 1.414 circle
C
xq D 3.6055 circle

• Let’s set k = 3
B • There is no problem and we can assign the
A query point as a triangle.
• But we won’t stop here!
MACHINE INTELLIGENCE
What K value should I start ?

Let us try out different values of k

P1 P2 Distance CLASS

D xq A 2.82 triangle
xq B 1.414 triangle
xq C 1.414 circle
C
xq D 3.6055 circle

• Let’s set k=4

B • There is a problem again
• From our experiments, K should be odd if
A
number of classes is even and even if
number of classes is odd.
MACHINE INTELLIGENCE
What K value should I start ?

Assume the same query point, xq = (4,10)

• Let’s add two more classes with following

D data points.
Point X Y CLASS
G
A 2 8 triangle
C B 3 9 triangle
C 5 11 circle
D 6 13 circle
E F E 5 9 heart
B
F 6 9 heart
A G 8 12 square
MACHINE INTELLIGENCE
What K value should I start ?

Let’s calculate the distance between the each point and query point xq = (4,10)

Point X Y CLASS P1 P2 Distance

A 2 8 triangle xq A 2.82
B 3 9 triangle xq B 1.414
C 5 11 circle xq C 1.414
D 6 13 circle xq D 3.6055
E 5 9 heart xq E 1.414
F 6 9 heart xq F 2.23606
G 8 12 square xq G 4.4721
MACHINE INTELLIGENCE
What K value should I start ?

Let us try out different values of k P1 P2 Distance CLASS

xq A 2.82 triangle
D
xq B 1.414 triangle
Training data G C 1.414 circle
xq
C xq D 3.6055 circle
xq E 1.414 heart

B E F xq F 2.23606 heart

A xq G 4.4721 square
• The label has 4 classes, which is even. So from
our previous statement, k should be odd.
• So let k = 5
• We are stuck again!
Machine Learning
Find appropriate value of k

● In practice there is no right value for k to start with.

● However, you can start with

k = number of classes + 1

● In case of tie decrease k by 1.

● Experimental results shows that k = Sqrt(n), performs well too.

Where n is the size of dataset.
Machine Learning
Elbow Method - another way of estimating the value of k

• Calculate the error rate for different values of K.

• Choose the optimal k value as the elbow value

of the curve.

error-rate
• Run the K-NN algorithm with the best K value
found and redo the classification report and the
confusion matrix.
K-value
• Any changes to the data will require you to redo
the elbow method and re-define the best K

• Elbow method does not mean your curve should

look exactly like an elbow, it can be of any type.
Machine Learning
Bias-Variance Trade off wrt how new instances get classified

• K is very low: Could label the query points as the point near it as only small
no. of points are considered. Chance for high variance aka overfitting.

• K is very high: Could mostly label the query points as the class label which
has a majority in the given dataset. Chance for high bias aka underfitting.

• K is neither high nor low: Good trade-off without high bias and variance.
MACHINE INTELLIGENCE
kNN algorithm

Practical implementation of KNN

MACHINE INTELLIGENCE
Working algorithm for kNN Classification

X Y Class
3 5 red
3 6 red
• For most of the practical purposes, the 4 6 red
dataset is divided into production(training)
and test data set (80:20) 4 4 red
--- some kind of pre-processing 7 10 green
(learning) is being done 8 9 green
--- Not used as a naive (plain 8 8 green
vanilla) version
9 10 green
5 5 red
4 7 green
MACHINE INTELLIGENCE
Working algorithm for kNN Classification

X Y Class
• Upload the production data into 3 5 red
memory space. 3 6 red
• Decide the distance measure. 4 6 red
(we will choose Euclidean)
4 4 red
• Decide the k value
7 10 green
(we will choose k = 4)
8 9 green
• For each data point in test data, find
distance with respect to all 8 8 green
neighbors and choose the nearest 4 9 10 green
neighbors and assign the prediction X Y Class
as mode of the neighbors.
5 5 red
• Calculate the error and accuracy.
4 7 green
MACHINE INTELLIGENCE
Working algorithm for kNN Classification

TP TN FP FN X Y Class
Error = (FP + FN)/
All 3 5 red
predictions 3 6 red
1 0 0 0 0 4 6 red
4 4 red
7 10 green
8 9 green
8 8 green
9 10 green

X Y Class
5 5 red
4 7 green
MACHINE INTELLIGENCE
Working algorithm for kNN Classification

TP TN FP FN Error = (FP + FN)/ X Y Class

All
predictions
3 5 red
3 6 red
1 0 1 0 ½ = 0.5
4 6 red
4 4 red
7 10 green
8 9 green
8 8 green
This is a small example real data will be in
9 10 green
1000’s
X Y Class
5 5 red
4 7 green
MACHINE INTELLIGENCE
Working algorithm for kNN Regression

• Collect the dataset X Y value

• Divide into production and test data set (80:20) 3 5 3.5
3 6 2.6
4 6 2.8
4 4 3.1
7 10 8.7
8 9 8.5
8 8 9.3
9 10 9.7
5 5 3.5
7 8 2.6
MACHINE INTELLIGENCE
Working algorithm for kNN Regression

• Upload the production data into

memory space
• Decide the distance measure X Y value
(we will choose Euclidean now)
• Decide the k value 3 5 3.5
(we will choose k = 4) 3 6 2.6
• For each data point in test data, find 4 6 2.8
distance with respect to all neighbors
4 4 3.1
and choose the nearest 4 neighbors
and assign the prediction as mean of 7 10 8.7
the neighbors. 8 9 8.5
• Calculate the error using MSE. 8 8 9.3
9 10 9.7
X Y value
5 5 3.7
7 8 8.4
MACHINE INTELLIGENCE
Working algorithm for kNN Regression

X Y value
3 5 3.5
3 6 2.6
=3.0 4 6 2.8
4 4 3.1
e1=(3.7-3.0)2 =0.49
7 10 8.7
8 9 8.5
e1 0.49 8 8 9.3
9 10 9.7

X Y value
5 5 3.7
7 8 8.4
MACHINE INTELLIGENCE
Working algorithm for kNN Regression

X Y value
3 5 3.5
3 6 2.6
=9.05 4 6 2.8
4 4 3.1
e2=(8.4-9.05)2 =0.4225 7 10 8.7
8 9 8.5
8 8 9.3
e1 0.49
9 10 9.7
e2 0.4225
X Y value
5 5 3.7
7 8 8.4
MACHINE INTELLIGENCE
Working algorithm for kNN Regression

e1 0.49
X Y value
e2 0.4225
3 5 3.5
3 6 2.6
4 6 2.8
4 4 3.1
7 10 8.7
=0.45625
8 9 8.5
8 8 9.3
9 10 9.7
X Y value
5 5 3.7
7 8 8.4
Machine Learning
Weighted KNN

Let closer points have more influence

• Consider the given 2-D plot of a data

set where the label is binary.
X
• Consider the query point xq

• With vanilla kNN where K = 3:

○ --- x will be classified as green,
when it's more likely to be red.
MACHINE INTELLIGENCE
Distance weighted kNN

• Weigh the contribution of each of the k neighbour according to their distance to the
query xq
• Give greater weight to closer neighbour.

where m depends on how much you want

to penalise the points that are far away
from the query point.
MACHINE INTELLIGENCE
An easy example of weightedCKNN with comparison to
C

• Consider the given plot.

• Assume the label is binary and the data
instances either belong to class red or class
D blue.
• Assume the query point is marked as X.
E • Let’s say with k = 5 we get A, B, C, D, E as the
B C closest points.
A
Question : What will the vanilla kNN classify
the query point as:
Answer: Red
MACHINE INTELLIGENCE
An easy example of weighted KNN

• Let’s say with k = 5 we get A, B, C, D, E as closest points with distances as depicted in the plot :

Support for red = (1 /2.1) * 1 + (1/1.8) * 1 + (1/2 ) * 1

= 0.476 + 0.556 + 0.5
= 1.532
0.7
Support for blue = (1/1.1) * 1 + (1/0.7) * 1
D
= 0.909 + 1.428
= 2.337
E
1.1 estimated class for X = max(red-1.532, blue-2.337)
2.0 = blue
B C
1.8
A
2.1
MACHINE INTELLIGENCE
kNN algorithm

Advantage & Challenges of kNN

MACHINE INTELLIGENCE
Pros

• Requires no training before making predictions. New data can be added

seamlessly. It will not impact the accuracy of the algorithm.

• Only two parameters required i.e. k and the distance measure.

• Versatile - useful for both classification and regression.

MACHINE INTELLIGENCE
Cons

• Time
Prediction is computationally expensive as we need to compute the
distance between the query point and all other points. (N is in 1000s)

• Space
High memory requirement - lazy algorithm which stores all of the
training data. (N is in 1000s)

• Does not work well with large datasets. Cost of calculating distance between
the new point and each existing point is high.

• It is sensitive to scaling.

• The curse of dimensionality.

MACHINE INTELLIGENCE
Why do we scale?

Can you identify the nearest neighbors? How about now?

MACHINE INTELLIGENCE
Effect of normalizing

Pre normalizing
Euclidean distance between 1 and 2=
[(100000 - 80000)^2 + (30 - 25)^2] ^ (1/2) = 20000
• High magnitude of income affected the distance between the two points.
• This will impact the performance as higher weightage is given to variables with higher
magnitude.
MACHINE INTELLIGENCE
Effect of normalizing

Pre normalizing
Euclidean distance between 1 and 2=
[(100000 - 80000)^2 + (30 - 25)^2] ^ (1/2) = 20000

How to normalize?
MACHINE INTELLIGENCE
Effect of normalizing

Pre normalizing
Euclidean distance between 1 and 2=
[(100000 - 80000)^2 + (30 - 25)^2] ^ (1/2) = 20000

Post normalizing
Euclidean distance between 1 and 2 =
[(0.608 + 0.260)^2 + (-0.447 + 1.192)^2] ^ (1/2) = 1.14
• Distance is not biased towards the income variable anymore.
• Similar weightage given to both the variables.
MACHINE INTELLIGENCE
The curse of dimensionality

• The kNN classifier makes the assumption that

similar points share similar labels.

• Unfortunately, in high dimensional spaces, points

that are drawn from a probability distribution, tend
to never be close together.

• We can illustrate this on a simple example.

• We will draw points uniformly at random within the

unit cube.

• We will investigate how much space the k nearest

neighbors of a test point inside this cube will take
up.
MACHINE INTELLIGENCE
The curse of dimensionality

• Formally, imagine the unit cube [0,1]d

• All training data is sampled uniformly within this cube, i.e.
∀i,xi∈[0,1]d, and we are considering the k=10 nearest
neighbors of such a test point.
• Let ℓ be the edge length of the smallest hyper-cube that
contains all k-nearest neighbor of a test point.
• Then ℓd≈k/n and ℓ≈(k/n)1/d

• If n=1000 (# data instances), how big is ℓ?

• So as d≫0 almost the entire space is needed to find the
10-NN
• This breaks down the k-NN assumptions, because the k-
NN are not particularly closer (and therefore more similar)
than any other data points in the training set.
• Why would the test point share the label with those k-
nearest neighbors, if they are not actually similar to it?
MACHINE INTELLIGENCE
The curse of dimensionality

Figure demonstrating ``the

curse of dimensionality''.
The histogram plots show
the distributions of all
pairwise distances between
randomly distributed
points within d-
dimensional unit squares.
As the number of
dimensions d grows, all
distances concentrate
within a very small range.
MACHINE INTELLIGENCE
The curse of dimensionality

• One might think that one rescue could be to increase the number of training
samples, n, until the nearest neighbors are truly close to the test point.

• How many data points would we need such that ℓ becomes truly small?

• Fix ℓ=1/10=0.1

• n=k/ℓd=k*10d, which grows exponentially!

• For d>100 we would need far more data points than there are electrons in the
universe...
MACHINE INTELLIGENCE
How to deal with it?

• Assigning weights to the attributes when calculating distances. Let’s say we’re
predicting the price of a house, we give higher weights to features like area
and locality when compared to color.

• Iteratively, we leave out one of the attributes and test the algorithm. The
exercise can then lead us to the best set of attributes.

• Dimensionality reduction using techniques like PCA.

MACHINE INTELLIGENCE
kNN algorithm

How to handle different types of attributes?

MACHINE INTELLIGENCE
How to handle categorical features?

• In case of Boolean values, then convert to 0 and 1.

• Non-binary characterizations:

• Natural order among the data, then convert to numerical values based on order.
e.g. Educational attainment: HS, College, MS, PhD -> 1, 2, 3, 4

• No order and > 1 category, then convert to one hot encoding.

e.g. Animals: Cat, Dog, Zebra -> (1, 0, 0), (0, 1, 0), (0, 0, 1)
MACHINE INTELLIGENCE
How to handle categorical features?

Let us say we have the following data

Education Place Gender Eligibility
• Education, Place and Gender are
High school Banglore M Yes
attributes
College Udupi F No • Eligibility is the target
Phd Mandya M No • Education has natural order that is High
Phd Mandya M Yes school <College <Phd
College Udupi F No
• Place needs to be one-hot encoded
• Gender is binary
High School Mandya M No
Phd Udupi M Yes
College Banglore F Yes
Phd Banglore F Yes
High School Mandya M No
MACHINE INTELLIGENCE
How to handle categorical features?

The converted data looks something like this

Education Place Gender Eligibility Educatio P1 P2 Gender Eligibilit
n y
High school Banglore M Yes
1 0 0 1 Yes
College Udupi F No
2 0 1 0 No
Phd Mandya M No
3 1 0 1 No
Phd Mandya M Yes
3 1 0 1 Yes
College Udupi F No
2 0 1 0 No
High School Mandya M No
1 1 0 1 No
Phd Udupi M Yes
3 0 1 1 Yes
College Banglore F Yes
2 0 0 0 Yes
Phd Banglore F Yes
3 0 0 0 Yes
High School Mandya M No
MACHINE INTELLIGENCE
How to handle categorical features?

The converted data looks something like this

Educatio P1 P2 Gender Eligibilit • In Education 1,2,3 represents high
n y school college and Phd respectively
1 0 0 1 Yes
2 0 1 0 No • Place which had three types of
3 1 0 1 No value is broken into two attribute
p1 and p2 where (p1,p2)=(0,0)
3 1 0 1 Yes
represents bangalore, (p1,p2)=(0,1)
2 0 1 0 No represents Udupi and
1 1 0 1 No (p1,p2)=(1,0) represents Mandya
3 0 1 1 Yes
2 0 0 0 Yes
• Since Gender is binary here , we
can give 0 to Female and 1 to Male
3 0 0 0 Yes or the other way too
MACHINE INTELLIGENCE
Quiz

Setting K to a large value seems a good idea .

We get more votes!
For this dataset we set k value to 11 .Choose the most
appropriate comment on this decision

A good idea since ,the dataset is not too

big ,considering more neighbour would help

Not a good idea since class of a data point is

determined more by its most nearest neighbour

Not a good decision,since there are 5 negative

instances in the training set.Therefore K >10 would
have a majority of positive

None of the above

MACHINE INTELLIGENCE
Quiz

True

Do you think that the following is a good way

to calculate the weights in weighted KNN
False
MACHINE INTELLIGENCE
Quiz

Which of the following statement is true about k-NN algorithm?

1. k-NN performs much better if all of the data have the same scale
2 .k-NN works well with a small number of input variables (p), but struggles when the
number of inputs is very large
3. k-NN makes no assumptions about the functional form of the problem being solved

1 and 2 1 and 3

Only 1 All of the above

The above mentioned statements are assumptions of kNN

algorithm
MACHINE INTELLIGENCE
Quiz

When you find noise in data which of the

following option would you consider in k-NN?

I will increase the value of k I will decrease the value of k

Noise can not be dependent on value of k None of these

To be more sure of which classifications you make, you can try increasing
the value of k.
Machine Learning
Dr. Shylaja S S
Department of Computer Science & Engineering
shylaja.sharath@pes.edu

3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Ue21cs352a 20230830121009
No ratings yet
Ue21cs352a 20230830121009
42 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Instance Based Learning: Aiml/ Bda
No ratings yet
Instance Based Learning: Aiml/ Bda
25 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
Lecture 07 Slides
No ratings yet
Lecture 07 Slides
45 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
U3 KNN
No ratings yet
U3 KNN
6 pages
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
No ratings yet
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
18 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
Lecture Slides#7
No ratings yet
Lecture Slides#7
21 pages
K-Nearest Neighbour Classifier: Prerequisite
No ratings yet
K-Nearest Neighbour Classifier: Prerequisite
6 pages
Nearest-Neighbor Classifier Guide
No ratings yet
Nearest-Neighbor Classifier Guide
2 pages
KNN With Example
No ratings yet
KNN With Example
21 pages
Lec 23 - 24 KNN
No ratings yet
Lec 23 - 24 KNN
25 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
2 KNN
No ratings yet
2 KNN
67 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Unit 5 ML
No ratings yet
Unit 5 ML
13 pages
04 KNN
No ratings yet
04 KNN
25 pages
KNN & Decision Tree Basics
No ratings yet
KNN & Decision Tree Basics
9 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
6 pages
k-NN Algorithm: Basics, Applications, and Advantages
No ratings yet
k-NN Algorithm: Basics, Applications, and Advantages
42 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
ML 2
No ratings yet
ML 2
6 pages
ML Day6
No ratings yet
ML Day6
20 pages
UNIT V 5.1 ML Instance Based Learning
No ratings yet
UNIT V 5.1 ML Instance Based Learning
52 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
MachineLearning Unit-III
No ratings yet
MachineLearning Unit-III
26 pages
07 CSE358 Intro To Machine Learning I
No ratings yet
07 CSE358 Intro To Machine Learning I
63 pages
KNN CIML
No ratings yet
KNN CIML
12 pages
Mod 3
No ratings yet
Mod 3
56 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
جدول الاختبارات 2023-2024 الفصل الاول - مع القاعات - ٠٦١١٣٥
No ratings yet
جدول الاختبارات 2023-2024 الفصل الاول - مع القاعات - ٠٦١١٣٥
36 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
ML KN
No ratings yet
ML KN
12 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
kNN Insights for Data Scientists
No ratings yet
kNN Insights for Data Scientists
53 pages
08 - KNN
No ratings yet
08 - KNN
39 pages
Distance-Based Methods - KNN
0% (1)
Distance-Based Methods - KNN
8 pages
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
No ratings yet
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
93 pages
First Bulletin@2025
No ratings yet
First Bulletin@2025
32 pages
001-2022-0930 DLAPENG01 Course Book
No ratings yet
001-2022-0930 DLAPENG01 Course Book
148 pages
Kolli Deepthi Resume
No ratings yet
Kolli Deepthi Resume
1 page
Hearing Intervention Versus He
No ratings yet
Hearing Intervention Versus He
13 pages
Dates & Deadlines 2019-2020: Events FALL 2019 WINTER 2020 SPRING 2020
No ratings yet
Dates & Deadlines 2019-2020: Events FALL 2019 WINTER 2020 SPRING 2020
1 page
Daily Practice Journal - : Heory H Earning Odule
No ratings yet
Daily Practice Journal - : Heory H Earning Odule
5 pages
Support KSRI: Preserve Sanskrit Heritage
No ratings yet
Support KSRI: Preserve Sanskrit Heritage
2 pages
Teachings On The Prayer of The Heart in The Greek and Syrian Fathers
100% (4)
Teachings On The Prayer of The Heart in The Greek and Syrian Fathers
303 pages
Profed10critical Literacy
No ratings yet
Profed10critical Literacy
32 pages
Japan Foundation The Japanese Language Proficiency Test - July 2019 (New Delhi Centre Only)
No ratings yet
Japan Foundation The Japanese Language Proficiency Test - July 2019 (New Delhi Centre Only)
1 page
Tts Ips Kenampakan (Bentang) Alam Worksheet
No ratings yet
Tts Ips Kenampakan (Bentang) Alam Worksheet
3 pages
DR Green Project Main
No ratings yet
DR Green Project Main
145 pages
Validation of The ASK-ASD in A Sample of Parents Teachers and M
No ratings yet
Validation of The ASK-ASD in A Sample of Parents Teachers and M
90 pages
Ar-Navarro & Dinogyao
No ratings yet
Ar-Navarro & Dinogyao
13 pages
Novi Zivotopis
No ratings yet
Novi Zivotopis
2 pages
8019 English General Paper - Paper 1 Example Candidate Responses
No ratings yet
8019 English General Paper - Paper 1 Example Candidate Responses
27 pages
History of Education in Nigeria
No ratings yet
History of Education in Nigeria
4 pages
Rubber-Tyred Gantry Simulator Training Pack: Overview
No ratings yet
Rubber-Tyred Gantry Simulator Training Pack: Overview
2 pages
E-SHE Sign Up Instructions - MS Office Workspace
No ratings yet
E-SHE Sign Up Instructions - MS Office Workspace
8 pages
JNU-Academic Rules Regulations
No ratings yet
JNU-Academic Rules Regulations
168 pages
List of Assessments
No ratings yet
List of Assessments
16 pages
Instant Access To Stephen King and American History 1st Edition Tony Magistrale Ebook Full Chapters
100% (8)
Instant Access To Stephen King and American History 1st Edition Tony Magistrale Ebook Full Chapters
41 pages
Chinese Homework Helper
100% (1)
Chinese Homework Helper
8 pages
MODULE 5 (UPDATE) - Contemporary Philippine Arts From The Regions
No ratings yet
MODULE 5 (UPDATE) - Contemporary Philippine Arts From The Regions
9 pages
FK Partizan: 1 History
No ratings yet
FK Partizan: 1 History
19 pages
Adolfo CAmacho Yague - GCP
No ratings yet
Adolfo CAmacho Yague - GCP
8 pages
LESSON-PLAN-ENGLISH Week 6-8
No ratings yet
LESSON-PLAN-ENGLISH Week 6-8
16 pages
Classical Realism for Scholars
100% (1)
Classical Realism for Scholars
21 pages
Fte Candidate Data 2025-26 Nit Silchar
No ratings yet
Fte Candidate Data 2025-26 Nit Silchar
35 pages
Implications of Kants Philosophy Kantadarsaner Tatparyya 1st Edition Bhattacharyya PDF Download
100% (1)
Implications of Kants Philosophy Kantadarsaner Tatparyya 1st Edition Bhattacharyya PDF Download
79 pages