0% found this document useful (0 votes)

24 views13 pages

Lec 37

The lecture discusses K-means clustering, an unsupervised algorithm used to partition observations into a specified number of clusters. It emphasizes the importance of minimizing within-cluster distances to ensure that similar data points are grouped together. The process involves initializing cluster centers, assigning data points based on distance, updating cluster centers, and iterating until no further changes occur.

Uploaded by

joydsouza054

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views13 pages

Lec 37

Uploaded by

joydsouza054

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Python for Data Science

Prof. Ragunathan Rengasamy

Department of Computer Science and Engineering
Indian Institute of Technology, Madras

Lecture – 37
K – means Clustering

So, we are into the last theory lecture of this course I am going to talk about K-means
Clustering today following this lecture there will be a demonstration of this technique on
a case study by the TA’s of this course.

(Refer Slide Time: 00:53)

So, what is K-means clustering? So, K-means clustering is a technique that you can use
to cluster or partition a certain number of observations, let us say N observations in to K
clusters. The number of clusters which is K is something that you either choose or you
can run the algorithm for several case and find what is an optimum number of clusters
that this data should be partitioned into.

Just a little bit of semantics I am teaching in clustering here under the heading of
classification. In general typical classification algorithms that we see are usually super
waste algorithms; in the sense that the data is partitioned into different classes and these
labels are generally given. So, these are labeled data points and the classification
algorithms job is to actually find decision boundaries between these different classes. So,
those are supervised algorithms.
So, an example of that would be the k-nearest neighbor that we saw before where we
have label data and when you get a test data you kind of bin it into the most likely class
that it belongs to. However, many of the clustering algorithms are unsupervised
algorithms in the sense that you have N observations as we mentioned here, but they are
not really labeled into different classes. So, you might think of clustering as slightly
different from classification, where you are actually just finding out if there are data
points which share some common characteristics or attributes.

However, as far as this course is concerned we would still think of this as some form of
classification or categorization of data into groups just that we do not know how many
groups are there a priori. However, for a clustering technique to be useful once you
partition this data into different groups as a second step one would like to look at
whether there are certain characteristics that kind of pop out from each of this group to
understand and label these individual groups in some way or the other.

So, in that sense we would still think of this as some form of categorization which I
could also call as classification into different groups without any supervision. So, having
said all of that K-means is one of the simplest unsupervised algorithms, where if you
give the number of clusters or number of categories that exist in the data then this
algorithm will partition these N observations into these categories. Now, this algorithm
works very well for all distance matrix you again need a distance metric as we will see
where we can clearly define a mean for the samples.
(Refer Slide Time: 04:32)

So, when we were talking about optimization for data science I told you that you know
all kinds of algorithms that you come up with in machine learning there will be some
optimization basis for these algorithms. So, let us start by describing what a K-means
clustering optimizes or what is objective that is driving the K-means clustering
algorithm. So, as we described in the previous slide there are N observations x1 to xN and
we are asking the algorithm to partition this into K clusters. So, what does it mean when
we say we partition it into K clusters? So, we will generate k sets s1 to sk and we will say
this data belongs to this set and this data belongs to the other set and so on.

So, to give a very simple example let us say you have this observations x1, x2 all the way
up to xN and just for the sake of argument let us take that we are going to partition this
data into 2 clusters ok. So, there is going to be one set for one cluster, cluster 1 and there
is going to be another set for the other cluster, cluster 2. Now, the job of K-means
clustering algorithm would be to put these data points into these two bins ok. So, let us
say this could go here this could go here x3 could go here xN could go here maybe an x4
could go here and so on.

So, all you are doing is you are taking all of these data points and putting them into two
bins and while you do it what you are looking for really in a clustering algorithm is to
make sure that all the data points that are in this bin have certain characteristics which
are like in the sense that if I take two data points here and two data points here, these two
data points will be more like and these two data points will be more like each other, but
if I take a data point here and here they will be in some sense unlike each other. So, if
you cluster the data this way then it is something that we can use where we could say
look all of these data points share certain common characteristics and then we are going
to make some judgments based on what those characteristics are.

So, what we really would like to do is the following: we would like to keep these as
compact as possible so that like data points are grouped together which would translate
to minimizing within cluster sum of square distances. So, I want this to be a compact
group. So, a mathematical way of doing this would be the following. So, if I have k sets
into which I am putting all of this in, you take set by set and then make sure that if you
calculate a mean for all of this data the difference between the data and the mean squared
in a nonsense is as minimum as possible for each cluster and if you have k clusters you
kind of sum all of them together and then say I want a minimum for this whole function.
So, this is a basic idea of K-means clustering.

So, there are there is a double summation the first summation is for all the data that has
been identified to belong to a set, I will calculate the mean of that set and I will find out a
solution for this these means in such a way that this within cluster distance x which is in
the set minus I mean is minimized not only for one cluster, but for all clusters. So, this is
how you define this objective. So, this is the objective function that is being optimized
and as we have been mentioned mentioning before this µi is mean of all the points in set
si .
(Refer Slide Time: 09:09)

Now, let us look at how an algorithm such as K-means works it is a very simple idea. So,
let us again illustrate this with a very simple example. Let us say there are two
dimensions that we are interested in x and y and there are eight observations. So, this is
one data point, this is another data point. So, there are eight data points and if you simply
plot this you would you would see this and when we talk about cluster. So, notionally
you would think that there is a very clear separation of this data into 2 clusters right.

Now, again as I mentioned before in one of the earlier lectures on data science very easy
to see this in 2-dimensions, but the minute we increase the number of dimensions and the
minute we increase the complexity of the organization of the data which you will see at
the end of this lecture itself you will you will find that finding the number of clusters is
really not that obvious. In any case let us say we want to identify or partition this data
into different clusters and let us assume for the sake of illustration with this example that
we are interested in partitioning this data into 2 clusters.

So, we are interested in partitioning this data into 2 clusters. So, right at the beginning we
have no information right. So, we do not know that this all belong to one cluster all of
these data points belong to another cluster, we are just saying that are going to be 2
clusters that is it. So, how do we find the 2 clusters because we have no information
labels for this and this is an unsupervised algorithm. What we do is we know ultimately
there are going to be 2 clusters. So, what we are going to do is, we are going to start off 2
cluster centers in some random location. So, that is the first thing that we are going to do.
So, you could start off 2 clusters somewhere here and here or you could actually pick
some points in the data itself to start these 2 clusters.

(Refer Slide Time: 11:42)

So, for example let us say we randomly choose two points as cluster centers. So, let us
say one point that we have chosen is 1, 1. So, which would be this point here and let us
assume this is what is group 1 and let us assume that for group 2 we choose 0.33 which
is here. The way we have chosen this if we go back to the previous slide and look at this
the two cluster centers have been picked from the data itself. So, the one group was
observation 1 and the other group centre was observation 8.

Now, you could do this or like I mentioned before you could pick a point which is not
there in the data also and we will see the impact of choosing this cluster centers later in
this lecture. Now, that we have two cluster centers then what this algorithm does is the
following it finds for every data point in our database this algorithm first finds out the
distance of that data point from each one of this cluster centers. So, in this table we have
distance 1 which is the distance from the point 1, 1 and we have distance 2 which is the
distance from the point 3, 3.

Now, if you notice since the first point from the data itself, ie, the distance of 1, 1 from 1,
1 is 0. So, you see that distance 1 is 0 and distance 2 is the distance of the point 1, 1 from
3, 3. Similarly, since we chose 3, 3 as representative of group 2 if you look at point 8
which was a 3, 3 point the distance of 3, 3 from 3, 3 is 0 and this is the distance of 3, 3
from 1, 1 which will be the same as this. For every other point since they are not either 1,
1 or 3, 3 there will be distances you can calculate.

So, for example, this is a distance of the second point from 1, 1 and this is a distance of
the second point from 3, 3 and if you go back to the previous slide the second point is the
second point is actually 1.5, 1.5 and so on. So, you will go through here each of these
points are different from 1, 1; 3, 3 you will generate these distances. So, for every point
you will generate two distance one from 1, 1 other one from 3, 3.

Now, since we want all the points that are like 1, 1 to be in one group and all the points
which are like 3, 3 to be in the other group, we use a distance as a metric for likeness. So,
if a point is closer to 1, 1 then it is more like 1, 1 than 3, 3 and similarly if a point is close
to 3, 3 it is more like 3, 3 than 1, 1. So, we are using a very very simple logic here.

So, you compare these two distances and whichever is a smaller distance you assign that
point to that group. So, here distance 1 is less than distance 2. So, this is assigned to
group 1 this observation which is basically the point 1, 1 the second observation again if
you look at it distance 1 is less than distance 2. So, the second observation is also
assigned to group 1 and you will notice through this process a third and fourth will be
assigned to group 1 and 5, 6, 7, 8 will be assigned to group 2 because these distances are
less than these distances.

So, by starting at some random initial point we have been able to group these data points
into two different groups one group which is characterized by all these data points the
other group which is characterized by these data points. Now, just as a quick note
whether this is in 2-dimensions or N dimensions it is really does not matter because the
distance formula simply translates and you could have done that for two classes in N
dimensions as easily. So, while visualizing data in N dimensions are hard this calculation
whether it is 2 or N dimension it is actually just the same.

Now, what you do is you know that these group positions are the centers of these groups
were randomly chosen now, but we have now more information to update the centers
because I know all of these data points belong to group 1 and all of these data points
belong to group 2. So, a better representation for this group; so, initially the
representation for this group was 1, 1, a better representation for this group would be the
mean of all of these four samples and the initial representation for group 2 was 3, 3, but a
better representation for group 2 would be the mean of all of these points. So, that is step
3.

(Refer Slide Time: 17:23)

So, we compute the new mean and because group 1 has points 1, 2, 3, 4, we do a mean of
those points and the x is 1.075 and y is 1.05. Similarly, for group 2 we do the mean of
the labels are the data points 5, 6, 7, 8 and you will see the mean here. So, notice how
from 1, 1 and 3, 3 the mean has been updated in this case because we chose a very
simple example the updation is only slight. So, these points move a little bit.

Now, you redo the same computation because I still have the same eight observations but
now, group 1 is represented not by 1, 1 but by 1.075 and 1.05 and group 2 is represented
by 3.095 and 3.145 and not 3, 3. So, for each of these points you can again calculate a
distance 1 and distance 2. And, notice previously this distance was 0 because the
representative point was 1, 1. Now that the representative point has become 1, 1.075 and
1.05 this distance is no more 0, but it is still a small number.

So, for each of these data points with these new, new means you calculate these distances
and again use the same logic to see whether distance 1 is smaller or distance 2 is smaller
and depending on that you assign these groups. Now, if we notice that by doing this there
was no assignment reassignment that happened. So, whatever were the samples that were
originally in group 1 remain in group 1 and whatever are the samples which are in group
2 remain in group 2.

So, if you again compute a mean because the points have not changed the mean is not
going to change. So, even after this the mean will be the same. So, if you keep repeating
this process there is no never going to be any reassignment. So, this clustering procedure
stops. Now, if it were the case that because of this new mean let us say for example, if
this had gone into group 2 and let us say this and this had gone into group 1, then
correspondingly you have to collect all the points in group 1 and calculate a new mean
collect all the points in group 2 and calculate a new mean and then do this process again.

(Refer Slide Time: 19:50)

And, you keep doing this process till the mean change is negligible or no reassignment.
So, one of these two things happen, then you can stop the process. So, this is a very very
simple technique. Now, you notice this technique is very very easily implementable it
does not matter the dimensionality of the data you could have 20 variables, 30 variables
the procedure remains the same. The only thing is we have to specify how many clusters
that are there in the data and also remember that this is a unsupervised learning and so,
we originally do not know any labels.

But, at the end of this process at least what we have done is we have categorized this data
into classes or groups. Now, if you find something specific about this group by further
analysis then you could basically sometimes maybe even be able to label these groups
and the broad categories. If that is not possible at least you know from a distance
viewpoint that these two might be different behavior from the system viewpoint. So,
from an engineering viewpoint why would something like this be important?

If I let us say give you data for several variables temperatures, pressures, concentrations
and so on flow for several variables, then you could run a clustering algorithm purely on
this data and then say I find actually that there are two clusters of this data. Then, a
natural question and engineer might ask is if the process is stable I would expect all of
the data points to be like, but since it seems like there are two distinct groups of data
either both or normal, but there is some reason why there is this two distinct group or
maybe one is normal and one is not really normal, then you can actually go on probe of
these two groups which is normal which is not normal and so on.

So, in that sense an algorithm like this allows you to work with just raw data without any
annotation; what I mean by annotation here is without any more information in terms of
labeling of the data and then start making some judgments about how the data might be
organized. And, you can look at this multi-dimensional data and then look at what is the
optimum number of clusters maybe there are five groups in this multi-dimensional data
which would be impossible to find out by just looking at an excel sheet.

But, once you do an algorithm like this then it maybe organizes this into 5 or 6 or 7
groups then that allows you to go and probe more into what these groups might mean in
terms of process operations and so on. So, it is an important algorithm in that sense.
Now, we kept talking about finding the number of clusters, till now I said you let the
algorithm know how many clusters you want to look for, but there is something called an
elbow method where you look at the results of the clustering for different number of
clusters and use a graph to find out what is an optimum number of clusters. So, this is
called an elbow method.

And, you will see a demonstration of this in an example. The case study that follows this
lecture you will see how this plot looks and how you can make judgments about the
optimal number of clusters that you should use. So, basically this approach uses what is
called a percentage of variance explained as a function of number of clusters. But those
are very typical looking plots and you can look at those and then be able to figure out
what is the optimal number of clusters.
(Refer Slide Time: 23:55)

So, I explained how in an unsupervised fashion you can actually start making sense out
of data and this is a very important idea for engineers because this kind of allows a first
level categorization of data just purely based on data and then once that is done then one
could bring in their domain knowledge to understand these classes and then see whether
there are some judgments that one could make.

Couple of disadvantages of K-means that I would like to mention: the algorithm can be
quite sensitive to the initial guess that you use. It is very easy to see why this might be.
So, let me show you a very simple example. So, let us say I have data like this. So, if my
if I start my cluster points initial cluster points here and here you can very clearly see by
the algorithm all these data points will be closer to this.

So, all of this will be assigned to this group and all of these data points are closer to this.
So, they will be assigned to this group, then you will calculate a mean and once a mean
is calculated there will never be any reassignment possible afterwards then you have
clearly these two clusters very well separated. However, if just to make this point
supposing I start these two cluster centers for example, one here and one somewhere here
then what is going to happen is that when you calculate the distance between this cluster
and all of these data points and this cluster center.

And, all of these data points it is going to happen that all these data points are going to be
closer to this and all of these data points are going to be closer to this than this point. So,
after the first round of the clustering calculations you will see that the center might not
even move right because the mean of this and this might be somewhere in the center. So,
this will never move, but the algorithm will say all of these data points belong to this
center and this center will never have any data points for it. So, this is a very trivial case,
but it just still makes the point that if you use the same algorithm depending on how you
start your cluster centers you can get different results.

So, you would like to avoid situations like this and these are actually easy situations
avoid, but when you have multi-dimensional data and lots of data and which you cannot
visualize like I showed you here it turns out it is not really that obvious to see how you
should initialize. So, there are ways of solving this problem, but that is something to
keep in mind. So, every time you run an algorithm if the initial guesses are different you
are likely to get at least minor differences in your results.

And, the other important thing to notice look at how I have been plotting this data, I
typically I have been plotting to this data, so that you know the clusters are in general
spherical. But, let us say if I have data like this where you know all of this belongs to one
class and all of this belongs to another class, now K-means clustering can have difficulty
with this kind of data simply because if you look at data within this class this point and
this point are quite far though they are within the same class whereas, this point and this
point might be closer than this point in this point.

So, if in an unsupervised fashion if you ask K-means clustering to work on this

depending on where you start and so on, you might get different results. So, for example,
if you start with three clusters you might in some instances find three clusters like this.
So, though underneath the data is actually organized differently and if these happen to be
two different classes in reality, in an unsupervised fashion when you run the K-means
clustering algorithm you might say all of this belongs to one class; all of this belongs to
another class and all of this belongs to another class and so on.

So, these kinds of issues could be there. Of course, as I said before there are other
clustering algorithms which will quite easily handle data that is organized like this, but if
you were thinking about K-means then these are some of the things to think about. So,
with this we come to the conclusion of the theory part of this course on data science for
engineers, there will be one more lecture which will demonstrate the use of K-means on
a case study. With that all the material that we intended to cover would have been
covered um. I hope this was an useful course for you we tried to pick these data science
techniques, so that you get a good flavor of the things that you think about and also
actually techniques that you can use for some problems right away.

But, more importantly our hope has been that this has increased your interest in this field
of data science and as you can see that there are several fascinating aspects to think about
when one thinks about machine learning algorithms and so on. And, this course would
have hopefully given you a good feel for the kind of thinking and aptitude that you might
need to follow up on data science and also the mathematical foundations that are going to
be quite important as you try to learn more advanced and complicated machine learning
techniques either on your own or through courses such as this that are available.

Thank you very much and I hope a large number of you plan to take the exams and get a
certification out of this course.

Thanks again.

1.introduction To Python For Data Science
No ratings yet
1.introduction To Python For Data Science
6 pages
Programming For Data Science - Assignment 1
No ratings yet
Programming For Data Science - Assignment 1
2 pages
Lesson 0 - SDS 2102 - Computer Programming For Data Science I - Course Outline
No ratings yet
Lesson 0 - SDS 2102 - Computer Programming For Data Science I - Course Outline
6 pages
Programming For Data Science Assignment-2
No ratings yet
Programming For Data Science Assignment-2
23 pages
SEO Tools for Engineering Students
No ratings yet
SEO Tools for Engineering Students
14 pages
Weka Tutorial
No ratings yet
Weka Tutorial
2 pages
Retail Data Insights & Strategies
No ratings yet
Retail Data Insights & Strategies
24 pages
ML Seminar Presentation
No ratings yet
ML Seminar Presentation
26 pages
Real Estate ML Project Guide
No ratings yet
Real Estate ML Project Guide
20 pages
OCS353 Data Science Fundamentals LAB QUESTION SET
No ratings yet
OCS353 Data Science Fundamentals LAB QUESTION SET
2 pages
RMM Unit-I Introdution To Data Mining
No ratings yet
RMM Unit-I Introdution To Data Mining
129 pages
Python Data Science Cookbook - (Preface) PDF
No ratings yet
Python Data Science Cookbook - (Preface) PDF
8 pages
ISRO Research Proposal
No ratings yet
ISRO Research Proposal
11 pages
Building Powerful Image Classification Models Using Very Little Data
No ratings yet
Building Powerful Image Classification Models Using Very Little Data
20 pages
Machine Learning Foundations - Overview
100% (1)
Machine Learning Foundations - Overview
24 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
MLOPs Original
No ratings yet
MLOPs Original
27 pages
Semi-Automated EDA in Python
No ratings yet
Semi-Automated EDA in Python
3 pages
Data Science and Big Data Analytics
0% (1)
Data Science and Big Data Analytics
3 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Module 1 Introduction To Data Science
No ratings yet
Module 1 Introduction To Data Science
24 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Data Science
No ratings yet
Data Science
74 pages
CDO Operators for Data Analysis
No ratings yet
CDO Operators for Data Analysis
50 pages
Customer Data Analysis & Feature Engineering
No ratings yet
Customer Data Analysis & Feature Engineering
35 pages
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
No ratings yet
Introduction To EDA: Exploratory Data Analysis (EDA) in Data Science
4 pages
A Guide To Teaching Data Science PDF
No ratings yet
A Guide To Teaching Data Science PDF
26 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Cluster
100% (1)
Cluster
72 pages
Data Science for Industrialists
100% (2)
Data Science for Industrialists
15 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Lecture 4 Data Structure Linked List
No ratings yet
Lecture 4 Data Structure Linked List
30 pages
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
No ratings yet
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
63 pages
Fisher Linear Discriminant Analysis: Max Welling
No ratings yet
Fisher Linear Discriminant Analysis: Max Welling
4 pages
Lecture Notes - Random Forests PDF
100% (1)
Lecture Notes - Random Forests PDF
4 pages
CCchap 2
No ratings yet
CCchap 2
7 pages
Appendix Weka
No ratings yet
Appendix Weka
17 pages
Data Science For Business
No ratings yet
Data Science For Business
18 pages
Lecture 01 05.08.2024 AI-ML Introduction
No ratings yet
Lecture 01 05.08.2024 AI-ML Introduction
46 pages
Python NumPy for Beginners
No ratings yet
Python NumPy for Beginners
50 pages
IIIT-B Postgrad Assessment Guide
No ratings yet
IIIT-B Postgrad Assessment Guide
13 pages
CS3352 - Foundations of Data Science
No ratings yet
CS3352 - Foundations of Data Science
142 pages
DVI - Mid-Term Sample QP Bits Pilani Mid Sem
No ratings yet
DVI - Mid-Term Sample QP Bits Pilani Mid Sem
6 pages
Bridging Science and Data Science
No ratings yet
Bridging Science and Data Science
1 page
Unit I
No ratings yet
Unit I
85 pages
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
No ratings yet
Figure Style and Scale: Darkgrid Whitegrid Dark White Ticks Darkgrid
15 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
30 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
34 pages
Distributed Computing BE (AI&DS)
No ratings yet
Distributed Computing BE (AI&DS)
53 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
Midsem Regular QP
No ratings yet
Midsem Regular QP
2 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
MATPLOTLIB Updated
No ratings yet
MATPLOTLIB Updated
95 pages
Crisp DM
No ratings yet
Crisp DM
33 pages
Deep Learning and CNNFYTGS5101-Guoyangxie
No ratings yet
Deep Learning and CNNFYTGS5101-Guoyangxie
42 pages
Lec 48
No ratings yet
Lec 48
12 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
Air Data Test
No ratings yet
Air Data Test
4 pages
QA VTMmanual
No ratings yet
QA VTMmanual
75 pages
Host Interface Manual HL7 V 9.1 For SW v3.3.1
No ratings yet
Host Interface Manual HL7 V 9.1 For SW v3.3.1
34 pages
(Ebook PDF) Accounting Information Systems 11th Edition by Patrick Wheeler Instant Download
No ratings yet
(Ebook PDF) Accounting Information Systems 11th Edition by Patrick Wheeler Instant Download
53 pages
Tarapur#6 Well Pressure Analysis
No ratings yet
Tarapur#6 Well Pressure Analysis
6 pages
Rv220w Admin v1 0 1 0 Manual
No ratings yet
Rv220w Admin v1 0 1 0 Manual
178 pages
2MP IR Cube Network Camera Specs
No ratings yet
2MP IR Cube Network Camera Specs
5 pages
Journal Style Powerpoint Template by GEMO EDITS
No ratings yet
Journal Style Powerpoint Template by GEMO EDITS
16 pages
PayPal Login Troubleshooting Guide
No ratings yet
PayPal Login Troubleshooting Guide
12 pages
Unit 1-1
No ratings yet
Unit 1-1
2 pages
Automated Testing Lifecycle Methodology
No ratings yet
Automated Testing Lifecycle Methodology
21 pages
Testing Manual
No ratings yet
Testing Manual
76 pages
GNSS-product Overview UBX-14000426
No ratings yet
GNSS-product Overview UBX-14000426
4 pages
Research Paper On Programmable Logic Controller
No ratings yet
Research Paper On Programmable Logic Controller
5 pages
LEED v4 Green Associate Made Easy PDF
100% (3)
LEED v4 Green Associate Made Easy PDF
150 pages
IM Machine Shorthand
No ratings yet
IM Machine Shorthand
32 pages
System Software Module 1
No ratings yet
System Software Module 1
93 pages
Rev4 Ikk Gayung Bersambut Kab. Sambas Sesuai Ded
No ratings yet
Rev4 Ikk Gayung Bersambut Kab. Sambas Sesuai Ded
130 pages
Radiographic Testing Level 3 Course
No ratings yet
Radiographic Testing Level 3 Course
4 pages
Fire Fighting System PSP
No ratings yet
Fire Fighting System PSP
7 pages
0-Level Documentation Part - 2 (System Reference Manual BV Family R2.3)
No ratings yet
0-Level Documentation Part - 2 (System Reference Manual BV Family R2.3)
24 pages
Fintoch Killa Presentation
No ratings yet
Fintoch Killa Presentation
32 pages
Office 365 Home Tab Guide
No ratings yet
Office 365 Home Tab Guide
5 pages
Biomedical Engineering Curriculum - Istanbul Medipol University
No ratings yet
Biomedical Engineering Curriculum - Istanbul Medipol University
5 pages
MA - Handbook of Information 2022
No ratings yet
MA - Handbook of Information 2022
21 pages
Anthology Viability Assessment 8.21.25
No ratings yet
Anthology Viability Assessment 8.21.25
10 pages
DLL Etech Q2 W7
No ratings yet
DLL Etech Q2 W7
4 pages
Replace Rocker Arm Cover Gasket - ctm415 - Service ADVISOR™
100% (1)
Replace Rocker Arm Cover Gasket - ctm415 - Service ADVISOR™
5 pages
Attachments Catalogue: l60f, l70f, l90f
100% (1)
Attachments Catalogue: l60f, l70f, l90f
182 pages
19BCP096 HashPointers
No ratings yet
19BCP096 HashPointers
4 pages

Lec 37

Uploaded by

Lec 37

Uploaded by

Python for Data Science

Prof. Ragunathan Rengasamy

(Refer Slide Time: 00:53)

(Refer Slide Time: 11:42)

(Refer Slide Time: 17:23)

(Refer Slide Time: 19:50)

So, if in an unsupervised fashion if you ask K-means clustering to work on this

You might also like