0% found this document useful (0 votes)

17 views25 pages

Report

This research study investigates the application of unsupervised machine learning algorithms for early fault detection in predictive maintenance, focusing on vibration data from an exhaust fan. Various algorithms, including PCA, T2 statistic, and different clustering methods, are evaluated for their effectiveness in identifying faults with minimal historical data. The study concludes by proposing a methodology to benchmark these algorithms and select the most suitable model for fault detection.

Uploaded by

TRY11E PRIYADHARSHINI.M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views25 pages

Report

Uploaded by

TRY11E PRIYADHARSHINI.M

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

A Research Study on Unsupervised Machine

Learning Algorithms for Early Fault Detection

in Predictive Maintenance
Summer Internship under the guidance of

Dr. M. Punniyamoorthy
Professor(HAG)

National Institute of Technology

Tiruchirapalli-620015

Submitted by
NAVEENRAJ P
BACHELOR OF TECHNOLOGY (PRODUCTION)

National Institute of Technology

Tiruchirapalli-620015

JUNE 2025
INSTITUTE BONAFIDE CERTIFICATE

This is to certify that the project entitled” A Research Study on

Unsupervised Machine Learning Algorithms for Early Fault Detection in
Predictive Maintenance “ is a record of the work done by Naveenraj P
(24AC0043)in fulfillment of the Summer Internship at the National
Institute of Technology, Tiruchirapalli, during June 2025.
I declare that I have carried out the work presented in this report and that
I have not submitted the results in any form previously for the award of any
degree or diploma.

Dr. M. Punniyamoorthy,
Guide,

Department of Management Studies,

National Institute of Technology,

Tiruchirapalli-620015

Project submitted on 25.06.2025

1
CONTENTS
ABSTRACT
1.Introduction- 2

2. LITERATURE REVIEW 3

3. FAULT DETECTION 4

A. Data Collection 4

B. Feature Selection Using PCA 6

6
C. T2 Statistic
7
D. Cluster Analysis
7
E. Optimal Number of Clusters
8
F. Heirarchical Clustering
8
G. K-Means and Fuzzy C-Means Clustering
11
H. Model-Based Clustering 12
4. RESULTS 12

5.Conclusion 13

6.references 17

Appendix

0
A Research Study on Unsupervised Machine
Learning Algorithms for Early Fault Detection in
Predictive Maintenance

ABSTRACT
The area of predictive maintenance has taken a lot of prominence in
the last couple of years due to various reasons. With new algorithms
and methodologies growing across different learning methods, it has
remained a challenge for industries to adopt which method is fit,
robust and provide most accurate detection. Fault detection is one
of the critical components of predictive maintenance; it is very much
needed for industries to detect faults early and accurately. In a
production environment, to minimize the cost of maintenance,
sometimes it is required to build a model with minimal or no
historical data. In such cases, unsupervised learning would be a
better option model building. In this paper, we have chosen a simple
vibration data collected from an exhaust fan, and have fit different
unsupervised learning algorithms such as PCA T2 statistic,
Hierarchical clustering, K-Means, Fuzzy C-Means clustering and
model-based clustering to test its accuracy, performance, and
robustness. In the end, we have proposed a methodology to
benchmark different algorithms and choosing the final model.

1.INTRODUCTION
The concept of predictive maintenance (PdM) was proposed a few
decades ago. PdM is also a subset of planned maintenance. PdM did
not gain prominence until the recent decade. This rapid advance is
mainly due to emerging internet technologies, connected sensors,
systems capable of handling big data sets and realizing the need to
use these techniques. The abrupt growth can also be theorized due
to the demand for high-quality products, at the least cost and with
shortest lead time. Every year, it is estimated that U.S. industry
spends $200 billion on maintenance of plant equipment and
facilities and the result of ineffective maintenance leads to a loss of
more than $60 billion [1]. In food and beverage industry it was
estimated that failures and downtime accounted for 18% of OEE [2].

1
Over the years, different architecture, algorithms, and
methodologies have been proposed. One of the most prominent
methods is watchdog agent, a design enclosed with various machine
learning algorithms [3] [11]. Some of the other architectures are an
OSA-CBM architecture [4], SIMAP Architecture [5], and predictive
maintenance framework [6]. Emerging technologies such as the
Internet of things (IoT) devices have formed a gateway to connect to
machines and its subcomponents to not only collect the process data
and its parameters but also to collect the physical health aspects of
the machine such as vibration, pressure, temperature, acoustics,
viscosity, flow rate and many as such. This information is widely used
for early fault detection, fault identification, health assessment of the
machine and predict the future state of the machine. Some of this is
made possible due to machine learning algorithms available across
different learning domains.
Machine learning is a subsection of Artificial Intelligence Figure 1.
Machine learning can be defined a program or an algorithm that is
capable of learning with minimum or no additional support. Machine
learning helps in solving many problems such as big data, vision,
speech recognition, and robotics [7]. Machine learning is classified
into three types. In supervised learning, the predictors and response
variables are known for building the model, in unsupervised learning,
, only response variables are known, and in reinforced learning, the
agent learns actions and consequences by interacting with the
environment. In this research, the main focus will be on unsupervised
learning methodology. One of the most commonly used approaches
in unsupervised learning is clustering where, response variables are
grouped into clusters either user-defined or model based on the
distance, model, density, class, or characteristic of that variable. For
this research, vibration data has been used. Data collection, feature
selection, and extraction will be described in the later sections.

2
II. LITERATURE REVIEW
The primary goal of PdM is to reduce the cost of a product or
service and to have a competitive advantage in the market to
survive. Today business analytics are embedded across PdM to
realize the need for it and to make appropriate decisions. Business
analytics can be viewed in three different prospective (i) Descriptive
analytics (ii) Predictive analytics and (iii) Prescriptive analytics [16].
Descriptive analytics is a process of answering questions like what
happened in the past? This is done by analyzing historical data
and summarizing them in charts. In maintenance, this step is
performed using control charts. Predictive analytics is an
extension to descriptive analytics where historical data is analyzed
to predict the future outcomes. In maintenance, it is used predict
type of failure and time to complete failure. Finally, prescriptive
analytics is a process of optimization to identify the best
alternatives to minimize or maximize the objective. This also
answers the questions such as what can be done? In maintenance,
this can be used to optimize the maintenance schedules to
minimize the cost of maintenance. In this paper, our primary focus
will be on descriptive and predictive analytics to detect the faults.
Predictive analytics has spread its applications into various
applications such as railway track maintenance, vehicle
monitoring [23], automotive subcomponents [8], utility systems
[19], computer systems, electrical grids [13], aircraft maintenance
[21], oil and gas industry, computational finance and many more.
Fault detection is one of the concepts in predictive maintenance
which is well accepted in the industry. Early Failure detection
could potentially eliminate catastrophic machine failures. In one
of the recent research studies, this process is classified into
different methods such as quantitative model-based methods,
qualitative model-based methods, and process history based

3
methods [25]. Principle component analysis (PCA) is one of the
oldest and most prominent algorithms that are widely used today.
It was first invented by Karl Pearson in 1901. Since then, they have
been many hybrid approaches to PCA for fault detection such as
using Kernel PCA [17], adaptive threshold using Exponential
weight moving average for T2 and Q statistic [9], multiscale
neighborhood normalization-based multiple dynamic principal
component analysis (MNN MDPCA) method [27], Independent
Component Analysis. Another common method used for fault
detection is clustering method. Similar to PCA, there are various
algorithms such as neural net clustering algorithm neural
networks and subtractive clustering [28], K-means [10], Gaussian
mixture model [15], C-Means, Hierarchical Clustering [22], and
Modified Rank Order clustering (MROC) [33].

FAULT DETECTION
Fault detection is one of the most critical components of
predictive maintenance. Fault detection can be defined as a
process of identifying the abnormal behavior of a subsystem. Any
deviation from a standard behavior can be categorized as a
failure. In this section, we will discuss different algorithms such
as Principle Component Analysis (PCA) T2 statistic, Hierarchical
clustering, K- Means clustering, C Means, and Model-based
clustering for fault detection and benchmark its results for
vibration monitoring data.
A. Data Collection Vibration data is one of the most commonly
used technique to detect any abnormalities in a
submachine. In this research paper, a vibration monitor
sensor was set up on an exhaust fan. The vibration was
collected every 240 minutes for 12 days at a sampling
frequency of 2048 Hz on both X and Y axis. From the
following data, different features were extracted such as
peak acceleration, peak velocity, turning speed, RMS
Velocity, and Damage accumulation. Figure 2 is the time
series plots of the data.

4
Modern Definition of IMF (e.g., in Variational Mode
Decomposition - VMD)
In modern methods like VMD, the concept of IMF shifts more
towards a mathematical and frequency-domain definition:
1. Band-limited modes:
Each IMF is modeled as a mode with a compact support in
the frequency domain, i.e., each mode is centered around
a specific frequency (ωₖ).
2. Mode constraints:
Modes are derived such that they are:
o Smooth in the frequency domain (minimal bandwidth).
o Non-overlapping in frequency.
o Optimized through variational principles to extract
meaningful components.
3. No need for envelopes or zero-crossings:
Modern IMFs do not require extrema/zero-crossing counts
or mean-zero envelopes like in EMD.
Use Case: VMD and other recent methods define IMFs through
optimization problems that provide better mathematical
robustness, particularly for noisy or closely spaced signal
components.
In Figure 2, we can see a trend line generating closer to index
60th observation. In this paper, we will test to see how different
algorithms help in detecting this fault earlier.

5
B. Feature Selection Using PCA Not all features extracted
provide a true correlation. If right features are not selected,
then a significant amount of noise would be added to the
final model and hence, reduce the accuracy of the model.
One of the most prominent algorithms for that is used for
dimensionality reduction is Principle component analysis.
Principal component analysis (PCA) is a mathematical
algorithm that reduces the dimensionality of the data while
retaining most of the variation (information) in the data set
[18]. In a simple context, it is an algorithm to identify
patterns in data and expressing such a way to showcase
those similarities and differences [29]. Algorithm:
Step 1: Consider a data matrix X [X]mxn where, X is the
matrix, m is a row, and n is a column
Step 2: Subtract the mean from each dimension [ 356 ] −[ ]
Step 3: Calculate the covariance matrix [ ] (3)
Step 4: Calculate the eigenvectors and eigenvalues of the
covariance matrix ([ ] − ){ } ={0}
Step 5: Store the eigenvector in a matrix [ ] =[{ }{ }{ }…..{ }]
Step 6: Store eigenvalues in a diagonal matrix [ ] (4) (5) (6)
where [Eigen] is the eigenvalues corresponding to the
principal components, and P contains the loading vectors
Step 7: Rank eigenvalues in decreasing order and choose
top “r” vectors to retain [
Step 8: Retain “r” eigenvectors [ ] ] =[{ }{ }{ }…..{ }] (7) (8)
Step 9: Calculate the principal components [U] which is
projected in data matrix [ ] [ ] =[ ] (9) Summary of the PCA
indicates that the first two principal components show
95.65% of variance compared to rest of the components. A
scree plot can be plotted for Eigenvalues versus principle
components as shown in Figure 4. This plot can be used to
define the components that show significant variance in the
data. From summary data and scree plot, we can conclude
that the first two principal components present maximum
variation compared to the rest of the principal components.

C. T2 Statistic
T2 Statistic is a multivariate statistical analysis. The T 2
statistic for the data observation x can be calculated by [12]
= ∑ (10) The upper confidence limit for T 2 is obtained using
the F-distribution: , ,∝= ( ) , ,∝
6
where n is the number of samples in the data, a is the
number of principal components, and α is the level of
significance [24]. This statistic can be used to measure the
values against the threshold and any values above the
threshold; can be concluded as out of control data. In this
case, it is going to be faulty data. The results for the
vibration data are shown the Figure 5
Based on the results from T2 statistic in Figure 5, we can
observe that the faults can be detected as early as 41
observations. Hence, this early detection would help the
maintenance teams to monitor these process changes and
take corrective actions accordingly.
D. Cluster Analysis
Clustering analysis is one of the unsupervised learning
methods. In cluster analysis, similar data are grouped into
different clusters. Some of the most prominent cluster
analyses are K-Means clustering, C-Means clustering, and
hierarchical clustering. There are various merging principles
in hierarchical clustering. They are iterative, hierarchical,
density based, Metasearch controlled and stochastic. In this
paper, we will be discussing one of the commonly used
hierarchical clusterings.
E. Optimal Number of Clusters
In cluster analysis, we need to know the optimal number of
clusters that can be formed. Although we know that, we
have healthy data and faulty data, identifying the number of
optimal cluster formations in our data would help in
understanding different states in the data and representing
the data more accurately. To identify the number of
clusters, there are many procedures available such as elbow
method, Bayesian Inference Criterion method and nbClust
package in R. The results for elbow method is shown in
Figure 6 and using nbClust [30] is shown in Figure 7.
From both the procedures shown in Figure 6 and Figure 7,
we can identify that 3 clusters are the optimal number of
clusters. For fault detection, we can use three clusters and
theorize each cluster represents a normal condition,
warning condition, and faulty condition. In the next section
of cluster analysis, we can observe how each of the
clustering algorithms provides the results. From both the
procedures shown in Figure 6 and Figure 7, we can identify
7
that 3 clusters are the optimal number of clusters. For fault
detection, we can use three clusters and theorize each
cluster represents a normal condition, warning condition,
and faulty condition. In the next section of cluster analysis,
we can observe how each of the clustering algorithms
provides the results.

F. Heirarchical Clustering
Start by assigning each item to its own cluster, so that if
you have N items, you now have N clusters, each containing
just one item. Let the distances (similarities) between the
clusters equal the distances (similarities) between the items
they contain [24]. Algorithm: Step 1: Find the closest (most
similar) pair of clusters and merge them into a single
cluster, so that now you have one less cluster. Step 2:
Compute distances (similarities) between the new cluster
and each of the old clusters. Step 3: Repeat steps 2 and 3
until all items are clustered into a single cluster of size N. In
Figure 8, the cluster is formed based on the feature data
using Ward's method. Irrespective of feature data and
Principle components, the results were identical. Three
clusters were formed, where the first cluster includes
observations from 1 to 40, the second cluster includes
observations 41 to 67 and finally, the third cluster includes
observations from 68 to 71. Based on the domain
knowledge, we can represent cluster 1 as healthy dataset,
cluster 2 as warning dataset and finally cluster 3 as faulty
data set.

G. K-Means and Fuzzy C-Means Clustering

K-means is one of the most common unsupervised learning
clustering algorithms. This most straightforward algorithm’s
goal is to divide the data set into pre-determined clusters
based on distance. Here, we have used Euclidian distance.
The graphical results as shown in Figure 9. C-means is a
data clustering technique where each data point belongs to
every cluster at some degree. Fuzzy C means was first
introduced by Bezdek [14]. Fuzzy C-Means has been applied
in various applications such as agricultural, engineering,
astronomy, chemistry, geology, image analysis [14], medical

8
diagnosis, and shape analysis and target recognition [26].
The graphical results for C-Means is as shown in Figure 9.
From Table III summary of K-means and C-means
clustering, we can observe that clusters of sizes 4, 27 and
40 are formed. Observation 1 to 40 formed one cluster, 41
to 67 formed second cluster and the third cluster with 68 to
71 observations. These results are same as hierarchical
clustering

Scree plot to determine the variation between principal

components.

T2 statistic results for training dataset and testing

dataset.

9
Determining the optimal number of clusters based on
elbow method.

Determining the number of clusters using nbClust package.

CLUSTER MEANS OF K-MEANS ALGORITHM

1 2
1 -9.665 -1.609
2 -0.497 1.856
3 1.301 -1.092
Within cluster sum of squares by cluster: [1] 16.758705
39.575966 8.823486 (between_SS / total_SS = 90.2 %)

FUZZYC-MEANS CLUSTER CENTERS WITH 3CLUSTERS

10
1 2
1 1.275 -1.071
2 -0.289 1.920
3 -9.935 -1.723

H.Model-Based Clustering
A Gaussian mixture model (GMM) is used for modeling data
that comes from one of the several groups: the groups might
be different from each other, but data points within the
same group can be well-modeled by a Gaussian distribution
[20]. Gaussian finite mixture model fitted by EM algorithm
is an iterative algorithm where some initial random estimate
starts and updates every iterate until convergence is
detected [31] [32]. Initialization can be started based on a
set of initial parameters and start E-step or set of initial
weights and proceed to M-step. This step can be either set

11
randomly or could be chosen based on some method.
Summary of Classification Mclust EVV (ellipsoidal, equal
volume) model with five components: log.likelihood n df BIC
ICL -57.23501 71 25 -221.037 -222.0734

K-Means and C-Means clustering for fault identification.

The results are summarized in Table 3. The results from
Gaussian finite mixture model fitted by EM algorithm
Classification, there was a total of 5 groups of components are
formed. Component 1 and two are assigned to observation 1 to
40, component group 3 consists of observation 41 to 63,
component group 4 consist of observations 64 to 67 and finally
component 5 consists of observations 68 to 71. It is interesting to
note that, the critical fault detection which is accurately
predicted similarly to other clustering algorithms as well.
RESULTS
In this research, initially, we were hypothesized that two states
in data. One is healthy data set, and the other is unhealthy data
set. Using PCA and T2 statistic, we were able to fit our hypothesis
states and able to detect the faults 31 observations ahead.
Whereas, without a tool and just based on data plots we could
observe the trends only 11 observations ahead. As we moved on
to fitting different unsupervised clustering algorithms, we found
most of the clustering algorithms provided much more than the
T2 statistic. Using elbow method and nbClust package, we were
able to identify that the most optimal number of clusters that
could be formed was three. Based on these results, when data
was fitted in hierarchical clustering, K-means, and C-means, the

12
results were nearly identical. Based on the previous knowledge of
the data, we were able to identify each of three states. The first
state was identified as healthy state (since it was calibrated for
healthy data), second state was identified as the warning state
and finally the third state was identified as faulty state. It would
not be surprising to obtain the following results as all these
algorithms were based on a distance measure.

Gaussian finite mixture model fitted by EM algorithm

classification.

For our final model, Gaussian finite mixture model fitted by

EM algorithm was used. Unlike providing the number of
clusters, this model identifies optimal clusters and
accordingly classifies the observations into groups. Here,
the model recognized a total of 5 components. Although
with five components, upon closer investigation, we could
observe that, there is an overlap of component 1 and 2 and
component 3 and 4. When these components are
reorganized we can observe much similar pattern to the
previous cluster analysis.
V. CONCLUSION
This research started out as a test bed to benchmark
different machine learning algorithms for early fault
detection using unsupervised learning. In our results, T2
statistic provided more accurate results compared to GMM
method, and no hypothesis was required to identify the
relationship between cluster and state. One of the main
benefits of this method is that, even when this is deployed

13
to the manufacturing environment, with minimum or no
domain knowledge, one can identify fault or critical
condition when compared to clustering analysis. On the
other hand in clustering, some information about the data
is needed to name the clusters as healthy, warning or
critical. Clustering methodology is undoubtedly a better tool
in detecting different levels of faults where T2 statistic
would be challenging after certain levels. To emphasize this,
when the cost machine maintenance is expensive,
clustering would be a flexible option where machine health
can be monitored continuously until a critical level is
reached.

14
In conclusion of this study, although most algorithms
provided nearly similar results, each algorithm provided
deeper insight into the data. Hence, if the application is just
to detect the faults, T2statistic would be an excellent to

15
In conclusion of this study, although most algorithms provided
nearly similar results, each algorithm provided deeper insight
into the data. Hence, if the application is just to detect the faults,
T2statistic would be an excellent tool.
But if fault detection needs to be performed under different levels
then, clustering algorithms would be a better choice.
VI. FUTURE SCOPE OF WORK
Fault detection is one of the preliminary analytics for predictive
maintenance. Hence, detecting the fault accurately is regarded
important. This work is currently performed for vibration data.
The scope of this research can be extended out to other physics-
based parameters and combination of these parameters. It would
also be interesting to observe the detection accuracy for bigger
sample size and multiple fault states.
REFERENCES
[1] Mobley, R Keith, “An Introduction to predictive maintenance”,
2002, 2nd ed, ISBN 0-7506-7531-4
[2] Battini, D., Calzavara, M., Persona, A., and Sgarbossa, F.
(2016) “Sustainable Packaging Development for Fresh Food
Supply Chains. Package.” Technol. Sci., 29: 25–43. doi:
10.1002/pts.2185.
[3] Jay Lee, Hung-An Kao, Shanhu Yang, (2014) “Service
Innovation and Smart Analytics for Industry 4.0 and Big Data
Environment”, Procedia CIRP Volume 16, 2014, Pages 3-8
[4] Lebold M, Thurston M. “Open standards for condition-based
maintenance and Prognostic systems”. In: Proceedings of
MARCON 2001—fifth annual maintenance and reliability
conference, Gatlinburg, USA, 2001.
[5] Garcia E, Guyennet H, Lapayre J-C, Zerhouni N. “A new
industrial cooperative tele-maintenance platform”. Comput Ind
Eng 2004;46(4): 851–64.
[6] Groba. C, Cech. S, Rosenthal. F., Gossling. A, “Architecture of
the predictive maintenance framework”, 6th International
Conference on Computer Information Systems and Industrial
Management Applications, 2007, IEEE

16
Appendix
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from scipy.stats import f

# Simulated feature data (to replicate the paper,

values are dummy for example)
np.random.seed(42)
n_samples = 72
n_features = 5
X = np.random.normal(0, 1, (n_samples,
n_features))

# Standardize the data

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# PCA - First principles

cov_matrix = np.cov(X_scaled, rowvar=False)

17
eigen_values, eigen_vectors =
np.linalg.eigh(cov_matrix)

# Sort eigenvalues and eigenvectors

sorted_index = np.argsort(eigen_values)[::-1]
eigen_values = eigen_values[sorted_index]
eigen_vectors = eigen_vectors[:, sorted_index]

# Scree Plot
plt.figure(figsize=(8, 5))
plt.plot(range(1, len(eigen_values) + 1),
eigen_values, 'o-', color='blue')
plt.title('Scree Plot')
plt.xlabel('Principal Component')
plt.ylabel('Eigenvalue')
plt.grid(True)
plt.tight_layout()
plt.show()

# Project data onto top 2 PCs

k=2
top_eigenvectors = eigen_vectors[:, :k]
PC_scores = np.dot(X_scaled, top_eigenvectors)

18
# T² statistic
T2 = np.sum((PC_scores ** 2) / eigen_values[:k],
axis=1)

# Calculate T² threshold
n = n_samples
a=k
alpha = 0.05
F_val = f.ppf(1 - alpha, a, n - a)
T2_limit = a * (n - 1) / (n - a) * F_val

# T² plot
plt.figure(figsize=(10, 4))
plt.plot(T2, label='T² Statistic', color='darkred')
plt.axhline(y=T2_limit, color='green', linestyle='--',
label='Control Limit')
plt.title("T² Statistic Plot")
plt.xlabel("Observation Index")
plt.ylabel("T² Value")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

19
from sklearn.cluster import KMeans,
AgglomerativeClustering
from sklearn.mixture import GaussianMixture
import seaborn as sns

# Use first two principal components

data_for_clustering = PC_scores[:, :2]

# K-Means Clustering
kmeans = KMeans(n_clusters=3, random_state=42)
kmeans_labels =
kmeans.fit_predict(data_for_clustering)

# Hierarchical Clustering (Agglomerative)

hierarchical =
AgglomerativeClustering(n_clusters=3,
linkage='ward')
hierarchical_labels =
hierarchical.fit_predict(data_for_clustering)

# Gaussian Mixture Model Clustering

gmm = GaussianMixture(n_components=3,
random_state=42)
gmm_labels = gmm.fit_predict(data_for_clustering)

20
# Plotting Clusters
fig, axs = plt.subplots(1, 3, figsize=(18, 5))

# K-Means
sns.scatterplot(x=data_for_clustering[:, 0],
y=data_for_clustering[:, 1], hue=kmeans_labels,
palette="tab10", ax=axs[0])
axs[0].set_title("K-Means Clustering")
axs[0].set_xlabel("PC1")
axs[0].set_ylabel("PC2")

# Hierarchical
sns.scatterplot(x=data_for_clustering[:, 0],
y=data_for_clustering[:, 1],
hue=hierarchical_labels, palette="tab10",
ax=axs[1])
axs[1].set_title("Hierarchical Clustering")
axs[1].set_xlabel("PC1")
axs[1].set_ylabel("PC2")

# GMM
sns.scatterplot(x=data_for_clustering[:, 0],
y=data_for_clustering[:, 1], hue=gmm_labels,
palette="tab10", ax=axs[2])
axs[2].set_title("Gaussian Mixture Model")

21
axs[2].set_xlabel("PC1")
axs[2].set_ylabel("PC2")

plt.tight_layout()
plt.show()

Libro
No ratings yet
Libro
8 pages
A Research Study On Unsupervised Machine Learning Algorithms For Early Fault Detection in Predictive Maintenance
No ratings yet
A Research Study On Unsupervised Machine Learning Algorithms For Early Fault Detection in Predictive Maintenance
7 pages
2020 - Machine Learning Approach To Predictive
No ratings yet
2020 - Machine Learning Approach To Predictive
10 pages
The Benefits of Predictive Maintenance in Manufact
No ratings yet
The Benefits of Predictive Maintenance in Manufact
9 pages
Ciências Exatas
No ratings yet
Ciências Exatas
5 pages
The Role of Artificial Intelligence in Predictive Maintenance of Industrial Equipment (WWW - Kiu.ac - Ug)
No ratings yet
The Role of Artificial Intelligence in Predictive Maintenance of Industrial Equipment (WWW - Kiu.ac - Ug)
4 pages
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
No ratings yet
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
9 pages
Group 3 - VIBEENG Term Project
No ratings yet
Group 3 - VIBEENG Term Project
12 pages
Building An Algorithm For Predictive Maintenance
No ratings yet
Building An Algorithm For Predictive Maintenance
13 pages
Centrifugal Pump Fault Diagnosis Using A Predictive Maintenance Model
No ratings yet
Centrifugal Pump Fault Diagnosis Using A Predictive Maintenance Model
12 pages
Predictive Maintenance Unleashing Applications of Machine Learning A Comprehensive Exploration
No ratings yet
Predictive Maintenance Unleashing Applications of Machine Learning A Comprehensive Exploration
5 pages
Marc 1 1 22110
No ratings yet
Marc 1 1 22110
8 pages
Sahasrabudhe2020 - Experimental Analysis of Machine Learning
No ratings yet
Sahasrabudhe2020 - Experimental Analysis of Machine Learning
7 pages
Machine Learning Algorithms For Predictive Mainten
No ratings yet
Machine Learning Algorithms For Predictive Mainten
14 pages
Machine Learning Approach For Predictive Maintenance in Industry 4.0
No ratings yet
Machine Learning Approach For Predictive Maintenance in Industry 4.0
6 pages
Nishhh Technical - Seminar - PPT
No ratings yet
Nishhh Technical - Seminar - PPT
18 pages
Predictive Maintenance Using Machine Learning
No ratings yet
Predictive Maintenance Using Machine Learning
3 pages
2020 - Application of Predictive Maintenance Concepts Using
No ratings yet
2020 - Application of Predictive Maintenance Concepts Using
18 pages
Seminar Topic On Predictive Maintenance in Computer Systems
100% (1)
Seminar Topic On Predictive Maintenance in Computer Systems
13 pages
Machine Learning Based Predictive Maintenance in Manufacturing Industry
No ratings yet
Machine Learning Based Predictive Maintenance in Manufacturing Industry
9 pages
Am Ruth Nath 2018
No ratings yet
Am Ruth Nath 2018
8 pages
Aircraft Maintenance
100% (2)
Aircraft Maintenance
36 pages
IJSDR2305088
No ratings yet
IJSDR2305088
4 pages
IEEE-Machine Learning For The Predictive Maintenance of A Jaw Crusher in The Mining Industry
No ratings yet
IEEE-Machine Learning For The Predictive Maintenance of A Jaw Crusher in The Mining Industry
6 pages
Maintenance - fINAL
No ratings yet
Maintenance - fINAL
6 pages
Abstract For CNN
No ratings yet
Abstract For CNN
20 pages
Designing Predictive Maintenance
100% (1)
Designing Predictive Maintenance
28 pages
Ahadov Anar
No ratings yet
Ahadov Anar
92 pages
Machine Learning in Predictive Maintenance Towards Sustainable Smart Manufacturing in Industry 4.0
No ratings yet
Machine Learning in Predictive Maintenance Towards Sustainable Smart Manufacturing in Industry 4.0
42 pages
Paper 1
No ratings yet
Paper 1
33 pages
Algorithms 16 00061 v2
No ratings yet
Algorithms 16 00061 v2
26 pages
Applied Sciences
No ratings yet
Applied Sciences
34 pages
Use of Machine Learning in Predictive Maintainance
No ratings yet
Use of Machine Learning in Predictive Maintainance
5 pages
AI Driven Predective Maintenance 06 11 2024
No ratings yet
AI Driven Predective Maintenance 06 11 2024
25 pages
Adhiban Siddarth V
No ratings yet
Adhiban Siddarth V
15 pages
Base Paper
No ratings yet
Base Paper
6 pages
IJCRT2205280
No ratings yet
IJCRT2205280
4 pages
2023 05 12 Predictive Maintenance
No ratings yet
2023 05 12 Predictive Maintenance
12 pages
IAI2023 Paper 9109
No ratings yet
IAI2023 Paper 9109
8 pages
Literatura de Machine Learning PDF
No ratings yet
Literatura de Machine Learning PDF
10 pages
Predictive Maintenancein Mechanical Systems Through Machine Learning
No ratings yet
Predictive Maintenancein Mechanical Systems Through Machine Learning
5 pages
Machine Learning Models For Predictive Maintenance Report Aditya Tiwari
No ratings yet
Machine Learning Models For Predictive Maintenance Report Aditya Tiwari
37 pages
陌陌陌陌莫迪
No ratings yet
陌陌陌陌莫迪
9 pages
Predictive Maintenance For Factory Equipment
No ratings yet
Predictive Maintenance For Factory Equipment
66 pages
Condition Based Techniques and Predictive Maintenance For Motor
No ratings yet
Condition Based Techniques and Predictive Maintenance For Motor
7 pages
Cse Major Project Progress Report
No ratings yet
Cse Major Project Progress Report
17 pages
Fault Analysis
No ratings yet
Fault Analysis
10 pages
1 s2.0 S0957417423002397 Main
No ratings yet
1 s2.0 S0957417423002397 Main
31 pages
Challenges and Reliability of Predictive Maintenance: March 2019
No ratings yet
Challenges and Reliability of Predictive Maintenance: March 2019
25 pages
Predictive Maintenance in Industrial Systems Using Machine Learning
No ratings yet
Predictive Maintenance in Industrial Systems Using Machine Learning
8 pages
Research Paperr (Predictive Maintenance) - Final
No ratings yet
Research Paperr (Predictive Maintenance) - Final
6 pages
Challenges and Reliability of Predictive Maintenance
No ratings yet
Challenges and Reliability of Predictive Maintenance
19 pages
Challenges and Reliability of Predictive Maintenance
No ratings yet
Challenges and Reliability of Predictive Maintenance
25 pages
A Systematic Literature Review of Machine Learning Methods Applied To Predictive Maintenance
No ratings yet
A Systematic Literature Review of Machine Learning Methods Applied To Predictive Maintenance
16 pages
Predictive Maintenance with AI
No ratings yet
Predictive Maintenance with AI
9 pages
Pdf&rendition 1 1
No ratings yet
Pdf&rendition 1 1
6 pages
Journal Review-Case study-RPE-CH SRINIVAS
No ratings yet
Journal Review-Case study-RPE-CH SRINIVAS
7 pages
Predictive Maintenance For Industrial Equipments Using ML & DL
No ratings yet
Predictive Maintenance For Industrial Equipments Using ML & DL
6 pages
Tutorial PRPC19
No ratings yet
Tutorial PRPC19
1 page
04 Types of Chip Formation, Cutting Temperature, Etc.
No ratings yet
04 Types of Chip Formation, Cutting Temperature, Etc.
20 pages
Metalcastingprocesses 6th Sem
No ratings yet
Metalcastingprocesses 6th Sem
135 pages
EEIR11 Basics of Electrical and Electronics Engineering Sec A
No ratings yet
EEIR11 Basics of Electrical and Electronics Engineering Sec A
4 pages
Travelling by Train in India Summary
No ratings yet
Travelling by Train in India Summary
1 page
The Inhumanoids - Real Encounters With Beings That Can't - Nunnelly, Barton M - 2017 - Createspace Independent Publishing Platform - 1545451745 - Anna's Archive
No ratings yet
The Inhumanoids - Real Encounters With Beings That Can't - Nunnelly, Barton M - 2017 - Createspace Independent Publishing Platform - 1545451745 - Anna's Archive
548 pages
RSF Unit 8 Aviation English Exercises
No ratings yet
RSF Unit 8 Aviation English Exercises
8 pages
A. Excavation E. Materials Quantity Unit Unit Price Amount: E. Total Project Cost
No ratings yet
A. Excavation E. Materials Quantity Unit Unit Price Amount: E. Total Project Cost
6 pages
Cs3501 Compiler Design Laboratory
No ratings yet
Cs3501 Compiler Design Laboratory
50 pages
Hioki - Electrical Measuring Instruments - 2005
No ratings yet
Hioki - Electrical Measuring Instruments - 2005
64 pages
Saga Edition Character Sheet v8.0
No ratings yet
Saga Edition Character Sheet v8.0
60 pages
Renewable Energy for Students
No ratings yet
Renewable Energy for Students
219 pages
October 5, 2018 Strathmore Times
100% (1)
October 5, 2018 Strathmore Times
24 pages
AI Problem Solving & Search Techniques
No ratings yet
AI Problem Solving & Search Techniques
66 pages
Linux - Important Questions From Question Papers
No ratings yet
Linux - Important Questions From Question Papers
2 pages
4th Grade Sound
No ratings yet
4th Grade Sound
12 pages
Advertising Response Matrix
No ratings yet
Advertising Response Matrix
3 pages
Class 12th Expected Questions All Subjects 2025
No ratings yet
Class 12th Expected Questions All Subjects 2025
12 pages
Pay Loaf
No ratings yet
Pay Loaf
2 pages
Electrolux Part Guide
No ratings yet
Electrolux Part Guide
8 pages
Fault Code List For Base Module (GM) Control Unit 2 Car Body Styles Wheeled Vehicles PDF
No ratings yet
Fault Code List For Base Module (GM) Control Unit 2 Car Body Styles Wheeled Vehicles PDF
1 page
Maritime Training Accreditation
No ratings yet
Maritime Training Accreditation
17 pages
Group 5 - Ibdp Maa - Mai Handbook 2023-2025
No ratings yet
Group 5 - Ibdp Maa - Mai Handbook 2023-2025
268 pages
Geography Paper 2 May 2024 Solutions PDF
No ratings yet
Geography Paper 2 May 2024 Solutions PDF
2 pages
Scilab, Xcos NITK
No ratings yet
Scilab, Xcos NITK
6 pages
Chap-5 S1
No ratings yet
Chap-5 S1
35 pages
Uvm Ieee 18002-2020
No ratings yet
Uvm Ieee 18002-2020
458 pages
Installation & Operation Manual of 4 Position R Type LBS
No ratings yet
Installation & Operation Manual of 4 Position R Type LBS
5 pages
HP Color Laserjet Cm1312 MFP Series: Paper and Print Media Guide
No ratings yet
HP Color Laserjet Cm1312 MFP Series: Paper and Print Media Guide
16 pages
Tola Gemechu CV
No ratings yet
Tola Gemechu CV
6 pages
"Weather Prediction System": (Major Project) Master of Computer Application
100% (1)
"Weather Prediction System": (Major Project) Master of Computer Application
31 pages
Lecture C240 PDF
No ratings yet
Lecture C240 PDF
86 pages
Diagnostic Test and Investigations
No ratings yet
Diagnostic Test and Investigations
26 pages
Introductory Rotational Apparatus: Instruction Manual and Experiment Guide For The PASCO Scientific Model ME-9341
No ratings yet
Introductory Rotational Apparatus: Instruction Manual and Experiment Guide For The PASCO Scientific Model ME-9341
40 pages

Report

Uploaded by

Report

Uploaded by

A Research Study on Unsupervised Machine

Learning Algorithms for Early Fault Detection

National Institute of Technology

National Institute of Technology

This is to certify that the project entitled” A Research Study on

Department of Management Studies,

National Institute of Technology,

Project submitted on 25.06.2025

B. Feature Selection Using PCA 6

G. K-Means and Fuzzy C-Means Clustering

Scree plot to determine the variation between principal

T2 statistic results for training dataset and testing

Determining the number of clusters using nbClust package.

FUZZYC-MEANS CLUSTER CENTERS WITH 3CLUSTERS

K-Means and C-Means clustering for fault identification.

Gaussian finite mixture model fitted by EM algorithm

For our final model, Gaussian finite mixture model fitted by

# Simulated feature data (to replicate the paper,

# Standardize the data

# PCA - First principles

# Sort eigenvalues and eigenvectors

# Project data onto top 2 PCs

# Use first two principal components

# Hierarchical Clustering (Agglomerative)

# Gaussian Mixture Model Clustering

You might also like