0% found this document useful (0 votes)
19 views11 pages

Report Combined

This document discusses the importance of predictive maintenance in industrial settings, focusing on the use of single-class Support Vector Machines (SVMs) for anomaly detection in machine health monitoring. It highlights the challenges of traditional maintenance methods and the need for advanced machine learning techniques to analyze complex sensor data effectively. The research aims to enhance anomaly detection capabilities and improve maintenance strategies through the development of a robust system utilizing sensor data from long stroke engines.

Uploaded by

alankritadeka12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views11 pages

Report Combined

This document discusses the importance of predictive maintenance in industrial settings, focusing on the use of single-class Support Vector Machines (SVMs) for anomaly detection in machine health monitoring. It highlights the challenges of traditional maintenance methods and the need for advanced machine learning techniques to analyze complex sensor data effectively. The research aims to enhance anomaly detection capabilities and improve maintenance strategies through the development of a robust system utilizing sensor data from long stroke engines.

Uploaded by

alankritadeka12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction:

Predictive maintenance is important to industrial settings because it identifies possible problems in


advance . This results in improve efficiency . Traditional maintenance methods are build on reactive
repairs and inspection. Traditional maintenance methods can cause lot of downtime and has high
maintenance cost. Preceding problem finding methods have been created for early detection of issues
that may occur in subsequent checks on machine health such as anomaly detections. Anomaly
detection is the process of identifying patterns in data that deviate from the norm and point to
potential problems. The presence of noise, the unpredictability of anomalous patterns and the sheer
complexity of machine data all make implementation difficult for effective anomaly detection. This
research investigates the use of single-class SVMs for detecting anomalies during predictive
maintenance in overcoming these challenges. In addition, we have used these approaches because
they are adept at managing large dimensional data sets and can distinguish between typical
operational situations and non-typical ones. Finally, this paper affirms that real-world datasets show
how well these methods effectively detect abnormalities that indicate possible machine failures
through.

Background and motivation:

Background:
Industry 4.0 technologies have been the catalyst for extensive adoption of industrial sensors, thus
leading to a deluge of real-time sensor data. Consequently, this can be used in the development of
better predictive maintenance techniques as it gives an idea of the status and performance features
of plant machinery. The enormity and intricacy of current industrial data in recent times has made it
difficult for traditional analytical approaches to cope with them, thus making any attempts to make
good use out of such information for predictive maintenance purposes rather challenging.
Motivation:
Old ways of analyzing extensive and intricate industrial sensor data have been found wanting in their
ability to handle them effectively hence the need for new machine learning approaches. The likes of
one class support vector machines are assistive in encoding complex interactions and pinpointing
peculiarities that exist in the sensory information. These advanced analytic capabilities allow
institutions to gather more information on equipment condition, which helps them make decisions
about preventative maintenance strategies. 1
Advantages and Disadvantages of Existing Surveys:
When it entails anomaly detection in commercial enterprise manage structures, contemporary-day
surveys have loads to offer. They provide us a complete assessment of several device analyzing
techniques, supporting us recognize what they may be able to do and the way they art work. These
surveys are splendid at maintaining the strengths and weaknesses of numerous anomaly detection
techniques, giving us a clean idea of which ones are the best mainly conditions. Not extremely good
that, however further they highlight rising tendencies and destiny tips, which can be certainly
beneficial for future studies. Another cool element approximately the ones surveys is that they often
take a look at the general typical performance of various techniques, giving us insights into their
specialists and cons. And if you're looking for real-international examples, some surveys even embody
case research and sensible applications to reveal how those techniques can be utilized in actual
lifestyles. So, all in all, the ones surveys are a precious beneficial resource for every person interested
in anomaly detection in enterprise manage structures. They supply us a solid information of the unique
techniques available, their strengths and weaknesses, or perhaps provide practical examples of
procedures they may be applied.
But the surveys as well have fantastic threats. Necessities for a few or certain techniques and platforms
often ignore significant improvements in other sectors. Quick progress in machine learning and
anomaly detection might make some surveys out of date, thereby diminishing their accuracy. It is
difficult to compare results across surveys directly due to variability in assessment metrics and the
datasets used. Practical implications may differ from those discussed in theory of the evolution
.practical implications may vary from those discussed in theory of the evolution

Objectives and Gaps Addressed by This Study


The primary concern of this research is the improvement of the various gaps and shortfalls in the
current knowledge. There are other aspects included in this work: one is to enhance the applicability
of anomaly detection systems across different factories and classes of machines while using a singe-
class support vector machine (SVM). Additionally, it is aimed at improving anomaly detection for
datasets with few or no labels using one-class SVMs. Other objectives include devising adaptive
thresholding methods changing over time according2to what occurs in reality.
3. PROBLEM STATEMENT:

In industries, predictive maintenance techniques are depends on sensor based technologies to monitor
the condition of machines and its equipment. From the help of these sensor and equipment it can be
gathered information about variables like pressure , temperature , vibration and sound etc , which are
further examine to forecast probable problems and planning to maintenance within required time.We
define this challenges broadly.

“The objective of project is to create an anomaly detection utilizing a support vector machine(SVM)

method , given a dataset of long stroke engine(bike) of sound an vibration measurements from healthy
and unhealthy components, represented as X healthy vector machine unhealthy and X. The system aims
to accurately classify healthy and unhealthy components based on anomaly scores derived from the
SVM model. The model is trained using healthy data and optimized using gradient descent to reduce
the loss function and it incorporate with linear kernel. Accuracy metrics and a confusion matrix are used
to define the system's performance by calculating the true positives (TP), true negatives (TN), false
positives (FP), and false negatives (FN). For additional research and validation, the system additionally
offers visualizations of the datasets and anticipated classifications.”

Aim of our project:

Our project's goal is to use sensor data to create an efficient anomaly detection system utilizing single-
class Support Vector Machines (SVM). This makes it possible for the machine can prevent itself from
any fault and can take necessary action quickly. Which increase overall machine dependability and
maintenance effectiveness. Through data analysis, we want to improve predictive maintenance skills,
which will ultimately lead to efficient industrial operations.

Objectives:

• Dimensionality reduction of sensor data using PCA


• Anomaly detection in unlabeled sensor data using one class SVM
• Developments of user interface for visualization

3
4. METHODOLOGY:

DATASET:

Training dataset

The healthy data is reflecting the scenario of the motor s overall operating state how it create vibration
or noise in normal state. It is utilized to train the anomaly detection algorithm. The healthy data is
recorded through sensors while the motor is functioning properly. These datapoint is very crucial for
the model. The data point provides a baseline for the model, capturing typical vibration and sound levels
during normal operation. By training the healthy dataset of this model, It learns to recognize and
characterize the normal behavior of the motor.

Testing dataset:

In comparison to the above, to provide a varied dataset for testing, the unhealthy data which represents
times when the motor show the sign of failure or deviates its normal operating parameters which is
combined with healthy data. This unhealthy dataset consists of vibration and noise data recorded when
the motor shows outlier or anomalies, which indicates the potential faults or degrade. By combining
both the dataset like healthy or unhealthy data during testing , the model is exposed to a wider range of
scenarios in both the normal and abnormal conditions

4.1 Data Preparation

Long Stroke Motor:

A long stroke motor is the primary source of vibration and noise data. This motor functions in a variety
of environments, and its performance such as vibration and sound which indicate the state of its health.

Arduino and Sensors:

An Arduino microcontroller is used to collect the necessary data, which is combination pf sensors
specially design to measure vibration and sound data. These sensors continuously monitor the motor’s
operation, detecting any changes in its performance that could signal a healthy or unhealthy state.

Data collection:
4
First step of our project involves into collecting sound and vibration data from long stroke
engine(motorbikes) to build the datasets needed for training and testing our anomaly detection
algorithm. For healthy bikes we collect the data during normal operating conditions by the bike
considering to be in good operating condition. Using sensors to record sound and vibration data under
various conditions (e.g., different speeds, loads, and terrains), ensure that it collect normal operation
data. For unhealthy state or condition bikes , we gather information or data which reflect mechanical
faults by recognize the faults attach with the motorbikes using the same sensors and procedures to record
sound and vibration data under diverse operating condition. These datasets form the foundation of
training our anomaly detection algorithm. The healthy dataset trains the One-Class SVM(support vector
machine) to recognize normal conditions, while the unhealthy dataset tests the algorithm ability to
detect anomalies. . Once collected, we use a data analysis tool, such as Python's Pandas module, to
convert the data into a structured format with each row representing a sample and each column
representing characteristics like sound and vibration levels, enabling simpler manipulation and analysis.
We randomly shuffle the data to avoid the order related bias affecting model training and testing. this
model is tested on a wide variety of circumstances, resulting in a more robust and generalized anomaly
detection system. Next, we determine the ratio for splitting the data into training and testing sets, such
as using 75% of the healthy data for training and 25% for testing, and divide the healthy dataset
accordingly. In order to enable the model to distinguish between normal and anomalous conditions, we
combine the training subset of healthy data with all the unhealthy data to create a full training dataset.

1 ALGORITHM
1.1 ONE-CLASS SINGLE VECTOR MACHINE
A Single Class Support Vector Machine is a machine learning algorithm used for
identifying outliers or anomalies in a dataset. It is specifically designed for novelty
detection which is the identification of new or unknown data during the testing phase.
Traditional SVMs handle binary classification tasks where they aim to separate two
classes whereas one-class SVM exclusively trains on data points from a single class
which is known as the target class. During training, the goal of this algorithm is to define
a boundary capturing the normal instances in the feature space. This creates a region
of familiarity. This boundary is purposefully placed to maximize the margin
around the normal data points to allow a delineation between what is considered
ordinary and what may be regarded as unusual. It is like drawing a protective circle
around the typical instances to shield them from the outliers or anomalies. New
instances falling outside this boundary are identified as outliers or anomalies. The
choice of hyperparameters such as regularization parameter and the kernel function
play an important role in the SVM’s ability to capture the elementary structure
of the data. There several real-world use-cases for One-Class SVM like fraud
detection commercial systems, network intrusion detection and quality control in
manufacturing.

1.1.1 Converting raw data to Comma Separated Values(CSV) for-


mat
After the raw data has been collected, it is converted to CSV format because CSV
files provide a standardized way to store 5 data and it is also easier to manage to
share. CSV files are compatible with various data analysis tools and programming
environments facilitating further processing and analysis.

1.1.2 Training Support Vector Machine with Radial Basis Func-


tion(RBF) Kernel
1. RBF kernel

The RBF kernel is resourceful in handling complex, non-linear relation- ships. It


transforms data into a space where complicated decision bound- aries can be
drawn. It is well suited when the exact form relationships is unknown or
complicated.

It is calculated using the following formula:


2
𝐾(𝑥𝑖 , 𝑥𝑗 ) = 𝑒𝑥𝑝 (−𝛾 ||𝑥𝑖 − 𝑥𝑗 || )

1
Where 𝛾 = 2 , σ is the RBF kernel width.
2𝜎

2. Initializing the decision function

The decision function is initialized with a vector of weights of all training points
as support vectors. This vector is called Lagrange Multiplier which is denoted by
α. Along with this vector, the decision function is also ini- tialized with bias
denoted by b.

Formula for decision function:


𝑛

𝑓(𝑥) = ∑ 𝛼𝑖 𝐾(𝑥𝑖 , 𝑥𝑗 ) − 𝑏
𝑖=1

The decision function is used to classify new data points as similar to or different
from the training set by calculating a weighted sum of kernel values between new
data points and the support vectors.

Gradient Descent

Gradint descent finds optimal values for bias and Lagrange Multipliers to
minimize SVM’s objective function or cost function.

Gradient descent is calculated as:


∂L
αi ← α i − η
∂x i
where L is the cost function and η is the learning rate. This formula is applied
iteratively until the cost function is close to or at zero.

3. Decision function and thresholding


6
The decision function outputs a value(which is actually a decision). This value
indicates how stoutly a sample belongs to a particular class based on its position
relative to the support vectors.
Its formula is:
𝑛

𝑓(𝑥) = ∑ 𝛼𝑖 𝐾(𝑥𝑖 , 𝑥𝑗 ) − 𝑏
𝑖=1

Threshold Calculation:

Calculate the brink because the suggest of the selection function values for healthy samples. . Compute
the Mean Threshold:

Threshold = 1/𝑚∑ 𝑓(𝑥𝑖 𝑚 𝑗=1 )

in which m is the quantity of wholesome samples.

The threshold separates healthy from faulty samples based totally on their choice characteristic values.
Healthy samples have choice characteristic values below the threshold, at the same time as faulty
samples have values above it.

Classification:

Classify each sample as wholesome or defective based on its choice characteristic price relative to the
edge. Assign Labels Based on Threshold:

Label(x)=Healthy if f(x) < threshold

Faulty if f(x) ≥ threshold

This very last step uses the edge to make a binary category, determining whether or not each sample is
healthful or defective primarily based on the calculated decision characteristic values.

4.3 Flow of work to attain our objective:

The first step became facts series, in which vibration and sound statistics were accumulated from a long
stroke motor using sensors incorporated with an Arduino microcontroller. The facts were gathered in
two states: wholesome (everyday operation) and bad (malfunctioning). These raw records have been
standardized into Comma-Separated Values (CSV) layout to make certain uniformity and facilitate next
analysis. We trained a unmarried-magnificence Support Vector Machine (SVM) the use of the Radial
Basis Function (RBF) kernel. The education turned into completed completely on the healthful
information, allowing the version to learn the characteristics of the everyday operational country, that
is particularly beneficial whilst faulty statistics are scarce. For anomaly detection, we carried out the
trained SVM model to new facts to discover deviations from the discovered regular country. The
performance of this anomaly detection model became assessed the use of numerous metrics, which
7
include accuracy, precision, don't forget, F1-rating, and a confusion matrix, to make sure its
effectiveness in distinguishing between normal and anomalous facts factors. Finally, we developed a
consumer interface and visualization. This involved developing graphical tools to help clients more
easily grasp the data by visualizing the healthy and unhealthy 12 records. In order to promote interaction
with the device, a user-friendly graphical user interface (GUI) was created, enabling users to create
effects and view statistical effects. This all-encompassing strategy made sure that we achieved our
objectives of efficient anomaly identification and visualization, offering a reliable solution for
predictive maintenance in a industrial environment.

5. IMPLEMENTATION:

Data Collection equipment:

In the hardware part of the records collection method, several components are applied. First, Arduino
serves because the important manage device, providing the essential hardware and software
infrastructure for the assignment. Accompanying Arduino is the Arduino Integrated Development
Environment (IDE), a software program utility utilized for coding, compiling, and uploading code to
the Arduino board. Sensors play a vital position in capturing and measuring vibrations or acceleration
forces. These sensors are vital for tracking motorbike additives' moves, together with the engine,
wheels, or frame, to stumble on anomalies. Finally, lengthy stroke motors are applied to simulate
mechanical actions or vibrations for testing and calibration functions. These vehicles permit for
controlled experimentation and facts series in a laboratory or take a look at environment, improving
the accuracy and reliability of anomaly detection algorithms.

Algorithm:

In the algorithmic a part of our anomaly detection machine, we chose the one-elegance Support
Vector Machine (SVM) algorithm for several motives. First, the one-elegance SVM is well-ideal for
anomaly detection duties wherein the majority of the records is ordinary or healthful, and anomalies
are rare or unseen for the duration of schooling. This aligns with our use case, where we goal to
discover faults or anomalies in motorbike components, which might be fairly uncommon in
comparison to everyday working conditions. To educate the only-elegance SVM version, we utilized
the healthful dataset exclusively. This preference guarantees that the model learns simplest from times
representing normal conduct, allowing it to distinguish anomalies based totally on deviations from
this learned regular conduct. By training on a healthy dataset, we offer the model with a clear
information of what constitutes normal operation, enhancing its ability to pick out anomalies
correctly. After education the version at the wholesome dataset, we created a various testing dataset by
combining both dangerous and healthful information. This mixed dataset ensures that the version
encounters a wide range of eventualities at some stage in testing, inclusive of both regular and
anomalous instances. By exposing the model to lots of conditions, we examine its robustness and
8
generalization capabilities, assessing its overall performance in real-global situations in which
anomalies may be encountered on occasion or unpredictably. This method permits us to validate the
version's effectiveness in as it should be detecting anomalies while minimizing false positives and
fake negatives, ultimately enhancing the reliability and usability of our anomaly detection machine
Results
This section will cover a detailed evaluation of the Single-Class Support Vector Machine performance
in detecting anomalies for industrial machinery, along with the discussion of the implications of our
results. The results include the evaluation of the model performance metrics, a detailed analysis of the
confusion matrix, and practical significance to those outcomes for the predictive maintenance strategy.

Confusion matrix :
A confusion matrix is matrix for testing models performance on a given test data. This table consists of
True Positive, True Negative, False Positive, False Negative. Confusion matrix is also called the error
matrix as it suggests the models error.

Accuracy:

Our class version has finished an outstanding accuracy rate of 79%. This huge milestone displays the
model's robustness and the effectiveness of the education procedure. TP represents the true positives,
TN the actual negatives, FP the false positives, and FN the false negatives, the system can be expressed
as:

True Positive+ True Negative


True Positive+ True Negative+ False Positive+ False Negative

Recall:

Recall is the sensitivity or proper superb price, is an important degree in comparing how nicely a
classification version identifies superb times. It is calculated as:

True Positive
True Positive+ False Negative

A recall rating of 1indicates that our model predicts all


9 genuine positives efficaciously.

Precision:
Precision measures the share of efficiently recognized nice instances out of all instances which can be
anticipated as effective by means of the model. In our model, attaining 80% precision means that out of
all of the instances expected as anomalies, eighty% of them are really anomalies, minimizing fake
positives. This suggests a exceedingly low price of misclassification of healthful instances as faulty,
enhancing the model's reliability in identifying genuine anomalies. The formula can be expressed as:

True Positive
True Positive+ False Positive

F1 Score:

The F1 score is a metric that combines both precision and keep in mind right into a unmarried score,
providing a balanced degree of a version's overall performance. It is calculated because the harmonic
mean of precision and recollect, emphasizing the balance between the 2 metrics. A F1 score of 0.8
suggesting that the model has extraordinarily low false positives and false negatives. This score implies
that the model performs nicely in both successfully identifying positive instances (precision) and
capturing all positive instances (precision), making it appropriate for obligations in which each metrics
are critical. F1 rating may be represented as

2 ∗ precision ∗ recall
precision+ recall

1 Conclusion

A powerful One-Class Support Vector Machine(SVM) algorithm is utilized by


this anomaly-detection system to check asset health via careful examination of
sound and vibration data.

Accuracy of unhealthy data classification is precisely evaluated by the system


during result generation. By conceiving key indicators like true positives, true
negatives, false positives and false negatives, it provides valuable insights into
the model’s performance. 10

The system improves user interaction by using a Graphical User Interface that
provides smooth functionality for result generation and dataset visualization.
The GUI enables effortless asset health monitoring.
In summary, this anomaly detection system delivers transcendent performance and
reliability in detecting anomalies and also in managing assets.

1.1 Future Work

To enable swift action to address potential issues, automated alerts can be


developed that will notify maintenance teams immediately when an anomaly is
detected. Integrating this feature with the existing maintenance management
systems will ensure a cohesive workflow. Data processing and model training could
be migrated to cloud platforms to ensure scalability and performance. With this,
handling larger datasets and complex models would become efficient. Code
optimization is very important for fast execution, this would enable real-time
anomaly detection and immediate response to critical situations.

These improvements, as a whole, contribute to an agile and robust anomaly


detection system which will also improve asset maintenance and ensure
optimal performance.

11

You might also like