0% found this document useful (0 votes)
42 views6 pages

IoT Analytics: ML Tools & Challenges

The document discusses the importance of sophisticated data analytics in IoT systems, highlighting traditional tools like k-means and decision trees. It explains machine learning (ML) as a crucial component for intelligent IoT applications, detailing its advantages, challenges, and various types such as supervised, unsupervised, semi-supervised, and reinforcement learning. The text emphasizes the need for effective data processing and the role of ML in improving system performance and accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views6 pages

IoT Analytics: ML Tools & Challenges

The document discusses the importance of sophisticated data analytics in IoT systems, highlighting traditional tools like k-means and decision trees. It explains machine learning (ML) as a crucial component for intelligent IoT applications, detailing its advantages, challenges, and various types such as supervised, unsupervised, semi-supervised, and reinforcement learning. The text emphasizes the need for effective data processing and the role of ML in improving system performance and accuracy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

IoT Analytics

IoT Analytics – Introduction


Textbook 1: Chapter 17- 17.1

17.1 Introduction:

 The raw data from a sensor require processing to draw inferences.


 An IoT based system generates data with complex structures; therefore, conventional data
processing on these data is not sufficient.
 Sophisticated data analytics are necessary to identify hidden patterns.
 In this chapter, we discuss a few traditional data analytics tools that are popular in the
context of IoT applications.
 These tools include k-means, decision tree (DT), random forest (RF), k-nearest neighbor
(KNN), and density-based spatial clustering of applications with noise (DBSCAN)
algorithms. Before discussing these algorithms, let us understand some of the basics
related to machine learning (ML).

17.1.1 Machine learning


 The term “machine learning” defined a “field of study that gives computers the ability to
learn without being explicitly programmed”.
 ML is a powerful tool that allows a computer to learn from past experiences and its
mistakes and improve itself without user intervention.
 Different ML models play a crucial role in designing intelligent systems in IoT by
leveraging the massive amount of generated data and increasing the accuracy in their
operations.
 The main components of ML are statistics, mathematics, and computer science for
drawing inferences, constructing ML models, and implementation, respectively.
 ML is an important tool, which is used by different social networking websites such as
facebook and twitter.
 Autonomous vehicles use ML to determine their paths and speeds.
17.1.2 Advantages of ML
ML also enables a system to identify changes and to take intelligent actions that relatively
imitates that of a human. As ML demonstrates a myriad of advantages, its popularity in IoT
applications is increasing rapidly. In this section, we discuss the different advantages of ML, as
depicted in Figure 17.1

Self-learner:
An ML-empowered system is capable of learning from its prior and run-time experiences, which
helps in improving its performance continuously. For example, an ML-assisted weather
monitoring system predicts the weather report of the next seven days with high accuracy from
data collected in the last six months.
Time-efficient:
ML tools are capable of producing faster results as compared to human interpretation. For
example, the weather monitoring system generates a weather prediction report for the upcoming
seven days, using data that goes back to 6–9 months. A manual analysis of such sizeable data for
predicting the weather is difficult and time-consuming. Moreover, the manual process of data
analysis also affects accuracy.
Self-guided:
An ML tool uses a huge amount of data for producing its results. These tools have the capability
of analyzing the huge amount of data for identifying trends autonomously. As an example, when
we search for a particular item on an online e-commerce website, an ML tool analyzes our search
trends. As a result, it shows a range of products similar to the original item that we searched for
initially.
Minimum Human Interaction Required:
In an ML algorithm, the human does not need to participate in every step of its execution. The
ML algorithm trains itself automatically. In traditional systems, humans need to determine the
disease by analyzing different symptoms using standard “if–else” observations. However, the
ML algorithm determines the same disease, based on the health data available in the system and
matching the same with the symptoms of the patient.
Diverse Data Handling:
IoT systems consist of different sensors and produce diverse and multi-dimensional data, which
are easily analyzed by ML algorithms.
Diverse Applications:
ML is flexible and can be applied to different application domains such as healthcare, industry,
smart traffic, smart home, and many others. Two similar ML algorithms may serve two different
applications.

17.1.3 Challenges in ML
. A few major challenges in ML are listed as follows:
i. Data Description: The data acquired from different sensors are required to be
informative and meaningful. Description of data is a challenging part of ML.
ii. Amount of Data: In order to provide an accurate output, a model must have sufficient
amount of data. The availability of a huge amount of data is a challenge in ML.
iii. Erroneous Data: A dataset may contain noisy or erroneous data. On the other hand, the
learning of a model is heavily dependent on the quality of data. Since erroneous data
misleads the ML model.
iv. Selection of Model: Multiple models may be suitable for serving a particular purpose.
However, one model may perform better than others. In such cases, the proper selection
of the model is pertinent for ML.
v. Quality of Model: After the selection of a model, it is difficult to determine the quality
of the selected model. The quality of the model is essential in an ML-based system.
17.1.4 Types of ML
Typically, ML algorithms consist of four categories:
1. Supervised
2. Unsupervised
3. Semi-supervised
4. Reinforcement Learning

Supervised Learning:
This type of learning supervises or directs a machine to learn certain activities using labeled
datasets. The labeled data are used as a supervisor to make the machine understand the relation
of the labels with the properties of the corresponding input data.
Supervised ML algorithms are popular in solving classification and regression problems.
Typically, the classification deals with predictive models that are capable of approximating a
mapping function from input data to categorical output. On the other hand, regression provides
the mapping function from input data to numerical output. There are different classification
algorithms in ML.
We use regression to estimate the relationship among a set of dependent variables with
independent variables, as shown in Figure 17.3. The dependent variables are the primary factors
that we want to predict. However, these dependent variables are affected by the independent
variables. Let x and y be the independent and dependent variables, respectively. Mathematically,
a simple regression model is represented as:

Where
β = the amount of impact of variable x on y and
ϵ = error.
In the given equation, x0 creates β0 impact on y, which indicates that the value of y can never be
0. Similarly, for multiple variables, say n, the regression model is represented as:

Unsupervised Learning:
Unsupervised learning algorithms use unlabeled datasets to find scientific trends. Unsupervised
learning does not use any labels in its operations. Instead, the ML algorithms in this category try
to identify the nature and properties of the input equation and the nature of the formulae
responsible for solving it.
Unsupervised learning algorithms try to create different clusters based on the features of
the formulae and relate it with the input equations. Unsupervised learning is usually applied to
solve two types of problems: clustering and association.
Clustering divides the data into multiple groups. In contrast, association discovers the
relationship or association among the data in a dataset.

Semi-Supervised Learning:
Semi-supervised learning belongs to a category between supervised and unsupervised learning.
Algorithms under this category use a combination of both labeled and unlabeled datasets for
training. Labeled data are typically expensive and are relatively difficult to label correctly.
Unlabeled data is less expensive than labeled data. Therefore, semi-supervised learning includes
both labeled and unlabeled dataset to design the learning model. Traditionally, semi-supervised
learning uses mostly unlabeled data, which makes it efficient to use, and capable of overcoming
samples with missing labels.

Reinforcement Learning:
Reinforcement learning establishes a pattern with the help of its experiences by interacting with
the environment. Consequently, the agent performs a crucial role in reinforcement learning
models. It aims to achieve a particular goal in an uncertain environment.
The model starts with an initial state of a problem, for which different solutions are
available. Based on the output, the model receives either a reward or a penalty from the
environment. The output and reward act as inputs for proceeding to the next state. Thus,
reinforcement learning models continue learning iteratively from their experiences while
inducing correctness to the output.

You might also like