MACHINE
LEARNING
PALLAVI SHUKLA
Assistant Professor
United College Of Engineering & Research, Prayagraj
Email-Id : pallavi.shkl@gmail.com
UNIT 1 - INTRODUCTION
Learning, Types of Learning,
Well-defined learning problems,
Designing a Learning System,
History of ML, Introduction of Machine Learning Approaches – (Artificial Neural
Network,
Clustering, Reinforcement Learning,
Decision Tree Learning,
Bayesian networks, Support Vector Machine,
Genetic Algorithm),
Issues in Machine Learning and Data Science vs. Machine Learning;
What is Human learning?
• Human learning is all about observing things, recognizing a pattern,
elaborating a theory or model that explains that pattern, and then
putting that theory to the test and checking whether it matches most
or all of the observations.
• Learning is, basically, a model that represents a pattern within a
collection of observations.
• Without a feasible model, there is no learning
Types of Human Learning :
• (1) either somebody who is an expert in the subject directly teaches us,
• (2) we build our own notion indirectly based on what we have learned
from the expert in the past,
• (3) we do it ourselves, maybe after multiple attempts, some being
unsuccessful.
Learning under Expert Guidance -
• An infant may inculcate certain traits and characteristics, learning
straight from its guardians.
• He calls his hand, a ‘hand’, because that is the information he gets
from his parents.
• The sky is ‘blue’ to him because that is what his parents have taught
him.
• We say that the baby ‘learns’ things from his parents.
Learning guided by knowledge gained
from experts -
Learning guided by knowledge gained
from experts -
• In all these situations, there is no direct learning. It is some past
information shared on some different context, which is used as a
learning to make decisions
Learning by Shelf -
Learning by Shelf -
• In many situations, humans are left to learn on their own.
• A lot of things need to be learned only from mistakes made in the past.
• We tend to form a check list on things that we should do, and things
that we should not do, based on our experiences
Machine Learning:
• Machine learning is a branch of artificial
intelligence (AI) and computer science that focuses
on the use of data and algorithms to imitate the
way that humans learn, gradually improving its
accuracy
• Machine Learning is the field of study that gives
computers the capability to learn without being
explicitly programmed. (‘A computer program is
said to learn from experience E with respect to
some class of tasks T and performance measure P,
if its performance at tasks in T, as measured by P,
improves with experience E.’)
Examples –
Handwriting recognition learning problem –
• Task T: Recognizing and classifying handwritten words within images
• Performance P: Percent of words correctly classified
• Training experience E: A dataset of handwritten words with given
classifications
A robot driving learning problem -
• Task T: Driving on highways using vision sensors
• Performance measure P: Average distance traveled before an error
• Training experience: A sequence of images and steering commands
recorded while observing a human driver.
A chess learning problem -
• Task T: Playing chess.
• Performance measure P: Percent of games won against opponents.
• Training experience E: Playing practice games against itself.
• A computer program which learns from experience is called a machine
learning program or simply a learning program.
• Such a program is sometimes also referred to as a learner.
How do machines learn?
The basic Machine Learning process can be divided into four parts.
1. Data Input: Past data or information is utilized as a basis for future
decision-making.
2. Abstraction: The input data is represented in a broader way through
the underlying algorithm.
3. Generalization: The abstracted representation is generalized to form
a framework for making decisions.
4. Evaluation: provides a feedback mechanism to measure the utility of
learned knowledge and inform potential improvements.
Machine Learning Process -
Data Storage:
• Facilities for storing and retrieving huge amounts of data are an
important component of the learning process. Humans and computers
alike utilize data storage as a foundation for advanced reasoning.
• In a human being, the data is stored in the brain, and data is retrieved
using electrochemical signals.
• Computers use hard disk drives, flash memory, random access
memory, and similar devices to store data and use cables and other
technology to retrieve data.
Abstraction:
• Abstraction is the process of extracting knowledge about stored data.
• This involves creating general concepts about the data as a whole.
• The creation of knowledge involves the application of known models and the
creation of new models.
• The process of fitting a model to a dataset is known as training.
• When the model has been trained, the data is transformed into an abstract
form that summarizes the original information.
• This work of assigning a broader meaning to stored data occurs during the
abstraction process, in which raw data comes to represent a wider, more
abstract concept or idea.
• This type of connection, say between an object and its representation, is
exemplified by the famous René Magritte painting The Treachery of Images.
• There are many different types of models. You may already be familiar
with some. Examples include:
• Mathematical equations
• Relational diagrams, such as trees and graphs
• Logical if/else rules
• Groupings of data known as clusters.
• The process of fitting a model to a dataset is known as training. When the
model has been trained, the data has been transformed into an abstract form
that summarizes the original information.
Generalization -
• The term generalization describes the process of turning the knowledge
about stored data into a form that can be utilized for future action.
• These actions are to be carried out on tasks that are similar, but not
identical, to those that have been seen before.
• In generalization, the goal is to discover those properties of the data that
will be most relevant to future tasks.
• The term generalization is defined as the process of turning abstracted
knowledge into a form that can be utilized for future action, on tasks that
are similar, but not identical, to those the learner has seen before.
• It acts as a search through the entire set of models (that is, theories or
inferences) that could be established from the data during training.
• In generalization, the learner is tasked with limiting the patterns it
discovers to only those that will be most relevant to its future tasks.
• Normally, it is not feasible to reduce the number of patterns by
examining them one by one and ranking them by future utility.
• Instead, machine learning algorithms generally employ shortcuts that
reduce the search space more quickly.
• To this end, the algorithm will employ heuristics, which are educated
guesses about where to find the most useful inferences.
Evaluation -
• Evaluation is the last component of the learning process.
• It is the process of giving feedback to the user to measure the utility of
the learned knowledge.
• This feedback is then utilized to effect improvements in the whole
learning process.
• The final step in the learning process is to evaluate its success and to
measure the learner's performance in spite of its biases.
• The information gained in the evaluation phase can then be used to
inform additional training if needed.
• Generally, evaluation occurs after a model has been trained on an
initial training dataset.
• Then, the model is evaluated on a separate test dataset in order to
judge how well its characterization of the training data generalizes to
new, unseen cases.
• It's worth noting that it is exceedingly rare for a model to perfectly
generalize to every unforeseen case—mistakes are almost always
inevitable.
Well-posed learning problem:
• For defining a new problem, which can be solved using
machine learning, a simple framework, highlighted below, can
be used.
• This framework also helps in deciding whether the problem is a
right candidate to be solved using machine learning.
• The framework involves answering three questions:
• 1. What is the problem?
• 2. Why does the problem need to be solved?
• 3. How to solve the problem?
• Step 1: What is the Problem?
• A number of information should be collected to know what is the problem.
• Informal description of the problem, e.g.
• I need a program that will prompt the next word as and when I type a word.
• Formalism Use Tom Mitchell’s machine learning formalism stated above to define the
T, P, and E for the problem.
• For example: Task (T): Prompt the next word when I type a word.
Experience (E): A corpus of commonly used English words and phrases.
Performance (P): The number of correct words prompted considered as
a percentage (which in machine learning paradigm is known as learning accuracy).
Assumptions - Create a list of assumptions about the problem.
Similar problems What other problems have you seen or can you think of that are similar
to the problem that you are trying to solve?
Step 2: Why does the problem need to
be solved?
Motivation
• What is the motivation for solving the problem?
• What requirement will it fulfill?
• For example, does this problem solve any long-standing business issue
like finding out potentially fraudulent transactions?
• Or the purpose is more trivial like trying to suggest some movies for
the upcoming weekend.
Step 3: How would I solve the problem?
• Try to explore how to solve the problem manually.
• Detail out step-by-step data collection, data preparation, and program
design to solve the problem.
• Collect all these details and update the previous sections of the
problem definition, especially the assumptions.
Introduction to
ML -
PALLAVI SHUKLA
Assistant professor
Applications of Machine Learning-
• Application of machine learning methods to large databases is called data mining.
• In data mining, a large volume of data is processed to construct a simple model
with valuable use, for example, having high predictive accuracy.
• The following is a list of some of the typical applications of machine learning.
1. In retail business, machine learning is used to study consumer behavior.
2. In finance, banks analyze their past data to build models to use in credit
applications, fraud detection, and the stock market.
3. In manufacturing, learning models are used for optimization, control, and
troubleshooting
4. In medicine, learning programs are used for medical diagnosis.
5. In telecommunications, call patterns are analyzed for network optimization and
maximizing the quality of service.
• In science, large amounts of data in physics, astronomy, and biology can
only be analyzed fast enough by computers.
• The World Wide Web is huge; it is constantly growing and searching for
relevant information cannot be done manually.
• In artificial intelligence, it is used to teach a system to learn and adapt to
changes so that the system designer does not foresee and provide
solutions for all possible situations.
• It is used to find solutions to many problems in vision, speech recognition,
and robotics.
• Machine learning methods are applied in the design of computer-controlled
vehicles to steer correctly when driving on a variety of roads.
• Machine learning methods have been used to develop programs for
playing games such as chess, backgammon, and Go.
History of ML:
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
History of ML:
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
History of ML:
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
TYPES OF MACHINE LEARNING -
Supervised Learning -
• It has the presence of a supervisor as a teacher.
• Task of learning a function that maps an input to an output based on
example input-output pairs.
• A training set of examples with the correct responses (targets) is provided
and, based on this training set, the algorithm generalizes to respond
correctly to all possible inputs. This is also called learning from exemplars.
• A supervised learning algorithm analyzes the training data and produces
a function, which can be used for mapping new examples.
• If the shape of the object is rounded and has a depression
at the top, is red in color, then it will be labeled as –Apple.
• If the shape of the object is a long curving cylinder having
Green-Yellow color, then it will be labeled as –Banana.
Types of Supervised Learning -
Classification:
• are used to predict/Classify the discrete values such as Male or Female,
True or False, Spam or Not Spam, etc.
• a computer program is trained on the training dataset and based on that
training, it categorizes the data into different classes.
Regression :
• are used to predict the continuous values such as price, salary, age, etc.
• finding the correlations between dependent and independent variables.
Advantages of Supervised Algorithm -
• Supervised learning allows collecting data and produces data
output from previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-
world computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we
want in the training data.
Disadvantages Of Supervised Algorithm
-
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So,
it requires a lot of time.
• Supervised learning cannot handle all complex tasks in Machine
Learning.
• Computation time is vast for supervised learning.
• It requires a labeled data set.
• It requires a training process.
Introduction to
ML -
PALLAVI SHUKLA
Assistant professor
Applications of Machine Learning-
• Application of machine learning methods to large databases is called data mining.
• In data mining, a large volume of data is processed to construct a simple model
with valuable use, for example, having high predictive accuracy.
• The following is a list of some of the typical applications of machine learning.
1. In retail business, machine learning is used to study consumer behavior.
2. In finance, banks analyze their past data to build models to use in credit
applications, fraud detection, and the stock market.
3. In manufacturing, learning models are used for optimization, control, and
troubleshooting
4. In medicine, learning programs are used for medical diagnosis.
5. In telecommunications, call patterns are analyzed for network optimization and
maximizing the quality of service.
• In science, large amounts of data in physics, astronomy, and biology can
only be analyzed fast enough by computers.
• The World Wide Web is huge; it is constantly growing and searching for
relevant information cannot be done manually.
• In artificial intelligence, it is used to teach a system to learn and adapt to
changes so that the system designer does not foresee and provide
solutions for all possible situations.
• It is used to find solutions to many problems in vision, speech recognition,
and robotics.
• Machine learning methods are applied in the design of computer-controlled
vehicles to steer correctly when driving on a variety of roads.
• Machine learning methods have been used to develop programs for
playing games such as chess, backgammon, and Go.
History of ML:
• 1950s
– Samuel’s checker player
– Selfridge’s Pandemonium
• 1960s:
– Neural networks: Perceptron
– Pattern recognition
– Learning in the limit theory
– Minsky and Papert prove limitations of Perceptron
• 1970s:
– Symbolic concept induction
– Winston’s arch learner
– Expert systems and the knowledge acquisition bottleneck
– Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
– Scientific discovery with BACON
– Mathematical discovery with AM
History of ML:
• 1980s:
– Advanced decision tree and rule learning
– Explanation-based Learning (EBL)
– Learning and planning and problem solving
– Utility problem
– Analogy
– Cognitive architectures
– Resurgence of neural networks (connectionism,
backpropagation)
– Valiant’s PAC Learning Theory
– Focus on experimental methodology
• 1990s
– Data mining
– Adaptive software agents and web applications
– Text learning
– Reinforcement learning (RL)
– Inductive Logic Programming (ILP)
– Ensembles: Bagging, Boosting, and Stacking
– Bayes Net learning
History of ML:
• 2000s
– Support vector machines & kernel methods
– Graphical models
– Statistical relational learning
– Transfer learning
– Sequence labeling
– Collective classification and structured outputs
– Computer Systems Applications (Compilers, Debugging, Graphics, Security)
– E-mail management
– Personalized assistants that learn
– Learning in robotics and vision
• 2010s
– Deep learning systems
– Learning for big data
– Bayesian methods
– Multi-task & lifelong learning
– Applications to vision, speech, social networks, learning to read, etc.
TYPES OF MACHINE LEARNING -
Supervised Learning -
• It has the presence of a supervisor as a teacher.
• Task of learning a function that maps an input to an output based on
example input-output pairs.
• A training set of examples with the correct responses (targets) is provided
and, based on this training set, the algorithm generalizes to respond
correctly to all possible inputs. This is also called learning from exemplars.
• A supervised learning algorithm analyzes the training data and produces
a function, which can be used for mapping new examples.
• If the shape of the object is rounded and has a depression
at the top, is red in color, then it will be labeled as –Apple.
• If the shape of the object is a long curving cylinder having
Green-Yellow color, then it will be labeled as –Banana.
Types of Supervised Learning -
Classification:
• are used to predict/Classify the discrete values such as Male or Female,
True or False, Spam or Not Spam, etc.
• a computer program is trained on the training dataset and based on that
training, it categorizes the data into different classes.
Regression :
• are used to predict the continuous values such as price, salary, age, etc.
• finding the correlations between dependent and independent variables.
Advantages of Supervised Algorithm -
• Supervised learning allows collecting data and produces data
output from previous experiences.
• Helps to optimize performance criteria with the help of experience.
• Supervised machine learning helps to solve various types of real-
world computation problems.
• It performs classification and regression tasks.
• It allows estimating or mapping the result to a new sample.
• We have complete control over choosing the number of classes we
want in the training data.
Disadvantages Of Supervised Algorithm
-
• Classifying big data can be challenging.
• Training for supervised learning needs a lot of computation time. So,
it requires a lot of time.
• Supervised learning cannot handle all complex tasks in Machine
Learning.
• Computation time is vast for supervised learning.
• It requires a labeled data set.
• It requires a training process.
Introduction to
Machine Learning
– Lecture 3
PALLAVI SHUKLA
Assistant professor
Unsupervised Learning -
• It is the training of a machine using information that is neither classified nor
labeled and allowing the algorithm to act on that information without guidance.
• The task of the machine is to group unsorted information according to
similarities, patterns, and differences without any prior training of data.
• Type of machine learning algorithm used to draw inferences from datasets
consisting of input data without labeled responses.
Types of Unsupervised Learning -
Clustering -
• A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behavior.
• is a method of grouping the objects into clusters such that objects with
most similarities remains into a group and has less or no similarities with
the objects of another group.
Association -
• It is used for finding the relationships between variables in the large
database.
• It determines the set of items that occurs together in the dataset.
Association rule makes marketing strategy more effective.
Advantages of Unsupervised Learning:
• It does not require training data to be labeled.
• Dimensionality reduction can be easily accomplished using unsupervised learning.
• Capable of finding previously unknown patterns in data.
• Flexibility: Unsupervised learning is flexible in that it can be applied to a wide
variety of problems, including clustering, anomaly detection, and association rule
mining.
• Exploration: Unsupervised learning allows for the exploration of data and the
discovery of novel and potentially useful patterns that may not be apparent from the
outset.
• Low cost: Unsupervised learning is often less expensive than supervised learning
because it doesn’t require labeled data, which can be time-consuming and costly to
obtain.
Disadvantages of Unsupervised
Learning :
• Difficult to measure accuracy or effectiveness due to lack of predefined answers
during training.
• The results often have lesser accuracy.
• The user needs to spend time interpreting and label the classes which follow that
classification.
• Lack of guidance: Unsupervised learning lacks the guidance and feedback
provided by labeled data, which can make it difficult to know whether the
discovered patterns are relevant or useful.
• Sensitivity to data quality: Unsupervised learning can be sensitive to data
quality, including missing values, outliers, and noisy data.
• Scalability: Unsupervised learning can be computationally expensive,
particularly for large datasets or complex algorithms, which can limit its scalability.
REINFORCEMENT LEARNING -
• It is the problem of getting an agent to act in the world so as to
maximize its rewards.
• A learner (the program) is not told what actions to take as in most
forms of machine learning, but instead must discover which actions
yield the most reward by trying them .
Example-
Application of Reinforcement
Learning -
• 1. Robotics: Robots with pre-programmed behavior are useful in
structured environments, such as the assembly line of an
automobile manufacturing plant, where the task is repetitive in
nature.
• 2. A master chess player makes a move. The choice is informed
both by planning, anticipating possible replies and counter replies.
• 3. An adaptive controller adjusts parameters of a petroleum
refinery’s operation in real time.
Advantages of Reinforcement Learning-
• 1. Can be used to solve very complex problems that cannot be solved by
conventional techniques.
• 2. The model can correct the errors that occurred during the training process.
• 3. In RL, training data is obtained via the direct interaction of the agent with the
environment
• 4. Can handle environments that are non-deterministic, meaning that the
outcomes of actions are not always predictable. This is useful in real-world
applications where the environment may change over time or is uncertain.
• 5. Can be used to solve a wide range of problems, including those that involve
decision-making, control, and optimization.
• 6. A flexible approach that can be combined with other machine learning
techniques, such as deep learning, to improve performance.
Disadvantages of Reinforcement Learning -
• 1. It is not preferable to use it for solving simple problems.
• 2. It needs a lot of data and a lot of computation.
• 3. It is highly dependent on the quality of the reward function. If the
reward function is poorly designed, the agent may not learn the
desired behavior.
• 4. It can be difficult to debug and interpret. It is not always clear why
the agent is behaving in a certain way, which can make it difficult to
diagnose and fix problems.
Introduction to
machine learning
approaches- Lecture 4
Pallavi Shukla
Assistant professor
Computer science & engineering
Introduction to Machine Learning Approaches-
• Artificial Neural Network
• Clustering
• Reinforcement Learning
• Decision Tree Learning
• Bayesian Networks
• Support Vector Machine
• Genetic Algorithm
Artificial Neural Network(ANN):
• It is an information processing paradigm that is inspired by the
way the biological nervous system such as the brain process
information.
• It is composed of large number of highly interconnected
processing elements(neurons) working in unison to solve a specific
problem
• Biological Neurons (also called nerve cells) or simply neurons are
the fundamental units of the brain and nervous system, the cells
responsible for receiving sensory input from the external world via
dendrites, process it and gives the output through Axons.
• Cell body (Soma): The body of the neuron cell contains the
nucleus and carries out biochemical transformation necessary to
the life of neurons.
• Dendrites: Each neuron has fine, hair-like tubular structures
(extensions) around it. They branch out into a tree around the cell
body. They accept incoming signals.
• Axon: It is a long, thin, tubular structure that works like a
transmission line.
• Synapse: Neurons are connected to one another in a complex
spatial arrangement. When axon reaches its final destination, it
branches again called terminal arborization. At the end of the axon
are highly complex and specialized structures called synapses. The
• Dendrites receive input through the synapses of other neurons.
• The soma processes these incoming signals over time and
converts that processed value into an output, which is sent out
to other neurons through the axon and the synapses.
• The following diagram represents the general model of ANN
which is inspired by a biological neuron. It is also called
Perceptron.
• A single layer neural network is called a Perceptron.
• It gives a single output.
• In the above figure, for one single observation, x0, x1, x2,
x3...x(n) represents various inputs (independent variables) to the
network.
• Each of these inputs is multiplied by a connection weight or
synapse.
• The weights are represented as w0, w1, w2, w3…. w(n).
• Weight shows the strength of a particular node. b is a bias value.
• A bias value allows you to shift the activation function up or
down.
• In the simplest case, these products are summed, fed to a
transfer function (activation function) to generate a result,
and this result is sent as output.
• Mathematically, x1.w1 + x2.w2 + x3.w3 ...... xn. wn = ∑ xi. Wi
• Now activation function is applied 𝜙 (∑ xi. wi)
• The Activation function is important for an ANN to learn and
make sense of something really complicated. Their main
purpose is to convert an input signal of a node in an ANN to
an output signal. This output signal is used as input to the
next layer in the stack
CLUSTERING -
• Clustering is the task of dividing the population or data
points into a number of groups such that data points in
the same groups are more similar to other data points
in the same group than those in other groups.
• In simple words, the aim is to segregate groups with
similar traits and assign them into clusters.
Example of Clustering -
• Suppose, you are the head of a rental store and wish to understand
the preferences of your costumers to scale up your business.
• Is it possible for you to look at details of each costumer and devise
a unique business strategy for each one of them?
• Definitely not.
• But, what you can do is to cluster all of your customers into say 10
groups based on their purchasing habits and use a separate
strategy for customers in each of these 10 groups. And this is what
we call clustering.
Types of Clustering:
• Hard Clustering: In hard clustering, each data point either
belongs to a cluster completely or not. For example, in the above
example, each customer is put into one group out of the 10
groups.
• Soft Clustering: In soft clustering, instead of putting each data
point into a separate cluster, a probability or likelihood of that
data point to be in those clusters is assigned. For example, from
the above scenario, each customer is assigned a probability to be
in either of 10 clusters of the retail store.
Decision tree -
•
It is like a tree structure that works on the principle of conditions.
• It is efficient and has strong algorithms used for predictive analysis.
• It has mainly attributes that include internal nodes, branches and a terminal
node.
• Every internal node holds a “test” on an attribute, branches hold the
conclusion of the test and every leaf node means the class label.
• This is the most used algorithm when it comes to supervised learning
techniques.
• It is used for both classifications as well as regression.
• It is often termed as “CART” that means Classification and Regression Tree.
• Tree algorithms are always preferred due to stability and reliability.
• Branches - Division of the whole tree is called branches.
• Root Node - Represent the whole sample that is further divided
• Splitting - Division of nodes is called splitting.
• Terminal Node - Node that does not split further is called a
terminal node.
• Decision Node - It is a node that also gets further divided into
different sub-nodes being a sub node.
• Pruning - Removal of sub nodes from a decision node.
• Parent and Child Node - When a node gets divided further then
that node is termed as parent node whereas the divided nodes
or the sub-nodes are termed as a child node of the parent node
Advantages of the Decision Tree:
1.It is simple to understand as it follows the same process which a
human follow while making any decision in real-life.
2. It can be very useful for solving decision-related problems.
3. It helps to think about all the possible outcomes for a problem.
4. There is less requirement of data cleaning compared to other
algorithms.
Disadvantages of the Decision Tree:
1.The decision tree contains lots of layers, which makes it complex.
2. It may have an overfitting issue, which can be resolved using the
Random Forest algorithm.
3. For more class labels, the computational complexity of the decision
tree may increase.
Bayesian networks -
• Bayesian networks are a type of probabilistic graphical model that
uses Bayesian inference for probability computations.
• Bayesian networks aim to model conditional dependence and
causation by representing conditional dependence by edges in a
directed graph.
• Through these relationships, one can efficiently conduct inference on
the random variables in the graph through the use of factors.
• Using the relationships specified by our Bayesian network, we can
obtain a compact, factorized representation of the joint probability
distribution by taking advantage of conditional independence.
• Bayesian network is a directed acyclic graph in which each
edge corresponds to a conditional dependency, and each
node corresponds to a unique random variable.
• Formally, if an edge (A, B) exists in the graph connecting
random variables A and B, it means that P(B|A) is a factor
in the joint probability distribution, so we must know P(B|A)
for all values of B and A in order to conduct inference.
• In the above example, since Rain has an edge going into
WetGrass, it means that P(WetGrass|Rain) will be a factor,
whose probability values are specified next to the
WetGrass node in a conditional probability table.
Introduction to machine
learning
approaches(PART II)
Lecture 5
Pallavi Shukla
Assistant professor
Computer science & engineering
Support Vector Machines -
• Support Vector Machine (SVM) is a powerful machine learning
algorithm used for linear or nonlinear classification, regression,
and even outlier detection tasks.
• SVMs can be used for a variety of tasks, such as text
classification, image classification, spam detection, handwriting
identification, gene expression analysis, face detection, and
anomaly detection.
• SVMs are adaptable and efficient in a variety of applications
because they can manage high-dimensional data and nonlinear
relationships.
Support Vector Machines -
• In SVM, we plot each data item in the dataset in an N-dimensional
space, where N is the number of features/attributes in the data.
• Next, find the optimal hyperplane to separate the data.
• So, by this, you must have understood that inherently, SVM can
only perform binary classification (i.e., choose between two
classes)
• The main objective of the SVM algorithm is to find the optimal
hyperplane in an N-dimensional space that can separate the data
points in different classes in the feature space.
Support Vector Machines -
• Let’s consider two independent variables x 1, x2, and one
dependent variable which is either a blue circle or a red circle.
• From the figure above it’s very clear that there are multiple lines
(our hyperplane here is a line because we are considering only two
input features x1, x2) that segregate our data points or do a
classification between red and blue circles. So how do we choose
the best line or in general the best hyperplane that segregates our
data points?
Support Vector Machine Terminology -
1.Hyperplane: Hyperplane is the decision boundary that is used to separate the
data points of different classes in a feature space. In the case of linear
classifications, it will be a linear equation i.e. wx+b = 0.
2.Support Vectors: Support vectors are the closest data points to the
hyperplane, which makes a critical role in deciding the hyperplane and
margin.
3.Margin: Margin is the distance between the support vector and hyperplane.
The main objective of the support vector machine algorithm is to maximize the
margin. The wider margin indicates better classification performance.
Support Vector Machine Terminology -
4. Kernel: Kernel is the mathematical function, which is used in SVM to map the original input
data points into high-dimensional feature spaces, so, that the hyperplane can be easily found out
even if the data points are not linearly separable in the original input space. Some of the common
kernel functions are linear, polynomial, radial basis function(RBF), and sigmoid.
5. Hard Margin: The maximum-margin hyperplane or the hard margin hyperplane is a
hyperplane that properly separates the data points of different categories without any
misclassifications.
6. Soft Margin: When the data is not perfectly separable or contains outliers, SVM permits a soft
margin technique. Each data point has a slack variable introduced by the soft-margin SVM
formulation, which softens the strict margin requirement and permits certain misclassifications or
violations. It discovers a compromise between increasing the margin and reducing violations.
Genetic Algorithm -
• A genetic algorithm (GA) is a heuristic search algorithm used to solve
search and optimization problems. This algorithm is a subset of
evolutionary algorithms, which are used in computation. Genetic
algorithms employ the concept of genetics and natural selection to
provide solutions to problems.
• These algorithms have better intelligence than random search
algorithms because they use historical data to take the search to the
best performing region within the solution space.
• GAs are also based on the behavior of chromosomes and their genetic
structure.
• Every chromosome plays the role of providing a possible solution.
• The fitness function helps in providing the characteristics of all
individuals within the population.
• The greater the function, the better the solution
Phases of genetic algorithm :
• Initialization
• Fitness assignment
• Selection
• Reproduction
• Crossover
• Mutation:
INITIALIZATION
• The genetic algorithm starts by generating an
initial population.
• This initial population consists of all the probable
solutions to the given problem.
• The most popular technique for initialization is the
use of random binary strings.
Fitness assignment
• The fitness function helps in establishing the fitness of all
individuals in the population.
• It assigns a fitness score to every individual, which further
determines the probability of being chosen for reproduction.
• The higher the fitness score, the higher the chances of
being chosen for reproduction.
Selection
• In this phase, individuals are selected for the reproduction of offspring.
• The selected individuals are then arranged in pairs of two to enhance
reproduction.
• These individuals pass on their genes to the next generation.
• The main objective of this phase is to establish the region with high
chances of generating the best solution to the problem (better than the
previous generation).
• The genetic algorithm uses the fitness proportionate selection
technique to ensure that useful solutions are used for recombination.
Reproduction
• This phase involves the creation of a child
population.
• The algorithm employs variation operators that are
applied to the parent population.
• The two main operators in this phase include
crossover and mutation.
Crossover:
• This operator swaps the genetic information of two
parents to reproduce an offspring.
• It is performed on parent pairs that are selected
randomly to generate a child population of equal
size as the parent population.
Mutation:
• This operator adds new genetic information to the new child
population.
• This is achieved by flipping some bits in the chromosome.
• Mutation solves the problem of local minimum and enhances
diversification.
• The following image shows how mutation is done
ISSUES IN MACHINE
LEARNING- Lecture
6
Pallavi Shukla
Assistant professor
Computer science & engineering
Inadequate Training Data -
• Lack of quality as well as quantity of data.
• Many data scientists claim that inadequate data, noisy data, and unclean data
are extremely exhausting machine learning algorithms.
• For example, a simple task requires thousands of sample data, and an
advanced task such as speech or image recognition needs millions of sample
data examples.
• Further, data quality is also important for the algorithms to work ideally, but the
absence of data quality is also found in Machine Learning applications.
Factors that affect data quality -
• Noisy Data- It is responsible for an inaccurate prediction that affects the decision
as well as accuracy in classification tasks.
• Incorrect data- It is also responsible for faulty programming and results
obtained in machine learning models. Hence, incorrect data may affect the
accuracy of the results also.
• Generalizing of output data- Sometimes, it is also found that generalizing
output data becomes complex, which results in comparatively poor future actions.
Poor quality of data-
• Noisy data, incomplete data, inaccurate data, and
unclean data lead to less accuracy in classification
and low-quality results.
• Hence, data quality can also be considered as a
major common problem while p
• rocessing machine learning algorithms.
Non-representative training data -
• To make sure our training model is generalized well or not, we have to ensure that
sample training data is representative of new cases that we need to generalize.
• The training data must cover all cases that have already occurred as well as
occurring.
• Further, if we are using non-representative training data in the model, it results in
less accurate predictions.
• A machine learning model is said to be ideal if it predicts well for generalized cases
and provides accurate decisions.
• If there is less training data, then there will be a sampling noise in the model, called
the non-representative training set.
• It won't be accurate in predictions.
• To overcome this, it will be biased against one class or a group.
Overfitting and Underfitting -
Overfitting –
Overfitting is one of the most common issues faced by Machine Learning engineers and data
scientists.
Whenever a machine learning model is trained with a huge amount of data, it starts capturing noise
and inaccurate data into the training data set.
It negatively affects the performance of the model.
Let's understand with a simple example where we have a few training data sets such as 1000
mangoes, 1000 apples, 1000 bananas, and 5000 papayas.
Then there is a considerable probability of identification of an apple as papaya because we have a
massive amount of biased data in the training data set; hence prediction got negatively affected.
The main reason behind overfitting is using non-linear methods used in machine learning algorithms
as they build non-realistic data models.
We can overcome overfitting by using linear and parametric algorithms in the machine learning
models.
Methods to reduce overfitting:
• Increase training data in a dataset.
• Reduce model complexity by simplifying the model by selecting one with fewer
parameters
• Ridge Regularization and Lasso Regularization
• Early stopping during the training phase
• Reduce the noise
• Reduce the number of attributes in training data.
• Constraining the model.
Underfitting :
• Underfitting is just the opposite of overfitting.
• Whenever a machine learning model is trained with fewer amounts of data,
and as a result, it provides incomplete and inaccurate data and destroys the
accuracy of the machine learning model.
• Underfitting occurs when our model is too simple to understand the base
structure of the data, just like an undersized pant.
• This generally happens when we have limited data into the data set, and we
try to build a linear model with non-linear data.
• In such scenarios, the complexity of the model destroys, and rules of the
machine learning model become too easy to be applied on this data set, and
the model starts doing wrong predictions as well.
Methods to reduce Underfitting:
• Increase model complexity
• Remove noise from the data
• Trained on increased and better features
• Reduce the constraints
• Increase the number of epochs to get better results.
Monitoring and Maintenance -
• As we know that generalized output data is mandatory for
any machine learning model; hence, regular monitoring
and maintenance become compulsory for the same.
• Different results for different actions require data change;
hence editing of codes as well as resources for monitoring
them also become necessary.
Getting Bad Recommendations-
• A machine learning model operates under a specific context which
results in bad recommendations and concept drift in the model.
• Let's understand with an example where at a specific time customer is
looking for some gadgets, but customer requirement change over time
but still machine learning model shows same recommendations to the
customer while customer expectations has been changed.
• This incident is called a Data Drift.
• It generally occurs when new data is introduced or the interpretation of
data changes. However, we can overcome this by regularly updating
and monitoring data according to the expectations.
Lack of skilled resources -
• Although Machine Learning and Artificial Intelligence are
continuously growing in the market, still these industries are
fresher in comparison to others.
• The absence of skilled resources in the form of manpower is
also an issue.
• Hence, we need manpower having in-depth knowledge of
mathematics, science, and technologies for developing and
managing scientific substances for machine learning.
Customer Segmentation -
• Customer segmentation is also an important issue while
developing a machine learning algorithm.
• To identify the customers who paid for the recommendations
shown by the model and who don't even check them.
• Hence, an algorithm is necessary to recognize the customer
behavior and trigger a relevant recommendation for the user
based on past experience.
Process Complexity of Machine Learning
• The machine learning process is very complex, which is also another
major issue faced by machine learning engineers and data scientists.
• However, Machine Learning and Artificial Intelligence are very new
technologies but are still in an experimental phase and continuously
changing over time.
• There is the majority of hits and trial experiments; hence the
probability of error is higher than expected.
• Further, it also includes analyzing the data, removing data bias,
training data, applying complex mathematical calculations, etc.,
making the procedure more complicated and quite tedious.
Data Bias
• Data Biasing is also found a big challenge in Machine Learning.
• These errors exist when certain elements of the dataset are heavily
weighted or need more importance than others.
• Biased data leads to inaccurate results, skewed outcomes, and other
analytical errors.
• However, we can resolve this error by determining where data is
actually biased in the dataset.
• Further, take necessary steps to reduce it.
Methods to remove Data Bias:
• Research more for customer segmentation.
• Be aware of your general use cases and potential outliers.
• Combine inputs from multiple sources to ensure data diversity.
• Include bias testing in the development process.
• Analyze data regularly and keep tracking errors to resolve them easily.
• Review the collected and annotated data.
• Use multi-pass annotation such as sentiment analysis, content
moderation, and intent recognition.
Lack of Explain ability -
• The outputs cannot be easily comprehended as it is programmed in
specific ways to deliver for certain conditions.
• Hence, a lack of explainability is also found in machine learning
algorithms which reduce the credibility of the algorithms.
Slow implementations and results -
• This issue is also very commonly seen in machine learning
models.
• However, machine learning models are highly efficient in
producing accurate results but are time-consuming.
• Slow programming, excessive requirements' and overloaded
data take more time to provide accurate results than
expected.
• This needs continuous maintenance and monitoring of the
model for delivering accurate results.
Irrelevant features -
• Although machine learning models are intended to
give the best possible outcome, if we feed garbage
data as input, then the result will also be garbage.
• Hence, we should use relevant features in our
training sample.
• A machine learning model is said to be good if
training data has a good set of features or less to no
irrelevant features.
DATA SCIENCE VS
MACHINE LEARNING
Data Science Machine Learning
Machine Learning is a field of study
Data Science is a field about processes
that gives computers the capability
and systems to extract data from
to learn without being explicitly
structured and semi-structured data.
programmed.
Combination of Machine and Data
Need the entire analytics universe.
Science.
Machines utilize data science
Branch that deals with data.
techniques to learn about the data.
It uses various techniques like
Data in Data Science maybe or maybe
regression and supervised
not evolved from a machine or
clustering.
mechanical process.
Data Science Machine Learning
Data Science as a broader term not
only focuses on algorithms statistics But it is only focused on algorithm
but also takes care of the data statistics.
processing.
It is a broad term for multiple
It fits within data science.
disciplines.
Many operations of data science that is, It is three types: Unsupervised
data gathering, data cleaning, data learning, Reinforcement learning,
manipulation, etc. Supervised learning.
Example: Netflix uses Data Science Example: Facebook uses Machine
technology. Learning technology.
• For any query : Write mail to
pallavishukla@united.ac.in