0% found this document useful (0 votes)

23 views38 pages

Machine Learnig

The document discusses well-posed learning problems, which require a task, performance measure, and experience for effective learning. It outlines various applications of machine learning, including image and speech recognition, fraud detection, and self-driving cars. Additionally, it covers concepts such as model selection, generalization, inductive learning, hypothesis, and techniques like logistic regression and SVM, emphasizing their roles in machine learning processes.

Uploaded by

raghav.ty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views38 pages

Machine Learnig

Uploaded by

raghav.ty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 38

WELL POSED LEARNING PROBLEMS

Well Posed Learning Problem – A computer program is said to learn from experience E in
context to some task T and some performance measure P, if its performance on T, as was
measured by P, upgrades with experience E.
Any problem can be segregated as well-posed learning problem if it has three traits –
 Task
 Performance Measure
 Experience

Certain examples that efficiently defines the well-posed learning problem are –
1. To better filter emails as spam or not
 Task – Classifying emails as spam or not
 Performance Measure – The fraction of emails accurately classified as spam or not
spam
 Experience – Observing you label emails as spam or not spam

APPLICATIONS OF MACHINE LEARNING

Image Recognition

Image Recognition is one of the reasons behind the boom one could have experienced in the
field of Deep Learning. The task which started from classification between cats and dog
images has now evolved up to the level of Face Recognition and real-world use cases based
on that like employee attendance tracking.
Also, image recognition has helped revolutionized the healthcare industry by employing
smart systems in disease recognition and diagnosis methodologies.

Speech Recognition

Speech Recognition based smart systems like Alexa and Siri have certainly come across and
used to communicate with them. In the backend, these systems are based basically on
Speech Recognition systems. These systems are designed such that they can convert voice
instructions into text.
One more application of the Speech recognition that we can encounter in our day-to-day life
is that of performing Google searches just by speaking to it.

Fraud Detection

In today’s world, most things have been digitalized varying from buying toothbrushes or
making transactions of millions of dollars everything is accessible and easy to use. But with
this process of digitization cases of fraudulent transactions and fraudulent activities have
increased. Identifying them is not that easy but machine learning systems are very efficient
in these tasks.
Due to these applications only whenever the system detects red flags in a user’s activity than
a suitable notification be provided to the administrator so, that these cases can be
monitored properly for any spam or fraud activities.

Self Driving Cars

It would have been assumed that there is certainly some ghost who is driving a car if we ever
saw a car being driven without a driver but all thanks to machine learning and deep learning
that in today’s world, this is possible and not a story from some fictional book. Even though
the algorithms and tech stack behind these technologies are highly advanced but at the core
it is machine learning which has made these applications possible.
The most common example of this use case is that of the Tesla cars which are well-tested
and proven for autonomous driving.

MODEL SELECTION AND GENERALIZATION

In machine learning, model selection is the process of choosing the best
predictive model for a given problem, while generalization is the ability of a
model to adapt to new data:

 Model selection
Involves comparing the performance of multiple models, and choosing the one
that best fits the data. This process can be iterative, involving testing multiple
models and hyperparameters. It's important to consider other factors besides
performance, such as complexity, maintainability, and available resources.
 Generalization
A model's ability to adapt to new data that's drawn from the same distribution as
the data used to create the model. A good model should generalize well to new
data. Overfitting happens when a model performs well on training data but
generalizes poorly.

CONCEPT LEARNING

Concept learning in machine learning is a process that teaches a computer program

to recognize a concept or function by analyzing a set of labeled examples:

 Explanation
A concept is an idea that's formed by combining all of its attributes or
features. In concept learning, a model is trained to identify a concept or pattern
in a set of examples, and then use that concept to make predictions about new
data.

 How it works
The model learns by searching for a hypothesis that best fits the training
examples. This search can be viewed as a process of learning a pattern in the
data and creating a function based on that pattern.

 Approaches
Concept learning can be approached in a variety of ways, including rule-based
learning, neural networks, and decision trees. Case-based learning is a
prominent approach that involves building a repository of cases, each with a set
of features and their corresponding outcomes.

 Importance
Concept learning is a fundamental part of many automated decision-making
learning processes.

INDUCTIVE LEARNING
Inductive learning is a machine learning technique that uses specific examples to
make generalizations or predictions. It's also known as inductive reasoning or
inductive inference.

Here's some more information about inductive learning:

 How it works: Inductive learning algorithms (ILAs) use a labeled dataset to

train a model that can make predictions on new data. The model is trained to
map inputs to outputs based on the labeled examples.

 How it's used: Inductive learning is often used in supervised learning, where
the data is labeled with the correct answer for each example.

 Why it's used: Inductive learning is widely used because it's flexible and
generalizable.
 How it's related to inductive bias: Inductive learning is closely related to the
concept of inductive bias.

HYPOTHESIS

 A hypothesis in machine learning is the model’s presumption regarding the

connection between the input features and the result.

 It is an illustration of the mapping function that the algorithm is attempting to

discover using the training set.

 To minimize the discrepancy between the expected and actual outputs, the
learning process involves modifying the weights that parameterize the
hypothesis.

 The objective is to optimize the model’s parameters to achieve the best

predictive performance on new, unseen data, and a cost function is used to
assess the hypothesis’ accuracy.

Hypothesis Space (H)

Hypothesis space is the set of all the possible legal hypothesis. This is the set from
which the machine learning algorithm would determine the best possible (only one)
which would best describe the target function or the outputs.

Hypothesis (h)

A hypothesis is a function that best describes the target in supervised machine

learning. The hypothesis that an algorithm would come up depends upon the data
and also depends upon the restrictions and bias that we have imposed on the data.

The Hypothesis can be calculated as:

y=mx+b
Where,
 y = range
 m = slope of the lines
 x = domain
 b = intercept
INDUCTIVE BIAS
Inductive bias can be defined as the set of assumptions or biases that a
learning algorithm employs to make predictions on unseen data based on its
training data. These assumptions are inherent in the algorithm’s design and
serve as a foundation for learning and generalization.

The inductive bias of an algorithm influences how it selects a hypothesis (a

possible explanation or model) from the hypothesis space (the set of all
possible hypotheses) that best fits the training data.

It helps the algorithm navigate the trade-off between fitting the training data
perfectly (overfitting) and generalizing well to unseen data (underfitting).

DIRECTIONAL DERIVATIVE

Directional Derivative measures how a function changes along a specified

direction at a given point, providing insights into its rate of change in that
direction. Directional Derivative can be defined as:

Dv(f) = ∇f · v

 ∇f represents Gradient of Function

where:

 v is Direction Vector Along which we Want to Find Derivative

How to Calculate Directional Derivative

To calculate the directional derivative of a function at a given point in a specific

direction, follow these steps:
Step 1: Find the Gradient
Compute the gradient (∇f) of the function. The gradient is a vector that points in
the direction of the steepest increase of the function.
Step 2: Normalize Direction Vector
Normalize the direction vector (v) to ensure it has a length of 1. This is done by
dividing each component of the vector by its magnitude.
Step 3: Dot Product
Take the dot product of the normalized direction vector and the gradient. The dot
product is obtained by multiplying corresponding components of the two vectors
and then summing them up.
Dv(f) =∇f⋅v

Step 4: Evaluate at a Point: Plug in the coordinates of the point where you want to
find the directional derivative into the gradient and the normalized direction vector.

Dv(f)(a, b) = ∇f(a, b)⋅v

at the point P(1, 2) in the direction of the vector v= ⟨1, −1⟩.

Example 1: Compute the directional derivative of the function f(x, y) = x2 + 3y

Solution:

P(1, 2) in the direction of the vector v = ⟨1, −1⟩, we use the following formula:
To compute the directional derivative of the function f(x, y) = x 2 + 3y at the point

Dvf = ∇f⋅v

∇f=(∂f/∂x,∂f/∂y)
First, let’s find the gradient of f:

∂f/∂x=2x

So, ∇f=(2x,3).
∂f/∂y=3

Now, evaluate ∇f at the point P(1,2):

∇f(1,2)=(2(1),3)=(2,3)
Next, we compute the dot product of ∇f(1,2) and v:
∇f(1,2)⋅v = (2, 3)⋅⟨1, −1⟩ = 2⋅1 + 3⋅(−1) = 2 − 3 = −1

direction of the vector v = ⟨1, −1⟩ is Dvf = −1.

Therefore, the directional derivative of f(x, y) = x 2 + 3y at the point P(1, 2) in the

MINIMA/ MAXIMA
In machine learning, local minima and global minima are two important concepts
related to the optimization of loss functions.
A loss function is a function that measures the error between a model's predictions
and the ground truth. The goal of machine learning is to find a model that minimizes
the loss function.

Minima is a point where a loss function is minimized, indicating the point where a model
has the least error

 Local minima
A local minimum is a point where the function's value is the lowest in its
immediate neighborhood.
 Global minima
A global minimum is a point where the function's value is the lowest across
the entire range

OR
Local Minima

A point x = a is said to be the point of local minima if the value of the function at
this point is the lowest value around its neighbor.
In simple terms, if we consider a small interval around x = a, the function will obtain
its minimum value in this interval.

Mathematically,
A function f(x) has a local minimum at x = a if there exists an open interval I (such
that I is contained in the domain of f(x)) containing a, and
f(a) <= f(x), for all x in I.

Local Maxima

Similar to local minima, A point x = b is said to be the point of local maxima if the
value of the function at this point is the maximum value around its neighbor.
In simple terms, if we consider a small interval around x = b, the function will obtain
its maximum value in this interval.

Mathematically
A function f(x) has a local maximum at x = b if there exists an open
interval I (such that I is contained in the domain of f(x)) containing b, and
f(x) <= f(b), for all x in I.

Global Minima
The point within the entire domain at which the function obtains its lowest value is
known as the global minimum of the function.
 There can be one and only one global minimum of a function.
 It is the lowest local minimum value among all.
 i.e., x = a, is a point of global minima, if and only if f(a) <= f(x) for
all x in the domain of f(x).
Global Maxima
Similar to global minima, the point within the entire domain at which the function
obtains its highest value is known as the global maxima of the function.
 There can be one and only one global maxima of a function.
 It is the highest local maximum value among all.
 i.e., x = b is a point of global minima if and only if f(x) <= f(b) for
all x in the domain of f(x).

K-NEAREST NEIGHBORS ALGORITHM

 The K-NN algorithm works by finding the K nearest neighbors to a given data
point based on a distance metric, such as Euclidean distance.

 The class or value of the data point is then determined by the majority vote or
average of the K neighbors.

 This approach allows the algorithm to adapt to different patterns and make
predictions based on the local structure of the data
DISTANCE METRICS USED IN KNN
ALGORITHM
 Euclidean distance
 Manhattan distance
 Minkowski distance
 Suprenum distance

LINEAR CLASSIFIER
LOGISTIC REGRESSION

Logistic regression is a supervised machine learning algorithm used for classification

tasks where the goal is to predict the probability that an instance belongs to a given
class or not.

Logistic regression is used for binary classification where we use sigmoid function,
that takes input as independent variables and produces a probability value between
0 and 1.

For example, we have two classes Class 0 and Class 1 if the value of the logistic
function for an input is greater than 0.5 (threshold value) then it belongs to Class 1
otherwise it belongs to Class 0.

Key Points:

 Logistic regression predicts the output of a categorical dependent variable.

Therefore, the outcome must be a categorical or discrete value.

 It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the
exact value as 0 and 1, it gives the probabilistic values which lie between 0
and 1.

 In Logistic regression, instead of fitting a regression line, we fit an “S” shaped

logistic function, which predicts two maximum values (0 or 1).
Logistic Function – Sigmoid Function

 The sigmoid function is a mathematical function used to map the predicted

values to probabilities.

 It maps any real value into another value within a range of 0 and 1. The value
of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the “S” form.

 The S-form curve is called the Sigmoid function or the logistic function.

 In logistic regression, we use the concept of the threshold value, which

defines the probability of either 0 or 1. Such as values above the threshold
value tends to 1, and a value below the threshold values tends to 0.

DERIVATION OF SIGMOID FUNCTION

SVM
 It is used in supervised learning
 Svm classify the linear data points as well as non linear data points
 Svm is used for
o Regression analysis
o Outlier analysis
o Pattern analysis
o Classification
 svm is used to classify the data points of n dimension data set
where n= no of attributes or features of the data set
 Classify the data points with linear plane(hyper plane)

Data set is of 2 dimension

FI=X1
F2=X2

 Data points that lie on h1 & h2 are called support vectors

 If we have 2 hyperplanes, hyperplane with more width is considered
good(accurate) and we should use that hyperplane
 Support vectors gives or provides the maximum width
hyperplane(maximum marginal hyperplane (MMH))
QUESTION 1 ON SVM
QUESTION 2 ON SVM
PRINCIPAL COMPONENT ANALYSIS

 It is also know as karhunen-loeve or K-L method

 It searches for k d-dimensinal orthonormal vectors(principal componenets)
that can best be used to represent the data.
Where k(principal components)<= d
 It uses feature extraction technique
 It is also dimensional reduction techniques. During reducing dimension it does
not loose the important feature of data.

 Two componets
PC1 – having maximum variance(maximum spread)
PC2 – component analysis. This is always orthogonal to PC1
(perpendicular to PC1)
MULTI-CLASS CLASSIFIER
DECISION TREE CLASSIFIER

 Decision free induction is the learning of decision trees from class labelled
training tuples
 A decision tree is a flowchart like tree structure,
Where each internal node denotes test on attribute/feature/column
Each branch  represents outcome of the test
Each leaf node  holds a class label
PSEUDOCODE

 Begin with your training dataset, which should have some feature variables
and classification or regression input
 Determine the “best feature” in the dataset to split the data on
 Split the data into subsets that contain the correct values for this best feature.
This splitting basically defines a node on the tree i,e each node is a splitting
point based on a certain feature from our data
 Recursively generate new tree nodes by using the subset of data created In
above point

Mathematically speaking, decision trees use hyperplanes which run parallel to any
one of the axes to cut your coordinate system into hyper cuboids

CART- classification and regression trees

 The logic of decision trees can also be applied to regression problems,hence

the name CART
DECISION TREE ENTROPY

 Entropy is nothing but the measure of disorder

OR
Measure of purity/impurity

 More knowledge  less entropy

How to calculate entropy?

Observations
1. If all class labels are same, then entropy is zero
2. More is uncertainity, more is entropy
3. Both log2 or loge can be used to calculate entropy
NOW ,WHICH ATTRIBUTE WE TAKE AS A ROOT NODE?

 Information gain is a metric used to train decision trees

 This metric measures the quality of split
 The information gain is based on the decrease in the entropy after the dataset
is split on an attribute
 It is expected reduction in the information requirement caused by knowing the
value of A
 We want to partition on attribute A that would do the “best classification” so
that the amount of information still required to finish classifying the tuples is
minimal( minimum infoA(D) )

STEPS TO CALCULATE INFORMATION GAIN

STEP 1  entropy of parent

STEP 2  calculate entropy for column partitions
STEP 3  calculate weighted entropy of children
STEP 4  calculate information gain
STEP 5  calculate information gain for all the columns
STEP 6  find information gain recursively

Presentation On ML
No ratings yet
Presentation On ML
469 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
28 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
44 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
12 pages
Module 1
No ratings yet
Module 1
22 pages
MACHINE LEARNING Updated
No ratings yet
MACHINE LEARNING Updated
12 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
Matematics and Machine Learning
No ratings yet
Matematics and Machine Learning
156 pages
What Is Supervise
No ratings yet
What Is Supervise
3 pages
Maths For ML
No ratings yet
Maths For ML
156 pages
Int401 Lec
No ratings yet
Int401 Lec
28 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
15 pages
ML Module 1 Final
No ratings yet
ML Module 1 Final
134 pages
Unit 3
No ratings yet
Unit 3
62 pages
Ai Unit5 Learning
No ratings yet
Ai Unit5 Learning
62 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Unit 1
No ratings yet
Unit 1
20 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Lecture 2
No ratings yet
Lecture 2
20 pages
Lec 001
No ratings yet
Lec 001
17 pages
Machine Learning
No ratings yet
Machine Learning
92 pages
DL Unit 1 Notes
No ratings yet
DL Unit 1 Notes
90 pages
01 Ml-Overview Notes
No ratings yet
01 Ml-Overview Notes
19 pages
A Course in Machine Learning - Ciml-V0 - 9-All
No ratings yet
A Course in Machine Learning - Ciml-V0 - 9-All
193 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
32 pages
A Course in Machine Learning
No ratings yet
A Course in Machine Learning
50 pages
A Course in Machine Learning
100% (1)
A Course in Machine Learning
191 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Machine Learning Unil-1
No ratings yet
Machine Learning Unil-1
20 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
Introductiontomachinelearning 230723174746 1a0e5edc
No ratings yet
Introductiontomachinelearning 230723174746 1a0e5edc
27 pages
Machine Learning - Module 1
No ratings yet
Machine Learning - Module 1
105 pages
Lecture 1.2 Introduction To Machine Learning
No ratings yet
Lecture 1.2 Introduction To Machine Learning
31 pages
A Course in Machine Learning 1648562733
No ratings yet
A Course in Machine Learning 1648562733
193 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
What Are The Basic Concepts in Machine Learning
No ratings yet
What Are The Basic Concepts in Machine Learning
6 pages
Machine Learning
100% (1)
Machine Learning
189 pages
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
No ratings yet
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
48 pages
Unit 1
No ratings yet
Unit 1
93 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
UNIT-1 Machine Learning
No ratings yet
UNIT-1 Machine Learning
43 pages
Alzubi 2018 J. Phys. Conf. Ser. 1142 012012
No ratings yet
Alzubi 2018 J. Phys. Conf. Ser. 1142 012012
23 pages
B.Tech CSE: Machine Learning Guide
No ratings yet
B.Tech CSE: Machine Learning Guide
60 pages
Machine Learning Basics & History
No ratings yet
Machine Learning Basics & History
458 pages
ML - Full Slides Srikanth Allamshatty
No ratings yet
ML - Full Slides Srikanth Allamshatty
369 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
61 pages
Machine Learning 1
No ratings yet
Machine Learning 1
29 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Intro to Machine Learning Concepts
100% (1)
Intro to Machine Learning Concepts
58 pages
24-Analog Electronic Filters - Hercules G. Dimopoulos
No ratings yet
24-Analog Electronic Filters - Hercules G. Dimopoulos
9 pages
DevOps Engineer Job: AWS & Node.js
No ratings yet
DevOps Engineer Job: AWS & Node.js
2 pages
EPI Prediction with Knime
No ratings yet
EPI Prediction with Knime
11 pages
Control Systems Unit 4
No ratings yet
Control Systems Unit 4
38 pages
Define Period Work Schedules in Sap
No ratings yet
Define Period Work Schedules in Sap
4 pages
Shape Measurement of Steel Strips Using PDF
No ratings yet
Shape Measurement of Steel Strips Using PDF
9 pages
Instructions SC WD365 2021 EOM9-1
No ratings yet
Instructions SC WD365 2021 EOM9-1
3 pages
Active Directory Attack Guide
100% (1)
Active Directory Attack Guide
28 pages
User Manual TecnoManager ENG
No ratings yet
User Manual TecnoManager ENG
43 pages
Salangsang, Mark Anthony L.
No ratings yet
Salangsang, Mark Anthony L.
5 pages
Solutions Manual For College Accounting A Practical Approach 13th Edition by Jeffrey Slater Fast Access
No ratings yet
Solutions Manual For College Accounting A Practical Approach 13th Edition by Jeffrey Slater Fast Access
325 pages
Hydraulic G Code List
No ratings yet
Hydraulic G Code List
7 pages
LASERJET PRO 400 MFP M425 Repair Manual
No ratings yet
LASERJET PRO 400 MFP M425 Repair Manual
78 pages
Email Services
No ratings yet
Email Services
16 pages
Share Cs312 Assignment 1-1
No ratings yet
Share Cs312 Assignment 1-1
3 pages
Teens Online and Social Media Use UMN Extension
No ratings yet
Teens Online and Social Media Use UMN Extension
1 page
An Electronic Intelligent Hotel Management System For International Marketplace
No ratings yet
An Electronic Intelligent Hotel Management System For International Marketplace
7 pages
ICC Article HT V26jan23
No ratings yet
ICC Article HT V26jan23
3 pages
Avatar Assignment
No ratings yet
Avatar Assignment
1 page
812 Manual
No ratings yet
812 Manual
184 pages
7 Disk 1
No ratings yet
7 Disk 1
43 pages
Entry-Level Social Work Career
No ratings yet
Entry-Level Social Work Career
3 pages
Composition and Aggregation
No ratings yet
Composition and Aggregation
2 pages
Track DDL in Noarchivelog Mode
No ratings yet
Track DDL in Noarchivelog Mode
2 pages
Deep Learning For Deepfakes Creation and Detection: A Survey
No ratings yet
Deep Learning For Deepfakes Creation and Detection: A Survey
16 pages
Rs. 999 Plan-Converted1
No ratings yet
Rs. 999 Plan-Converted1
2 pages
NM 2068 1583059872
No ratings yet
NM 2068 1583059872
1 page
Sauna PDF
No ratings yet
Sauna PDF
13 pages
Urban Futurism
No ratings yet
Urban Futurism
7 pages
Tanmay Ghosh PDF
No ratings yet
Tanmay Ghosh PDF
21 pages

Machine Learnig

Uploaded by

Machine Learnig

Uploaded by

WELL POSED LEARNING PROBLEMS

APPLICATIONS OF MACHINE LEARNING

Self Driving Cars

MODEL SELECTION AND GENERALIZATION

Concept learning in machine learning is a process that teaches a computer program

Here's some more information about inductive learning:

 How it works: Inductive learning algorithms (ILAs) use a labeled dataset to

 A hypothesis in machine learning is the model’s presumption regarding the

 It is an illustration of the mapping function that the algorithm is attempting to

 The objective is to optimize the model’s parameters to achieve the best

Hypothesis Space (H)

A hypothesis is a function that best describes the target in supervised machine

The Hypothesis can be calculated as:

The inductive bias of an algorithm influences how it selects a hypothesis (a

Directional Derivative measures how a function changes along a specified

 ∇f represents Gradient of Function

 v is Direction Vector Along which we Want to Find Derivative

How to Calculate Directional Derivative

To calculate the directional derivative of a function at a given point in a specific

Dv(f)(a, b) = ∇f(a, b)⋅v

at the point P(1, 2) in the direction of the vector v= ⟨1, −1⟩.

Now, evaluate ∇f at the point P(1,2):

direction of the vector v = ⟨1, −1⟩ is Dvf = −1.

K-NEAREST NEIGHBORS ALGORITHM

Logistic regression is a supervised machine learning algorithm used for classification

 Logistic regression predicts the output of a categorical dependent variable.

 In Logistic regression, instead of fitting a regression line, we fit an “S” shaped

 The sigmoid function is a mathematical function used to map the predicted

 In logistic regression, we use the concept of the threshold value, which

DERIVATION OF SIGMOID FUNCTION

Data set is of 2 dimension

 Data points that lie on h1 & h2 are called support vectors

 It is also know as karhunen-loeve or K-L method

CART- classification and regression trees

 The logic of decision trees can also be applied to regression problems,hence

 Entropy is nothing but the measure of disorder

 More knowledge  less entropy

How to calculate entropy?

 Information gain is a metric used to train decision trees

STEPS TO CALCULATE INFORMATION GAIN

STEP 1  entropy of parent

You might also like