Module - 3 - Last Part

The document discusses the Bayes optimal classifier, which determines the most probable classification of new instances by combining predictions from all hypotheses weighted by their posterior probabilities. It highlights that while the MAP hypothesis is useful, it can be outperformed by methods that consider all hypotheses. Additionally, the document introduces the Naive Bayes classifier, which simplifies the classification process by assuming conditional independence among attributes, and outlines its effectiveness in various real-world applications despite its strong assumptions.

Uploaded by

abhishek.ec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views16 pages

Module - 3 - Last Part

Uploaded by

abhishek.ec

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

BAYES OPTIMAL CLASSIFIER

• Focus so far: "What is the most probable hypothesis given the training data?"
• More significant in practice:
"What is the most probable classification of a new instance given the training
data?"
• At first glance:
It might seem sufficient to apply the MAP (Maximum A Posteriori) hypothesis to
classify new instances.
• However:
There are methods that can outperform simply using the MAP hypothesis for
classification.
Example to Build Intuition:
• Hypothesis space: Contains three hypotheses — h₁, h₂, h₃
• Posterior probabilities:
– h₁: 0.4 → MAP hypothesis
– h₂: 0.3
– h₃: 0.3
• New instance x:
– Classified positive by h₁
– Classified negative by h₂ and h₃ ∑ P(+ | h ) P(h | D) = .4
hi ∈H
i i
• Overall classification probabilities:
– Positive: 0.4 ∑ P(− | h ) P(h | D) = .6
i i
– Negative: 0.6 hi ∈H

Observation:
Although h₁ is the MAP hypothesis and classifies x as positive, the most
probable classification considering all hypotheses is negative
• In general, the most probable classification of the new instance
is obtained by combining the predictions of all hypotheses,
weighted by their posterior probabilities.
• If the possible classification of the new example can take on
any value vj from some set V, then the probability P(vj|D) that
the correct classification for the new instance is vj, is just
P(vj | D) = ∑ P(v j | hi ) P(hi | D)
hi ∈H
• The optimal classification of the new instance is the value vj,
for which P(vj|D) is maximum.
arg max ∑ P(v j | hi ) P(hi | D)
v j ∈V
hi ∈H
• Any system that classifies new instances according to equation
above is called a Bayes optimal classifier, or Bayes optimal
learner.
• No other classification method using the same hypothesis
space and same prior knowledge can outperform this method
on average.
• This method maximizes the probability that the new instance is
classified correctly, given the available data, hypothesis space,
and prior probabilities over the hypotheses.
• Bayes optimal classifier can make predictions that do not match
any single hypothesis in the hypothesis space H
• Using equation to classify all instances in X: The resulting labeling
may not align with any individual hypothesis h ∈ H

Interpretation:
• The Bayes optimal classifier behaves as if it’s using a different
hypothesis space H′

Where H′ includes:
• Hypotheses that combine predictions from multiple hypotheses in
H
• Often through linear combinations or comparisons of these
predictions
Bayes optimal classifier provides best result, but can be expensive if
many hypotheses.
Gibbs algorithm:
1. Choose one hypothesis h at random, according to P(h|D)
2. Use this h to classify new instance
Surprising fact: assume target concepts are drawn at random from H
according to priors on H. Then:
E[errorGibbs] ≤ 2E[errorBayesOptimal]
Suppose correct, uniform prior distribution over H, then
• Pick any hypothesis from VS, with uniform probability
• Its expected error no worse than twice Bayes optimal
Classification approach:
• Chooses a hypothesis at random based on the posterior
probability distribution
Key insight:
• Despite its simplicity, it has provable performance guarantees
• Theoretical result (Haussler et al., 1994):
Under certain conditions:
– The expected misclassification error of the Gibbs algorithm
is at most twice that of the Bayes optimal classifier
Conditions:
• Target concepts are assumed to be drawn from the prior probability
distribution
• The expected value is taken over this distribution
Naive Bayes Classifier:
• A practical and widely-used Bayesian learning method
• Also known as the naive Bayes learner
Strength:
– Performs well in many real-world domains
– Moderate or large training set available
– Attributes that describe instances are conditionally independent given
classification
• Performance:
– Often comparable to that of neural networks and
decision tree learning methods
Assume target function f: X→V, where each instance x described by attributed
(a1,a2,…,an).
Most probable value of f(x) is: vMAP = arg max P(v j | a1 , a2 ,..., an )
v ∈V j

P(a1 , a2 ,..., an | v j ) P(v j )

= arg max
v j ∈V P(a1 , a2 ,..., an )
= arg max P(a1 , a2 ,..., an | v j ) P(v j )
v j ∈V
Naïve Bayes assumption:
• The naive Bayes classifier is based on the simplifying assumption that the
attribute values are conditionally independent given the target value. In
other words, the assumption is that given the target value of the instance,
the probability of observing the conjunction a l , a 2 . . .a, is just the
product of the probabilities for the individual attributes:
P(a1 , a2 ,..., an | v j ) = ∏ P(ai | v j )
which gives i

Naïve Bayes classifier: v NB = arg max

v ∈V
j
P (v j ) ∏ P(a | v )
i
i j
Naive_Bayes_Learn(examples)
For each target value v j
Pˆ (v j ) ← estimate P(v j )
For each attribute value ai of each attribute a
Pˆ (a |v ) ← estimate P(a |v )
i j i j

Classify_New_Instance( x)
v NB = arg max Pˆ (v j ) ∏ Pˆ (ai|v j )
v j ∈V
a i ∈x
Consider CoolCar and new instance
(Color=Blue, Type=SUV, Doors=2, Tires=White)
Want to compute
v NB = arg max P(v j )∏ P(ai | v j )
v j ∈V
i

arg max Pˆ (v j )∏ Pˆ (ai | v j ) = arg max P(v j ) P(a1 ,..., an | v j )

v j ∈V v j ∈V
i
• Naïve Bayes posteriors often unrealistically close to 1 or 0
2. What if none of the training instances with target value vj have
attribute value ai? Then
Pˆ (ai | v j ) = 0, and ...
Pˆ (v j )∏ Pˆ (ai | v j ) = 0
i
Typical solution is Bayesian estimate for
nc + mp
P(ai | v j ) ←
ˆ
n+m
• n is number of training examples for which v=vj
• nc is number of examples for which v=vj and a=ai
• p is prior estimate for
• m is weight given to prior (i.e., number of “virtual” examples)
• A probabilistic classifier based on Bayes' Theorem
• Assumes strong (naive) independence between features
• Simple, fast, and effective for high-dimensional data
• All features are conditionally independent given the class label

• Despite the unrealistic assumption, it often performs surprisingly

well

• Fast and scalable – works well with large datasets

• Handles both binary and multiclass classification
• Requires less training data than many other models
• Performs well even with noisy or missing data
1. Training Phase:
1. Learn prior probabilities: P(y)
2. Learn likelihoods: P(xi∣y)
2. Prediction Phase:
1. Use Bayes' Theorem to compute posterior:

2. Choose class y with highest posterior probability

THANK YOU

Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
No ratings yet
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
54 pages
Unit 2 Bayesian Learning
No ratings yet
Unit 2 Bayesian Learning
50 pages
Naïve Bayes Classifier: April 25, 2006
No ratings yet
Naïve Bayes Classifier: April 25, 2006
19 pages
Ba Yes Naive
No ratings yet
Ba Yes Naive
15 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
Unit-3 (After Mid)
No ratings yet
Unit-3 (After Mid)
10 pages
IML Module 3
No ratings yet
IML Module 3
95 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
L23 Bayesian Naive
No ratings yet
L23 Bayesian Naive
18 pages
Unit III
No ratings yet
Unit III
19 pages
Bayesian Decision Theory in ML
No ratings yet
Bayesian Decision Theory in ML
56 pages
Unit6 - 3 Classification-Bayesian
No ratings yet
Unit6 - 3 Classification-Bayesian
15 pages
Bayesian Learning Lecture 1
No ratings yet
Bayesian Learning Lecture 1
4 pages
Naive Bayes & SVM Overview
No ratings yet
Naive Bayes & SVM Overview
79 pages
Bayesian
No ratings yet
Bayesian
23 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
14 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
0% (1)
Machine Learning: Lecture 6: Bayesian Learning (Based On Chapter 6 of Mitchell T.., Machine Learning, 1997)
15 pages
Classification Bayes
No ratings yet
Classification Bayes
21 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Bayesian Classification, Nearest
No ratings yet
Bayesian Classification, Nearest
46 pages
6.1 Bayesian Learning
No ratings yet
6.1 Bayesian Learning
33 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
8 ML
No ratings yet
8 ML
22 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
Module05 - Bayesian Reasoning
No ratings yet
Module05 - Bayesian Reasoning
37 pages
Lecture 7 - Naïve Bayes Classification
No ratings yet
Lecture 7 - Naïve Bayes Classification
12 pages
Bayes Classification
No ratings yet
Bayes Classification
9 pages
Slide07 Bayes
No ratings yet
Slide07 Bayes
51 pages
Module 3 - Bayesian Classifier
No ratings yet
Module 3 - Bayesian Classifier
17 pages
07 Naive Bayes
No ratings yet
07 Naive Bayes
6 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
E-Note 14654 Content Document 20231228101425AM
No ratings yet
E-Note 14654 Content Document 20231228101425AM
10 pages
BSC ML CH2
No ratings yet
BSC ML CH2
79 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
CCS - Lec 5
No ratings yet
CCS - Lec 5
33 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Unit 3
No ratings yet
Unit 3
99 pages
Bayesian Learning
No ratings yet
Bayesian Learning
58 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
Chapter 4
No ratings yet
Chapter 4
57 pages
CSC 323-07 Bayesian Learning
No ratings yet
CSC 323-07 Bayesian Learning
11 pages
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
No ratings yet
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
18 pages
Bayes Classification
No ratings yet
Bayes Classification
8 pages
Machine Learning Notes: Bayesian Methods
No ratings yet
Machine Learning Notes: Bayesian Methods
12 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
100% (3)
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
11 pages
Ch-1 Introduction To Artificial Intelligence
No ratings yet
Ch-1 Introduction To Artificial Intelligence
20 pages
Talk 6 Healing
No ratings yet
Talk 6 Healing
6 pages
A Wadood CV
No ratings yet
A Wadood CV
3 pages
4 en - Dispersion - Low
No ratings yet
4 en - Dispersion - Low
12 pages
Quiz Exercises 9 Reported Speech 9
No ratings yet
Quiz Exercises 9 Reported Speech 9
2 pages
4 Migration
No ratings yet
4 Migration
33 pages
Price List Book
No ratings yet
Price List Book
11 pages
For Training Only: SAP Ariba Services
No ratings yet
For Training Only: SAP Ariba Services
1 page
52144 download full chapters
100% (1)
52144 download full chapters
124 pages
Tai-Chi and Qigong
No ratings yet
Tai-Chi and Qigong
9 pages
Technical Information No. 1: Grey Lamellar Graphite Cast Iron
No ratings yet
Technical Information No. 1: Grey Lamellar Graphite Cast Iron
2 pages
3D Printed Wind Turbines Part 1 Design Consid 2015 Sustainable Energy Techn
No ratings yet
3D Printed Wind Turbines Part 1 Design Consid 2015 Sustainable Energy Techn
8 pages
Costing Methods & Techniques - 1st Chapter
No ratings yet
Costing Methods & Techniques - 1st Chapter
8 pages
O.E.E. Reading Material
No ratings yet
O.E.E. Reading Material
12 pages
Eshet Eilon Profile
No ratings yet
Eshet Eilon Profile
3 pages
100 Resources
100% (3)
100 Resources
29 pages
Sexual Violence in Society
No ratings yet
Sexual Violence in Society
2 pages
Class Xii Term 1 Mock Test 2021
No ratings yet
Class Xii Term 1 Mock Test 2021
11 pages
Area
No ratings yet
Area
15 pages
SternMed Xenox C400 Leaflet
No ratings yet
SternMed Xenox C400 Leaflet
10 pages
Ebooks File Clinician S Guide To PTSD 2nd Ed A Cognitive Behavioral Approach 2nd Edition Steven Taylor All Chapters
100% (17)
Ebooks File Clinician S Guide To PTSD 2nd Ed A Cognitive Behavioral Approach 2nd Edition Steven Taylor All Chapters
55 pages
Cold Weather Concrete PDF
No ratings yet
Cold Weather Concrete PDF
4 pages
Advances in Spatial Transcriptomic Data Analysis
No ratings yet
Advances in Spatial Transcriptomic Data Analysis
13 pages
Data Analysis and Interpretation Guide
No ratings yet
Data Analysis and Interpretation Guide
4 pages
9 Circles of Hell and Its Punishments: Third Circle (Gluttony)
No ratings yet
9 Circles of Hell and Its Punishments: Third Circle (Gluttony)
3 pages
Financial Analysis for Accountancy Students
No ratings yet
Financial Analysis for Accountancy Students
18 pages
Robert Abiol-Grade 10 Mapeh 2
No ratings yet
Robert Abiol-Grade 10 Mapeh 2
51 pages
Proposal For Pitch An Idea
No ratings yet
Proposal For Pitch An Idea
2 pages
Sick Verse
100% (3)
Sick Verse
503 pages
Exploring Truth in Theory and Practice
No ratings yet
Exploring Truth in Theory and Practice
37 pages