Naive Bayes
NAIVE BAYES MODEL
● It is a classification technique based on Bayes’ Theorem with an
independence assumption among predictors. In simple terms, a Naive
Bayes classifier assumes that the presence of a particular feature in a
class is unrelated to the presence of any other feature.
● For example, a fruit may be considered to be an apple if it is red,
round, and about 3 inches in diameter. Even if these features depend
on each other or upon the existence of the other features, all of these
properties independently contribute to the probability that this fruit is
an apple and that is why it is known as ‘Naive’.
● Naive Bayes can be used for various things like face recognition,
weather prediction, Medical Diagnosis, News classification,
Sentiment Analysis, and a lot more.
S.NO Outlook Play EXAMPLE
1 Rainy Yes
2 Sunny Yes
3 Overcast Yes
4 Overcast Yes
5 Sunny No
6 Rainy Yes
7 Sunny Yes
8 Overcast Yes
9 Rainy No
10 Sunny No
11 Sunny Yes
12 Rainy No
13 Overcast Yes
14 Overcast Yes
Problem: If the weather is sunny, then the Player should play or not?
Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So
P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
So as we can see from the above calculation that
P(Yes|Sunny)>P(No|Sunny)
Hence on a Sunny day, Player can play the game.
Frequency table for the Weather Conditions:
Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 4
Likelihood table weather condition:
Weather No Yes
Overcast 0 5 5/14= 0.35
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
All 4/14=0.29 10/14=0.71
Advantages
● It is easy and fast to predict class of test data set. It also
perform well in multi class prediction
● When assumption of independence holds, the classifier
performs better compared to other machine learning
models like logistic regression or decision tree, and
requires less training data.
● It perform well in case of categorical input variables
compared to numerical variable(s). For numerical variable,
normal distribution is assumed (bell curve, which is a
strong assumption).
Drawbacks
● If categorical variable has a category (in test data set), which
was not observed in training data set, then model will assign a 0
(zero) probability and will be unable to make a prediction. This
is often known as “Zero Frequency”. To solve this, we can use
the smoothing technique. One of the simplest smoothing
techniques is called Laplace estimation.
● On the other side, Naive Bayes is also known as a bad estimator,
so the probability outputs from predict_proba are not to be
taken too seriously.
● Another limitation of this algorithm is the assumption of
independent predictors. In real life, it is almost impossible that
we get a set of predictors which are completely independent.
Applications
● Real-time Prediction: Naive Bayesian classifier is an eager learning
classifier and it is super fast. Thus, it could be used for making predictions in
real time.
● Multi-class Prediction: This algorithm is also well known for multi class
prediction feature. Here we can predict the probability of multiple classes of
target variable.
● Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayesian
classifiers mostly used in text classification (due to better result in multi class
problems and independence rule) have higher success rate as compared to
other algorithms. As a result, it is widely used in Spam filtering (identify spam
e-mail) and Sentiment Analysis (in social media analysis, to identify positive
and negative customer sentiments)
● Recommendation System: Naive Bayes Classifier and Collaborative
Filtering together builds a Recommendation System that uses machine
learning and data mining techniques to filter unseen information and predict
whether a user would like a given resource or not.