Introduction to Bayes' Theorem and Naive
Bayes Classifier in Artificial Intelligence
Presented by:-
           22/IT/062 Buddhadeb Giri
           22/IT/063 Debajyoti Mandal
           22/IT/064 Debjit Misra
           22/IT/065 Deep Roy
           22/IT/066 Deepak Kumar
           22/IT/067 Deepak Kumar Choudhary
           22/IT/068 Deepak Kumar Rai
1. Bayes' Theorem:-
Bayes' Theorem is a fundamental concept in probability
theory and statistics. It describes the probability of an event,
based on prior knowledge of conditions that might be related
to the event. Named after the Reverend Thomas Bayes, it
provides a way to update our beliefs about the world when
new evidence or data is available.
Mathematically, Bayes' Theorem is expressed as:
Where:
  •   P(A∣B) is the posterior probability: the probability of
      event A occurring given the evidence B.
  •   P(B∣A) is the likelihood: the probability of observing
      evidence B given that A is true.
  •   P(A) is the prior probability: the initial probability of
      event A before observing evidence.
  •   P(B) is the marginal likelihood: the total probability of
      observing evidence BB across all possible outcomes.
In simple terms, Bayes' Theorem allows us to revise our
predictions or beliefs about the world based on new data.
Example:
Suppose you are trying to diagnose a disease (Event A) based
on a test result (Event B). Bayes' Theorem helps you calculate
the probability of the disease given a positive test result,
using the prior probability of having the disease, the
probability of testing positive if you have the disease, and the
overall likelihood of a positive test.
2. Naive Bayes Classifier:-
The Naive Bayes Classifier is a classification algorithm based
on Bayes' Theorem. It's called "naive" because it makes the
simplifying assumption that the features used to describe
each class are conditionally independent, given the class
label. This assumption greatly simplifies the computation and
makes the classifier efficient, though it doesn't always hold in
real-world data.
Naive Bayes Formula:-
For a given class C and a set of features {x1,x2,...,xn}, the goal
is to predict the class C based on the values of the features.
The Naive Bayes classifier calculates the probability of each
class given the observed features and selects the class with
the highest posterior probability.
The formula is:
By applying the naive independence assumption, the
likelihood term P(x1,x2,...,xn∣C) becomes:
       P(x1,x2,..,xn∣C)=P(x1∣C)⋅P(x2∣C)⋅...⋅P(xn∣C)
Where:
  •   P(C) is the prior probability of the class.
  •   P(xi∣C)is the likelihood of feature xi given the class C.
  •   P(x1,x2,...,xn) is the normalization factor (which can be
      ignored during classification, as it is the same for all
      classes).
Steps in Naive Bayes Classification:-
  1. Training:
        o   Compute the prior probabilities for each class P(C).
        o   Compute the likelihood P(xi∣C) for each feature in
            each class.
  2. Prediction:
        o   For a new instance with features x1,x2,...,xn calculate
            the posterior probability for each class.
        o   Assign the class with the highest posterior
            probability.
Types of Naive Bayes Classifiers:-
  1. Gaussian Naive Bayes: Assumes that the features follow
     a normal (Gaussian) distribution.
  2. Multinomial Naive Bayes: Used for discrete data, such
     as text classification where features are counts or
     frequencies of words.
  3. Bernoulli Naive Bayes: Suitable for binary/boolean
     features.
3. Why Use Naive Bayes?
  •   Efficiency: Naive Bayes classifiers are computationally
      efficient and require a small amount of data for training.
  •   Interpretability: The model is simple and easy to
      understand.
  •   Good performance: Despite its simplicity, Naive Bayes
      often performs surprisingly well, especially in problems
      like spam filtering and text classification.
  •   Works well with categorical data: It is particularly
      effective when the features are discrete (e.g., words in a
      document, which are often represented as binary or
      count data).
4. Applications of Naive Bayes:-
Naive Bayes classifiers are widely used in various applications,
including:
  •   Spam filtering: Identifying whether an email is spam or
      not based on the words in the email.
  •   Document categorization: Classifying news articles, blog
      posts, or reviews into categories like sports, politics, or
      technology.
  •   Sentiment analysis: Determining the sentiment
      (positive, negative, or neutral) of a given text.
  •   Medical diagnosis: Classifying medical conditions based
      on symptoms or test results.
5. Limitations of Naive Bayes:-
  •   Independence assumption: The major limitation is the
      assumption that features are independent, which often
      doesn't hold true in real-world data. For instance, in text
      classification, the occurrence of one word (e.g., "cat")
      might be correlated with the occurrence of another
      word (e.g., "kitten").
  •   Poor performance with highly correlated features: If
      the features are highly dependent on each other, Naive
      Bayes might not perform well.
Conclusion:-
Bayes' Theorem provides a powerful framework for reasoning
under uncertainty, and the Naive Bayes classifier is a simple
yet effective algorithm that leverages this framework for
classification tasks. Its assumptions of feature independence
make it both fast and easy to implement, though it may not
always capture complex dependencies between features.
Despite its simplicity, Naive Bayes is still widely used in
practical applications, particularly when dealing with large
datasets or when computational efficiency is important.