Chapter 4
Chapter 4
🔍 Quantifying Uncertainty in AI
Quantifying uncertainty refers to how an AI system estimates how con dent it is in its predictions or decisions.
Why Is It Important?
Types of Uncertainty:
• Monte Carlo Dropout: Use dropout at test time to simulate an ensemble of models.
• Con dence Scores: Output probabilities for classi cations (e.g., 90% con dence it's a cat).
🤖 Learning Systems in AI
Learning systems are AI systems that improve their performance over time by learning from data or experiences
instead of being explicitly programmed.
1

fi
fi
fi
fi
fi
AI-CH4
• It also provides a con dence score (quantifying uncertainty) for each diagnosis.
Summary
• Quantifying uncertainty helps AI systems know how sure they are, improving trust and safety.
• Learning systems allow AI to adapt and get better over time using data.
Both are critical for building intelligent, safe, and reliable AI applications.
Acting under uncertainty in Arti cial Intelligence (AI) refers to the ability of an AI system to make decisions or take
actions when it doesn't have complete or perfect knowledge about the environment, future outcomes, or the
consequences of its actions.
🔍 Why Is It Important?
In real-world situations, perfect information is rare. AI systems need to act even when:
2

fi
fi
fi
AI-CH4
• Data is incomplete, noisy, or ambiguous
Despite these challenges, the AI must choose the best possible action to achieve its goal.
💡 Real-Life Examples
1. Self-driving cars
Don’t always know how pedestrians or other drivers will behave. They must predict and act accordingly.
2. Robots in search-and-rescue
Often work in unknown or partially destroyed environments and must make safe decisions without full
information.
1. Probability Theory
• Represents beliefs as probabilities (e.g., “there is a 70% chance the object is a cat”).
2. Bayesian Networks
3. Decision Theory
• When the state is only partially known, it becomes a Partially Observable MDP (POMDP).
• AI must balance trying new actions (exploration) and using known good ones (exploitation), especially in
reinforcement learning.
⚙ General Process
3

fi
fi
AI-CH4
1. Sense the environment (may be imperfect or partial).
2. Reason using probabilities to estimate the current state and possible future outcomes.
Probability notation is a standardized way to represent events, outcomes, and their likelihood in mathematics and AI.
Here are the most common basic probability notations you need to know:
🧮 1. Probability of an Event
• Notation: P(A)
• Example:
If A is "getting heads when ipping a coin", then
P(A) = 0.5
🔁 2. Complement of an Event
• Formula:
P(A') = 1 - P(A)
• Example:
If P(A) = 0.7, then P(A') = 0.3
🔗 3. Joint Probability
• Notation: P(A ∩ B)
• Example:
Probability of drawing a red card and a king from a deck.
🔀 4. Union of Events
• Notation: P(A ∪ B)
• Formula:
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
🔄 5. Conditional Probability
4

fl
AI-CH4
• Notation: P(A | B)
• Meaning: Probability that event A happens given that B has already happened.
• Formula:
P(A | B) = P(A ∩ B) / P(B) (if P(B) ≠ 0)
• Example:
If it’s raining (B), what’s the probability you’ll carry an umbrella (A)?
⚖ 6. Bayes’ Theorem
• Formula:
P(A∣B)=P(B∣A)⋅P(A)P(B)
P(A | B) = \frac{P(B | A) \cdot P(A)}{P(B)}
P(A∣B)=P(B)P(B∣A)⋅P(A)
🧠 Summary Table
Notation Meaning
P(A) Probability of event A
P(A') or
Probability A does not happen
P(¬A)
P(A ∩ B) Probability A and B both happen
Probability A or B (or both)
P(A ∪ B) happen
`P(A B)`
5

AI-CH4
bayes' Rule, also known as Bayes' Theorem, is a fundamental principle in probability theory that allows us to update our
beliefs based on new evidence. It is widely used in statistics, machine learning, and AI for decision-making under
uncertainty.
4. Autonomous Systems – Helps robots and AI agents make decisions under uncertainty.
5. Financial Risk Analysis – Assesses risks based on prior market trends and new data.
Bayes' Rule is essential for updating probabilities dynamically as new information becomes available.
Representing knowledge in an uncertain domain is a crucial aspect of arti cial intelligence, as real-world environments
often involve incomplete, ambiguous, or noisy information. AI systems must use specialized techniques to model and
reason about uncertainty effectively.
1. Probabilistic Reasoning – Uses probability theory to quantify uncertainty and make informed decisions.
2. Bayesian Networks – Graphical models that represent probabilistic dependencies among variables.
3. Hidden Markov Models (HMMs) – Used for sequential data where states are uncertain, such as speech
recognition.
4. Markov Decision Processes (MDPs) – Help AI agents make optimal decisions in uncertain environments.
5. Fuzzy Logic – Allows reasoning with imprecise or vague information by assigning degrees of truth.
6

fi
fi
AI-CH4
6. Dempster-Shafer Theory – A generalization of probability theory that handles uncertainty in evidence-based
reasoning.
7. Belief Networks – Represent uncertain knowledge using probability distributions over multiple variables.
8. Case-Based Reasoning – AI learns from past cases and applies them to new situations with uncertainty.
Applications
• Financial Forecasting – AI models predict market trends despite uncertain economic conditions.
In addition to Bayesian reasoning, AI uses several other methods to reason under uncertainty — especially when
data is incomplete, ambiguous, or noisy.
• Idea: Attach a certainty factor (CF) to each rule to re ect con dence.
• Example:
IF symptom = fever
THEN disease = u [CF = 0.7]
• Limitation: Not based on formal probability theory; lacks consistency in combining uncertainties.
2. 🌫 Fuzzy Logic
• Key Idea: Allows partial truth (values between 0 and 1), not just true/false.
• Example:
"Temperature is high" → 0.8 (80% true)
Useful in control systems, like washing machines or air conditioning.
• Each logical rule is assigned a weight to represent how strong the rule is under uncertainty.
• Includes:
• Used for complex reasoning tasks (e.g., medical diagnosis, NLP, vision).
6. 🧬 Possibility Theory
• Example:
Assume “Birds y” → conclude “Tweety ies”
But if you learn “Tweety is a penguin” → retract the earlier conclusion.
8

fl
fl
fi
AI-CH4
Vagueness refers to situations where the boundaries of a concept are not clearly de ned — unlike uncertainty (which
is about not knowing the truth), vagueness is about the blurry nature of the concept itself.
🤔 Example of Vagueness:
This kind of imprecision cannot be handled well with classic Boolean logic (true/false). Instead, AI uses fuzzy logic
and similar approaches to represent vagueness.
1. Fuzzy Logic – Allows reasoning with imprecise data by assigning degrees of truth between 0 and 1, rather
than binary true/false values.
4. Dempster-Shafer Theory – Handles uncertainty by assigning belief functions instead of precise probabilities.
5. Vague Predicates – Words like "tall" or "young" lack clear-cut boundaries, leading to borderline cases.
6. Non-Monotonic Reasoning – Allows AI to revise conclusions when new evidence contradicts previous
assumptions.
Applications
• Medical Diagnosis – Handles cases where symptoms don't t neatly into prede ned categories.
Vagueness is distinct from ambiguity, where a term has multiple meanings (e.g., "bank" referring to a nancial
institution or a riverbank.
Fuzzy sets and fuzzy logic are mathematical frameworks designed to handle uncertainty and imprecision, making them
useful in AI, control systems, and decision-making.
Fuzzy Sets
9

fi
fl
fi
fi
fi
fi
AI-CH4
A fuzzy set is an extension of classical set theory where elements have degrees of membership rather than a strict
binary classi cation (true or false). In classical sets, an element either belongs to a set or it doesn’t. In fuzzy sets,
membership is represented by a membership function that assigns values between 0 and 1, indicating the degree to
which an element belongs to the set.
Fuzzy Logic
Fuzzy logic is a multi-valued logic system that allows reasoning with imprecise or vague information. Unlike classical
Boolean logic, which operates on strict true (1) or false (0) values, fuzzy logic allows intermediate values.
1. Fuzzi cation – Converts crisp inputs into fuzzy values using membership functions.
• Control Systems – Used in washing machines, air conditioners, and traf c control.
Component Description
Fuzzi er Converts crisp input into fuzzy values
Rule Base Set of fuzzy if-then rules
Inference
Applies logic to fuzzy inputs
Engine
Converts fuzzy output into a crisp
Defuzzi er
decision
📦 Real-World Applications
10

fi
fi
fi
fi
fi
fi
fl
fi
AI-CH4
Fuzzy Logic
Fuzzy logic is a mathematical framework that deals with imprecise or vague information. Unlike classical Boolean
logic, which operates on strict true (1) or false (0) values, fuzzy logic allows intermediate values between 0 and 1,
representing degrees of truth.
1. Fuzzy Sets – Elements have degrees of membership rather than binary classi cation.
3. Fuzzy Rules – "If-Then" statements that describe relationships between fuzzy variables.
• Fuzzy logic helps computers reason and make decisions in these gray areas.
• It’s widely used in control systems, pattern recognition, natural language processing, and robotics.
How it Works:
• A Decision Tree is a owchart-like tree structure used for classi cation or regression.
11

fi
fi
fl
fi
fi
fi
fi
AI-CH4
• Each internal node tests an attribute (feature).
• Used in machine learning for classi cation problems like spam detection, medical diagnosis, etc.
How it Works:
• The tree is built by splitting data on attributes that best separate the classes.
• Once the tree is trained, you classify new data by traversing the tree from root to a leaf.
1. Data Preparation
• Feature Selection – Identify relevant attributes that in uence the decision-making process.
• Handling Missing Data – Use techniques like imputation or removal to deal with incomplete records.
• Categorical vs. Numerical Data – Convert categorical variables into numerical representations if needed.
2. Splitting Criteria
• Entropy & Information Gain (ID3 Algorithm) – Measures the reduction in uncertainty after a split.
• Gini Index (CART Algorithm) – Evaluates the purity of a node by measuring class distribution.
12

fi
fi
fl
AI-CH4
• Variance Reduction (Regression Trees) – Used for continuous target variables.
3. Tree Construction
4. Model Evaluation
• Accuracy Metrics – Use precision, recall, and F1-score for classi cation trees.
• Over tting Prevention – Techniques like pruning and limiting depth help avoid excessive complexity.
5. Implementation in Python
Libraries like Scikit-learn provide built-in functions for decision tree implementation. You can check out a detailed
guide on Python Decision Tree Implementation for practical examples.
Machine learning involves different types of learning methods based on how data is provided and how models improve
over time. The primary forms of learning include:
1. Supervised Learning – The model learns from labeled data, where each input has a corresponding correct
output.
2. Unsupervised Learning – The model identi es patterns and structures in unlabeled data without prede ned
outputs.
3. Semi-Supervised Learning – A mix of supervised and unsupervised learning, where some data is labeled and
some is not.
4. Reinforcement Learning – The model learns by interacting with an environment and receiving rewards or
penalties.
Supervised Learning
Supervised learning is one of the most widely used machine learning techniques. It involves training a model using
labeled data, meaning each input is paired with the correct output.
Key Characteristics:
1. Classi cation – Predicts discrete labels (e.g., spam detection, image recognition).
13

fi
fi
fi
fi
fi
fi
fi
AI-CH4
2. Regression – Predicts continuous values (e.g., stock price prediction, temperature forecasting).
• Email Spam Detection – Classi es emails as spam or not based on labeled examples.
• Medical Diagnosis – Predicts diseases based on patient symptoms and historical data.
• Speech Recognition – Converts spoken words into text using labeled speech data.
1. Feature Selection – The algorithm selects the best attribute to split the data at each node.
2. Splitting Criteria – Methods like entropy (ID3), Gini index (CART), or variance reduction determine the best
split.
3. Recursive Partitioning – The tree grows by repeatedly splitting data into subsets.
4. Stopping Conditions – Limits like maximum depth or minimum samples per node prevent over tting.
• Classi cation Trees – Used for categorical predictions (e.g., spam detection).
• Regression Trees – Used for continuous predictions (e.g., stock price forecasting).
A Decision Tree is a tree-like model used to make decisions or predictions based on input data. It represents a ow of
decisions, breaking down a complex decision into simpler, sequential tests on attributes.
Component Description
Root Node The top node representing the entire dataset; where decision-making starts.
Internal Nodes Nodes that test an attribute (feature) of the data. Each node corresponds to a
decision
Outcomesrule
of (e.g., “Isleading
Age > to
30?”).
the test next nodes or leaves. Each branch corresponds
Branches (Edges)
to a possible value or range of the attribute.
Leaf Nodes Final nodes that assign a class label (classi cation) or value (regression). No
(Terminal Nodes) further splits happen here.
14

fi
fi
fi
fi
fl
AI-CH4
How it Works
• Follow the branch corresponding to the attribute value for the given input.
Expressiveness refers to the ability of decision trees to represent complex decision boundaries and patterns in data.
• It’s about how well a decision tree can model relationships between input features and output classes or values.
Limitations
• Trees that are too deep can over t (model noise rather than signal).
• May require large trees for very complex functions, impacting interpretability.
• Sometimes struggle with very smooth or highly complex decision boundaries compared to other models (like neural
networks).
15

fi
fi
fi
AI-CH4
Inducing a decision tree means building the tree automatically from a set of training examples (data points with
features and known labels).
• Given a dataset of examples with input features and output labels, create a decision tree that classi es or
predicts correctly.
• The tree should generalize well to unseen data, not just memorize training examples.
• Choose the feature that best separates the data into distinct classes.
• Divide data into subsets according to the chosen attribute’s possible values or split points.
5. Repeat Recursively
◦ If all examples belong to the same class, make a leaf node with that class.
6. Stopping Conditions
◦ No attributes left.
16

fi
AI-CH4
◦ Number of examples is too small.
• To avoid over tting, prune branches that do not improve accuracy on validation data.
Example Illustration
Suppose dataset with attributes: Weather (Sunny, Rainy), Temperature (High, Low), and label Play (Yes/No).
Summary Table
Step Description
Start Use all training examples at root
Attribute
Pick feature that best splits the data
Selection
Splitting Partition data based on selected attribute
Recursion Repeat for each subset
Stop when nodes are pure or no further split
Stopping Criteria
possible
Pruning (optional) Remove branches that cause over tting
17

fi
fi
fi
AI-CH4
18

AI-CH4
19