NCVRT-MACHIN
E LEARNING-
UNIT-5
UNIT-5
▪Learning Sets of Rules – Sequential Covering Algorithm –
Learning Rule Set – First Order Rules – Sets of First Order
Rules – Induction on Inverted Deduction – Inverting
Resolution – Analytical Learning – Perfect Domain
Theories – Explanation Base Learning – FOCL Algorithm
– Reinforcement Learning – Task – Q-Learning –
Temporal Difference Learning
1. LEARNING SET OF RULES
▪ Learning sets of rules is a machine learning approach where the learned model is represented
as a collection of if-then rules. Each rule typically consists of a precondition (antecedent) and a
conclusion (consequent).
▪ For classification tasks, the consequent usually specifies a class label
▪ A rule can be represented as:
▪ IF <condition_1> AND <condition_2> AND ... AND <condition_n> THEN <conclusion>
▪ Where each condition is a test on an attribute (e.g., "Color = Red", "Size > 10"). The
conclusion is the predicted outcome (e.g., "Class = Positive")
ADVANTAGES
▪ Interpretability: Rules are often easy for humans to understand, providing insights into the
model's decision-making process
▪ Modularity: Rules can be added, removed, or modified relatively independently
▪ Flexibility: Rules can represent complex relationships and handle both categorical and
continuous attributes
2. SEQUENTIAL COVERING
ALGORITHM
▪ The Sequential Covering Algorithm is a general strategy for learning a set of rules that cover
the positive examples in a classification problem while avoiding the negative examples.
▪ It learns one rule at a time, removes the examples covered by that rule, and then repeats the
process until all positive examples are covered.
▪ Learning a single rule often involves a greedy search. Start with a very general rule (e.g., an
empty antecedent) and iteratively add conditions (literals) to the antecedent that improve the
rule's quality.
▪ This includes: Accuracy, Precision, Coverage and Significance
3. LEARNING RULES SET
▪ This is a broader term encompassing various methods for learning a collection of rules. The Sequential
Covering Algorithm is one such method. Other approaches might learn multiple rules concurrently or
use different search strategies for finding rules.
▪ Variations and Considerations:
▪ Rule Representation: The format of the conditions and the conclusion can vary (e.g., using relational
operators, allowing disjunctions in the antecedent)
▪ Search Strategy: Different algorithms employ different search heuristics and strategies for finding
good rules (e.g., top-down greedy search, bottom-up generalization)
▪ Rule Evaluation Metrics: The criteria used to evaluate the quality of individual rules and the overall
rule set can vary depending on the task
▪ Pruning: Techniques to remove overly specific or redundant rules to improve generalization
4. FIRST ORDER RULES
▪ First-order rules extend propositional rules by allowing variables, predicates, and quantifiers (though
quantifiers are less common in typical rule learning systems). This enables the representation of more
complex relationships between objects.
▪ Representation:
▪ A first-order rule might look like
▪ IF Person(x) AND Parent(x, y) AND Male(y) THEN Father(y, x)
▪ Here, Person, Parent, Male and Father are Predicates.
▪ Advantages of First Order Rules:
▪ 1. Increased Expressiveness: Can represent relationships between multiple objects and handle more
complex domains.
▪ 2. Generalization over Objects: Rules with variables can apply to multiple individuals
5. SETS OF FIRST ORDER
RULES
▪ Learning a set of first-order rules involves finding a collection of such rules that collectively
describe a target concept or relationship. Algorithms for learning sets of first-order rules often
adapt the principles of sequential covering or other rule learning strategies to the first-order
setting.
▪ Examples of First Order Rule Learning Systems:
▪ FOIL (First Order Inductive Learner): A classic algorithm that learns first-order Horn
clauses (rules with at most one positive literal in the consequent) using a greedy search strategy
guided by information gain.
6. INDUCTION TO INVERTED
DEDUCTION
▪ Induction on Inverted Deduction (IOD) is a framework for inductive learning that aims to
invert the process of logical deduction.
▪ Given a set of examples and background knowledge (in the form of logical rules), the goal is to
learn new rules that, when combined with the background knowledge, can explain the positive
examples and not the negative ones.
▪ Instead of starting with general hypotheses and specializing them based on data (as in many
inductive methods), IOD starts from the examples and tries to generalize them by inverting the
steps of logical deduction that could have led to these examples from a set of rules (including
background knowledge
PROCESS
▪ 1. Proof Tracing: For a positive example, try to find a proof of it using the background
knowledge.
▪ 2. Inverting Deduction Steps: Invert the inference rules used in the proof to obtain candidate
inductive hypotheses. This involves determining what premises would have been needed to
derive the conclusion (the example) using the known inference rules.
▪ 3. Generalization: Generalize the derived hypotheses by introducing variables and finding
common patterns across multiple examples.
▪ 4. Hypothesis Evaluation: Evaluate the generalized hypotheses based on their ability to
explain other positive examples and not explain negative examples.
▪ Relation to Explanation Based Learning (EBL): IOD can be seen as a generalization of
EBL, where the explanation of a single example is used to guide the learning of more general
rules.
7. INVERTING RESOLUTION
▪ Inverting Resolution is a specific technique within the IOD framework that focuses on
inverting the resolution inference rule used in automated theorem proving.
▪ Resolution is a powerful rule of inference that derives new clauses from existing ones.
▪ Inverting resolution aims to find the premises (parent clauses) that could have led to a given
conclusion (e.g., a positive example) through resolution
▪ Use in Inductive Learning:
▪ By inverting resolution on positive examples (treated as clauses), and using background
knowledge (also as clauses), the learner can generate candidate hypotheses (new clauses or
rules) that, when used with the background knowledge, can entail the positive examples
8. ANALYTICAL LEARNING
▪ Analytical learning methods aim to learn new knowledge or improve problem-solving efficiency by
analyzing existing knowledge and a single training example. Unlike empirical learning, which relies
on statistical patterns from many examples, analytical learning uses reasoning and deduction.
▪ Given a domain theory (a set of rules or axioms describing the domain) and a training example,
analytical learning constructs an explanation of how the example is entailed by the domain theory.
▪ This explanation is then generalized to form a new rule or concept that can be applied to future similar
situations.
▪ Contrast with Empirical Learning:
▪ Empirical Learning: Learns from statistical patterns in many examples, can handle incomplete or
noisy theories, but learned knowledge might be less interpretable
▪ Analytical Learning: Learns from a single example by analyzing a complete and correct domain
theory, resulting in logically sound and interpretable knowledge, but relies on the availability of such a
theory
9. PERFECT DOMAIN
THEORIES
▪ A perfect domain theory is a set of rules or axioms that completely and correctly describes a
domain of interest. It should be able to explain all relevant phenomena in the domain without
contradictions or omissions.
▪ Importance for Analytical Learning:
▪ Analytical learning methods, like Explanation Based Learning (EBL), heavily rely on the
existence of a perfect domain theory. If the domain theory is flawed or incomplete, the
explanations derived from it might be incorrect or lead to inaccurate generalizations.
▪ Challenge:
▪ Creating a perfect domain theory for complex real-world domains is often very difficult or
impossible. This is a major limitation of purely analytical learning approaches
10. EXPLANATION BASE
LEARNING
▪ Explanation Based Learning (EBL) is a type of analytical learning where a learner uses a domain
theory to construct an explanation of why a given training example is an instance of a target concept.
This explanation is then generalized to create a new rule that can be used to recognize other instances
of the concept more efficiently.
▪ Example:
▪ Domain Theory: Flies(x) ∧ HasWings(x) → CanFly(x)
▪ Bird(x) → HasWings(x)
▪ Training Example: Tweety is a bird and can fly. Target Concept: CanFly(Tweety)
▪ Explanation: Bird(Tweety) → HasWings(Tweety). HasWings(Tweety) ∧ Flies(Tweety) (assumed)
→ CanFly(Tweety).
▪ Generalization: Replace Tweety with a variable z. Bird(z) → HasWings(z). If we observe a bird, it
has wings.
▪ New Rule: Bird(z) ∧ Flies(z) → CanFly(z).
11. FOCL ALGORITHM
▪ FOCL is an algorithm that combines aspects of empirical inductive learning (like FOIL) with
explanation-based learning. It aims to overcome some of the limitations of purely inductive or
purely analytical approaches.
▪ Uses a domain theory: Similar to EBL, FOCL can utilize background knowledge in the form
of first-order rules.
▪ Learns from examples: Like FOIL, it uses a set of positive and negative examples to guide
learning
▪ Combines deduction and induction: FOCL can use the domain theory to explain positive
examples (as in EBL) and then generalize these explanations
▪ Handles imperfect domain theories: By incorporating empirical evidence, FOCL is more
robust to inaccuracies or incompleteness in the domain theory compared to pure EBL
12. REINFORCEMENT
LEARNING
▪ Reinforcement Learning (RL) is a subfield of machine learning where an agent learns to make decisions in an environment to maximize
a cumulative reward signal. It's inspired by how humans and animals learn through trial and error. The agent takes actions, receives
feedback in the form of rewards or penalties, and learns to adjust its behavior over time to achieve its goal
▪ Key Concepts:
▪ Agent: The learner and decision-maker.
▪ Environment: The world with which the agent interacts
▪ State: A representation of the current situation of the agent in the environment
▪ Action: A choice made by the agent that affects the environment and the agent's state
▪ Reward: A scalar signal received by the agent from the environment after taking an action, indicating how good or bad the action was
▪ Policy: A strategy that dictates which action the agent should take in each state
▪ Value Function: An estimate of the expected future reward the agent can accumulate by starting from a particular state and following a
policy
▪ Q-function (Action-Value Function): An estimate of the expected future reward of taking a specific action in a specific state and then
following a policy
13. TASK
▪ A reinforcement learning task defines the problem the agent needs to solve
▪ It Involves:
▪ Defining the Environment: Specifying the rules and dynamics of the world.
▪ Defining the State Space: Describing all possible situations the agent can be in
▪ Defining the State Space: Describing all possible situations the agent can be in
▪ Defining the Action Space: Specifying all possible actions the agent can take
▪ Defining the Reward Function: Determining the feedback the agent receives for its actions
and the goals it should strive for
▪ Defining the Goal: Usually to maximize the cumulative reward over time
▪ Episode Structure (if applicable): Whether the task has distinct episodes (start and end) or is
continuous
14. Q-LEARNING
▪ Q-Learning is a model-free, off-policy reinforcement learning algorithm that aims to find the
optimal action-value function. The Q-function, Q(s,a), estimates the expected future reward of
taking action a in state s and following the optimal policy thereafter
▪ Key Aspects
▪ 1. Q-Table
▪ 2. Update Rule
▪ 3. Off-Policy
▪ 4. Exploration-Exploitation
15. TEMPORAL DIFFERENCE
LEARNING
▪ Temporal Difference (TD) Learning is a class of model-free reinforcement learning algorithms
that learn from incomplete episodes by bootstrapping from the current estimate of the value
function. TD methods update their value function estimates based on the difference (the TD
error) between the current estimate and a temporally later estimate.
▪ Kay Aspects:
▪ 1. Learning from Incomplete Episodes
▪ 2. Bootstrapping
▪ 3. TD Error
▪ TD learning is a fundamental concept in reinforcement learning and forms the basis for many
popular algorithms. It offers efficiency and the ability to learn online, making it suitable for a
wide range of problems.