PROGRAM NAME:III rd YEAR –ECE –(A&B)
COURSE NAME:MACHINE LEARNING
COURSE INSTRUCTOR:Ms. S.JayaMangala M.Tech., (Ph.D)
                                 Assistant Professor, Dept of ECE
                                  Santhiram Engineering College
                                    jaya.ece@srecnandyal.edu.in
Syllabus
UNIT-I:
Introduction: Definition of learning systems, Goals and applications of
machine learning, Aspects of developing a learning system: training
data, concept representation, function approximation. Inductive
Classification: The concept learning task, Concept learning as search
through a hypothesis space, General-to-specific ordering of
hypotheses, Finding maximally specific hypotheses, Version spaces and
the candidate elimination algorithm, Learning conjunctive concepts,
The importance of inductive bias.
ML UNIT-1             Ms.S.JayaMangala
Syllabus
                                Machine Learning
• Learning ↔ Intelligence
• Def: (Learning): Acquisition of knowledge or skills through study , experience, or
being taught.
• Def: Intelligence is the ability to learn and use concepts to solve problems.
• Machine Learning ↔ Artificial Intelligence
 • Def: AI is the science of making machines do things that require intelligence if
done by human. (Minsky 1986).
 • Def: Machine Learning is an area of AI concerned with development of
techniques which allow machines to learn.
ML UNIT-1                 Ms.S.JayaMangala
Syllabus
                            Why Machine Learning
Why Machine Learning? ↔ Why Artificial Intelligence? ≡ To build machines
exhibiting intelligent behavior (i.e., able to reason, predict, and adapt) while
helping humans work, study, and entertain themselves.
 • Recent programs in algorithm and theory.
• Growing flood of online data.
• More Computational power is available than human.
 • Budding industry
ML UNIT-1               Ms.S.JayaMangala
Syllabus
                            Why Machine Learning
Why Machine Learning? ↔ Why Artificial Intelligence? ≡ To build machines
exhibiting intelligent behavior (i.e., able to reason, predict, and adapt) while
helping humans work, study, and entertain themselves.
 • Recent programs in algorithm and theory.
• Growing flood of online data.
• More Computational power is available than human.
 • Budding industry
ML UNIT-1               Ms.S.JayaMangala
ML UNIT-1   Ms.S.JayaMangala   05/20
 Designing a learning system
 1. Choosing the training experience –
    Examples of best moves, games outcome.
 2. Choosing the target function – board-move,
    board-value.
 3. Choosing a representation for the target
    function – linear function with weights
    (hypothesis space).
 4. Choosing a learning algorithm for
    approximating the target function – A
    method for parameter estimation
ML UNIT-1               Ms.S.JayaMangala         06/20
    • 12. Designing A Learning System 1.Choosing the Training
      Experience: One key attribute is whether the training
      experience provides direct or indirect feedback regarding the
      choices made by the performance system. • A second important
      attribute of the training experience is the degree to which the
      learner controls the sequence of training. • A third important
      attribute of the training experience is how well it represents the
      distribution of examples over which the final system
      performance P must be measured.
ML UNIT-1                Ms.S.JayaMangala                                  07/20
                     UNIT-1
ML UNIT-1   Ms.S.JayaMangala   08/20
ML UNIT-1   Ms.S.JayaMangala   09/20
ML UNIT-1   Ms.S.JayaMangala   10/20
A Formal Definition for Concept Learning:
Inferring a boolean-valued function from training examples of its input
and output.
 • An example for concept-learning is the learning of bird-concept from
the given examples of birds (positive examples) and non-birds (negative
examples).
• We are trying to learn the definition of a concept from given
examples.
ML UNIT-1                Ms.S.JayaMangala                         11/20
 • A set of example days, and each is described by six attributes.
 • The task is to learn to predict the value of EnjoyS port for arbitrary day,
   based on the values of its attribute values.
ML UNIT-1                Ms.S.JayaMangala                              03/20
 Enjoy Sport – Hypothesis Representation
 • Each hypothesis consists of a conjuction of constraints on the
 instance attributes.
 • Each hypothesis will be a vector of six constraints, specifying the
 values
 of the six attributes
 – (Sky, AirTemp, Humidity, Wind, Water, and Forecast).
 • Each attribute will be:
 ? - indicating any value is acceptable for the attribute (don’t care)
 single value – specifying a single required value (ex. Warm) (specific)
 0 - indicating no value is acceptable for the attribute (no value)
ML UNIT-1             Ms.S.JayaMangala                          03/20
Hypothesis Representation
• A hypothesis: Sky AirTemp Humidity Wind Water Forecast < Sunny, ? ,
? , Strong , ? , Same >
• The most general hypothesis – that every day is a positive example
<?, ?, ?, ?, ?, ?>
• The most specific hypothesis – that no day is a positive example
 <0, 0, 0, 0, 0, 0>
• Enjoy Sport concept learning task requires learning the sets of days
for which Enjoy Sport=yes, describing this set by a conjunction of
constraints over the instance attributes
ML UNIT-1             Ms.S.JayaMangala                          03/20
                      Enjoy Sport Concept Learning Task
 Given – Instances X : set of all possible days, each described by the attributes
 • Sky – (values: Sunny, Cloudy, Rainy)
  • AirTemp – (values: Warm, Cold)
 • Humidity – (values: Normal, High)
 • Wind – (values: Strong, Weak)
  • Water – (values: Warm, Cold)
  • Forecast – (values: Same, Change)
  – Target Concept (Function) c : EnjoySport : X  {0,1}
 – Hypotheses H : Each hypothesis is described by a conjunction of constraints on
 the attributes.
  – Training Examples D : positive and negative examples of the target function
 Determine
 ]– UNIT-1
ML  A hypothesis h in H such  that h(x) = c(x) for all x in D.
                           Ms.S.JayaMangala                                   03/20
 – Target Concept (Function) c : EnjoySport : X -> {0,1}
 – Hypotheses H : Each hypothesis is described by a conjunction of constraints on
 the attributes.
  – Training Examples D : positive and negative examples of the target function
 Determine
 ]– A hypothesis h in H such that h(x) = c(x) for all x in D.
ML UNIT-1                Ms.S.JayaMangala                                03/20
 The Inductive Learning Hypothesis
 • Although the learning task is to determine a hypothesis h identical to the target
 concept cover the entire set of instances X, the only information available about c
 is its value over the training examples.
 – Inductive learning algorithms can at best guarantee that the output hypothesis
 fits the target concept over the training data.
 – Lacking any further information, our assumption is that the best hypothesis
 regarding unseen instances is the hypothesis that best fits the observed training
 data. This is the fundamental assumption of inductive learning.
ML UNIT-1                 Ms.S.JayaMangala                                 03/20
 Enjoy Sport - Hypothesis Space
 • Sky has 3 possible values, and other 5 attributes have 2 possible values.
 • There are 96 (= 3.2.2.2.2.2) distinct instances in X.
 • There are 5120 (=5.4.4.4.4.4) syntactically distinct hypotheses in H. – Two more
 values for attributes: ? and 0
  • Every hypothesis containing one or more 0 symbols represents the empty set
 of instances; that is, it classifies every instance as negative.
  • There are 973 (= 1 + 4.3.3.3.3.3) semantically distinct hypotheses in H. – Only
 one more value for attributes: ?, and one hypothesis representing empty set of
 instances.
 • Although EnjoySport has small, finite hypothesis space, most learning tasks
 have much larger (even infinite) hypothesis spaces. – We need efficient search
 algorithms on the hypothesis spaces
ML UNIT-1                 Ms.S.JayaMangala                                 03/20
 • The Inductive Learning Hypothesis
 - Any hypothesis found to approximate the target function well over a sufficiently
 large set of training examples will also approximate the target function well over
 other unobserved examples.
                         Concept Learning As Search
 • Concept learning can be viewed as the task of searching through a large space
 of hypotheses implicitly defined by the hypothesis representation.
  • The goal of this search is to find the hypothesis that best fits the training
 examples.
 • By selecting a hypothesis representation, the designer of the learning algorithm
 implicitly defines the space of all hypotheses that the program can ever represent
 and therefore can ever learn
ML UNIT-1                 Ms.S.JayaMangala                                 03/20
 General-to-Specific Ordering of Hypotheses
 • Many algorithms for concept learning organize the search through
 the hypothesis space by relying on a general-to-specific ordering of
 hypotheses.
 • By taking advantage of this naturally occurring structure over the
 hypothesis space, we can design learning algorithms that exhaustively
 search even infinite hypothesis spaces without explicitly enumerating
 every hypothesis.
  • Consider two hypotheses
 h1 = (Sunny, ?, ?, Strong, ?, ?)
  h2 = (Sunny, ?, ?, ?, ?, ?)
ML UNIT-1             Ms.S.JayaMangala                          03/20
 Now consider the sets of instances that are classified positive by hl and by
 h2.
 – Because h2 imposes fewer constraints on the instance, it classifies more
 instances as positive. – In fact, any instance classified positive by hl will also be
 classified positive by h2. – Therefore, we say that h2 is more general than hl
 For any instance x in X and hypothesis h in H, we say that x satisfies h if and only
 if h(x) = 1.
 • More-General-Than-Or-Equal Relation:
 Let h1 and h2 be two boolean-valued functions defined over X. Then h1 is more-
 general-than-or-equal-to h2 (written h1 ≥ h2) if and only if any instance that
 satisfies h2 also satisfies h1.
  • h1 is more-general-than h2 ( h1 > h2) if and only if h1≥h2 is true and h2≥h1 is
 false. We also say h2 is more-specific-than h1.
ML UNIT-1                  Ms.S.JayaMangala                                   03/20
 FIND-S Algorithm
 • FIND-S Algorithm starts from the most specific hypothesis and generalize it by
 considering only positive examples.
 • FIND-S algorithm ignores negative examples.
 – As long as the hypothesis space contains a hypothesis that describes the true
 target concept, and the training data contains no errors, ignoring negative
 examples does not cause to any problem.
 • FIND-S algorithm finds the most specific hypothesis within H that is consistent
 with the positive training examples.
 – The final hypothesis will also be consistent with negative examples if the
 correct target concept is in H, and the training examples are correct.
ML UNIT-1                 Ms.S.JayaMangala                                03/20
                                FIND-S Algorithm
 1. Initialize h to the most specific hypothesis in H
 2. For each positive training instance x
    For each attribute constraint a, in h
     If the constraint a, is satisfied by x
        Then do nothing
         Else replace a, in h by the next more general constraint that is
 satisfied by x
  3. Output hypothesis h
ML UNIT-1               Ms.S.JayaMangala                             03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
                      Candidate-Elimination Algorithm
• FIND-S outputs a hypothesis from H, that is consistent with the training
examples, this is just one of many hypotheses from H that might fit the
training data equally well.
• The key idea in the Candidate-Elimination algorithm is to output a
description of the set of all hypotheses consistent with the training
examples.
   – Candidate-Elimination algorithm computes the description of this set without
explicitly enumerating all of its members.
   – This is accomplished by using the more-general-than partial ordering and
maintaining a compact representation of the set of consistent hypotheses.
ML UNIT-1                  Ms.S.JayaMangala                                  03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
                              Version Spaces
 • The Candidate-Elimination algorithm represents the set of all
 hypotheses consistent with the observed training examples.
 • This subset of all hypotheses is called the version space with respect
 to the hypothesis space H and the training examples D, because it
 contains all plausible versions of the target concept.
ML UNIT-1              Ms.S.JayaMangala                           03/20
                      List-Then-Eliminate Algorithm
 • List-Then-Eliminate algorithm initializes the version space to contain all
 hypotheses in H, then eliminates any hypothesis found inconsistent with any
 training example.
 • The version space of candidate hypotheses thus shrinks as more examples are
 observed, until ideally just one hypothesis remains that is consistent with all the
 observed examples.
  – Presumably, this is the desired target concept.
   – If insufficient data is available to narrow the version space to a single
 hypothesis, then the algorithm can output the entire set of hypotheses
 consistent with the observed data.
ML UNIT-1                 Ms.S.JayaMangala                                 03/20
 • List-Then-Eliminate algorithm can be applied whenever the
 hypothesis space H is finite.
 – It has many advantages, including the fact that it is guaranteed to
 output all hypotheses consistent with the training data.
  – Unfortunately, it requires exhaustively enumerating all hypotheses
 in H - an unrealistic requirement for all but the most trivial hypothesis
 spaces.
ML UNIT-1              Ms.S.JayaMangala                           03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
                   Compact Representation of Version Spaces
• A version space can be represented with its general and specific boundary sets.
• The Candidate-Elimination algorithm represents the version space by storing only
its most general members G and its most specific members S.
 • Given only these two sets S and G, it is possible to enumerate all members of a
version space by generating hypotheses that lie between these two sets in general-
to-specific partial ordering over hypotheses.
• Every member of the version space lies between these boundaries
ML UNIT-1                 Ms.S.JayaMangala                               03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
                            Candidate-Elimination Algorithm
• The Candidate-Elimination algorithm computes the version space containing all hypotheses
from H that are consistent with an observed sequence of training examples.
• It begins by initializing the version space to the set of all hypotheses in H; that is, by initializing
the G boundary set to contain the most general hypothesis in H
G0 ->{ <?, ?, ?, ?, ?, ?> }
and initializing the S boundary set to contain the most specific hypothesis
S0-> { <0, 0, 0, 0, 0, 0> }
• These two boundary sets delimit the entire hypothesis space, because every other hypothesis
in H is both more general than S0 and more specific than G0.
• As each training example is considered, the S and G boundary sets are generalized and
specialized, respectively, to eliminate from the version space any hypotheses found inconsistent
with the new training example.
 • After all examples have been processed, the computed version space contains all the
hypotheses consistent with these examples and only these hypotheses.
ML UNIT-1                         Ms.S.JayaMangala                                            03/20
                      Candidate-Elimination Algorithm
• Initialize G to the set of maximally general hypotheses in H
 • Initialize S to the set of maximally specific hypotheses in H
• For each training example d, do – If d is a positive example
 • Remove from G any hypothesis inconsistent with d ,
• For each hypothesis s in S that is not consistent with d ,
   - – Remove s from S
   – Add to S all minimal generalizations h of s such that
         » h is consistent with d, and some member of G is more general than h
     – Remove from S any hypothesis that is more general than another hypothesis
in S
    – If d is a negative example
ML UNIT-1                  Ms.S.JayaMangala                            03/20
• Remove from S any hypothesis inconsistent with d
• For each hypothesis g in G that is not consistent with d
     – Remove g from G
     – Add to G all minimal specializations h of g such that
      » h is consistent with d, and some member of S is more specific than h
      – Remove from G any hypothesis that is less general than another
hypothesis in G
ML UNIT-1                 Ms.S.JayaMangala                          03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20
ML UNIT-1   Ms.S.JayaMangala   03/20