Decision Tree Induction
Non-metric Methods
• Numerical Attributes
– Nearest-neighbor -- distance
– Neural networks: two similar inputs leads to
similar outputs
– SVMs: Dot Product
12/8/2021 Data Mining: Concepts and Techniques 2
Non-metric data
• Nominal attributes
• Color, taste
• Strings: DNA
•
12/8/2021 Data Mining: Concepts and Techniques 3
• Probability based
• Rule based
– Decision trees
12/8/2021 Data Mining: Concepts and Techniques 4
Decision Tree
• Rules in the form of a hierarchy.
12/8/2021 Data Mining: Concepts and Techniques 5
• Why are decision trees so popular?
12/8/2021 Data Mining: Concepts and Techniques 6
Definition of Decision Tree
Definition 9.1: Decision Tree
12/8/2021 Data Mining: Concepts and Techniques 7
We need to work with a training set
Decision tree induction
Training
Data
Classifier
(Decision Tree)
12/8/2021 Data Mining: Concepts and Techniques 8
You need to work with a training set
12/8/2021 Data Mining: Concepts and Techniques 9
Output: A Decision Tree for “buys_computer”
age?
<=30 >40
overcast
30..40
student? yes credit rating?
no yes fair
excellent
no yes no yes
12/8/2021 Data Mining: Concepts and Techniques 10
• Criteria for choosing an attribute?
• You can achieve 100% accuracy with training
set?!
– Overfitting
• When you stop building the tree?
• Are there various types of DT induction
methods?? ID3, C4.5 and CART.
12/8/2021 Data Mining: Concepts and Techniques 11
Decision tree induction
• They adopt a greedy (i.e., no backtracking),
top-down recursive divide-and-conquer
approach.
12/8/2021 Data Mining: Concepts and Techniques 12
• Node 🡪 subset of training patterns
• Root 🡪 training set.
• Leaf 🡪 class label.
12/8/2021 Data Mining: Concepts and Techniques 13
Impurity measures
• Entropy impurity (information impurity)
• Gini impurity (variance impurity)
• Misclassification impurity
12/8/2021 Data Mining: Concepts and Techniques 14
For a two category case
There is a mistake in this slide. Entropy for a two class problem has its max. vaule =
1. For others, (for 2 class problems), its max value = 0.5
12/8/2021 Data Mining: Concepts and Techniques 15
Which test?
• That which drops the impurity greater.
– Try to become pure quickly.
12/8/2021 Data Mining: Concepts and Techniques 16
Which test?
• That which drops the impurity greater.
– Try to become pure quickly.
12/8/2021 Data Mining: Concepts and Techniques 17
Which test?
• That which drops the impurity greater.
– Try to become pure quickly.
12/8/2021 Data Mining: Concepts and Techniques 18
Which test?
• That which drops the impurity greater.
– Try to become pure quickly.
12/8/2021 Data Mining: Concepts and Techniques 19
Which test?
• That which drops the impurity greater.
– Try to become pure quickly.
12/8/2021 Data Mining: Concepts and Techniques 20
Information gain
•
12/8/2021 Data Mining: Concepts and Techniques 21
Gain(age) ??
(yes, no) = (9, 5)
12/8/2021 Data Mining: Concepts and Techniques 22
Gain(age) ??
(yes, no) = (9, 5)
12/8/2021 Data Mining: Concepts and Techniques 23
For other attributes, their GAIN
• So we choose age as the splitting attribute.
12/8/2021 Data Mining: Concepts and Techniques 24
• Similarly one can use other impurity measures
12/8/2021 Data Mining: Concepts and Techniques 25
Gini Index (IBM
IntelligentMiner)
• If a data set T contains examples from n classes, gini index, gini(T)
is defined as
where pj is the relative frequency of class j in T.
• If a data set T is split into two subsets T1 and T2 with sizes N1 and
N2 respectively, the gini index of the split data contains examples
from n classes, the gini index gini(T) is defined as
• The attribute provides the smallest ginisplit(T) is chosen to split the
node (need to enumerate all possible splitting points for each
attribute).
12/8/2021 Data Mining: Concepts and Techniques 26
• But, there is one drawback with this approach!
12/8/2021 Data Mining: Concepts and Techniques 27
• A split with large branching factor is often
chosen.
– So, telephone number is chosen.
12/8/2021 Data Mining: Concepts and Techniques 28
So, we penalize large branching factors
• This is called gain ratio (very often used with
information gain).
• Branching factor is more, the denominator is
more.
12/8/2021 Data Mining: Concepts and Techniques 29
Notation
•
12/8/2021 Data Mining: Concepts and Techniques 30
Building Decision Tree
• In principle, there are exponentially many decision tree that can be
constructed from a given database (also called training data).
– Some of the tree may not be optimum
– Some of them may give inaccurate result
• How a Decision Tree built?
– Greedy strategy
• A top-down recursive divide-and-conquer
– Modification of greedy strategy
• ID3
• C4.5
• CART, etc.
12/8/2021 Data Mining: Concepts and Techniques 31
Built Decision Tree Algorithm
• Algorithm BuiltDT
• Input: D : Training data set
• Output: T : Decision tree
Steps
1. If all tuples in D belongs to the same class Cj
Add a leaf node labeled as Cj
Return // Termination condition
2. Select an attribute Ai (so that it is not already selected in the same branch)
3. Partition D = { D1, D2, …, Dp} based on p different values of Ai in D
4. For each Dk ϵ D
Create a node and add an edge between D and Dk with label as the Ai’s attribute value in Dk
5. For each Dk ϵ D
BuildTD(Dk) // Recursive call
6. Stop
12/8/2021 Data Mining: Concepts and Techniques 32
Node Splitting in BuildDT Algorithm
• BuildDT algorithm must provides a method for expressing an attribute test
condition and corresponding outcome for different attribute type
• Case: Binary attribute
– This is the simplest case of node splitting
– The test condition for a binary attribute generates only two outcomes
12/8/2021 Data Mining: Concepts and Techniques 33
Node Splitting in BuildDT Algorithm
• Case: Nominal attribute
– Since a nominal attribute can have many values, its test condition can be
expressed in two ways:
• A multi-way split
• A binary split
– Muti-way split: Outcome depends on the number of distinct values for the
corresponding attribute
– Binary splitting by grouping attribute values
12/8/2021 Data Mining: Concepts and Techniques 34
Node Splitting in BuildDT Algorithm
• Case: Ordinal attribute
– It also can be expressed in two ways:
• A multi-way split
• A binary split
– Muti-way split: It is same as in the case of nominal attribute
– Binary splitting attribute values should be grouped maintaining the order
property of the attribute values
12/8/2021 Data Mining: Concepts and Techniques 35
Node Splitting in BuildDT Algorithm
• Case: Numerical attribute
– For numeric attribute (with discrete or continuous values), a test condition can
be expressed as a comparison set
• Binary outcome: A >v or A ≤ v
– In this case, decision tree induction must consider all possible split positions
• Range query : vi ≤ A < vi+1 for i = 1, 2, …, q (if q number of ranges are chosen)
– Here, q should be decided a priori
– For a numeric attribute, decision tree induction is a combinatorial optimization
problem
12/8/2021 Data Mining: Concepts and Techniques 36
Illustration : BuildDT Algorithm
Example 9.4: Illustration of BuildDT Algorithm
– Consider a training data set as shown.
Attributes:
Gender = {Male(M), Female (F)} // Binary attribute
Height = {1.5, …, 2.5} // Continuous attribute
Class = {Short (S), Medium (M), Tall (T)}
Given a person, we are to test in which class s/he belongs
12/8/2021 Data Mining: Concepts and Techniques 37
Illustration : BuildDT Algorithm
• To built a decision tree, we can select an attribute in two different orderings:
<Gender, Height> or <Height, Gender>
• Further, for each ordering, we can choose different ways of splitting
• Different instances are shown in the following.
• Approach 1 : <Gender, Height>
12/8/2021 Data Mining: Concepts and Techniques 38
Illustration : BuildDT Algorithm
12/8/2021 Data Mining: Concepts and Techniques 39
Illustration : BuildDT Algorithm
• Approach 2 : <Height, Gender>
12/8/2021 Data Mining: Concepts and Techniques 40
Illustration : BuildDT Algorithm
Example 9.5: Illustration of BuildDT Algorithm
– Consider an anonymous database as shown.
• Is there any “clue” that enables to
select the “best” attribute first?
• Suppose, following are two
attempts:
• A1🡪A2🡪A3🡪A4 [naïve]
• A3🡪A2🡪A4🡪A1 [Random]
• Draw the decision trees in the
above-mentioned two cases.
• Are the trees different to classify any test
data?
• If any other sample data is added into the
database, is that likely to alter the
decision tree already obtained?
12/8/2021 Data Mining: Concepts and Techniques 41
Algorithm ID3
12/8/2021 Data Mining: Concepts and Techniques 42
ID3: Decision Tree Induction Algorithms
• Quinlan [1986] introduced the ID3, a popular short form of Iterative
Dichotomizer 3 for decision trees from a set of training data.
• In ID3, each node corresponds to a splitting attribute and each arc is a
possible value of that attribute.
• At each node, the splitting attribute is selected to be the most informative
among the attributes not yet considered in the path starting from the root.
12/8/2021 Data Mining: Concepts and Techniques 43
Algorithm ID3
• In ID3, entropy is used to measure how informative a node is.
– It is observed that splitting on any attribute has the property that average
entropy of the resulting training subsets will be less than or equal to that of the
previous (parent node’s) training subset.
• ID3 algorithm defines a measurement of a splitting called Information
Gain to determine the goodness of a split.
– The attribute with the largest value of information gain is chosen as the
splitting attribute and
– it partitions into a number of smaller training sets based on the distinct values
of attribute under split.
12/8/2021 Data Mining: Concepts and Techniques 44
12/8/2021 Data Mining: Concepts and Techniques 45
Entropy of a Training Set
Example 9.10: OPTH dataset
Consider the OTPH data shown in the following table with total 24 instances in it.
Age Eye sight Astigmatic Use Type Class
1 1 1 1 3
1 1 1 2 2
1 1 2 1 3
1 1 2 2 1
1 2 1 1 3
1 2 1 2 2
1 2 2 1 3
1 2 2 2 1
2 1 1 1 3
2 1 1 2 2
2 1 2 1 3
2 1 2 2 1
A coded
2 2 1 1 3
forms for all
2 2 1 2 2 values of
2 2 2 1 3 attributes are
2 2 2 2 3
3 1 1 1 3 used to avoid
3 1 1 2 3 the cluttering
3 1 2 1 3 in the table.
3 1 2 2 1
3 2 1 1 3
3 2 1 2 2
3 2 2 1 3
3 2 2 2 3 46
Information Gain Calculation
•
Age Eye-sight Astigmatism Use type Class
1 1 1 1 3
1 1 1 2 2
1 1 2 1 3
1 1 2 2 1
1 2 1 1 3
1 2 1 2 2
1 2 2 1 3
1 2 2 2 1
Data Mining: Concepts and Techniques 47
12/8/2021
Calculating Information Gain
Age Eye-sight Astigmatism Use type Class
2 1 1 1 3
2 1 1 2 2
2 1 2 1 3
2 1 2 2 1
2 2 1 1 3
2 2 1 2 2
2 2 2 1 3
2 2 2 2 3
12/8/2021 48
Calculating Information Gain
Age Eye-sight Astigmatism Use type Class
3 1 1 1 3
3 1 1 2 3
3 1 2 1 3
3 1 2 2 1
3 2 1 1 3
3 2 1 2 2
3 2 2 1 3
3 2 2 2 3
12/8/2021 Data Mining: Concepts and Techniques 49
Information Gains for Different Attributes
12/8/2021 Data Mining: Concepts and Techniques 50
Decision Tree Induction : ID3 Way
12/8/2021 Data Mining: Concepts and Techniques 51
Decision Tree Induction : ID3 Way
12/8/2021 Data Mining: Concepts and Techniques 52
Decision Tree Induction : ID3 Way
✔
Age Eye-sight Use Type Astigmatic
Age Eye Ast Use Class Age Eye Ast Use Class
1 1 1 2 2
1 1 1 1 3
1 1 2 2 1
1 1 2 1 3
1 2 1 2 2
1 2 1 1 3
1 2 2 1 3 1 2 2 2 1
2 1 1 1 3 2 1 1 2 2
2 2 1 1 3 2 1 2 2 1
2 2 2 1 3 2 2 1 2 2
3 1 1 1 3 3 1 1 2 3
3 1 2 1 3 3 1 2 2 3
3 2 1 1 3 3 2 1 2 2
3 2 2 1 3 3 2 2 2 3
Age Eye-si
Astigmatic Age Eye-si
ght Astigmatic
ght
12/8/2021 Data Mining: Concepts and Techniques 53
Splitting of Continuous Attribute Values
12/8/2021 Data Mining: Concepts and Techniques 54
Splitting of Continuous attribute values
12/8/2021 Data Mining: Concepts and Techniques 55
12/8/2021 Data Mining: Concepts and Techniques 56
Algorithm CART
12/8/2021 Data Mining: Concepts and Techniques 57
CART Algorithm
•
12/8/2021 Data Mining: Concepts and Techniques 58
Gini Index of Diversity
Definition 9.6: Gini Index
12/8/2021 Data Mining: Concepts and Techniques 59
Gini Index of Diversity
•
12/8/2021 Data Mining: Concepts and Techniques 60
Gini Index of Diversity
Definition 9.7: Gini Index of Diversity
12/8/2021 Data Mining: Concepts and Techniques 61
Gini Index of Diversity and CART
12/8/2021 Data Mining: Concepts and Techniques 62
n-ary Attribute Values to Binary Splitting
•
12/8/2021 Data Mining: Concepts and Techniques 63
n-ary Attribute Values to Binary Splitting
•
D
Yes No
12/8/2021 Data Mining: Concepts and Techniques 64
n-ary Attribute Values to Binary
Splitting
Case2: Continuous valued attributes
• For a continuous-valued attribute, each possible split point must be taken
into account.
• The strategy is similar to that followed in ID3 to calculate information gain
for the continuous –valued attributes.
• According to that strategy, the mid-point between ai and ai+1 , let it be vi,
then
Yes No
12/8/2021 Data Mining: Concepts and Techniques 65
n-ary Attribute Values to Binary
Splitting
•
12/8/2021 Data Mining: Concepts and Techniques 66
CART Algorithm : Illustration
Example 9.15 : CART Algorithm
Suppose we want to build decision tree for the data set EMP as given in the
table below.
Age Tuple# Age Salary Job Performance Select
Y : young 1 Y H P A N
M : middle-aged
2 Y H P E N
O : old
3 M H P A Y
Salary
4 O M P A Y
L : low
M : medium 5 O L G A Y
H : high 6 O L G E N
Job 7 M L G E Y
G : government 8 Y M P A N
P : private
9 Y L G A Y
Performance 10 O M G A Y
A : Average
11 Y M G E Y
E : Excellent
12 M M P E Y
Class : Select
13 M H G A Y
Y : yes
N : no 14 O M P E N
12/8/2021 Data Mining: Concepts and Techniques 67
CART Algorithm : Illustration
•
12/8/2021 Data Mining: Concepts and Techniques 68
CART Algorithm : Illustration
•
Yes No
{O} {Y,M}
12/8/2021 Data Mining: Concepts and Techniques 69
CART Algorithm : Illustration
•
Yes No
{H} {L,M}
12/8/2021 Data Mining: Concepts and Techniques 70
CART Algorithm : Illustration
•
12/8/2021 Data Mining: Concepts and Techniques 71
CART Algorithm : Illustration
•
12/8/2021 Data Mining: Concepts and Techniques 72
CS 40003: Data Analytics 73
Class
CS 40003: Data Analytics 74
Calculating γ using Frequency Table
12/8/2021 Data Mining: Concepts and Techniques 75
Calculating γ using Frequency Table
12/8/2021 Data Mining: Concepts and Techniques 76
Illustration: Calculating γ using Frequency
Table
•
1 2 3
Class 1 2 1 1
Class 2 2 2 1
Class 3 4 5 6
Column sum 8 8 8
12/8/2021 Data Mining: Concepts and Techniques 77
Illustration: Calculating γ using Frequency
Table
•
12/8/2021 Data Mining: Concepts and Techniques 78
Illustration: Calculating γ using Frequency
Table
•
12/8/2021 Data Mining: Concepts and Techniques 79
Decision Trees with ID3 and CART
Algorithms
Example 9.17 : Comparing Decision Trees of EMP Data set
Compare two decision trees obtained using ID3 and CART for the EMP
dataset. The decision tree according to ID3 is given for your ready reference
(subject to the verification)
Y Age O
Job Y Performance
P G A E
N Y Y N
Decision Tree using ID3
?
Decision Tree using CART
12/8/2021 Data Mining: Concepts and Techniques 80
Algorithm C4.5
12/8/2021 Data Mining: Concepts and Techniques 81
Algorithm C 4.5 : Introduction
12/8/2021 Data Mining: Concepts and Techniques 82
Algorithm C4.5 : Introduction
•
12/8/2021 Data Mining: Concepts and Techniques 83
Algorithm: C 4.5 : Introduction
• Although, the previous situation is an extreme case, intuitively, we can
infer that ID3 favours splitting attributes having a large number of values
– compared to other attributes, which have a less variations in their values.
• Such a partition appears to be useless for classification.
• This type of problem is called overfitting problem.
Note:
Decision Tree Induction Algorithm ID3 may suffer from overfitting problem.
12/8/2021 Data Mining: Concepts and Techniques 84
Algorithm: C 4.5 : Introduction
12/8/2021 Data Mining: Concepts and Techniques 85
Algorithm: C 4.5 : Gain Ratio
Definition 9.8: Gain Ratio
12/8/2021 Data Mining: Concepts and Techniques 86
•
12/8/2021 Data Mining: Concepts and Techniques 87
•
Frequency 32 0 0 0
Frequency 16 16 0 0
12/8/2021 Data Mining: Concepts and Techniques 88
– Distribution 3
Frequency 16 8 8 0
– Distribution 4
Frequency 16 8 4 4
– Distribution 5: Uniform distribution of attribute values
Frequency 8 8 8 8
12/8/2021 Data Mining: Concepts and Techniques 89
•
12/8/2021 Data Mining: Concepts and Techniques 90
• Information gain signifies how much information will be gained on
partitioning the values of attribute A
– Higher information gain means splitting of A is more desirable.
•
• On the other hand, split information forms the denominator in the gain ratio
formula.
– This implies that higher the value of split information is, lower the gain ratio.
– In turns, it decreases the information gain.
• Further, information gain is large when there are many distinct attribute
values.
– When many distinct values, split information is also a large value.
– This way split information reduces the value of gain ratio, thus resulting a
balanced value for information gain.
• Like information gain (in ID3), the attribute with the maximum gain ratio is
selected as the splitting attribute in C4.5.
12/8/2021 Data Mining: Concepts and Techniques 91
•
12/8/2021 Data Mining: Concepts and Techniques 92
Summary of Decision Tree Induction
Algorithms
• We have learned the building of a decision tree given a training data.
– The decision tree is then used to classify a test data.
• For a given training data D, the important task is to build the decision tree
so that:
– All test data can be classified accurately
– The tree is balanced and with as minimum depth as possible, thus the
classification can be done at a faster rate.
• In order to build a decision tree, several algorithms have been proposed.
These algorithms differ from the chosen splitting criteria, so that they
satisfy the above mentioned objectives as well as the decision tree can be
induced with minimum time complexity. We have studied three decision
tree induction algorithms namely ID3, CART and C4.5. A summary of
these three algorithms is presented in the following table.
12/8/2021 Data Mining: Concepts and Techniques 93
Table 11.6
Algorithm Splitting Criteria Remark
ID3
12/8/2021 Data Mining: Concepts and Techniques 94
Algorithm Splitting Criteria Remark
CART
12/8/2021 Data Mining: Concepts and Techniques 95
Algorithm Splitting Criteria Remark
C4.5
In addition to this, we also highlight few important characteristics
of decision tree induction algorithms in the following.
12/8/2021 Data Mining: Concepts and Techniques 96
Notes on Decision Tree Induction
algorithms
1. Optimal Decision Tree: Finding an optimal decision tree is an NP-complete
problem. Hence, decision tree induction algorithms employ a heuristic based
approach to search for the best in a large search space. Majority of the algorithms
follow a greedy, top-down recursive divide-and-conquer strategy to build
decision trees.
2. Missing data and noise: Decision tree induction algorithms are quite robust to
the data set with missing values and presence of noise. However, proper data
pre-processing can be followed to nullify these discrepancies.
3. Redundant Attributes: The presence of redundant attributes does not adversely
affect the accuracy of decision trees. It is observed that if an attribute is chosen
for splitting, then another attribute which is redundant is unlikely to chosen for
splitting.
4. Computational complexity: Decision tree induction algorithms are
computationally inexpensive, in particular, when the sizes of training sets are
large, Moreover, once a decision tree is known, classifying a test record is
extremely fast, with a worst-case time complexity of O(d), where d is the
maximum depth of the tree.
12/8/2021 Data Mining: Concepts and Techniques 97
Notes on Decision Tree Induction algorithms
5. Data Fragmentation Problem: Since the decision tree induction
algorithms employ a top-down, recursive partitioning approach, the number
of tuples becomes smaller as we traverse down the tree. At a time, the
number of tuples may be too small to make a decision about the class
representation, such a problem is known as the data fragmentation. To deal
with this problem, further splitting can be stopped when the number of
records falls below a certain threshold.
6. Tree Pruning: A sub-tree can replicate two or more times in a decision tree
(see figure below). This makes a decision tree unambiguous to classify a
test record. To avoid such a sub-tree replication problem, all sub-trees
except one can be pruned from the tree.
A
C B
1 0
D C
0 1 D 1
12/8/2021 Data Mining: Concepts and Techniques 0 1 98
Avoid Overfitting in Classification
• The generated tree may overfit the training data
– Too many branches, some may reflect anomalies due to
noise or outliers
– Result is in poor accuracy for unseen samples
• Two approaches to avoid overfitting
– Prepruning: Halt tree construction early—do not split a
node if this would result in the goodness measure falling
below a threshold
• Difficult to choose an appropriate threshold
– Postpruning: Remove branches from a “fully grown”
tree—get a sequence of progressively pruned trees
• Use a set of data different from the training data to
decide which is the “best pruned tree”
12/8/2021 Data Mining: Concepts and Techniques 99
Notes on Decision Tree Induction algorithms
12/8/2021 Data Mining: Concepts and Techniques 100
Reference
⚫ The detail material related to this lecture can be found in
Data Mining: Concepts and Techniques, (3rd Edn.), Jiawei Han, Micheline Kamber, Morgan
Kaufmann, 2015.
Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, and Vipin Kumar,
Addison-Wesley, 2014
12/8/2021 Data Mining: Concepts and Techniques 101
APPENDIX
12/8/2021 Data Mining: Concepts and Techniques 102
Extracting Classification Rules from Trees
• Represent the knowledge in the form of IF-THEN rules
• One rule is created for each path from the root to a leaf
• Each attribute-value pair along a path forms a conjunction
• The leaf node holds the class prediction
• Rules are easier for humans to understand
• Example
IF age = “<=30” AND student = “no” THEN buys_computer = “no”
IF age = “<=30” AND student = “yes” THEN buys_computer = “yes”
IF age = “31…40” THEN buys_computer = “yes”
IF age = “>40” AND credit_rating = “excellent” THEN buys_computer =
“yes”
IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “no”
12/8/2021 Data Mining: Concepts and Techniques 103
Classification in Large Databases
• Classification—a classical problem extensively studied by
statisticians and machine learning researchers
• Scalability: Classifying data sets with millions of examples and
hundreds of attributes with reasonable speed
• Why decision tree induction in data mining?
– relatively faster learning speed (than other classification
methods)
– convertible to simple and easy to understand classification
rules
– can use SQL queries for accessing databases
– comparable classification accuracy with other methods
12/8/2021 Data Mining: Concepts and Techniques 104
Scalable Decision Tree Induction
Methods in Data Mining Studies
• SLIQ (EDBT’96 — Mehta et al.)
– builds an index for each attribute and only class list and the
current attribute list reside in memory
• SPRINT (VLDB’96 — J. Shafer et al.)
– constructs an attribute list data structure
• PUBLIC (VLDB’98 — Rastogi & Shim)
– integrates tree splitting and tree pruning: stop growing the
tree earlier
• RainForest (VLDB’98 — Gehrke, Ramakrishnan & Ganti)
– separates the scalability aspects from the criteria that
determine the quality of the tree
– builds an AVC-list (attribute, value, class label)
12/8/2021 Data Mining: Concepts and Techniques 105
Drawbacks
• What we discussed are axis parallel
• For continuous valued attributes cut-points
can be found.
– Can be discretized (CART does).
12/8/2021 Data Mining: Concepts and Techniques 106
12/8/2021 Data Mining: Concepts and Techniques 107
12/8/2021 Data Mining: Concepts and Techniques 108
12/8/2021 Data Mining: Concepts and Techniques 109