0% found this document useful (0 votes)

44 views45 pages

Decision Trees

Uploaded by

akrab.tech7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views45 pages

Decision Trees

Uploaded by

akrab.tech7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

CSCI417

Machine Intelligence
Lecture # 3

Spring 2024

1
Tentative Course Topics

1.Machine Learning Basics

2.Classifying with k-Nearest Neighbors
3.Splitting datasets one feature at a time: decision trees
4.Classifying with probability theory: naïve Bayes
5.Logistic regression
6.Support vector machines
7.Model Evaluation and Improvement: Cross-validation, Grid Search, Evaluation Metrics, and
Scoring
8.Ensemble learning and improving classification with the AdaBoost meta-algorithm.
9.Introduction to Neural Networks - Building NN for classification (binary/multiclass)
10.Convolutional Neural Network (CNN)
11.Pretrained models (VGG, Alexnet,..)
12.Machine learning pipeline and use cases.

2
Recommending App

For a woman who works at

an office, which app do we
recommend?

For a man who works at a

factory, which app do we
recommend?

ML ask:
Between Gender and Occupation,
which one seems more decisive for
predicting what app will the users
download?....

3
Recommending App
Occupation

School Work

Pokemon Go Gender

F M

WhatsApp Snapchat

4
Between a horizontal and
a vertical line, which one
would cut the data
better?

5
6
Non-parametric Estimation
• A non-parametric model is not fixed, but its complexity
depends on the size of the training set or, rather, the
complexity of the problem inherent in the data.
• Here, a non-parametric model does not mean that the model
has no parameters; it means that the number of parameters
is not fixed and that their number can grow depending on the
size of the data or, better still, depending on the complexity of
the regularity that underlies the data.

7
Decision tree
• A decision tree is a hierarchical data structure implementing
the divide-and-conquer strategy.
• It is an efficient non-parametric method that can be used for
both classification and regression.
• A decision tree is also a non-parametric model in the sense
that we do not assume any parametric form for the class
densities, and the tree structure is not fixed a priori, but the
tree grows, branches and leaves are added during learning
depending on the complexity of the problem inherent in the
data.

8
Function Approximation

9
Ref: https://www.seas.upenn.edu, Eric Eaton.
Sample Dataset (Will Nadal Play Tennis?)
• Columns denote features Xi
• Rows denote labeled instances hx i , yi i
Class label denotes whether a tennis game was played

10
Ref: https://www.seas.upenn.edu, Eric Eaton.
Decision Tree

• A possible decision tree for the data:

• Each internal node: test one attribute Xi
• Each branch from a node: selects one value for Xi
• Each leaf node: predict Y

11
Ref: https://www.seas.upenn.edu, Eric Eaton.
Decision Tree

• A possible decision tree for the data:

• What prediction would we make for
<outlook=sunny, temperature=hot, humidity=high, wind=weak> ?

12
Ref: https://www.seas.upenn.edu, Eric Eaton.
Decision Tree
• Decision trees divide the feature space into axis-
parallel (hyper-)rectangles
• Each rectangular region is labeled with one label
– or a probability distribution over labels

13
Ref: https://www.seas.upenn.edu, Eric Eaton.
Stages of Machine Learning
Given: labeled training data X , Y = { hx i , yi i } ni= 1
• Assumes each x i ⇠ D(X ) with yi = f t ar get (x i )

X, Y
Train the model:
learner
model classifier.train(X, Y )
x model yprediction

Apply the model to new data:

• Given: new unlabeled instance x ⇠ D(X )
yprediction ß model.predict(x)
14
Ref: https://www.seas.upenn.edu, Eric Eaton.
Top-down learning
node = root of decision tree
Main loop:
1. A ß the “best” decision attribute for the next node.
2. Assign A as decision attribute for node.
3. For each value of A, create a new descendant of node.
4. Sort training examples to leaf nodes.
5. If training examples are perfectly classified, stop. Else,
recurse over new leaf nodes.

How do we choose which attribute is best?

15
Choosing the Best Attribute
Key problem: choosing which attribute to split a given set of examples
• Some possibilities are:
– Random: Select any attribute at random
– Least-Values: Choose the attribute with the smallest number of possible values
– Most-Values: Choose the attribute with the largest number of possible values
– Max-Gain: Choose the attribute that has the largest expected information gain
• i.e., attribute that results in smallest expected size of subtrees rooted at its children

• The ID3 algorithm uses the Max-Gain method of selecting the best
attribute

16
Information Gain
Impurity/Entropy (informal)
– Measures the level of impurity in a group of
examples

Entropy can be roughly thought of as how much variance the data has.

17
18
Entropy: a common way to measure impurity

19
Entropy
 Entropy = 0  all examples belong to the same class. Minimum
 Entropy = 1  examples are evenly split between classes. impurity

Maximum
entropy = - 1 log21 = 0 impurity
entropy = -0.5 log20.5 – 0.5 log20.5 =1

20
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Entropy
 Entropy = 0  all examples belong to the same class. Minimum
 Entropy = 1  examples are evenly split between classes. impurity

Maximum
impurity

21
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
2-Class Cases:
2-Class Cases:
Xn Minimum
Entropy H (x) = − P (x = i ) log2 P (x = i ) impurity
i= 1

• What is the entropy of a group in which all

examples belong to the same class?
– entropy = - 1 log21 = 0

not a good training set for learning Maximum

impurity
• What is the entropy of a group with 50%
in either class?
– entropy = -0.5 log20.5 – 0.5 log20.5 =1
good training set for learning 16

22
Information Gain

23
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Information Gain

24
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Information Gain

25
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Information Gain of feature WIND

26
Information Gain of feature HUMIDITY

27
Information Gain of feature TEMP

28
Information Gain of feature OUTLOOK

29
After several iterations

30
Information Gain
Example of a family of 10 members, where
5 members are pursuing their studies and
the rest of them have completed or not
pursued.

% of pursuing = 50% % of pursuing = 0%

% of not pursuing = 50% % of not pursuing = 100%

First, calculate the entropy.

Entropy = -(0.5) * log2(0.5) -(0.5) * log2(0.5) = 1 Entropy = -(0) * log2(0) -(1) * log2(1) = 0

we can say that if a node contains only one class in it or formally says the node of the tree is pure the entropy for data in such node
will be zero and according to the information gain formula the information gained for such node will be higher and purity is higher

if the entropy is higher the information gain will be less, and the node can be considered as the less pure.

31
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Entropy for Parent Node

32
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
Entropy for Parent Node
Now according to the performance
of the students, we can say
Students= 20
Curricular activity = 10
No curricular activity = 10

Entropy = -(0.5) * log2(0.5) -(0.5) * log2(0.5) = 1

Entropy for Child Node

Students= 14 Students= 6
Curricular activity = 8/14=57% Curricular activity = 2/6=33%
No curricular activity = 6/14=43%
No curricular activity = 4/6=67%

Entropy = -(0.43) * log2(0.43) -(0.57) * log2(0.57) = 0.98 Entropy = -(0.33) * log2(0.33) -(0.67) * log2(0.67) = 0.91
we have calculated the entropy for the parent and child nodes now the weighted sum of these entropies will give the weighted entropy of all the nodes.
Weighted Entropy : (14/20)*0.98 + (6/20)*0.91 = 0.959

33
https://analyticsindiamag.com/a-complete-guide-to-decision-tree-split-using-information-gain/
splits based on the class

Entropy for parent nodes

Entropy = -(0.5) * log2(0.5) -(0.5) * log2(0.5) = 1

Entropy for child nodes

Class 11th
Entropy = -(0.8) * log2(0.8) -(0.2) * log2(0.2) = 0.722

Class 12th
Weighted entropy Entropy = -(0.2) * log2(0.2) -(0.8) * log2(0.8) = 1
Weighted Entropy : (10/20)*0.722 + (10/20)*0.722 = 0.722

34
Calculation of Information Gain
the amount of uncertainty because of any process or any given random
variable.

the amount of information improved in the nodes before splitting them for
making further decisions.

• Information Gain = 1 – Entropy higher Information Gain = more Entropy removed, which is what we
want.

Split Entropy Information Gain

Performance of class 0.959 0.041

class 0.722 0.278

35
36
37
Which Tree Should We Output?

38
Overfitting in DTs

39
Overfitting in DTs

40
Avoiding overfitting
How can we avoid overfitting?
• Stop growing when data split is not statistically significant
• Acquire more training data
• Remove irrelevant attributes (manual process – not always possible)
• Grow full tree, then post-prune

How to select “best” tree:

• Measure performance over training data
• Measure performance over separate validation data set
• Add complexity penalty to performance measure
(heuristic: simpler is better)

41
Reduced-error pruning
Split training data further into training and validation sets
Grow tree based on training set
Do until further pruning is harmful:
1. Evaluate impact on validation set of pruning each
possible node (plus those below it)
2. Greedily remove the node that most improves
validation set accuracy

42
Effect of Reduced-Error Pruning
Effect of Reduced-Error Pruning

The tree is pruned back to the red line where

it gives more accurate results on the test data 43
Decision Tree in summary

• Representation: decision trees

• Bias: prefer small decision trees
• Search algorithm: greedy
• Heuristic function:
information gain or information
content or others
• Overfitting / pruning

44
Decision Tree PROS & CONS

 Fast and simple to implement  Prone to over-fitting, specially when

many features are to be considered.
 DTs would work will with any type of
nonlinearly separable data.  While DTs visualize the decisions to be
made, at the same time they condense
 A “path” through possibilities, with a complex process into discrete steps
alternatives, always leads toward a (which may be a good or a bad thing).
desirable outcome.
 Non-incremental (batch method)

Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Ch02 DecisionTree
100% (1)
Ch02 DecisionTree
41 pages
Decision Tree Classification Guide
No ratings yet
Decision Tree Classification Guide
3 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Tree Based Algorithms in Machine Learning
No ratings yet
Tree Based Algorithms in Machine Learning
8 pages
2024 Lecture11 MLAlgorithms
No ratings yet
2024 Lecture11 MLAlgorithms
84 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Trees
No ratings yet
Trees
78 pages
Decision Trees for Beginners
No ratings yet
Decision Trees for Beginners
45 pages
Act 9
No ratings yet
Act 9
22 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
Decision Tree
100% (4)
Decision Tree
66 pages
Decision Tree Basics for Data Scientists
No ratings yet
Decision Tree Basics for Data Scientists
61 pages
Decision Trees for CS Students
No ratings yet
Decision Trees for CS Students
54 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
62 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Decision Trees for Beginners
No ratings yet
Decision Trees for Beginners
22 pages
Class Basic
No ratings yet
Class Basic
75 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
79 pages
Decision Tree Learning Basics
No ratings yet
Decision Tree Learning Basics
36 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
2025 Lecture07 P1 ID3
No ratings yet
2025 Lecture07 P1 ID3
41 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Chapter 4 SqCzYr
No ratings yet
Chapter 4 SqCzYr
47 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
Decissin Tree & Over Fitting
No ratings yet
Decissin Tree & Over Fitting
22 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Unit V-Part 1-1
No ratings yet
Unit V-Part 1-1
45 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Unit 2
No ratings yet
Unit 2
29 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Learning
No ratings yet
Learning
51 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Lecture-7 Machine Learning With Python
No ratings yet
Lecture-7 Machine Learning With Python
42 pages
Decision Tree
No ratings yet
Decision Tree
66 pages
MACHINE LEARNING Unit-1
No ratings yet
MACHINE LEARNING Unit-1
23 pages
Machine Learning Types Overview
No ratings yet
Machine Learning Types Overview
129 pages
DMML Lab
No ratings yet
DMML Lab
35 pages
Decision Trees
No ratings yet
Decision Trees
13 pages
Lecture 07A - Decision Trees
No ratings yet
Lecture 07A - Decision Trees
26 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
21csc305p Machine Learning Unit 5
No ratings yet
21csc305p Machine Learning Unit 5
61 pages
AI & Applications Study Guide BCA
100% (1)
AI & Applications Study Guide BCA
78 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Unit V Natural Language Processing
No ratings yet
Unit V Natural Language Processing
20 pages
CC - Unit IV - Chapters
No ratings yet
CC - Unit IV - Chapters
47 pages
Artificial Intelligence Techniques (2025) - Week 3
No ratings yet
Artificial Intelligence Techniques (2025) - Week 3
37 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
ML - LAB Record - Final
No ratings yet
ML - LAB Record - Final
39 pages
Ai&ml M-3
No ratings yet
Ai&ml M-3
6 pages
Week 4 Part 1 Classification
No ratings yet
Week 4 Part 1 Classification
71 pages
Unit 2
100% (1)
Unit 2
42 pages
AnClub Placements Prepbook 2024
No ratings yet
AnClub Placements Prepbook 2024
85 pages
Chapter-V CLASSIFICATION & CLUSTERING
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
153 pages
Impedance Technique Combined With Supervised Algorithms Based - 2023 - Journal o
No ratings yet
Impedance Technique Combined With Supervised Algorithms Based - 2023 - Journal o
16 pages
Chapter 15 - Machine Learning New
No ratings yet
Chapter 15 - Machine Learning New
19 pages
Introduction To Machine Learning 9
No ratings yet
Introduction To Machine Learning 9
3 pages
Factor Analysis - Segmentation New
No ratings yet
Factor Analysis - Segmentation New
142 pages
Machine Learning Summer Training
No ratings yet
Machine Learning Summer Training
118 pages
MAchine Learning 2
No ratings yet
MAchine Learning 2
16 pages
5.desion Tree
No ratings yet
5.desion Tree
18 pages
BRAC University BRACU Thesis Template
No ratings yet
BRAC University BRACU Thesis Template
26 pages
ID3 Algorithm
No ratings yet
ID3 Algorithm
11 pages
College Predictor - Thesis
No ratings yet
College Predictor - Thesis
37 pages