0% found this document useful (0 votes)

33 views32 pages

Ensemble Learning Explained

Bagging and Boosting

Uploaded by

akithmivihasna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views32 pages

Ensemble Learning Explained

Bagging and Boosting

Uploaded by

akithmivihasna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Bagging and Boosting Roy Ian MSc(UOM), BSc. Eng (Hons.

)(UOM), AMIESL, CCNP

Traditional modeling

• A single model is selected and trained with the dataset

• Full dataset is used to trained the model
• All the features or selected features form feature extraction process will
be used

• But,
• There can be overfitting issues - variance
• There can be underfitting issues – biased
• Final performance might not be as expected
Traditional modeling

• A single model is selected and trained with the dataset

• Full dataset is used to trained the model
• All the features or selected features form feature extraction process will
be used

• But,
• There can be overfitting issues - variance
So why don’t we combine multiple models
• There can be underfitting issues – biased
• Final performance might not be as expected
General idea
Training Data
S

Multiple S1 S2 Sn
Data Sets

Multiple C1 C2 Cn
Classifiers

Combined
Classifier
H
Building ensemble classifiers

• Basic idea:
Build different “experts”, and let them vote
• Advantages:
• Improve predictive performance
• Other types of classifiers can be directly included
• Easy to implement
• No too much parameter tuning
• Disadvantages:
• The combined classifier is not so transparent (black box)
• Not a compact representation
Combining multiple models

1. Simple (unweighted) votes

• Standard choice
2. Weighted votes
• e.g., weight by tuning-set accuracy
3. Train a combining function
1. Prone to overfitting?
2. “Stacked generalization” (Wolpert)
Why do they work?

• Suppose there are 25 base classifiers

• Each classifier has error rate,  = 0.35
• Assume independence among classifiers
• Probability that the ensemble classifier makes a wrong prediction:

25
 25  i
  i   (1 −  ) = 0.06

i =13 
 25−i


Common ensemble techniques

1. Bagging

2. Boosting

3. Stacking
Bootstrapping
Bagging : Bootstrap aggregating

• Training
o Given a dataset S, at each iteration i, a training set Si is sampled with
replacement from S (i.e. bootstraping)
o A classifier Ci is learned for each Si
• Classification: given an unseen sample X,
o Each classifier Ci returns its class prediction
o Each model in the ensemble votes with equal weight
o The bagged classifier H counts the votes and assigns the class with the most
votes to X
• Regression: can be applied to the prediction of continuous values by taking
the average value of each prediction.
Bagging
Bagging

• Bagging works because it reduces variance by voting/averaging

o In some pathological hypothetical situations the overall error might
increase
o Usually, the more classifiers the better
• Problem: we only have one dataset.
• Solution: generate new ones of size n by bootstrapping, i.e. sampling it
with replacement
• Can help a lot if data is noisy
When to use Bagging

• Learning algorithm is unstable: if small changes to the training set cause

large changes in the learned classifier.
• If the learning algorithm is unstable, then Bagging almost always
improves performance
• Some candidates:
1. Decision tree
2. Decision stump
3. Regression tree
4. Linear regression
5. SVMs
Random Forest

• Decision trees are individual learners that are combined. They are one of
the most popular learning methods commonly used for data exploration.
• One type of decision tree is called CART… classification and regression
tree.
• CART … greedy, top-down binary, recursive partitioning, that divides
feature space into sets of disjoint rectangular regions.
• Regions should be pure wrt response variable
• Simple model is fit in each region – majority vote for classification, constant value
for regression.
Random Forest

• Random forest (or random forests) is an ensemble classifier that consists

of many decision trees and outputs the class that is the mode of the
class's output by individual trees.
• The term came from random decision forests that was first proposed by
Tin Kam Ho of Bell Labs in 1995.
• The method combines Breiman's "bagging" idea and the random
selection of features.
Random Forest (Algorithm)
• Each tree is constructed using the following algorithm:
1. Let the number of training cases be N, and the number of variables in the classifier be M.
2. We are told the number m of input variables to be used to determine the decision at a node of
the tree; m should be much less than M.
3. Choose a training set for this tree by choosing n times with replacement from all N available
training cases (i.e. take a bootstrap sample). Use the rest of the cases to estimate the error of the
tree, by predicting their classes.
4. For each node of the tree, randomly choose m variables on which to base the decision at that
node. Calculate the best split based on these m variables in the training set.
5. Each tree is fully grown and not pruned (as may be done in constructing a normal tree classifier).
• For prediction a new sample is pushed down the tree.
• It is assigned the label of the training sample in the terminal node it ends up in.
• This procedure is iterated over all trees in the ensemble, and the average vote of all
trees is reported as random forest prediction.
Random Forest (Algorithm)
Features and Advantages

The advantages of random forest are:

• It is one of the most accurate learning algorithms available. For many data
sets, it produces a highly accurate classifier.
• It runs efficiently on large databases.
• It can handle thousands of input variables without variable deletion.
• It gives estimates of what variables are important in the classification.
• It generates an internal unbiased estimate of the generalization error as
the forest building progresses.
• It has an effective method for estimating missing data and maintains
accuracy when a large proportion of the data are missing.

18/14
Features and Advantages

• It has methods for balancing error in class population unbalanced data

sets.
• Generated forests can be saved for future use on other data.
• Prototypes are computed that give information about the relation between
the variables and the classification.
• It computes proximities between pairs of cases that can be used in
clustering, locating outliers, or (by scaling) give interesting views of the
data.
• The capabilities of the above can be extended to unlabeled data, leading to
unsupervised clustering, data views and outlier detection.
• It offers an experimental method for detecting variable interactions.

19/14
Disadvantages

• Random forests have been observed to overfit for some datasets with
noisy classification/regression tasks.
• For data including categorical variables with different number of
levels, random forests are biased in favor of those attributes with
more levels. Therefore, the variable importance scores from random
forest are not reliable for this type of data.

20/14
Tuning Random Forests
• Random forest models have multiple hyperparameters to tune:

1. The sampling scheme: number of predictors to randomly select at each split : max_features
{“auto”, “sqrt”, “log2”}, int or float, default=”auto”.

2. The total number of trees in the forest: n_estimators, int, default=100

3. The complexity of each tree: stop when a leaf has <= min_samples_leaf samples or when we
reach a certain max_depth.

4. In theory, each tree in the random forest is full, but in practice this can be computationally
expensive (and added redundancies in the model), thus, imposing a minimum node size is not
unusual.
Tuning Random Forests

• When the number of predictors is large, but the number of relevant

predictors is small, you need to set max_features to a larger number.

Why?
If chosen features is small, in each split, the chances of selected a relevant predictor
will be low and hence most trees in the ensemble will be weak models.
Boosting

• Boosting is considered to be one of the most significant developments

in machine learning
• Finding many weak rules of thumb is easier than finding a single, highly
prediction rule
• Key in combining the weak rules
• Incremental
• Build new models that try to do better on previous model's mis-
classifications
• Can get better accuracy
• Tends to overfit
Boosting
classifier
Training Sample C(0)(x)
re-weight
Weighted classifier
Sample C(1)(x)
re-weight
Weighted classifier
Sample C(2)(x)
NClassifier


re-weight
classifier
y(x) = w iC(i) (x)
Weighted i
Sample C(3)(x)
re-weight

Weighted classifier
Sample C(m)(x)
AdaBoost(Algorithm)
classifier ❑ AdaBoost re-weights events misclassified by
Training Sample C(0)(x) previous classifier by:
re-weight
Weighted classifier 1 − ferr
with :
Sample C(1)(x) ferr
re-weight
misclassified events
Weighted classifier ferr =
Sample C(2)(x) all events
re-weight
Weighted classifier ❑ AdaBoost weights the classifiers also using the
Sample C(3)(x) error rate of the individual classifier according to:
re-weight

NClassifier
 1 − ferr
(i)
 (i)
y(x) =  log  (i) C (x)
classifier
i  ferr 
Weighted
Sample C(m)(x)
AdaBoost(Algorithm)

ROUND 1
AdaBoost(Algorithm)

ROUND 2
AdaBoost(Algorithm)

ROUND 3
AdaBoost(Algorithm)
Popular boosting algorithms

• XGBoost
• LightGBM
• CatBoosting
Comparison
5-fold cross validation

Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Random Forest
No ratings yet
Random Forest
29 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Random Forest
No ratings yet
Random Forest
25 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
32 pages
Lecture 5
No ratings yet
Lecture 5
53 pages
Assessing Predictive Models
No ratings yet
Assessing Predictive Models
25 pages
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-02-18 Reference-Material-II
No ratings yet
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-02-18 Reference-Material-II
39 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Bagging and Random Forest Presentation1
100% (4)
Bagging and Random Forest Presentation1
23 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
CS109a Lecture16 Bagging RF Boosting
No ratings yet
CS109a Lecture16 Bagging RF Boosting
48 pages
Unit 3
No ratings yet
Unit 3
63 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Random Forests
No ratings yet
Random Forests
43 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Unit 2
No ratings yet
Unit 2
13 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
ML Lec6
No ratings yet
ML Lec6
4 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Bagging
No ratings yet
Bagging
7 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Module 2
No ratings yet
Module 2
34 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Lecture-12 Machine Learning With Python
No ratings yet
Lecture-12 Machine Learning With Python
18 pages
Random Forest
No ratings yet
Random Forest
27 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Unit 3
No ratings yet
Unit 3
59 pages
Phys361 S24 Lecture 17 Random Forests
No ratings yet
Phys361 S24 Lecture 17 Random Forests
24 pages
Random Forest
No ratings yet
Random Forest
25 pages
lecture19-FromTreesToForests RandomForests
No ratings yet
lecture19-FromTreesToForests RandomForests
50 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Random Forest Algorithm Updated
No ratings yet
Random Forest Algorithm Updated
11 pages
Pa - Unit - Iv
No ratings yet
Pa - Unit - Iv
45 pages
Random Forest Lecture
No ratings yet
Random Forest Lecture
5 pages
Week 12
No ratings yet
Week 12
34 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Random Forest
No ratings yet
Random Forest
6 pages
PDS LVC 2 Post-Session Summary
No ratings yet
PDS LVC 2 Post-Session Summary
11 pages
Ensemble Methods
No ratings yet
Ensemble Methods
19 pages
Tree-Based Methods Explained
No ratings yet
Tree-Based Methods Explained
68 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
Random Forest
No ratings yet
Random Forest
10 pages
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
No ratings yet
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
6 pages
ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Data 10 00007
No ratings yet
Data 10 00007
9 pages
+3 Arts Sociology
No ratings yet
+3 Arts Sociology
47 pages
Analyses and Discussions in Action Research
No ratings yet
Analyses and Discussions in Action Research
48 pages
Fundamentals of Differential Equations Solutions Even 6th Edition Nagle - Read The Ebook Online or Download It To Own The Complete Version
100% (14)
Fundamentals of Differential Equations Solutions Even 6th Edition Nagle - Read The Ebook Online or Download It To Own The Complete Version
71 pages
Splitting Methods For Partial Differential Equations With Rough Solutions Holden H Instant Download
No ratings yet
Splitting Methods For Partial Differential Equations With Rough Solutions Holden H Instant Download
90 pages
Amazon Resume
No ratings yet
Amazon Resume
2 pages
Nrs 110 Lecture 1 Care Plan Workshop
100% (3)
Nrs 110 Lecture 1 Care Plan Workshop
45 pages
Train Business Plan
No ratings yet
Train Business Plan
13 pages
Detailed Lesson Plan - Lesson 3 - Monreal
No ratings yet
Detailed Lesson Plan - Lesson 3 - Monreal
16 pages
DLL - Mapeh 4 - Q1 - W2
No ratings yet
DLL - Mapeh 4 - Q1 - W2
3 pages
Individual Learning and Development Plan 2024-2025
No ratings yet
Individual Learning and Development Plan 2024-2025
4 pages
Adolfo CAmacho Yague - GCP
No ratings yet
Adolfo CAmacho Yague - GCP
8 pages
Phenomenological Research Guidelines
100% (1)
Phenomenological Research Guidelines
2 pages
Ordinance For UG & PG Gracing Marks
No ratings yet
Ordinance For UG & PG Gracing Marks
6 pages
Forgiveness and Atonement Christs Restorative Sacrifice 1st Edition Jonathan Curtis Rutledge Instant Download
100% (2)
Forgiveness and Atonement Christs Restorative Sacrifice 1st Edition Jonathan Curtis Rutledge Instant Download
83 pages
Bioethics: Principles, Issues, and Cases 3rd Edition (Ebook PDF) Download
100% (4)
Bioethics: Principles, Issues, and Cases 3rd Edition (Ebook PDF) Download
114 pages
Outside State Colleges Email Details
No ratings yet
Outside State Colleges Email Details
84 pages
Novi Zivotopis
No ratings yet
Novi Zivotopis
2 pages
Blended Distance Learning Approach in The New Normal and The Core Competencies in Cookey Among TVL Students
No ratings yet
Blended Distance Learning Approach in The New Normal and The Core Competencies in Cookey Among TVL Students
12 pages
Concepts and Principles of Human Behavior: Learning Objectives
No ratings yet
Concepts and Principles of Human Behavior: Learning Objectives
5 pages
4 As Lesson Plan
No ratings yet
4 As Lesson Plan
9 pages
Application Form
No ratings yet
Application Form
2 pages
TUSPM Student Handbook
No ratings yet
TUSPM Student Handbook
198 pages
Manual of Surgical Pathology Expert Consult Online and Print Expert Consult Title Online Print Third Edition Susan C. Lester MD PHD Instant Download
No ratings yet
Manual of Surgical Pathology Expert Consult Online and Print Expert Consult Title Online Print Third Edition Susan C. Lester MD PHD Instant Download
121 pages
HEC Resume Form
No ratings yet
HEC Resume Form
4 pages
Types of Reading
No ratings yet
Types of Reading
15 pages
Clearance
100% (1)
Clearance
71 pages
Design
No ratings yet
Design
2 pages
Summer Valley Eng Lang Prelims
No ratings yet
Summer Valley Eng Lang Prelims
4 pages
Operative Techniques in Pediatric Plastic and Reconstructive Surgery Complete Ebook Edition
No ratings yet
Operative Techniques in Pediatric Plastic and Reconstructive Surgery Complete Ebook Edition
16 pages

Ensemble Learning Explained

Uploaded by

Ensemble Learning Explained

Uploaded by

Bagging and Boosting Roy Ian MSc(UOM), BSc. Eng (Hons.

)(UOM), AMIESL, CCNP

• A single model is selected and trained with the dataset

• A single model is selected and trained with the dataset

1. Simple (unweighted) votes

• Suppose there are 25 base classifiers

• Bagging works because it reduces variance by voting/averaging

• Learning algorithm is unstable: if small changes to the training set cause

• Random forest (or random forests) is an ensemble classifier that consists

The advantages of random forest are:

• It has methods for balancing error in class population unbalanced data

2. The total number of trees in the forest: n_estimators, int, default=100

• When the number of predictors is large, but the number of relevant

• Boosting is considered to be one of the most significant developments

You might also like