05.random Forest

Uploaded by

SuganthiVasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views3 pages

05.random Forest

Uploaded by

SuganthiVasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Random Forests

Deep Dive into Random Forests

How Random Forests Work
Random Forests are an ensemble learning method that builds mul3ple decision trees and merges them together to get a
more accurate and stable predic3on. The basic idea is to combine the output of mul3ple (randomly created) decision trees
to generate a single result.

. Training Process:
Bootstrap Sampling: Each tree is trained on a different bootstrapped sample of the original dataset. This means
that for each tree, a subset of the training data is randomly chosen with replacement.
Decision Tree Construc3on: Each tree is grown to the fullest extent without pruning. During the construc3on of
each tree, a random subset of features is selected at each split point to determine the best split.
Aggrega3on of Predic3ons: For classifica3on tasks, the final predic3on is made based on the majority vote of the
individual trees. For regression tasks, the final predic3on is the average of the predic3ons from all the individual
trees.
. Feature Randomness:
Random Feature Selec3on: At each split in the tree, a random subset of the features is considered for spliLng.
This randomness helps to create a diverse set of trees and ensures that the ensemble model is not overly
dependent on any single feature.
Reduces Correla3on: By using different subsets of features, the correla3on between the individual trees is
reduced, which helps in improving the overall performance of the Random Forest.

Building and Tuning Random Forests

Prac3cal Considera3ons
. Data Preprocessing:
Handling Missing Values: Random Forests can handle missing values internally, but it is s3ll good prac3ce to
handle them during preprocessing.
Feature Scaling: Not strictly necessary as Random Forests are not sensi3ve to the scale of the features, but it
can be beneﬁcial for other preprocessing steps.
. Training the Model:
Number of Trees (n_es3mators): The number of trees in the forest. More trees generally lead to beRer
performance but at the cost of increased computa3on 3me.
Number of Features (max_features): The number of features to consider when looking for the best split. This
can be set as a ﬁxed number or as a percentage of the total number of features.
Unit-3: Ensemble Learning Random Forest

Hyperparameter Tuning
. Key Hyperparameters:
n_es3mators: The number of trees in the forest. A larger number of trees generally leads to beRer
performance but also increases the computa3onal cost.
max_features: The maximum number of features considered for spliLng a node. Can be a fixed number or a
percentage of the total features.
max_depth: The maximum depth of each tree. Deeper trees can capture more details but are more likely to
overfit. min_samples_split: The minimum number of samples required to split an internal node. Higher
values prevent the model from learning overly specific paRerns (overfiLng). min_samples_leaf: The
minimum number of samples required to be at a leaf node. A higher number makes the model more robust
by smoothing the model.
bootstrap: Whether bootstrap samples are used when building trees. If False, the whole dataset is used to build
each tree.
. Grid Search and Cross-Valida3on:
Grid Search: A systema3c way to work through mul3ple combina3ons of hyperparameter values, cross-valida3ng
as it goes to determine which combina3on gives the best performance.
Cross-Valida3on: Used to assess how the model will generalize to an independent dataset, helping to avoid
overfiLng.

Performance Evalua:on
. Metrics for Classifica3on:
Accuracy: The frac3on of correctly classified instances.
Precision: The frac3on of relevant instances among the retrieved instances.
Recall: The frac3on of relevant instances that have been retrieved over the total amount of relevant instances.
F1 Score: The harmonic mean of precision and recall, providing a balance between the two.
. Metrics for Regression:
Mean Squared Error (MSE): The average of the squares of the errors, giving higher weight to larger errors.
Mean Absolute Error (MAE): The average of the absolute errors, providing a linear score without over-
penalizing large errors.
R-squared: The propor3on of the variance in the dependent variable that is predictable from the independent
variables.
. Out-of-Bag (OOB) Error Es3mate:
Defini3on: An internal valida3on method where each tree is tested on the data not used in the bootstrap
sample for that tree.
Purpose: Provides an unbiased es3mate of the generaliza3on error without the need for a separate valida3on
set.
. Confusion Matrix:
Defini3on: A table used to evaluate the performance of a classifica3on algorithm by comparing the actual vs.
predicted classifica3ons.
Components: True Posi3ve (TP), True Nega3ve (TN), False Posi3ve (FP), False Nega3ve (FN).
.

2
Unit-3: Ensemble Learning Random Forest

Receiver Opera3ng Characteris3c (ROC) Curve and AUC:

ROC Curve: A graphical representa3on of the true posi3ve rate vs. the false posi3ve rate at various threshold
seLngs.
AUC (Area Under the Curve): A single scalar value to compare the performance of diﬀerent models.

Summary
Random Forests are a powerful and versa3le machine learning technique that improves accuracy and robustness by
combining mul3ple decision trees.
Feature Randomness and Bootstrap Sampling are key to reducing variance and preven3ng overfiLng.
Hyperparameter Tuning and Performance Evalua3on are essen3al to op3mizing and assessing the model.
Random Forests can be applied to both classifica3on and regression tasks and are widely used in various domains for
their effec3veness and ease of use.

Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Notes On Random Forest
No ratings yet
Notes On Random Forest
2 pages
Random Forest
No ratings yet
Random Forest
29 pages
Random Forest
No ratings yet
Random Forest
25 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Random Forest
No ratings yet
Random Forest
25 pages
Random Forest
No ratings yet
Random Forest
6 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Randon Forest
No ratings yet
Randon Forest
34 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
No ratings yet
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
6 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Random Forest Algorithm Updated
No ratings yet
Random Forest Algorithm Updated
11 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
Random Forests
No ratings yet
Random Forests
43 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Random Forest for ML Enthusiasts
No ratings yet
Random Forest for ML Enthusiasts
4 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
Trees and Random Forest
No ratings yet
Trees and Random Forest
34 pages
Random Forest
No ratings yet
Random Forest
21 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Lecture Notes - Random Forests PDF
100% (1)
Lecture Notes - Random Forests PDF
4 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Random Forest
No ratings yet
Random Forest
8 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
32 pages
03 - Random Forest
No ratings yet
03 - Random Forest
24 pages
25 June 2024 12:34: Random Fores Page 1
No ratings yet
25 June 2024 12:34: Random Fores Page 1
6 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
Random Forest Algorithm Unit 3
No ratings yet
Random Forest Algorithm Unit 3
2 pages
Session 2 On Random Forest
No ratings yet
Session 2 On Random Forest
11 pages
Da MS
No ratings yet
Da MS
24 pages
Random Forest Lecture
No ratings yet
Random Forest Lecture
5 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
Random Forest Classic Style
No ratings yet
Random Forest Classic Style
9 pages
Random Forest
No ratings yet
Random Forest
14 pages
Random Forest Algorithm 1
100% (2)
Random Forest Algorithm 1
14 pages
Random Forest in ML
No ratings yet
Random Forest in ML
13 pages
Aditri Chaudhuri - DM
No ratings yet
Aditri Chaudhuri - DM
10 pages
68546c408a59e
No ratings yet
68546c408a59e
2 pages
L1 - Ensemble Learning - Random Forests (Lecture Slides)
No ratings yet
L1 - Ensemble Learning - Random Forests (Lecture Slides)
73 pages
Random Forest
No ratings yet
Random Forest
16 pages
ML Lec6
No ratings yet
ML Lec6
4 pages
Random Forest
No ratings yet
Random Forest
13 pages
Random Forest, CNN and Different Algorithm
No ratings yet
Random Forest, CNN and Different Algorithm
14 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Machine Learning Random Forest Algorithm - Javatpoint
100% (1)
Machine Learning Random Forest Algorithm - Javatpoint
14 pages
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-02-18 Reference-Material-II
No ratings yet
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-02-18 Reference-Material-II
39 pages
Machine Learning
No ratings yet
Machine Learning
23 pages
2023AIB1008 Lab08
No ratings yet
2023AIB1008 Lab08
8 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Lecture-12 Machine Learning With Python
No ratings yet
Lecture-12 Machine Learning With Python
18 pages
DS&RF
No ratings yet
DS&RF
20 pages
12 - Chapter 2 - 2
No ratings yet
12 - Chapter 2 - 2
31 pages
05.classification Algorithm
No ratings yet
05.classification Algorithm
3 pages
Syllabus Software Testing
No ratings yet
Syllabus Software Testing
3 pages
SRM Mtech-Research - Reg
No ratings yet
SRM Mtech-Research - Reg
17 pages
College Book 08082017
No ratings yet
College Book 08082017
358 pages
05.stochastic Gradient Descent
No ratings yet
05.stochastic Gradient Descent
2 pages
Amrita School of Biotechnology Amrita Vishwa Vidyapeetham Amritapuri Campus
No ratings yet
Amrita School of Biotechnology Amrita Vishwa Vidyapeetham Amritapuri Campus
40 pages
Woodtech Ingenio PU
No ratings yet
Woodtech Ingenio PU
2 pages
Ge3753 Engineering Economics and Financial Accounting
No ratings yet
Ge3753 Engineering Economics and Financial Accounting
1 page
Tamil Nadu Engineering Colleges
No ratings yet
Tamil Nadu Engineering Colleges
10 pages
RENEWABLE ENERGY TECHNOLOGIES - Syllabus
67% (3)
RENEWABLE ENERGY TECHNOLOGIES - Syllabus
1 page
A Systematic Review of The Effect of Noc
No ratings yet
A Systematic Review of The Effect of Noc
9 pages
SECURITY
No ratings yet
SECURITY
28 pages
Ge3751 Principles of Management
80% (5)
Ge3751 Principles of Management
1 page
DM (Unit-3) - 1
No ratings yet
DM (Unit-3) - 1
34 pages
Java Graphics Programming
No ratings yet
Java Graphics Programming
4 pages
BIOT 3 SEM Timetable
No ratings yet
BIOT 3 SEM Timetable
1 page
Life Processes
No ratings yet
Life Processes
2 pages
Abdelwahab Alsammak - Lecture-1-Introduction
No ratings yet
Abdelwahab Alsammak - Lecture-1-Introduction
43 pages
Rit Brochure 23-24
No ratings yet
Rit Brochure 23-24
8 pages
Green Tech
No ratings yet
Green Tech
4 pages
AI Principles Course Overview
No ratings yet
AI Principles Course Overview
2 pages
Computer Network Handwritten Notes
100% (2)
Computer Network Handwritten Notes
12 pages
B E Automobile
No ratings yet
B E Automobile
40 pages
குங்குமம் 19.08.2022
No ratings yet
குங்குமம் 19.08.2022
132 pages
B E Mechatronics
No ratings yet
B E Mechatronics
46 pages
B.E. Aerospace Engg
No ratings yet
B.E. Aerospace Engg
41 pages
B E Aeronatical
No ratings yet
B E Aeronatical
42 pages
ML Unit 3
No ratings yet
ML Unit 3
2 pages
1.3. Truncation Errors: 1.3.1. Exercises: Following Are Some Known Results
No ratings yet
1.3. Truncation Errors: 1.3.1. Exercises: Following Are Some Known Results
1 page
Chapter1 DiscreteSignalsAndSystemsIII
No ratings yet
Chapter1 DiscreteSignalsAndSystemsIII
5 pages
LAB#08: Implementation of Code of Bisection Method and Regula-Falsi Method For Solution of Transcendental Equations in MATLAB
No ratings yet
LAB#08: Implementation of Code of Bisection Method and Regula-Falsi Method For Solution of Transcendental Equations in MATLAB
10 pages
Data Mining 456
No ratings yet
Data Mining 456
8 pages
Romanian Math Competition 2017
No ratings yet
Romanian Math Competition 2017
1 page
Least Square Filter
No ratings yet
Least Square Filter
5 pages
Shifting Bottleneck Weighted Tardiness
No ratings yet
Shifting Bottleneck Weighted Tardiness
8 pages
Lec 16
No ratings yet
Lec 16
27 pages
10 Maths Chapter 09
No ratings yet
10 Maths Chapter 09
4 pages
CSPC2002
No ratings yet
CSPC2002
2 pages
NNDL
No ratings yet
NNDL
96 pages
Polynomial Roots Guide
No ratings yet
Polynomial Roots Guide
4 pages
Development of Machine Learning Models For Classification of Tenders Based On UNSPSC Standard Procurement Taxonomy
No ratings yet
Development of Machine Learning Models For Classification of Tenders Based On UNSPSC Standard Procurement Taxonomy
28 pages
Gradient Search Method Sem 4
No ratings yet
Gradient Search Method Sem 4
10 pages
Morphological Image Processing: Presented By: Hiba Faisal Nisar Ahmad Anam Qureshi
No ratings yet
Morphological Image Processing: Presented By: Hiba Faisal Nisar Ahmad Anam Qureshi
24 pages
Lecture 5 - Uninformed Vs Informed Search-1
No ratings yet
Lecture 5 - Uninformed Vs Informed Search-1
39 pages
Advanced Hybrid Adams-Moulton Methods
No ratings yet
Advanced Hybrid Adams-Moulton Methods
25 pages
Divide and Conquer & Greedy Algorithms
No ratings yet
Divide and Conquer & Greedy Algorithms
52 pages
Matrix Decomposition and Its Application in Statistics - NK
100% (1)
Matrix Decomposition and Its Application in Statistics - NK
82 pages
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
1000 Machine Learning MCQ (Multiple Choice Questions) - Sanfoundry
16 pages
Merge Sort
No ratings yet
Merge Sort
10 pages
Task On Hopfield Networks
No ratings yet
Task On Hopfield Networks
8 pages
JNTUK - M Tech - 2018 - 1st Semester - Feb - R17 R16 R15 R13 - G4001022018 ADVANCED DATA STRUCTURES AND ALGORITHM ANALYSIS
No ratings yet
JNTUK - M Tech - 2018 - 1st Semester - Feb - R17 R16 R15 R13 - G4001022018 ADVANCED DATA STRUCTURES AND ALGORITHM ANALYSIS
1 page
Data Structures Exam Guide
No ratings yet
Data Structures Exam Guide
2 pages
Watershed Segmentation
No ratings yet
Watershed Segmentation
19 pages
7.binary Tree Problems
No ratings yet
7.binary Tree Problems
2 pages
Final DAA Lab Manual
No ratings yet
Final DAA Lab Manual
38 pages
Master Method Cheat Sheet
No ratings yet
Master Method Cheat Sheet
2 pages
N-Queen Project
No ratings yet
N-Queen Project
15 pages

05.random Forest

Uploaded by

05.random Forest

Uploaded by

Random Forests

Deep Dive into Random Forests

Building and Tuning Random Forests

Receiver Opera3ng Characteris3c (ROC) Curve and AUC:

You might also like