0% found this document useful (0 votes)

134 views6 pages

Decision Tree

Uploaded by

Deergha Tiwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

134 views6 pages

Decision Tree

Uploaded by

Deergha Tiwari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Decision Tree

A Decision Tree is a supervised learning algorithm used for both classification and regression
tasks. It works by recursively splitting the dataset into subsets based on feature values, creating a
tree-like model of decisions. The key goal of a decision tree is to split data in such a way that
each branch represents a likely outcome based on feature conditions, leading to a prediction or
decision at each leaf node.

Key Components of a Decision Tree:

1. Root Node:
o This is the starting point of the decision tree, representing the entire dataset. It
contains the feature that provides the best split (based on some criterion, like Gini
impurity or entropy for classification, or variance for regression).
2. Internal Nodes:
o These are the points where the dataset is split based on a certain feature. The
feature used for splitting is chosen according to a specific splitting criterion, such
as maximizing information gain or minimizing Gini impurity.
3. Branches:
o The branches are the results of the feature test at each node. They represent
different possible values or ranges of a feature, leading to further subdivisions.
4. Leaf Nodes (Terminal Nodes):
o These are the end nodes of the tree where no further splitting occurs. Each leaf
node represents a class label (for classification tasks) or a continuous value (for
regression tasks).

How a Decision Tree Works:

1. Feature Selection:
o The algorithm begins by selecting the feature that best divides the data. This is
done by evaluating all features using a criterion like Gini impurity, information
gain (for classification), or reduction in variance (for regression).
o For classification, the goal is to select features that produce the purest nodes
(where most data points in a node belong to one class).
2. Recursive Splitting:
o After selecting the best feature for the root node, the algorithm splits the dataset
into smaller subsets and continues the process recursively for each branch. At
each node, it selects the feature that best splits the data in that particular subset.
o This recursive process continues until the stopping criteria are met, which could
be reaching a certain tree depth, a minimum number of samples in a node, or if
further splitting does not improve the model significantly.
3. Stopping Criteria:
o To prevent overfitting, trees are usually not grown indefinitely. Some common
stopping criteria include:
 Maximum depth: Limits the number of levels in the tree.
 Minimum samples per leaf: Ensures that leaf nodes have a minimum
number of samples.
 Minimum impurity decrease: Stops splitting when the improvement in
impurity is below a threshold.
4. Prediction:
o For classification: Once the tree is fully constructed, predictions are made by
following the splits from the root node down to a leaf node. The class label at the
leaf node is the predicted class.
o For regression: The prediction is the average of the target values at the leaf node.

Splitting Criteria:
1. Gini Impurity (Classification):
o Gini impurity measures how often a randomly chosen element from the dataset
would be incorrectly classified if it was randomly labeled according to the
distribution of labels in the subset.

2. Information Gain/Entropy (Classification):

o Entropy measures the amount of uncertainty or randomness in the data. The goal
is to reduce entropy with each split.

Information Gain is calculated as the difference between the entropy of the parent node
and the weighted sum of the entropies of the child nodes.
3. Mean Squared Error (MSE) for Regression:
o In regression trees, splits are made by minimizing the mean squared error, which
measures the average squared difference between the actual target values and the
predicted values.

Advantages of Decision Trees:

1. Interpretability: Decision trees are easy to interpret and visualize. You can follow the
decision process step-by-step by reading the tree structure.
2. Handles Both Types of Data: Decision trees can handle both numerical and categorical
data.
3. Non-linear Relationships: They can model non-linear relationships between features
and target variables.
4. No Need for Data Normalization: Unlike some algorithms, decision trees don’t require
feature scaling or normalization.

Disadvantages of Decision Trees:

1. Overfitting: If not controlled, decision trees can become too complex and overfit the
training data, meaning they will perform poorly on unseen data. This is why pruning or
setting maximum depth constraints is important.
2. Instability: Small variations in the data can lead to a completely different tree. Decision
trees are sensitive to noise in the data.
3. Biased towards dominant features: If some features have a higher number of unique
values, the algorithm might select these features for splitting, even if they are not very
informative.

Applications of Decision Trees

 Business Decision Making: Used in strategic planning and resource allocation.
 Healthcare: Assists in diagnosing diseases and suggesting treatment plans.
 Finance: Helps in credit scoring and risk assessment.
 Marketing: Used to segment customers and predict customer behavior.

Pruning:
To avoid overfitting, pruning is used to reduce the size of the tree. There are two main types:

1. Pre-pruning (Early Stopping): The tree construction is stopped early based on certain
conditions (maximum depth, minimum number of samples in a node, etc.).
2. Post-pruning: The tree is fully grown, and then nodes are removed if they do not provide
significant information.

Use Cases of Decision Trees:

 Classification tasks: Identifying if an email is spam or not, determining if a patient has a

particular disease.
 Regression tasks: Predicting house prices, stock prices, or other continuous values.

Example:

Consider a dataset where we want to classify whether a person will buy a car based on their
income and age. The decision tree might split first on the person's income (high vs. low) and then
further split on age. Each split reduces the uncertainty, eventually leading to a prediction (e.g.,
will buy or will not buy).

[Income]
/ \
High Low
/ \ / \
Age>30 Age<=30 BuyCar NoBuyCar
/ \ / \
Yes No Yes No

In this simple tree:

 The root node splits on "Income."

 The internal nodes make further decisions based on age.
 The leaf nodes represent the final classification (Yes or No).
Application of Decision Tree Algorithms in Real-World Domains
1. Healthcare

In healthcare, decision tree algorithms assist in diagnosis, treatment planning, and predicting
patient outcomes. Medical data often involve complex relationships between symptoms,
diagnoses, and treatments, making decision trees a suitable tool for analyzing these relationships.

 Example: Decision trees can be used to predict whether a patient is at risk of developing a certain
condition, such as diabetes or heart disease, based on factors like age, BMI, blood pressure, and
family history. The tree splits patients into groups according to their health parameters, helping
doctors determine a treatment plan. For instance, in cancer diagnosis, decision trees can assist in
classifying whether a tumor is benign or malignant using features like size, shape, and biopsy
results.

2. Finance

In finance, decision tree algorithms are used for credit risk analysis, fraud detection, and
investment decisions. The structured nature of financial data and the need to make clear, rule-
based decisions align well with decision trees.

 Example: Banks use decision trees to assess the creditworthiness of loan applicants. By
analyzing factors like income, employment status, credit score, and past repayment
history, a decision tree can classify applicants into categories such as high-risk, moderate-
risk, or low-risk. This allows banks to make informed decisions on whether to approve a
loan, set interest rates, or require additional documentation.
 Fraud detection is another critical application. Decision trees can flag suspicious
transactions by analyzing patterns such as unusually large withdrawals, sudden
international transactions, or deviations from a customer’s normal spending habits. Each
suspicious characteristic can form a branch in the decision tree, leading to a final decision
on whether a transaction is fraudulent or legitimate.

3. Marketing

In marketing, decision tree algorithms are used to segment customers, personalize marketing
campaigns, and predict customer behavior. The ability of decision trees to classify customers
based on their demographics, behavior, and preferences makes them a valuable tool for
marketers.

 Example: A company may use a decision tree to segment its customer base based on
purchasing patterns, age, and engagement with promotional campaigns. For instance, a
decision tree might identify a group of customers who are likely to respond to discount
offers and another group that prefers loyalty programs. This allows the company to tailor
its marketing strategies for different customer segments, improving conversion rates and
customer satisfaction.
 Customer churn prediction is another common use. A decision tree can analyze past
customer behavior, such as purchase frequency and service complaints, to predict which
customers are likely to leave and why. This helps businesses implement retention
strategies before customers churn.

Decisiontree, Prefixcodeandgametree
No ratings yet
Decisiontree, Prefixcodeandgametree
12 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Decision Trees Notes
No ratings yet
Decision Trees Notes
5 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Breaking Down Decision Tree Algorithm
No ratings yet
Breaking Down Decision Tree Algorithm
10 pages
TEAA - Tree Ensembles-1
No ratings yet
TEAA - Tree Ensembles-1
43 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Decision Trees
No ratings yet
Decision Trees
9 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Assignment of Decision Tree
No ratings yet
Assignment of Decision Tree
15 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
35 pages
Decision Tree
0% (1)
Decision Tree
16 pages
Intro to Decision Trees for ML Students
No ratings yet
Intro to Decision Trees for ML Students
15 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
Entropy and Information Gain For Decision Tree Algorithm
No ratings yet
Entropy and Information Gain For Decision Tree Algorithm
12 pages
Lecture 5a
No ratings yet
Lecture 5a
24 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Unit 3
No ratings yet
Unit 3
25 pages
Decision Trees for Data Enthusiasts
No ratings yet
Decision Trees for Data Enthusiasts
52 pages
Decision Trees for Data Scientists
0% (1)
Decision Trees for Data Scientists
24 pages
Lect 6-7 Notes Decision Tree
No ratings yet
Lect 6-7 Notes Decision Tree
4 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
15 pages
ML Unit 3
No ratings yet
ML Unit 3
15 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
CART
No ratings yet
CART
26 pages
Decision Trees for Beginners
No ratings yet
Decision Trees for Beginners
45 pages
UNIT-3 ML Notes
No ratings yet
UNIT-3 ML Notes
4 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Lecture Note 5
No ratings yet
Lecture Note 5
7 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Notes On Decision Trees
No ratings yet
Notes On Decision Trees
2 pages
2.12 Chapter 6 Decision Tree
No ratings yet
2.12 Chapter 6 Decision Tree
56 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Module 5 Machine Learning
No ratings yet
Module 5 Machine Learning
36 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Decision Tree
100% (1)
Decision Tree
57 pages
1822 B.E Cse Batchno 149
No ratings yet
1822 B.E Cse Batchno 149
66 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
Decision Tree
No ratings yet
Decision Tree
57 pages
Lakshmi Priya Module 3 Assignment
No ratings yet
Lakshmi Priya Module 3 Assignment
6 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Decision Treesnotes
No ratings yet
Decision Treesnotes
3 pages
Notes Decision Tree
No ratings yet
Notes Decision Tree
22 pages
Machine Learning Chapter 4
No ratings yet
Machine Learning Chapter 4
9 pages
AI - Mod 5. Part 2
No ratings yet
AI - Mod 5. Part 2
40 pages
2.unit 2
No ratings yet
2.unit 2
23 pages
Classification 4
No ratings yet
Classification 4
16 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Unit 3,4,5 ML (CS - AI)
No ratings yet
Unit 3,4,5 ML (CS - AI)
37 pages
Cenvat Credit Rules, 2017
No ratings yet
Cenvat Credit Rules, 2017
9 pages
Husen Mohammed Quantitative Final Exam
100% (2)
Husen Mohammed Quantitative Final Exam
14 pages
MS 1380
No ratings yet
MS 1380
4 pages
Pricelist Mesin
No ratings yet
Pricelist Mesin
28 pages
The Secret History of The World 1st Edition Laura Knight-Jadczyk PDF Version
100% (3)
The Secret History of The World 1st Edition Laura Knight-Jadczyk PDF Version
121 pages
The Writings of Henry Barrow 1587 1590 Elizabethan Non Conformist Texts 1st Edition Henry Barrow Instant Download
No ratings yet
The Writings of Henry Barrow 1587 1590 Elizabethan Non Conformist Texts 1st Edition Henry Barrow Instant Download
37 pages
Hitesh - Bhatia
No ratings yet
Hitesh - Bhatia
1 page
MCQ ch3
No ratings yet
MCQ ch3
12 pages
Principles and Objectives of Risk MGMT
No ratings yet
Principles and Objectives of Risk MGMT
8 pages
IBA MBA Admission Test Result For 2023 - 2024 (Batch 66)
No ratings yet
IBA MBA Admission Test Result For 2023 - 2024 (Batch 66)
3 pages
Fireworks Pop Up Card Template
100% (1)
Fireworks Pop Up Card Template
10 pages
Coffee Production
No ratings yet
Coffee Production
9 pages
Pathology, Lecture 10, Neoplasia
97% (37)
Pathology, Lecture 10, Neoplasia
190 pages
Ictl Form 2
No ratings yet
Ictl Form 2
10 pages
News May 2024 - Parish of Newcastle & Newtownmountkennedy With Calary, Co. Wicklow
No ratings yet
News May 2024 - Parish of Newcastle & Newtownmountkennedy With Calary, Co. Wicklow
28 pages
Experience of Nurses From The Emergency Department Management: A Qualitative Study
No ratings yet
Experience of Nurses From The Emergency Department Management: A Qualitative Study
11 pages
Safety Data Sheet: 1. Identification
No ratings yet
Safety Data Sheet: 1. Identification
10 pages
Mathematical Induction Guide
100% (1)
Mathematical Induction Guide
3 pages
The Skeletal System Education Presentation in Green Yellow Simple Outlined Style
No ratings yet
The Skeletal System Education Presentation in Green Yellow Simple Outlined Style
20 pages
400m Dash Analysis Guide
No ratings yet
400m Dash Analysis Guide
22 pages
Rail Authors English Books
No ratings yet
Rail Authors English Books
4 pages
Financial Sector Development and Economic Growth in Ethiopia
No ratings yet
Financial Sector Development and Economic Growth in Ethiopia
11 pages
User's Manual: GSM FAX Terminal-8848
No ratings yet
User's Manual: GSM FAX Terminal-8848
13 pages
Service Manual: iPF700 Series
No ratings yet
Service Manual: iPF700 Series
198 pages
J. Jarvis Trial Balance As at 31 December 2010
100% (1)
J. Jarvis Trial Balance As at 31 December 2010
3 pages
Class 11 Biology Topic Wise Line by Line Chapter 4 Chemical Bonding and Molecular Structure
No ratings yet
Class 11 Biology Topic Wise Line by Line Chapter 4 Chemical Bonding and Molecular Structure
39 pages
Verbal vs Nonverbal Communication Lesson
91% (22)
Verbal vs Nonverbal Communication Lesson
5 pages
TS7000 Telescopic Crane Operators Manual V2
No ratings yet
TS7000 Telescopic Crane Operators Manual V2
21 pages
Presentation Puna Ptipk Mapin 20151028 English
No ratings yet
Presentation Puna Ptipk Mapin 20151028 English
34 pages
Untitled Document
No ratings yet
Untitled Document
7 pages

Decision Tree

Uploaded by

Decision Tree

Uploaded by

Decision Tree

Key Components of a Decision Tree:

How a Decision Tree Works:

2. Information Gain/Entropy (Classification):

Advantages of Decision Trees:

Disadvantages of Decision Trees:

Applications of Decision Trees

Use Cases of Decision Trees:

 Classification tasks: Identifying if an email is spam or not, determining if a patient has a

In this simple tree:

 The root node splits on "Income."

You might also like