IDML Presentation

The document presents an overview of the concepts of training, validation, and test datasets in machine learning, emphasizing their distinct roles in model development. The training set is used to teach the model, the validation set helps in tuning hyperparameters, and the test set evaluates the model's performance on unseen data. It also discusses the importance of data splitting, typically following the 80:20 rule for effective model training and testing.

Uploaded by

vishwasuded265

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views12 pages

IDML Presentation

Uploaded by

vishwasuded265

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

PRESENTATION ON

NOTION OF TRAINING-VALIDATION AND TESTING

BY:
• Pn Sameera-22BCAR261
• K.pooja-22BCAR262
• Priyadharshini-22BCAR263
• Vishwas-22BCAR265
INTRODUCTION
• Understanding the concepts of training data, validation data, and test
data is important in machine learning. In the world of machine learning,
data reigns supreme.

• . The trio of training data sets, validation data sets, and test data sets,
play an important role in shaping your machine learning model.

• Machine learning (ML) is a branch of artificial intelligence (AI) that uses

data and algorithms to mimic real-world situations. Machine learning
helps you forecast, analyze, and study human behaviors and events.
• Machine learning helps you understand customer behaviors, spot
process-related patterns, and operational gaps.
• Machine learning also helps you predict trends and developments.
• Constructing a machine learning algorithm depends on how it will
collect data. In this process, information is categorized into three
types of data:
1.Training data.
2.Validation data.
3.Test data.
THREE TYPE OF SPLIT DATASET IN MACHINE LEARNING
TRAINING SET
• It is the set of data that is used to train and make the model learn the hidden
features/patterns in the data.
• The training set includes the features and well as labels in the case of
supervised learning. In the case of unsupervised learning, it can simply
be the feature sets.
• These labels are used in the training phase to get the training accuracy
score. The training set is usually taken as 70% of the original dataset
but can be changed per the use case or available data.
• The training set must include all the possible inputs the model can process.
• For example, if your model must classify pictures of cats and dogs, the training set
must include both cats and dogs.
VALIDATION SET
• The validation set is used to provide an unbiased evaluation of the model fit
during hyperparameter tuning of the model.
• It is the set of examples that are used to change learning process
parameters.
• Optimal values of hyperparameters are tested against the model trained
using the training set.
• In Machine Learning or Deep Learning, we generally need to test multiple
models with different hyperparameters and check which model gives the
best result. This process is carried out with the help of a validation set.
• Applications of Validation Set
• Validations sets are used for Hyperparameter tuning of AI models. Domains
include Healthcare, Analytics, Cyber Security, etc.
TEST SET

Test data is used to perform a realistic check on an algorithm.

Test data, also known as a testing set, or test set, confirms if the
machine learning model is accurate.

Once the machine learning model is confirmed as accurate, it can be

used for predictive analytics. Test data is similar to validation data.
Unlike validation data, test sets are only used once on the final model.
Data splitting for training and testing your
machine learning model

• Teaching a machine learning model will mean undertaking data splitting.

• You will need to denote which type of data you are working with: training data, validation data, or test
data. Teaching your machine learning model requires data splitting into two primary datasets: training
data and test data.
• Data splitting ensures that an algorithm model can help analysts find features or aspects that include an
outcome or result. The standard data splitting approach uses the Pareto principle. The Pareto principle is
also known as the 80:20 rule.
• The Pareto principle states that 80% of effects come from 20% of causes. The 80:20 rule can be applied
to your data splitting as it is a reliable way to assess data. Your data splitting approach should:
1.Use 80% of your data as training data.
2.Use the remaining 20% of your data as testing data.
• In summary, training, testing,
and validation sets serve
distinct purposes in machine
learning. The training set is
used to train the model; the
test set evaluates its
performance on unseen data;
and the validation set aids in
model selection and
hyperparameter tuning.
10/21/2024 12

Intro To Aids Proficency Sunil
No ratings yet
Intro To Aids Proficency Sunil
7 pages
Data Splitting for Model Training
No ratings yet
Data Splitting for Model Training
9 pages
1-Introduction To Machine Learning
No ratings yet
1-Introduction To Machine Learning
61 pages
Unit I - ML For Data Analytics
No ratings yet
Unit I - ML For Data Analytics
106 pages
Train and Test Datasets in Machine Learning
No ratings yet
Train and Test Datasets in Machine Learning
26 pages
Machine Learning 2
No ratings yet
Machine Learning 2
7 pages
CSC407 - Chapter 5-6
No ratings yet
CSC407 - Chapter 5-6
42 pages
Train and Test Datasets in Machine Learning
No ratings yet
Train and Test Datasets in Machine Learning
6 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Basic Concepts of Machine Learning For Beginners 1732109263
No ratings yet
Basic Concepts of Machine Learning For Beginners 1732109263
102 pages
CSL0777 L08
No ratings yet
CSL0777 L08
29 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Data Splitting for ML Models
No ratings yet
Data Splitting for ML Models
9 pages
First Cut Draft LS1.4
No ratings yet
First Cut Draft LS1.4
11 pages
UNIT 2 Deep Learing Answers
No ratings yet
UNIT 2 Deep Learing Answers
42 pages
DL Unit-2 - Deep Learning Unit 2 Material DL Unit-2 - Deep Learning Unit 2 Material
No ratings yet
DL Unit-2 - Deep Learning Unit 2 Material DL Unit-2 - Deep Learning Unit 2 Material
37 pages
Machine Learning for Beginners
No ratings yet
Machine Learning for Beginners
73 pages
Unit 2 Part 2 Data Science Final 23june
No ratings yet
Unit 2 Part 2 Data Science Final 23june
39 pages
Train, Test and Validation
No ratings yet
Train, Test and Validation
3 pages
Model Validation & Data Partition
No ratings yet
Model Validation & Data Partition
14 pages
Chapter-3-Common Issues in Machine Learning
No ratings yet
Chapter-3-Common Issues in Machine Learning
20 pages
Top 45 Machine Learning Interview Questions in 2025
100% (1)
Top 45 Machine Learning Interview Questions in 2025
37 pages
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
Deep Learning Unit 3
No ratings yet
Deep Learning Unit 3
19 pages
Al - Lec 3
No ratings yet
Al - Lec 3
30 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
13 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
60 pages
ML Unit 2
No ratings yet
ML Unit 2
18 pages
Unit 4
No ratings yet
Unit 4
34 pages
E-Notes 33718 Content Document 20250325122736PM
No ratings yet
E-Notes 33718 Content Document 20250325122736PM
18 pages
Chapter 3 NeeLXU
No ratings yet
Chapter 3 NeeLXU
68 pages
11-AI ML Intro 2022
No ratings yet
11-AI ML Intro 2022
54 pages
Machine Learning Module Overview
No ratings yet
Machine Learning Module Overview
29 pages
Unit 3 ML
No ratings yet
Unit 3 ML
40 pages
Artificial Intelligence (Advance) Notes?
No ratings yet
Artificial Intelligence (Advance) Notes?
33 pages
FAIML Unit 4 Introduction To ML
No ratings yet
FAIML Unit 4 Introduction To ML
22 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
ML Viva Questions
No ratings yet
ML Viva Questions
25 pages
Understanding Datasets Features Selection Train Test Validation Sets L12
No ratings yet
Understanding Datasets Features Selection Train Test Validation Sets L12
25 pages
Introduction To Machine Learning: Suresh Singh Rajpurohit
No ratings yet
Introduction To Machine Learning: Suresh Singh Rajpurohit
28 pages
Zarantech - Intro To ML
No ratings yet
Zarantech - Intro To ML
105 pages
Machinelearning Unit1
No ratings yet
Machinelearning Unit1
9 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
Unit 1
No ratings yet
Unit 1
14 pages
Data Science Model Optimization
No ratings yet
Data Science Model Optimization
18 pages
Machine Learning Types & Techniques
No ratings yet
Machine Learning Types & Techniques
17 pages
Development and Deployment Setup: Data Collection
No ratings yet
Development and Deployment Setup: Data Collection
8 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Lecture 6
No ratings yet
Lecture 6
12 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Concepts of Machine Learning
No ratings yet
Concepts of Machine Learning
4 pages
DATA 2024 - Dist
No ratings yet
DATA 2024 - Dist
72 pages
Introduction To Data in Machine Learning
No ratings yet
Introduction To Data in Machine Learning
12 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
19 pages
Notes Unit 1-3 Part-II
No ratings yet
Notes Unit 1-3 Part-II
20 pages
5 DL
No ratings yet
5 DL
33 pages
Complete ML Notes
No ratings yet
Complete ML Notes
62 pages
Abstractive Text Summarization Using Transformer Based Approach
No ratings yet
Abstractive Text Summarization Using Transformer Based Approach
10 pages
Database Basics for Beginners
No ratings yet
Database Basics for Beginners
10 pages
10 Igcse CS Paper 1 QP Pre Board 2 Exam 2024-25
No ratings yet
10 Igcse CS Paper 1 QP Pre Board 2 Exam 2024-25
13 pages
DBS 6202 - Advanced Database Systems Individual Assignment Iii
No ratings yet
DBS 6202 - Advanced Database Systems Individual Assignment Iii
16 pages
Emotion Detection Using Machine Learning
No ratings yet
Emotion Detection Using Machine Learning
7 pages
DBMS Full Notes
No ratings yet
DBMS Full Notes
99 pages
DBMS vs Traditional File Systems
No ratings yet
DBMS vs Traditional File Systems
17 pages
COMPUTER SCIENCE Books List
No ratings yet
COMPUTER SCIENCE Books List
3 pages
Research Librarian (F D M)
No ratings yet
Research Librarian (F D M)
3 pages
SHIVAM SINGH CV Final
No ratings yet
SHIVAM SINGH CV Final
1 page
Unit 3 Topic Ontological-Engineering
No ratings yet
Unit 3 Topic Ontological-Engineering
60 pages
Lecture 1.2
No ratings yet
Lecture 1.2
32 pages
DS Important Questions
No ratings yet
DS Important Questions
3 pages
Professional Machine Learning Engineer
No ratings yet
Professional Machine Learning Engineer
27 pages
My Resume Latest
No ratings yet
My Resume Latest
3 pages
The Java EE 5 Tutorial 2nd Ed Edition Eric Jendrock - Download The Ebook Today and Experience The Full Content
100% (7)
The Java EE 5 Tutorial 2nd Ed Edition Eric Jendrock - Download The Ebook Today and Experience The Full Content
47 pages
Ensemble Clustering Based Approach For Software Architecture Recovery
No ratings yet
Ensemble Clustering Based Approach For Software Architecture Recovery
7 pages
Resume Temp PDF
No ratings yet
Resume Temp PDF
3 pages
Assignment Network and Security
No ratings yet
Assignment Network and Security
4 pages
Srishti Resume (SE) - 1
No ratings yet
Srishti Resume (SE) - 1
1 page
Oracle Syllabus
No ratings yet
Oracle Syllabus
15 pages
DCA1109 & INTRODUCTION TO WEB PROGRAMMING-Bhushan 266
No ratings yet
DCA1109 & INTRODUCTION TO WEB PROGRAMMING-Bhushan 266
19 pages
Data Quality Framework Eu Medicines Regulation - en
No ratings yet
Data Quality Framework Eu Medicines Regulation - en
42 pages
Introduction To Data Base Mangement
No ratings yet
Introduction To Data Base Mangement
88 pages
AI-Powered E-Learning Platform
No ratings yet
AI-Powered E-Learning Platform
20 pages
QP Format Dbms
No ratings yet
QP Format Dbms
3 pages
Spring Data MongoDB Cheat Sheet
No ratings yet
Spring Data MongoDB Cheat Sheet
1 page
Recent Trends in Technology
No ratings yet
Recent Trends in Technology
11 pages
10.2478 - Orga 2021 0020
No ratings yet
10.2478 - Orga 2021 0020
13 pages
Audio To Sign Language Tool
No ratings yet
Audio To Sign Language Tool
7 pages