0% found this document useful (0 votes)

11 views41 pages

3.1. Introduction To Machine Learning Concepts

Uploaded by

whosamyee2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views41 pages

3.1. Introduction To Machine Learning Concepts

Uploaded by

whosamyee2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

3.1.

Introduction to Machine
Learning Concepts
Learning Objectives
By the end of this lecture, candidates will be able to:
• Understand the basic concepts of machine learning, including features, classification,
regression, and model training.
• Recognize different ML models like KNN, Decision Trees, SVM, and Linear/Logistic
Regression, and their uses in drug discovery.
• Learn the working principle of ML models and their use in drug discovery and
development.
Why Machine Learning for Drug Discovery?
• Traditional drug discovery is slow and costly:
• Average timeline: 10-15 years
• Average cost: $2.6 billion per drug
• High failure rates — 90% of clinical trials fail
• ML Helps in:
• Hit Identification: Screening millions of compounds virtually (e.g., DeepDock,
VirtualFlow)
• Lead Optimization: Predicting molecule activity, ADMET (Absorption,
Distribution, Metabolism, Excretion, Toxicity) profiles
• Drug Repurposing: Finding new uses for existing drugs (e.g., Remdesivir for
COVID-19)
• Personalized Medicine: Predicting patient-specific drug responses
What is Machine Learning (ML)?
• Machine learning is a subset of artificial intelligence (AI) where computers learn from data
without being explicitly programmed.
• The model learns patterns from data to make predictions or decisions.
• Three Types of Machine Learning:
• Supervised Learning – Model learns from labeled data (e.g., predicting if a molecule is
active/inactive)
• Unsupervised Learning – Model finds hidden patterns in unlabeled data (e.g., clustering compounds
by chemical similarity)
• Reinforcement Learning – Model learns through trial and error (e.g., optimizing synthesis pathways)
Features
• A feature is an individual measurable property or characteristic of the data — like a descriptor for a
molecule.
• In drug discovery, features describe molecular properties. Examples:
• Molecular weight — How "heavy" the molecule is
• LogP (Partition coefficient) — Lipophilicity (fat vs. water solubility)
• Hydrogen bond donors/acceptors — Essential for binding interactions
• Topological polar surface area (TPSA) — Affects cell permeability
• Molecular fingerprints — Encoded bit strings representing chemical structure
• The better the features, the smarter the model.
• Predicting blood-brain barrier permeability — features like LogP,
TPSA, and rotatable bonds help determine if a compound crosses the BBB.
Discrete Features (Categorical)
• These are features that take on distinct, separate values — usually categories or
counts.
• Characteristics:
• Can’t be broken down into finer values
• Often encoded as integers (e.g., 0, 1, 2) or one-hot encoded for ML models
• Examples:
• Drug Class: Antibiotic (0), Antiviral (1), Anticancer (2)
• Chemical Substructures: Presence of benzene ring (Yes/No → 1/0)
• Toxicity Class: Non-toxic (0), Low toxicity (1), High toxicity (2)
• Amino Acid Type: Hydrophobic, Polar, Charged
Binary Features
• A feature that has only two possible values — typically 0/1, Yes/No, or True/False.
• Characteristics:
• Encodes presence/absence or positive/negative states
• Often represents qualitative data in a simplified form
• Helps models make quick binary decisions
• Examples:
• Lipinski’s Rule of 5 Compliance: (Yes = 1, No = 0)
• Toxicity Flag: (Toxic = 1, Non-toxic = 0)
• Hydrogen Bond Donor Presence: (Yes = 1, No = 0)
• Molecular Scaffold Presence: (Aromatic ring present = 1, Absent = 0)
• Activity Classification: (Active = 1, Inactive = 0)
Continuous Features (Numerical)
• These are features that can take any value within a range — they’re measured
on a continuous scale.
• Characteristics:
• Can be infinitely divided into smaller values (e.g., 5.3, 5.31, 5.314)
• Often require scaling/normalization (e.g., Min-Max, StandardScaler)
• Examples:
• Molecular Weight: e.g., 342.3 g/mol
• LogP (Lipophilicity): e.g., 2.5 (hydrophobicity measure)
• Topological Polar Surface Area (TPSA): e.g., 78.9 Å² (predicts membrane
permeability)
• IC₅₀ (Half Maximal Inhibitory Concentration): e.g., 12.7 nM (measure of drug
potency)
Derived Features
• Features that are calculated or engineered from existing data to capture
more insight.
• Characteristics:
• Can be continuous or categorical
• Helps enhance predictive power
• Involves domain knowledge for meaningful transformations
• Examples in Drug Discovery:
• Ligand Efficiency: (pIC₅₀ / Molecular Weight) → Measures binding efficiency
• LogD (Distribution Coefficient): Derived from LogP and pKa for drug
permeability prediction
• Hydrophobic Surface Area: Calculated from molecular structure
• Drug-likeness Score: Composite of multiple descriptors (MW, LogP, H-bond
donors, etc.)
• Polar to Nonpolar Ratio: Ratio of polar atoms to non-polar atoms — useful for
predicting solubility
Machine Learning
Input 1

Input 2 Types of Predictions

Input 3

Input 4 Output
Model (Prediction)
Input 5

Input 6

Input 7

Input 8
Supervised Learning
• Learn from labelled data to predict outcomes.
• Types:
• Classification: (e.g., Active vs. Inactive compound, soluble vs insoluble)
— SVM, Logistic Regression
• Regression: (e.g., Predict solubility, permeability or IC₅₀ values.) —
Linear Regression, Random Forest
Classification vs. Regression
Category Type Description Example Algorithms

Binary Two possible Active vs. Inactive Logistic Regression, SVM,

Classification
Classification outcomes compound Random Forest
Multiclass More than two Agonist vs. Antagonist Decision Trees, KNN, Neural
Classification classes vs. Neutral ligand Networks
Each sample can
Multilabel Antimicrobial + Anti- One-vs-Rest, Neural Networks,
belong to multiple
Classification inflammatory properties Adapted Random Forest
classes
Simple One feature → one Molecular weight
Regression Linear Regression, SVR
Regression output predicting IC₅₀
MW, logP, H-bond
Multiple Multiple features → Ridge Regression, Lasso
donors predicting
Regression one output Regression
bioavailability
Polynomial Captures curved Dose vs. Response
Polynomial Regression models
Regression relationships curve
Log/ Handles non-linear, Drug concentration
Exponential exponential, or decay in plasma over Nonlinear regression models
Regression decay relationships time
Machine Learning
Input 1

Input 2 Types of Predictions

Input 3

Input 4 Output
Model (Prediction)
Input 5

Input 6

Input 7

Input 8
Machine Learning
Input 1

Input 2 Types of Predictions

Input 3

Input 4 Output
Model (Prediction)
Input 5

Input 6

Input 7

Input 8
Supervised Learning Dataset

Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Supervised Learning Dataset

Each Row = Different Sample in the dataset

Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Supervised Learning Dataset
Different features

Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Supervised Learning Dataset
Label

Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Supervised Learning Dataset
Label Matrix (Y) Feature Matrix (X)

Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Machine Learning Actual value

Training Loss = Actual-Predicted

-1.8 mol/L
Model (Prediction)
Supervised Learning Dataset
Training Dataset

60:20:20 or 80:10:10

Validation
Dataset
Test
Dataset
Data from J. Chem. Inf. Comput. Sci. 2004, 44, 1000-1005 by John S. Delaney
Machine Learning Actual value

Training Loss = Actual-Predicted

(Make Adjustments)

Training Dataset Model Prediction

Machine Learning Actual value

Loss = Actual-Predicted

Validation Dataset Model Prediction

Validation set is used as a reality check during/after

training to ensure model can handle unseen data
Machine Learning

Validation Validation
Dataset Model 1 Loss = 1.3
Dataset Model 3 Loss = 0.5

Validation Validation
Dataset Model 1 Loss = 1
Dataset Model 4 Loss = 0.8
Machine Learning
Best Model

Validation Validation
Dataset Model 1 Loss = 1.3
Dataset Model 3 Loss = 0.5

Validation Validation
Dataset Model 1 Loss = 1
Dataset Model 4 Loss = 0.8

Validation set is used as a reality check during/after

training to ensure model can handle unseen data
Machine Learning Actual value

Reported
performance

Test Dataset Model 3 Prediction

Test set is used to check how generalizable

the final chosen model is
Loss Functions
• L1 Loss Function is used to minimize the error which is the sum of the all
the absolute differences between the true value and the predicted value.
• L2 Loss Function is used to minimize the error which is the sum of the all the
squared differences between the true value and the predicted value. (if close,
penalty is minimum and vice versa)
Supervised Learning Algorithms
• Classification: Predicts categories (e.g., "soluble" vs "insoluble").
• Regression: Predicts continuous values (e.g., logP, IC 50).

Algorithm Classification Regression

k-Nearest Neighbors (KNN) ✅ ✅
Support Vector Machine
✅ ✅
(SVM)
Decision Tree ✅ ✅
Random Forest ✅ ✅
Logistic Regression ✅ ❌
Linear Regression ❌ ✅
Naive Bayes ✅ ❌
Unsupervised Learning
• Clustering: Groups data (e.g., compound libraries clustering by chemical similarity).
• Dimensionality Reduction: Reduces features (e.g., PCA on molecular descriptors).

Dimensionality
Algorithm Clustering
Reduction

k-Means ✅ ❌

Hierarchical Clustering ✅ ❌

DBSCAN ✅ ❌

Principal Component
❌ ✅
Analysis (PCA)

t-SNE ❌ ✅
k-Nearest Neighbors (KNN)
• Works for Classification & Regression
• How it works:
• Finds the "k" closest data points to a new sample (based on
distance, like Euclidean distance).
• For classification, it picks the majority class among the
neighbors (e.g., most are "soluble" → predicts "soluble").
• For regression, it averages the values of the neighbors (e.g.,
averages logP values).
• Use case in drug discovery:
• Predicting whether a compound is an inhibitor (yes/no).
• Estimating a molecule’s binding affinity by averaging nearby
known compounds.
Support Vector Machine (SVM)
• Supports Classification (SVC) and Regression (SVR)
• How it works:
• Finds a hyperplane that best separates data into classes (for
classification).
• For regression, it tries to fit data within a margin of error
while keeping the model simple (less overfitting).
• Can handle non-linear data using kernels (e.g., RBF kernel).
• Use case in drug discovery:
• Classifying compounds as active/inactive based on molecular
fingerprints.
• Predicting biological properties like solubility or toxicity.
Decision Tree
• Works for Classification & Regression
• How it works:
• Splits data into "yes/no" decisions based on features (e.g., "Does
logP > 2?").
• Grows branches until data is pure (all samples in a leaf belong to
the same class or close in value).
• Prone to overfitting, but Random Forest (ensemble of trees) fixes
that.
• Use case in drug discovery:
• Classifying molecules based on structural alerts for toxicity.
• Predicting ADMET properties (Absorption, Distribution,
Metabolism, Excretion, Toxicity).
Random Forest (RF)
• Supports Classification & Regression
• How it works:
• Builds multiple decision trees on random
subsets of data.
• Takes a majority vote (classification) or averages
predictions (regression).
• Reduces overfitting compared to a single
decision tree.
• Use case in drug discovery:
• Predicting IC50 values of kinase inhibitors.
• Identifying active compounds from high-
throughput screening data.
Linear Regression
• Regression only
• How it works:
• Finds the best-fit line through data by minimizing the
difference between predicted and actual values.
• Assumes a linear relationship between features and
target (e.g., molecular weight vs solubility).
• Use case in drug discovery:
• Predicting LogP, LogD, or other physicochemical
properties.
• Modeling dose-response curves.
Logistic Regression
• Classification only
• How it works:
• Despite the name, it's for classification (binary/multiclass).
• Uses a sigmoid function to squash predictions between 0 and 1
(probabilities).
• Predicts the likelihood of a compound being "active/inactive"
based on molecular descriptors.
• Use case in drug discovery:
• Classifying compounds as hits/non-hits in virtual screening.
• Predicting whether a molecule crosses the blood-brain barrier
(yes/no).
Principal Component Analysis (PCA)
• Dimensionality Reduction (unsupervised)
• How it works:
• Reduces a high-dimensional dataset to fewer components while preserving most of the information.
• Helps visualize data and improves model performance by removing noise.
• Use case in drug discovery:
• Reducing thousands of molecular descriptors to 2D/3D for visualization.
• Preprocessing large compound datasets for faster training.
Summary
• Machine learning models learn from data to recognize patterns, make predictions, or classify
new information.
• Features and labels are essential parts of the dataset, where features describe data points and
labels define the outcome.
• Supervised learning focuses on labeled data (e.g., classification and regression), while
unsupervised learning finds hidden patterns in unlabeled data (e.g., clustering).
• Model training, validation, and testing ensure the model
generalizes well to new, unseen data.
Further Reading
• Bzdok, D., Krzywinski, M. & Altman, N. Machine learning: a primer. Nat Methods 14, 1119–1120
(2017).
• https://medium.com/acing-ai/machine-learning-techniques-primer-60edd9d14863
• Badrulhisham F, Pogatzki-Zahn E, Segelcke D, Spisak T, Vollert J. Machine learning and artificial
intelligence in neuroscience: A primer for researchers. Brain Behav Immun. 2024 Jan;115:470-
479.
Think about it
Suppose you built a model that predicts a molecule as "active" with 95% accuracy- but in the lab,
most compounds still fail. Why might this happen?
Thank You!

CMB Project Report
No ratings yet
CMB Project Report
6 pages
Machine Learning in Chemoinformatics and Medicinal Chemistry
No ratings yet
Machine Learning in Chemoinformatics and Medicinal Chemistry
23 pages
Prediction Machines Applied Machine Learning For Therapeutic
No ratings yet
Prediction Machines Applied Machine Learning For Therapeutic
17 pages
ENGGG
No ratings yet
ENGGG
36 pages
3.3 Feature Engineering and Data Preprocessing
No ratings yet
3.3 Feature Engineering and Data Preprocessing
38 pages
Andrew F
No ratings yet
Andrew F
4 pages
Deep Learning for Lipophilicity Prediction
No ratings yet
Deep Learning for Lipophilicity Prediction
13 pages
Machine Learning in Disease Prediction
No ratings yet
Machine Learning in Disease Prediction
21 pages
DS Report 03
No ratings yet
DS Report 03
30 pages
ML Acti
No ratings yet
ML Acti
23 pages
Batch 23 Research Project I
No ratings yet
Batch 23 Research Project I
64 pages
Keshav 22PIM3722
No ratings yet
Keshav 22PIM3722
31 pages
Clin Pharma and Therapeutics - 2020 - Badillo - An Introduction To Machine Learning
No ratings yet
Clin Pharma and Therapeutics - 2020 - Badillo - An Introduction To Machine Learning
15 pages
Lipid Patient Prediction Using Machine Learning
No ratings yet
Lipid Patient Prediction Using Machine Learning
34 pages
Applications of Machine Learning in Drug Discovery and Development
No ratings yet
Applications of Machine Learning in Drug Discovery and Development
15 pages
Drug Discovery Using Machine Learning Algorithm Like SVM
No ratings yet
Drug Discovery Using Machine Learning Algorithm Like SVM
4 pages
Predicting Disease With Machine Learning
No ratings yet
Predicting Disease With Machine Learning
20 pages
Research Report
No ratings yet
Research Report
42 pages
Drug Classification Using State of Art ML Algorithm
No ratings yet
Drug Classification Using State of Art ML Algorithm
7 pages
Azencott BioML
No ratings yet
Azencott BioML
87 pages
Chemistry Centric Explanation of Machine 2021 Artificial Intelligence in The
No ratings yet
Chemistry Centric Explanation of Machine 2021 Artificial Intelligence in The
4 pages
Applications of Machine Learning in Drug Discovery
No ratings yet
Applications of Machine Learning in Drug Discovery
31 pages
PySiRC Supplementary Information
No ratings yet
PySiRC Supplementary Information
8 pages
Machine Learning Lecture1
No ratings yet
Machine Learning Lecture1
56 pages
Character N-Gram Model For Toxicity Prediction
No ratings yet
Character N-Gram Model For Toxicity Prediction
8 pages
Mini Project Report
No ratings yet
Mini Project Report
21 pages
Project 4
No ratings yet
Project 4
9 pages
ML0101EN Clas Decision Trees Drug Py v1
No ratings yet
ML0101EN Clas Decision Trees Drug Py v1
12 pages
Machine Learning in Medicine: A Practical Introduction To Techniques For Data Pre-Processing, Hyperparameter Tuning, and Model Comparison
No ratings yet
Machine Learning in Medicine: A Practical Introduction To Techniques For Data Pre-Processing, Hyperparameter Tuning, and Model Comparison
15 pages
Btac112 Supplementary Data
No ratings yet
Btac112 Supplementary Data
39 pages
IEEE Conference Team ATOM
No ratings yet
IEEE Conference Team ATOM
5 pages
LS Project Report
No ratings yet
LS Project Report
10 pages
Machine Learning Based Prediction Methods in Bioinformatics
No ratings yet
Machine Learning Based Prediction Methods in Bioinformatics
34 pages
Predictive Toxicology - 1st Edition Secure Ebook Download
100% (8)
Predictive Toxicology - 1st Edition Secure Ebook Download
17 pages
Machine Learning in Drug Discovery and Development Part 1: A Primer
No ratings yet
Machine Learning in Drug Discovery and Development Part 1: A Primer
14 pages
Masterarbeit / Master'S Thesis
No ratings yet
Masterarbeit / Master'S Thesis
58 pages
MMP Networks for Molecular Prediction
No ratings yet
MMP Networks for Molecular Prediction
61 pages
Predictive Toxicology, 1st Edition ISBN 082472397X, 9780824723972 Instant Download
No ratings yet
Predictive Toxicology, 1st Edition ISBN 082472397X, 9780824723972 Instant Download
14 pages
Research Report
No ratings yet
Research Report
35 pages
Deep Learning Assisted Compound Bioactivity Estim - 2024 - Egyptian Informatics
No ratings yet
Deep Learning Assisted Compound Bioactivity Estim - 2024 - Egyptian Informatics
9 pages
Recent Advances in Machine-Lea
No ratings yet
Recent Advances in Machine-Lea
16 pages
Online13 Koch
No ratings yet
Online13 Koch
19 pages
HussainBadshah SafwanSheikh
No ratings yet
HussainBadshah SafwanSheikh
12 pages
Link Prediction Drug Disease
No ratings yet
Link Prediction Drug Disease
21 pages
5 Markd
No ratings yet
5 Markd
24 pages
Data Analysis and Machine Learning On The Wisconsin Breast Cancer Dataset
No ratings yet
Data Analysis and Machine Learning On The Wisconsin Breast Cancer Dataset
11 pages
Thyroid Disease Classification Using ML
No ratings yet
Thyroid Disease Classification Using ML
37 pages
1 s2.0 S1359644616304366 Main
No ratings yet
1 s2.0 S1359644616304366 Main
6 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Thesis 6
No ratings yet
Thesis 6
62 pages
Drugdisease 2
No ratings yet
Drugdisease 2
17 pages
EA1006
No ratings yet
EA1006
1 page
A Self-Attention Based Message Passing Neural Netw
No ratings yet
A Self-Attention Based Message Passing Neural Netw
10 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
Instant Access To Predictive Toxicology 1st Edition Christoph Helma (Editor) Ebook Full Chapters
No ratings yet
Instant Access To Predictive Toxicology 1st Edition Christoph Helma (Editor) Ebook Full Chapters
81 pages
Pattern Recognition Unit 2
No ratings yet
Pattern Recognition Unit 2
24 pages
Lec 2
No ratings yet
Lec 2
23 pages
Imp 1 QP 2019-43-45
No ratings yet
Imp 1 QP 2019-43-45
3 pages
2.2. Overview of AI Technologies
No ratings yet
2.2. Overview of AI Technologies
30 pages
Recent Advances in Microwave-Assisted Nanocarrier Based Drug Delivery
No ratings yet
Recent Advances in Microwave-Assisted Nanocarrier Based Drug Delivery
20 pages
Riddles in Drug Discovery
No ratings yet
Riddles in Drug Discovery
24 pages
2.4 Available AI Tools and Platforms
No ratings yet
2.4 Available AI Tools and Platforms
36 pages
2.3 Key Applications of AI Across The Pipeline
No ratings yet
2.3 Key Applications of AI Across The Pipeline
29 pages
31 3dqsar
No ratings yet
31 3dqsar
9 pages
33 Target
No ratings yet
33 Target
4 pages
RANAJITNATH27701915060
No ratings yet
RANAJITNATH27701915060
35 pages
Zboril Et Al 2014 Microwave Assisted Chemistry Synthetic Applications For Rapid Assembly of Nanomaterials and Organics
No ratings yet
Zboril Et Al 2014 Microwave Assisted Chemistry Synthetic Applications For Rapid Assembly of Nanomaterials and Organics
11 pages
16 Force Field
No ratings yet
16 Force Field
14 pages
In Silico Strategies For Fragment To Lead Optimization in Drug Discovery
No ratings yet
In Silico Strategies For Fragment To Lead Optimization in Drug Discovery
9 pages
UNITIV CATALYSISTransition MetalandOrgano Catalysisinorganicsynthesis
No ratings yet
UNITIV CATALYSISTransition MetalandOrgano Catalysisinorganicsynthesis
10 pages
Unit Ivbfermentation
No ratings yet
Unit Ivbfermentation
10 pages
Clearly Defined Objectives Guiding Your Narrative
No ratings yet
Clearly Defined Objectives Guiding Your Narrative
1 page
Fragment Libraries in FBDD
No ratings yet
Fragment Libraries in FBDD
1 page
Heterogeneous Catalytic Hydrogenation
No ratings yet
Heterogeneous Catalytic Hydrogenation
18 pages
Catalysis Modified 2025
No ratings yet
Catalysis Modified 2025
61 pages
Interpreting XChem Hits Challenges and Considerations
No ratings yet
Interpreting XChem Hits Challenges and Considerations
1 page
Peptide Synthesis Modified 2025
No ratings yet
Peptide Synthesis Modified 2025
73 pages
106 CMR 361
No ratings yet
106 CMR 361
19 pages
Guidelines For The Safe Transport of Clinical Specimens 2023 (Malaysia)
No ratings yet
Guidelines For The Safe Transport of Clinical Specimens 2023 (Malaysia)
69 pages
The Espresso Lane To Global Markets Case Study
100% (1)
The Espresso Lane To Global Markets Case Study
7 pages
Bria S Mythical Menagerie DND5e 1.2
No ratings yet
Bria S Mythical Menagerie DND5e 1.2
360 pages
Paper Class 2021AL Tute 34 A3 Ds
No ratings yet
Paper Class 2021AL Tute 34 A3 Ds
4 pages
Single Seat Valves SVP Select Sudmo Leaflet v2110 en
No ratings yet
Single Seat Valves SVP Select Sudmo Leaflet v2110 en
2 pages
Power of The Mind - Nicky James
No ratings yet
Power of The Mind - Nicky James
348 pages
Cdi Tos
No ratings yet
Cdi Tos
8 pages
(LN) Houkago Wa, Isekai Kissa de Coffee Wo - Volume 03
No ratings yet
(LN) Houkago Wa, Isekai Kissa de Coffee Wo - Volume 03
238 pages
Work Instructions On Electrical Safety
No ratings yet
Work Instructions On Electrical Safety
4 pages
Carotid Treatment Principles and Techniques, 3rd Edition Best Quality Download
100% (9)
Carotid Treatment Principles and Techniques, 3rd Edition Best Quality Download
16 pages
Cookery NC II Equipment List
No ratings yet
Cookery NC II Equipment List
4 pages
The Hidden Struggles of Nursing Students Facing Stress, Anxiety and Burnout
No ratings yet
The Hidden Struggles of Nursing Students Facing Stress, Anxiety and Burnout
15 pages
Cream Types and Uses
No ratings yet
Cream Types and Uses
6 pages
Previewpdf
100% (1)
Previewpdf
145 pages
Divisions of Marine Environment
No ratings yet
Divisions of Marine Environment
3 pages
Bee Pollen Chemical Composition
No ratings yet
Bee Pollen Chemical Composition
3 pages
ATBC Plasticizer for PVC Applications
No ratings yet
ATBC Plasticizer for PVC Applications
2 pages
Leadership Trait Questionnaire (1) BHBJ
No ratings yet
Leadership Trait Questionnaire (1) BHBJ
2 pages
202105080354586338526merged CircularNo2and1
No ratings yet
202105080354586338526merged CircularNo2and1
11 pages
UNIT 2 - Produce Organic Vegetables
100% (2)
UNIT 2 - Produce Organic Vegetables
80 pages
(Ebook) Surface Anatomy by Richard Tunstall, Nehal Shah ISBN 9781907816178, 1907816178 Available Instanly
100% (3)
(Ebook) Surface Anatomy by Richard Tunstall, Nehal Shah ISBN 9781907816178, 1907816178 Available Instanly
78 pages
Messer, S. - Behavioral and Psychoanalytic Perspectives at Therapeutic Choice Points
No ratings yet
Messer, S. - Behavioral and Psychoanalytic Perspectives at Therapeutic Choice Points
12 pages
Industrial Pharmacy: Filter Aids & Equipment
No ratings yet
Industrial Pharmacy: Filter Aids & Equipment
46 pages
Kerala's Water Crisis Unveiled
33% (3)
Kerala's Water Crisis Unveiled
18 pages
Load Cell Inspection Checklist
No ratings yet
Load Cell Inspection Checklist
2 pages
GARDTEC 800series
No ratings yet
GARDTEC 800series
43 pages
UNIT 4 - Cultural Practices in Luzon (Lowlands)
No ratings yet
UNIT 4 - Cultural Practices in Luzon (Lowlands)
51 pages
X X X XX XX X X To The Canonical Form: Xy Yz Z y 2 X
No ratings yet
X X X XX XX X X To The Canonical Form: Xy Yz Z y 2 X
2 pages
2 - Chapter Heat Transfer Evaporation
No ratings yet
2 - Chapter Heat Transfer Evaporation
37 pages

3.1. Introduction To Machine Learning Concepts

Uploaded by

3.1. Introduction To Machine Learning Concepts

Uploaded by

3.1.

Input 2 Types of Predictions

Binary Two possible Active vs. Inactive Logistic Regression, SVM,

Input 2 Types of Predictions

Input 2 Types of Predictions

Each Row = Different Sample in the dataset

Training Loss = Actual-Predicted

Training Loss = Actual-Predicted

Training Dataset Model Prediction

Validation Dataset Model Prediction

Validation set is used as a reality check during/after

Validation set is used as a reality check during/after

Test Dataset Model 3 Prediction

Test set is used to check how generalizable

Algorithm Classification Regression

You might also like