Solutions Exo 3 (2021)

This document discusses classification problems in machine learning. It defines a classification problem as predicting a categorical class label for new data based on features from training data. Mathematically, a classification problem involves a dataset of samples with feature vectors and class labels, and the goal is to learn a classifier that can predict the class labels of new samples. Various machine learning algorithms like decision trees, support vector machines, and neural networks can be used to build classification models and solve these problems.

Uploaded by

Kodjo ALIPUI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views3 pages

Solutions Exo 3 (2021)

Uploaded by

Kodjo ALIPUI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

A

In machine learning, a classification problem involves predicting the categorical class label
of a data point based on its features or attributes. The goal is to learn a model from labeled
training data that can accurately classify new, unseen data points into predefined classes or
categories. The classes could represent different categories, groups, or outcomes.

Mathematically, a classification problem can be represented as follows:

Given a dataset of n samples, each with p features, the dataset can be denoted as: D={(x1,y1),
(x2,y2),...,(xn,yn)}D={(x1,y1),(x2,y2),...,(xn,yn)}

Where:

 xi represents the feature vector of the ith sample in the p-dimensional predictor space.
 yi is the corresponding class label for the ith sample, belonging to one of K classes
(K≥2).

The goal is to learn a classifier ff that can predict the class labels for new, unseen samples.
The classifier maps input feature vectors to one of the KK classes:
f:Rp→{C1,C2,...,CK}f:Rp→{C1,C2,...,CK}

In this context:

 p represents the dimension of the predictor space, which is the number of features in
the dataset. Each feature contributes to the decision-making process of the classifier.
 K is the number of classes in the classification problem. For a binary classification
problem, K=2, while for multi-class problems, K>2.

The classifier's task is to learn the underlying relationships between the feature vectors and
the class labels by identifying decision boundaries or decision functions that separate the
classes as accurately as possible. Various algorithms and techniques, such as decision trees,
support vector machines, neural networks, and k-nearest neighbors, can be used to build
classification models and solve these types of problems.

three methods used actually we have two algorithm here LDA (LDA,LDA2) and one QDA.

however in the second LDA we are doing some feature engineering because we take as X axis as not
X axis or X to the predictor we take the original digit plus all the digits squared OK so that's the only
difference

Assumptions:

LDA assumes that you have different means conditional on the class on the label but the covariance
for each class is the same .
whereas for QDA well of course you also have different means according to different labels but the
covariance matrix can be different labeled by label OK and this gives more flexibility for QDA the
errors computed here.

simply the misclassification error so I I showed you before that what is the accuracy you just read
since this is a classification problem the accuracy is taken from the confusion matrix entries that in
the diagonal and divide by the number of observations the misclassification rate is just the opposite
take everything that is off the diagonal and yeah divide by the number of observations so here we
observe the misclassification rate we see that from the training set at the first 6% of misclassification
rate but on the test session the second model where we do a bit of feature engineering so we also
add squared term is doing already better well on the training set but also on the test set 10% test set
error the third method has very and this is a kind of simple setting where you can start seeing
overfitting so QA has a very low training error one 2% but 13% of tests error misclassification rate and
so answering very briefly

what is the best method for this data set well I would say that is the second so LDA which does not
overfit as much as as QA but it's still flexible enough because we did this feature engineering
transformation that helped to kind of reduce the misclassification on the test from K 11 to 10% and
finally it asks what

could we do with multinomial regression with the last two so here I mean it's again a bit more
theoretical question last regularization the idea is also it also applies to to regression is that as we
said many times with the last two you want to four yeah induce some sort of sparsity in the weights
that you're going to estimate sparsity means that you will force some of the weights of course when
you tune up your Lambda you will force some of the estimated weights to be 0 uh and what is the
two end of the tuning parameter in this method of course it well of course it's Lambda so this is
something it might sound easy question but that's also it's very important so everyone has to know
that that's why we will ask us or something like this and what do we expect about the regularization
OK for this data set in principle it could could help by thinking that OK the the in a sense OK what do
we have we have each row OK so on the on the in the X axis is an image and most of the predictors
right of the columns that are the the pixels are zero so in fact adding some regularization here
doesn't seem to be a bad idea in the sense that the ground truth here is that while we have many
many entries exactly equal to zero so and some entries not equal to to zero so we would like our
coefficient to kind of be activated only for the entries where you have some pixels I don't know if
this again this is a qualitative answer I don't know you know what he would say about that for this
specific image data set with regularization on logistic but I think that's something

I agree with with what you're saying I don't yeah it's hard to say if it's going to be better than this LDA
of course but still it's something that it's worth trying and OK this is again theory question but very
important model parameter and tuning parameter
E

model parameter is a parameter that you fit to the data so you learn it on the on the data and if you
want it is learned by the algorithm OK the the model parameter so it's like the beta in linear
regression or in the logistic it's something that the the specific model learns on the data OK

whereas the tuning parameter is something that you the user can change by hand yeah and it's all
also called hyper parameter and how do you find these how do you choose this actually because of
course hyperparameters or tuning parameters it's a synonym can always be set by hands but the
best way to choose them is to use cross validation as we said right OK and given the training data
but not this data right if I just have training data while the model parameter is learned when you fit
the model so it's learned on the data in a sense automatically by the model whereas the
hyperparameter must be chosen and this is usually usually done with cross validation OK now
regarding the other questions to the multiple choice and things like this we will upload the the
answer on motor very briefly it was about the bias by bias variance decomposition that we saw on on
Jupiter notebook and some of the questions were about the notation OK and for these maybe I can
discuss with the person who sent me uh at some point but I wouldn't say this is important this
notation related question are not important for the midterm but what's important is a typo that the
person discovered so this I want to share with you it's minor type of but still it's good I will just open
the Jupiter notebook your picture let's see if I can share now OK

Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
10: Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
10: Advice For Applying Machine Learning: Deciding What To Try Next
8 pages
Overfitting & Feature Engineering
No ratings yet
Overfitting & Feature Engineering
37 pages
MLquestions
No ratings yet
MLquestions
26 pages
ML Interview
No ratings yet
ML Interview
65 pages
All DL
No ratings yet
All DL
72 pages
Machine Learning
No ratings yet
Machine Learning
63 pages
ML PYQs
No ratings yet
ML PYQs
32 pages
Lec-1 Bias-variance-Tradeoff
No ratings yet
Lec-1 Bias-variance-Tradeoff
24 pages
Lec3 Linear Regression With Multiple Vars
No ratings yet
Lec3 Linear Regression With Multiple Vars
30 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Overfitting
No ratings yet
Overfitting
7 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
PS Notes (Machine Learning
No ratings yet
PS Notes (Machine Learning
14 pages
Statistical ML Exam Guide
No ratings yet
Statistical ML Exam Guide
13 pages
Machine Learning QUESTION AND ANSWERS
No ratings yet
Machine Learning QUESTION AND ANSWERS
13 pages
Lecture 4 - Regularization
No ratings yet
Lecture 4 - Regularization
22 pages
Week 6 Lecture Notes
No ratings yet
Week 6 Lecture Notes
9 pages
DL Unit1
100% (1)
DL Unit1
61 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
Ai - W7L14
No ratings yet
Ai - W7L14
22 pages
Overfitting & Underfitting in Machine Learning
No ratings yet
Overfitting & Underfitting in Machine Learning
9 pages
Chemometrics 711
No ratings yet
Chemometrics 711
13 pages
Classification Review
No ratings yet
Classification Review
8 pages
Lecture2 PDF
No ratings yet
Lecture2 PDF
111 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
ML Answer Key (M.tech)
No ratings yet
ML Answer Key (M.tech)
31 pages
Lecture Slide 02 - Supervised Learning-1
No ratings yet
Lecture Slide 02 - Supervised Learning-1
43 pages
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
No ratings yet
Advanced Machine Learning: Neural Networks Decision Trees Random Forest Xgboost
61 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Linear Regression With Multiple Variable
No ratings yet
Linear Regression With Multiple Variable
30 pages
Machine-Learning Research: Four Current Directions
No ratings yet
Machine-Learning Research: Four Current Directions
40 pages
Deep Learning Important Questions For Ia 1
No ratings yet
Deep Learning Important Questions For Ia 1
11 pages
ML 04 Validation Regularization
No ratings yet
ML 04 Validation Regularization
57 pages
Machine Leafning
No ratings yet
Machine Leafning
5 pages
Excellent 05 - Overfitting
No ratings yet
Excellent 05 - Overfitting
22 pages
Advanced Linear Regression Guide
No ratings yet
Advanced Linear Regression Guide
45 pages
ML 01
No ratings yet
ML 01
24 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
CMPE257 - W2C3 - ML Fundamentals - Part 2
No ratings yet
CMPE257 - W2C3 - ML Fundamentals - Part 2
34 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
12 pages
1.2 Overfitting Under Fitting and Cross Validation and Confusion Matrix
No ratings yet
1.2 Overfitting Under Fitting and Cross Validation and Confusion Matrix
17 pages
Homework2 - Tran Anh Vu
No ratings yet
Homework2 - Tran Anh Vu
3 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Machine Learning Doc-2
No ratings yet
Machine Learning Doc-2
8 pages
Midterm Report
No ratings yet
Midterm Report
4 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
Overfitting vs Underfitting in ML
No ratings yet
Overfitting vs Underfitting in ML
20 pages
SML Updated UNIT 4
No ratings yet
SML Updated UNIT 4
44 pages
Week#5
No ratings yet
Week#5
33 pages
ML Tips and Tricks
No ratings yet
ML Tips and Tricks
32 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Unit 2
No ratings yet
Unit 2
23 pages
Samatrix Assignment3
No ratings yet
Samatrix Assignment3
4 pages
Lecture1 MCQ Guide
No ratings yet
Lecture1 MCQ Guide
4 pages
Machine Learning Juunit2.pdf Lands
No ratings yet
Machine Learning Juunit2.pdf Lands
7 pages
Neural Networks - Applications
No ratings yet
Neural Networks - Applications
3 pages
Lunet: A Deep Neural Network For Network Intrusion Detection
No ratings yet
Lunet: A Deep Neural Network For Network Intrusion Detection
8 pages
Sampath Et Al. - 2021 - A Survey On Generative Adversarial Networks For Im
No ratings yet
Sampath Et Al. - 2021 - A Survey On Generative Adversarial Networks For Im
60 pages
Mitchell's AI Book Summary
No ratings yet
Mitchell's AI Book Summary
16 pages
CB Insights State of AI
No ratings yet
CB Insights State of AI
92 pages
Whiz Cheat Sheet DP 203 v2
No ratings yet
Whiz Cheat Sheet DP 203 v2
42 pages
Machine Learning For Humans
100% (5)
Machine Learning For Humans
97 pages
Analysis of Chronic Kidney Disease by Using Orange Tool
No ratings yet
Analysis of Chronic Kidney Disease by Using Orange Tool
8 pages
Impact of AI On Corporate Governance
No ratings yet
Impact of AI On Corporate Governance
7 pages
Enhanced Face Recognition Algorithm Using PCA With Artificial Neural Networks
No ratings yet
Enhanced Face Recognition Algorithm Using PCA With Artificial Neural Networks
9 pages
Decision Tree Using ID3 Algorithm
No ratings yet
Decision Tree Using ID3 Algorithm
40 pages
Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training
No ratings yet
Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training
28 pages
Shiv Pharma Marketing Assignment PDF
No ratings yet
Shiv Pharma Marketing Assignment PDF
13 pages
Beginner's Guide to R-CNN Basics
No ratings yet
Beginner's Guide to R-CNN Basics
6 pages
Lec 01
No ratings yet
Lec 01
16 pages
Cyberbullying Detection IEEE
No ratings yet
Cyberbullying Detection IEEE
2 pages
1JS18IS074 Internship Report
No ratings yet
1JS18IS074 Internship Report
27 pages
AI-Powered Prompt-Based Image Generator
No ratings yet
AI-Powered Prompt-Based Image Generator
5 pages
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
100% (1)
Module 1 - S8 CSE NOTES - KTU DEEP LEARNING NOTES - CST414
18 pages
Embedded Systems & AI Conference
No ratings yet
Embedded Systems & AI Conference
9 pages
PhamTrungHieu 20233402 AnalogIntegratedCircuitsForIOTEra
No ratings yet
PhamTrungHieu 20233402 AnalogIntegratedCircuitsForIOTEra
30 pages
Forward Forecast of Stock Price Using Sliding-Window Metaheuristic-Optimized Machine-Learning Regression
No ratings yet
Forward Forecast of Stock Price Using Sliding-Window Metaheuristic-Optimized Machine-Learning Regression
11 pages
Network & Security Tools Guide
No ratings yet
Network & Security Tools Guide
22 pages
CSE407R01-Cloud Computing-B.tech. CSE SPL in IoT & Automation - Syllabus
No ratings yet
CSE407R01-Cloud Computing-B.tech. CSE SPL in IoT & Automation - Syllabus
2 pages
Aspiring Software Engineer Profile
No ratings yet
Aspiring Software Engineer Profile
1 page
Flash Cards
No ratings yet
Flash Cards
4 pages
AI-Based Security System For 5G Enabled IoT
No ratings yet
AI-Based Security System For 5G Enabled IoT
7 pages
Estonia's AI Courtroom Revolution
No ratings yet
Estonia's AI Courtroom Revolution
11 pages
The Transformative Impact of AI Technology On Phys
No ratings yet
The Transformative Impact of AI Technology On Phys
6 pages
Logistics - INF8245E - Machine Learning
No ratings yet
Logistics - INF8245E - Machine Learning
2 pages

Solutions Exo 3 (2021)

Uploaded by

Solutions Exo 3 (2021)

Uploaded by

A

Mathematically, a classification problem can be represented as follows:

You might also like