0% found this document useful (0 votes)

16 views5 pages

Week 6 Sol

Uploaded by

Rama Bhushan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views5 pages

Week 6 Sol

Uploaded by

Rama Bhushan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

1. Use the information provided below and compute the following.

A regression model is trained

to predict defaulter applications with 1 lakh data points. The model is tested on out-of-sample
100 data points. The following classification matrix is obtained at Thresholding level of 0.5.
Here, 0’s are considered as non-defaulters and 1’s are defaulters. The model is evaluated on its
ability to accurately predict (i.e., classify) the observations from the test dataset.

Actual/Predicted Predicted =0 Predicted =1

Actual =0 30 10
Actual =1 20 40

What is the sensitivity - ability to correctly classify defaulters (1’s) as 1’s – of the model

(a) 40-50%
(b) 50-60%
(c) 60-70%
(d) 70-80%

Hint: Sensitivity = True Positives /(True Positives+False Negatives)= 40/(60)*100=66.67%

2. Use the information provided below and compute the following. A regression model is trained
to predict defaulter applications with 1 lakh data points. The model is tested on out-of-sample
100 data points. The following classification matrix is obtained at Thresholding level of 0.5.
Here, 0’s are considered as non-defaulters and 1’s are defaulters. The model is evaluated on its
ability to accurately predict (i.e., classify) the observations from the test dataset.

Actual/Predicted Predicted =0 Predicted =1

Actual =0 30 10
Actual =1 20 40

What is the specificity - ability to correctly classify non-defaulters (0’s) as 0’s – of the model

(a) 40-50%
(b) 50-60%
(c) 60-70%
(d) 70-80%

Hint: Specificity = True Negatives /(False Positives+True Negatives)= 30/(40)*100=75.0%

3. Use the information provided below and compute the following. A regression model is trained
to predict defaulter applications with 1 lakh data points. The model is tested on out-of-sample
100 data points. The following classification matrix is obtained at Thresholding level of 0.5.
Here, 0’s are considered as non-defaulters and 1’s are defaulters. The model is evaluated on its
ability to accurately predict (i.e., classify) the observations from the test dataset.

Actual/Predicted Predicted =0 Predicted =1

Actual =0 30 10
Actual =1 20 40

If thresholding level is increased, the following correctly reflects its impact on model
performance.

(a) Sensitivity increases and Specificity increases

(b) Sensitivity increases and Specificity decreases
(c) Sensitivity decreases and Specificity increases
(d) Sensitivity decreases and Specificity decreases

Hint: Increasing the thresholding level decreases the sensitivity and increases the specificity

4. Use the information provided below and compute the following. A regression model is trained
to predict defaulter applications with 1 lakh data points. The model is tested on out-of-sample
100 data points. The following classification matrix is obtained at Thresholding level of 0.5.
Here, 0’s are considered as non-defaulters and 1’s are defaulters. The model is evaluated on its
ability to accurately predict (i.e., classify) the observations from the test dataset.
Actual/Predicted Predicted =0 Predicted =1
Actual =0 30 10
Actual =1 20 40

What is the overall accuracy of the model (Count R-square)

(a) 45-55%
(b) 55-65%
(c) 65-75%
(d) 75-85%

Hint: 𝑂𝑣𝑒𝑟𝑎𝑙𝑙 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 (Count R^2)=(𝑇𝑁+𝑇𝑃)/𝑁= (30+40)/100=70%

5. A classification model uses simply coin tossing game with a fair coin as model of prediction.
Head’s are counted as 0’s and Tail’s as 1’s. What would be the sensitivity and specificity of the
model

(a) Sensitivity =20-40%, Specificity =40-60%

(b) Sensitivity =40-60%, Specificity =40-60%
(c) Sensitivity =40-60%, Specificity =20-40%
(d) Sensitivity =20-40%, Specificity =20-40%

Hint: A coin tossing game (with fair coin) has an accuracy of 50%, so 50% of 0’s and 1’s will be
accurately classified and vice-versa 50% of them will be inaccurately classified.

6. A classification model uses simply coin tossing game with a fair coin as model of prediction.
Head’s are counted as 0’s and Tail’s as 1’s. Under the receiver operating characteristic (ROC)
curve, what would be the area under the curve (AUC) of the model

a) 45-55%
b) 55-65%
c) 65-75%
d) 75-85%
Hint: A coin tossing game (with fair coin) has an accuracy of 50%, so the area under the curve or
the performance of the model is 50%.

7. For a Thresholding value of Tau =1, the following is the correct value

(a) Sensitivity =1, Specificity =1

(b) Sensitivity =0, Specificity =0
(c) Sensitivity =1, Specificity =0
(d) Sensitivity =0, Specificity =1

Hint: For Tau =1, all the 1’s will be classified as 0’s, hence sensitivity =0. And all the 0’s will be
classified as 0’s, hence specificity is 1.

8. For a Thresholding value of Tau =0, the following is the correct value

a) Sensitivity =1, Specificity =1

b) Sensitivity =0, Specificity =0
c) Sensitivity =1, Specificity =0
d) Sensitivity =0, Specificity =1

Hint: For Tau =0, all the 1’s will be classified as 1’s, hence sensitivity =1. And all the 0’s will be
classified as 1’s, hence specificity is 0.
9. As a data scientist, you realize that conventional R-square measure is not a very appropriate
goodness-of-fit indicator for classification algorithms. You teammate suggests using 𝑃𝑠𝑒𝑢𝑑𝑜 −
LLF
𝑅 2 = 1 − LLF . Here LLF is loglikelihood function, that is log of joint probability of observing
0
the data. LLF0 is loglikelihood function for the naïve model. For a good classification model,
you expect the joint probability of observing the data to be very high and low for a poor naïve
model. If you believe that your trained model is very poor and only as good as naïve model,
your 𝑃𝑠𝑒𝑢𝑑𝑜 − 𝑅 2 values can be

a) 0-10%
b) 10-20%
c) 20-30%
d) 30-40%

Hint: If your model is as good as Naïve model, the LLF will be equal to LLF0 and 𝑃𝑠𝑒𝑢𝑑𝑜 − 𝑅 2 =
0.

10. As a data scientist, you realize that conventional R-square measure is not a very appropriate
goodness-of-fit indicator for classification algorithms. You teammate suggests using 𝑃𝑠𝑒𝑢𝑑𝑜 −
LLF
𝑅2 = 1 − . Here LLF is loglikelihood function, that is log of joint probability of observing
LLF0
the data. LLF0 is loglikelihood function for the naïve model. For a good classification model,
you expect the joint probability of observing the data to be very high and low for a poor naïve
model. If you believe that your trained model is far better than naïve model, your 𝑃𝑠𝑒𝑢𝑑𝑜 − 𝑅 2
values can be as high as.

a) 90-100%
b) 80-90%
c) 70-80%
d) 60-70%

Hint: If the model is too good as compared to naïve model (LLF0), the joint probabilities
approach 1, and LLF will be close to 0. Then 𝑃𝑠𝑒𝑢𝑑𝑜 − 𝑅 2 will be close to 1.

Final Week 12 Quiz
No ratings yet
Final Week 12 Quiz
5 pages
Assignment 2 Sol
No ratings yet
Assignment 2 Sol
19 pages
Week 1
No ratings yet
Week 1
11 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
File 1756615298237
No ratings yet
File 1756615298237
6 pages
2024 Machine Learning
No ratings yet
2024 Machine Learning
8 pages
Data Science Quiz Questions Analysis
No ratings yet
Data Science Quiz Questions Analysis
8 pages
Session-11 Machine Learning
No ratings yet
Session-11 Machine Learning
27 pages
Evaluation-Practice Questions (Answer Key)
100% (2)
Evaluation-Practice Questions (Answer Key)
4 pages
2022 Jan
No ratings yet
2022 Jan
37 pages
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
No ratings yet
SS ZG568 EC 2M SECOND SEM 2020 2021 Solution 1617600765956
9 pages
Binary Classification for Analysts
No ratings yet
Binary Classification for Analysts
9 pages
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
100% (1)
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
6 pages
Quiz1 Solutions Quiz 1 Soln
No ratings yet
Quiz1 Solutions Quiz 1 Soln
7 pages
Logistic+regression Data
No ratings yet
Logistic+regression Data
13 pages
2024 - PCS - 24P2CSC04 - Question Bank ML
No ratings yet
2024 - PCS - 24P2CSC04 - Question Bank ML
7 pages
Exam 21
No ratings yet
Exam 21
17 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
Data Science Exam Analysis
No ratings yet
Data Science Exam Analysis
16 pages
Logistic Ver 2
No ratings yet
Logistic Ver 2
20 pages
Logistic Regression
No ratings yet
Logistic Regression
13 pages
Extract Pages From 1 ML
No ratings yet
Extract Pages From 1 ML
4 pages
Logistic Regression
No ratings yet
Logistic Regression
61 pages
Ch01 ICS422 03
No ratings yet
Ch01 ICS422 03
46 pages
Evaluation Solved
No ratings yet
Evaluation Solved
12 pages
ML Ques Mod-1
No ratings yet
ML Ques Mod-1
25 pages
EVALUATION - Notes
No ratings yet
EVALUATION - Notes
15 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
Data Science For Online Customer Analytics - Assignment
No ratings yet
Data Science For Online Customer Analytics - Assignment
11 pages
Statistics Quiz
No ratings yet
Statistics Quiz
20 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
BA Quiz 2 (4-7)
No ratings yet
BA Quiz 2 (4-7)
14 pages
Int 354 ML-1
No ratings yet
Int 354 ML-1
4 pages
07au Midterm
No ratings yet
07au Midterm
17 pages
Data Science and ML - End Term
No ratings yet
Data Science and ML - End Term
4 pages
KDAG Task
No ratings yet
KDAG Task
2 pages
Homework2 v1.0
No ratings yet
Homework2 v1.0
5 pages
CS-31002 (ML) - CS End April 2025
No ratings yet
CS-31002 (ML) - CS End April 2025
19 pages
Week 7
No ratings yet
Week 7
25 pages
Evaluation Questions
No ratings yet
Evaluation Questions
9 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
Machine Learning 10-701 Exam Prep
No ratings yet
Machine Learning 10-701 Exam Prep
14 pages
Final Soln
No ratings yet
Final Soln
10 pages
Economics Mod
No ratings yet
Economics Mod
22 pages
Performance Measures
No ratings yet
Performance Measures
25 pages
Analytics in Practice: Model Evaluation
No ratings yet
Analytics in Practice: Model Evaluation
40 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Logistic Regression EBay
No ratings yet
Logistic Regression EBay
10 pages
UGBA 104 Prob Set C
No ratings yet
UGBA 104 Prob Set C
29 pages
ML June 2024
No ratings yet
ML June 2024
12 pages
Machine Learning Essentials
No ratings yet
Machine Learning Essentials
12 pages
Credit Risk Model Building Guide
0% (10)
Credit Risk Model Building Guide
3 pages
Week 5 Notes
No ratings yet
Week 5 Notes
175 pages
Week 9 Advanced Financial Time Series
No ratings yet
Week 9 Advanced Financial Time Series
5 pages
Week 4 Assignment Solution
No ratings yet
Week 4 Assignment Solution
4 pages
Week 10 Quiz
No ratings yet
Week 10 Quiz
4 pages
Week 6 Notes
No ratings yet
Week 6 Notes
56 pages
Panel Questions
No ratings yet
Panel Questions
5 pages
Quiz Week 11
No ratings yet
Quiz Week 11
5 pages
Math9 PCTB Sol Ex1 2 Sheraz Ansari
No ratings yet
Math9 PCTB Sol Ex1 2 Sheraz Ansari
4 pages
Comparative Toxicological Evaluation of Natural and Artificial Sweeteners: Focus On Liver and Kidney Damage (WWW - Kiu.ac - Ug)
No ratings yet
Comparative Toxicological Evaluation of Natural and Artificial Sweeteners: Focus On Liver and Kidney Damage (WWW - Kiu.ac - Ug)
6 pages
PCM
No ratings yet
PCM
6 pages
Descriptor TP English
No ratings yet
Descriptor TP English
2 pages
Static Water Level (Metro Manila)
No ratings yet
Static Water Level (Metro Manila)
2 pages
English for Dentistry Students
No ratings yet
English for Dentistry Students
162 pages
Elements Facts at Your Fingertips Pocket Eyewitness DK Instant Download
100% (2)
Elements Facts at Your Fingertips Pocket Eyewitness DK Instant Download
55 pages
London Pub Menu New (30 X 23 CM) - 1
No ratings yet
London Pub Menu New (30 X 23 CM) - 1
9 pages
DHC 8 Sop PDF
No ratings yet
DHC 8 Sop PDF
251 pages
Bernpyle #3
100% (3)
Bernpyle #3
28 pages
Hon. Leopoldo Dominico L. Petilla: Streamline
100% (1)
Hon. Leopoldo Dominico L. Petilla: Streamline
1 page
Assessment Algebra - 2 - Unit - 8 - Rational - Functions
No ratings yet
Assessment Algebra - 2 - Unit - 8 - Rational - Functions
2 pages
BLAME! Academy and So On (2017) (Digital) (Danke-Empire)
100% (3)
BLAME! Academy and So On (2017) (Digital) (Danke-Empire)
158 pages
Consonant Gradation PDF
No ratings yet
Consonant Gradation PDF
10 pages
Inter and Sub Trochanteric Fracture
100% (1)
Inter and Sub Trochanteric Fracture
25 pages
Pharmaceutical Packaging Materials
No ratings yet
Pharmaceutical Packaging Materials
3 pages
Mark Anthony Estrada CV
No ratings yet
Mark Anthony Estrada CV
1 page
Robot Arms, Hands: Kinematics: With Slides From Renata Melamud
No ratings yet
Robot Arms, Hands: Kinematics: With Slides From Renata Melamud
59 pages
CPX en
No ratings yet
CPX en
215 pages
WFP 0000148112
No ratings yet
WFP 0000148112
102 pages
Hall-Effect Unipolar Switches A1120-A1125
No ratings yet
Hall-Effect Unipolar Switches A1120-A1125
18 pages
Aruba 3810 Switch Series Data Sheet
No ratings yet
Aruba 3810 Switch Series Data Sheet
30 pages
Zara's Pre-Owned Challenges & Strategy
No ratings yet
Zara's Pre-Owned Challenges & Strategy
4 pages
PB1 S S2 MS
No ratings yet
PB1 S S2 MS
4 pages
A Crash Course in Film Technique
No ratings yet
A Crash Course in Film Technique
5 pages
Aws B5. 2. 2018 PDF
100% (3)
Aws B5. 2. 2018 PDF
30 pages
Final PPT of Carbon Nanotubes
67% (3)
Final PPT of Carbon Nanotubes
29 pages
Reinventing Reference How Libraries Deliver Value in The Age of Google Anderson Download
No ratings yet
Reinventing Reference How Libraries Deliver Value in The Age of Google Anderson Download
51 pages
Optimal Portfolio Diversification
No ratings yet
Optimal Portfolio Diversification
18 pages
Alynsa Eryn Anak Musa - 1379199 - 0
No ratings yet
Alynsa Eryn Anak Musa - 1379199 - 0
17 pages