Performance Measures

Uploaded by

shashanks2493

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views25 pages

Performance Measures

Uploaded by

shashanks2493

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Performance Measures

Performance Measures
Classification Metrics
• Classification Accuracy

• Logarithmic Loss

• Confusion Matrix

• Area under Curve(AOC)

• F1 Score
Regression Metrics

• Mean Absolute Error

• Mean Squared Error

• R –Squared
Classification Accuracy
• Classification Accuracy is what we usually mean, when we use the term accuracy.
It is the ratio of number of correct predictions to the total number of input samples.

• It works well only if there are equal number of samples belonging to each class.

• For example, consider that there are 98% samples of class A and 2% samples of
class B in our training set. Then our model can easily get 98% training accuracy by
simply predicting every training sample belonging to class A.
• When the same model is tested on a test set with 60% samples of class A and 40%
samples of class B, then the test accuracy would drop down to 60%.
Classification Accuracy is great, but gives us the false sense of achieving high
accuracy.
• The real problem arises, when the cost of misclassification of the minor class
samples are very high. If we deal with a rare but fatal disease, the cost of failing to
diagnose the disease of a sick person is much higher than the cost of sending a
healthy person to more tests.
Python Example
import pandas
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv("pima-indians-diabetes.data.csv", names=names)
X = dataframe.iloc[:, :-1]
Y= dataframe.iloc[:, 8]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LogisticRegression()
scoring = 'accuracy'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results.mean())
print(results.std())
Logarithmic Loss

• Logarithmic Loss or Log Loss, works by penalizing the false classifications. It

works well for multi-class classification. When working with Log Loss, the classifier
must assign probability to each class for all the samples. Suppose, there are N
samples belonging to M classes, then the Log Loss is calculated as below :

where,
• y_ij, indicates whether sample i belongs to class j or not
• p_ij, indicates the probability of sample i belonging to class j
• Log Loss has no upper bound and it exists on the range [0, ∞). Log Loss nearer to
0 indicates higher accuracy, whereas if the Log Loss is away from 0 then it
indicates lower accuracy.
• In general, minimising Log Loss gives greater accuracy for the classifier.
Python Example
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv("pima-indians-diabetes.data.csv", names=names)
X = dataframe.iloc[:, :-1]
Y= dataframe.iloc[:, 8]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LogisticRegression()
scoring = 'neg_log_loss'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("Logloss: %.3f (%.3f)") % (results.mean(), results.std())
Confusion Matrix
• Confusion Matrix as the name suggests gives us a matrix as output and
describes the complete performance of the model.
• Lets assume we have a binary classification problem. We have some
samples belonging to two classes : YES or NO. Also, we have our own
classifier which predicts a class for a given input sample. On testing our
model on 165 samples ,we get the following result.
•
• There are 4 important terms :
• True Positives : The cases in which we predicted YES and the actual
output was also YES.
• True Negatives : The cases in which we predicted NO and the actual
output was NO.
• False Positives : The cases in which we predicted YES and the actual
output was NO.
• False Negatives : The cases in which we predicted NO and the actual
output was YES.
• Accuracy for the matrix can be calculated by taking average of the values
lying across the “main diagonal” i.e
Python Example
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv("pima-indians-diabetes.data.csv", names=names)
X = dataframe.iloc[:, :-1]
Y= dataframe.iloc[:, 8]
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=0.2,
random_state=seed)
model = LogisticRegression()
model.fit(X_train, Y_train)
predicted = model.predict(X_test)
matrix = confusion_matrix(Y_test, predicted)
print(matrix)
Area Under Curve
• Area Under Curve(AUC) is one of the most widely used metrics for
evaluation. It is used for binary classification problem. AUC of a classifier is
equal to the probability that the classifier will rank a randomly chosen
positive example higher than a randomly chosen negative example. Before
defining AUC, let us understand two basic terms :
• True Positive Rate (Sensitivity) : True Positive Rate is defined as TP/
(FN+TP). True Positive Rate corresponds to the proportion of positive data
points that are correctly considered as positive, with respect to all positive
data points.
Contd..
• False Positive Rate (1-Specificity) : False Positive Rate is defined as FP / (FP+TN). False Positive
Rate corresponds to the proportion of negative data points that are mistakenly considered as
positive, with respect to all negative data points.

• Specificity is also known as True negative rate and False positive rate =(1-specificity).
• So,
Contd..
• False Positive Rate and True Positive Rate both have values in the range
[0, 1]. FPR and TPR both are computed at threshold values such as (0.00,
0.02, 0.04, …., 1.00) and a graph is drawn. AUC is the area under the
curve of plot False Positive Rate vs True Positive Rate at different points in
[0, 1].
• As evident, AUC has a range of [0, 1]. The greater the value, the better is
the performance of our model.
Python example
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe=pandas.read_csv("pima-indians-diabetes.data.csv", names=names)
X = dataframe.iloc[:, :-1]
Y= dataframe.iloc[:, 8]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LogisticRegression()
scoring = 'roc_auc'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print(results)
print(results.mean())
print(results.std())
F1 Score

• F1 Score is used to measure a test’s accuracy

• F1 Score is the Harmonic Mean between precision and recall. The range
for F1 Score is [0, 1]. It tells you how precise your classifier is (how many
instances it classifies correctly), as well as how robust it is (it does not
miss a significant number of instances).
• High precision but lower recall, gives you an extremely accurate, but it
then misses a large number of instances that are difficult to classify. The
greater the F1 Score, the better is the performance of our model.
Mathematically, it can be expressed as :
Contd..
F1 Score tries to find the balance between precision and recall.
• Precision : It is the number of correct positive results divided by the number of
positive results predicted by the classifier.
• Recall : It is the number of correct positive results divided by the number of all
relevant samples (all samples that should have been identified as positive).
Python example
# Cross Validation Classification Report
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe=pandas.read_csv("pima-indians-diabetes.data.csv", names=names)
X = dataframe.iloc[:, :-1]
Y= dataframe.iloc[:, 8]
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
model = LogisticRegression()
model.fit(X_train, Y_train)
predicted = model.predict(X_test)
report = classification_report(Y_test, predicted)
print(report)
Mean Absolute Error
• Mean Absolute Error is the average of the difference between the Original Values
and the Predicted Values.
• It gives us the measure of how far the predictions were from the actual output.
However, they don’t gives us any idea of the direction of the error i.e. whether we
are under predicting the data or over predicting the data.
• Mathematically, it is represented as :
Boston Housing dataset
Attribute Information:

1. CRIM per capita crime rate by town

2. ZN proportion of residential land zoned for lots over 25,000 sq.ft.
3. INDUS proportion of non-retail business acres per town
4. CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
5. NOX nitric oxides concentration (parts per 10 million)
6. RM average number of rooms per dwelling
7. AGE proportion of owner-occupied units built prior to 1940
8. DIS weighted distances to five Boston employment centres
9. RAD index of accessibility to radial highways
10. TAX full-value property-tax rate per $10,000
11. PTRATIO pupil-teacher ratio by town
12. B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
13. LSTAT % lower status of the population
14. MEDV Median value of owner-occupied homes in $1000’s
(The dataset can be available on Kaggle.com as Boston Housing Dataset)
Python Example
# Cross Validation Regression MAE
• import pandas
• from sklearn import model_selection
• from sklearn.linear_model import LinearRegression
• names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
• dataframe = pandas.read_csv(“housing.data”, delim_whitespace=True, names=names)
• array = dataframe.values
• X = array[:,0:13]
• Y = array[:,13]
• seed = 7
• kfold = model_selection.KFold(n_splits=10, random_state=seed)
• model = LinearRegression()
• scoring = 'neg_mean_absolute_error'
• results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
• print("MAE: %.3f (%.3f)") % (results.mean(), results.std())
Mean Squared Error
• Mean Squared Error(MSE) is quite similar to Mean Absolute Error, the
only difference being that MSE takes the average of the square of the
difference between the original values and the predicted values.
• The advantage of MSE being that it is easier to compute the gradient,
whereas Mean Absolute Error requires complicated linear programming
tools to compute the gradient.
• As, we take square of the error, the effect of larger errors become more
pronounced then smaller error, hence the model can now focus more on
the larger errors.
Python Example
# Cross Validation Regression MSE
import pandas
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
dataframe = pandas.read_csv(“housing.data”, delim_whitespace=True, names=names)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LinearRegression()
scoring = 'neg_mean_squared_error'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("MSE: %.3f (%.3f)") % (results.mean(), results.std())
R-Squared Metric
• The R^2 (or R Squared) metric provides an indication of the goodness of fit of a set
of predictions to the actual values. In statistical literature, this measure is called the
coefficient of determination.
• This is a value between 0 and 1 for no-fit and perfect fit respectively.
•
Python Example
# Cross Validation Regression R^2
import pandas
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.data"
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
dataframe = pandas.read_csv(“housing.data”, delim_whitespace=True, names=names)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LinearRegression()
scoring = 'r2'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("R^2: %.3f (%.3f)") % (results.mean(), results.std())
Contd..

Thank You

B-56 Sanket Jambhulkar MLA-3
No ratings yet
B-56 Sanket Jambhulkar MLA-3
7 pages
Dsbda 5
No ratings yet
Dsbda 5
4 pages
Domande Complete ML UNIPD
No ratings yet
Domande Complete ML UNIPD
12 pages
Module 2
No ratings yet
Module 2
72 pages
Binary Classifier Evaluation Guide
No ratings yet
Binary Classifier Evaluation Guide
12 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Openlab 1
No ratings yet
Openlab 1
17 pages
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Diabetes Prediction with Logistic Regression
No ratings yet
Diabetes Prediction with Logistic Regression
9 pages
Unit 2 Supervised Learning
No ratings yet
Unit 2 Supervised Learning
20 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Session-11 Machine Learning
No ratings yet
Session-11 Machine Learning
27 pages
07 Logistics Regression
No ratings yet
07 Logistics Regression
23 pages
Machine Learning Model
No ratings yet
Machine Learning Model
9 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
100% (1)
### Data Exploration: 'Yes' 'No' 'Agency' 'Direct' 'Employee Referral' 'Yes' 'No'
6 pages
ML Manual Final
No ratings yet
ML Manual Final
35 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Logistic Regression for Beginners
No ratings yet
Logistic Regression for Beginners
3 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
Logistic Ver 2
No ratings yet
Logistic Ver 2
20 pages
Evaluation Metrics in Machine Learning - GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning - GeeksforGeeks
6 pages
ML4 Classification
No ratings yet
ML4 Classification
154 pages
Information Securtiy
No ratings yet
Information Securtiy
8 pages
2 Modele Lineare
No ratings yet
2 Modele Lineare
43 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Chapter 10 Logistic Reg - Week 07 - 01
No ratings yet
Chapter 10 Logistic Reg - Week 07 - 01
31 pages
ADS - Phase 3
No ratings yet
ADS - Phase 3
34 pages
Part A Assignment - No - 5 PDF
No ratings yet
Part A Assignment - No - 5 PDF
8 pages
Logistic Regression Tutorial Python
No ratings yet
Logistic Regression Tutorial Python
30 pages
Aquif Ibrar 1212
No ratings yet
Aquif Ibrar 1212
9 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
61 pages
Advanced Regression with GLMs
No ratings yet
Advanced Regression with GLMs
13 pages
ML 2024 Part6 Classification Unsupervised
No ratings yet
ML 2024 Part6 Classification Unsupervised
43 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Evaluation in Ai
No ratings yet
Evaluation in Ai
25 pages
Chapter 10 Logistic Reg (Python)
No ratings yet
Chapter 10 Logistic Reg (Python)
29 pages
Vs 14 Logistic Regression
No ratings yet
Vs 14 Logistic Regression
18 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
L22 KNN+Metrics
No ratings yet
L22 KNN+Metrics
18 pages
Machine Learning PDF
No ratings yet
Machine Learning PDF
8 pages
Deep Dive Into Confusion Matrix - Towards AI
No ratings yet
Deep Dive Into Confusion Matrix - Towards AI
9 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
8 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
33 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Diabetic Retinopathy Risk Modeling
No ratings yet
Diabetic Retinopathy Risk Modeling
24 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
Logistic REGRESSION
No ratings yet
Logistic REGRESSION
10 pages
Lecture 04
No ratings yet
Lecture 04
33 pages
KNN and Logistic Regression Guide
No ratings yet
KNN and Logistic Regression Guide
18 pages
ML 4
No ratings yet
ML 4
2 pages
Lect 02 Evaluation Part 1
No ratings yet
Lect 02 Evaluation Part 1
33 pages
08 Logistic Regression
No ratings yet
08 Logistic Regression
19 pages
Prediction Diabetic NBayes
No ratings yet
Prediction Diabetic NBayes
3 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
OS - Full Notes
No ratings yet
OS - Full Notes
61 pages
Mahabharata First Draft
No ratings yet
Mahabharata First Draft
20 pages
IAiML - Full Notes
No ratings yet
IAiML - Full Notes
63 pages
Anti-Ragging Affidavit for Students
No ratings yet
Anti-Ragging Affidavit for Students
2 pages
Chapter 2 SEM 2025
No ratings yet
Chapter 2 SEM 2025
52 pages
MOOC Econometrics Test Exercise 1
No ratings yet
MOOC Econometrics Test Exercise 1
3 pages
Latihan Regresi Berganda
No ratings yet
Latihan Regresi Berganda
10 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
Time Series Analysis for Stats Majors
No ratings yet
Time Series Analysis for Stats Majors
13 pages
This Study Resource Was: Weekly Quiz 3 (AS)
No ratings yet
This Study Resource Was: Weekly Quiz 3 (AS)
6 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
6 pages
QM CH 9 Corr Coeff
100% (2)
QM CH 9 Corr Coeff
18 pages
CRD ANOVA Table Formation Guide
No ratings yet
CRD ANOVA Table Formation Guide
7 pages
Big Data Computing - Assignment 7
100% (1)
Big Data Computing - Assignment 7
3 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
17 pages
Gender and Preference
No ratings yet
Gender and Preference
3 pages
Forecasting Values
No ratings yet
Forecasting Values
21 pages
Week 11-2 Lecture 15 Student
No ratings yet
Week 11-2 Lecture 15 Student
54 pages
4.4 Correlation and Simple Linear Regression
100% (2)
4.4 Correlation and Simple Linear Regression
11 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Multinomial Logistic Regression
No ratings yet
Multinomial Logistic Regression
18 pages
Chapter 17 - Plus Notes - 12 PT
No ratings yet
Chapter 17 - Plus Notes - 12 PT
34 pages
Time Series Regression Models Guide
No ratings yet
Time Series Regression Models Guide
74 pages
E4 DS203 2023 Sem2
No ratings yet
E4 DS203 2023 Sem2
2 pages
Business Research Methods (BRM) Solved MCQs
100% (2)
Business Research Methods (BRM) Solved MCQs
6 pages
Exp 3 A
No ratings yet
Exp 3 A
2 pages
Solution
No ratings yet
Solution
18 pages
ANOVA - Test Your Understanding
No ratings yet
ANOVA - Test Your Understanding
5 pages
Machine Learning Regression & Encoding Assignment
No ratings yet
Machine Learning Regression & Encoding Assignment
7 pages
Correlation Test Between Two Variables in R - Easy Guides - Wiki - STHDA
100% (1)
Correlation Test Between Two Variables in R - Easy Guides - Wiki - STHDA
11 pages
Breast Cancer Classification
100% (2)
Breast Cancer Classification
16 pages
Boston Housing Price Prediction
No ratings yet
Boston Housing Price Prediction
3 pages
Dummy
No ratings yet
Dummy
20 pages
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
No ratings yet
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
33 pages