0% found this document useful (0 votes)
46 views12 pages

06 Performance Evaluation

The document discusses evaluation metrics for machine learning models including accuracy, recall, precision, and the F1 score. It explains how to calculate these metrics using a confusion matrix and provides examples of confusion matrices for a fruit classification problem with oranges, apples, and grapes. Key steps in the machine learning process are also summarized such as dividing data into training and test sets and using cross-validation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views12 pages

06 Performance Evaluation

The document discusses evaluation metrics for machine learning models including accuracy, recall, precision, and the F1 score. It explains how to calculate these metrics using a confusion matrix and provides examples of confusion matrices for a fruit classification problem with oranges, apples, and grapes. Key steps in the machine learning process are also summarized such as dividing data into training and test sets and using cross-validation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Performance Evaluation

Lili Ayu Wulandhari Ph.D.


Test Sets

• Learning Process is carried out to recognize pattern of datasets.


• In order to avoid bias result of pattern recognition, the data is divided
into in-sample and out-sample data where these data must be
independence each others.
• In-sample data training is used to obtain learning model such as
number of hidden, number of neuron in hidden, and acceptable
weights which appropriate to the pattern.
• Out-sample data is used for validation which yields selected models
and testing is to evaluate models and weights obtained from training.
• In – sample and out – sample data can be divided into several
methods, for instance
1. Cross Validation
2. Split average
• Appropriate model and weights are determined by high accuracy,
recall and precision
Confusion Matrix

Table 5.1: Confusion Matrix

Predicted Predicted
Positives Negatives

Actual Positives a b

Actual Negatives c d
Confusion Matrix

• As example, assuming the sample of 23 fruits consists of


equal to 3 classes, namely 8 oranges, 10 apples and 5
grapes. The confusion matrix can be described as Table 5.2
below
Table 5.2: Classification Matrix for 3-Class Classification

Predicted Predicted Predicted


Oranges Apples Grapes

Actual Oranges 6 1 1

Actual Apples 2 8 0
Actual Grapes 1 1 3
Confusion Matrix

Table 5.3 Confusion Table of Orange class

6 true positives 2 false negatives


(actual oranges which are correctly (oranges which are incorrectly identified
classified as oranges) as apples and grapes)

3 false positives 11 true negatives


(non-oranges which are incorrectly (all the remaining fruits which are
identified as oranges) correctly identified as non-oranges)
Confusion Matrix

Table 5.4 Confusion Table of Apple class

8 true positives 2 false negatives


(actual apples which are correctly (apples which are incorrectly identified
classified as apples) as oranges)

2 false positives 9 true negatives


(non-apples which are incorrectly (all the remaining fruits which are
identified as apples) correctly identified as non-apples)
Confusion Matrix

Table 5.5 Confusion Table of Grape class

3 true positives 2 false negatives


(actual grapes which are correctly (grapes which are incorrectly identified
classified as grapes) as oranges and apples)

1 false positives 14 true negatives


(non-grapes which are incorrectly (all the remaining fruits which are
identified as grapes) correctly identified as non-grapes)
Classification Accuracy

• Accuracy shows the proportion of true prediction


classes obtained from the model against actual classes
of classification. Accuracy is calculated using:

Predicted Predicted
Positives Negatives
Actual Positives TP FN
Actual Negatives FP TN

TP  TN
Accuracy   100%
TP  TN  FP  FN
Recall

Recall (True positive Rate (TPR) is rate of correct


positive prediction to all actual positive classes

Predicted Predicted
Positives Negatives
Actual Positives TP FN
Actual Negatives FP TN

TP
Recall   100%
TP  FN
Precision

Precision is rate of correct positive prediction to all


positive predicted classes

Predicted Predicted
Positives Negatives
Actual Positives TP FN
Actual Negatives FP TN

TP
Precision   100%
TP  FP
F1 Score
the F1 score presents the balance between the
precision and the recall.

2 * (Recall  Precision)
F1 Score 
Recall  Precision
References
• Page, David. (2017) Evaluating Machine Learning Methods,
Department of Biostatistics and Medical Informatics
and Department of Computer Sciences, School of Medicine
and Public Health University of Wisconsin-Madison.
http://pages.cs.wisc.edu/~dpage/cs760/evaluating.pdf
• Wulandhari, Lili Ayu. (2014). Enhanced Genetic Algorithm-
based Back Propagation Neural Network to Diagnose
Conditions of Multiple-bearing System, Ph.D. Thesis, Universiti
Teknologi Malaysia.
• Machart, P., and Ralaivola, L. (2012). Confusion Matrix Stability
Bounds for Multiclass Classification. Aix-Marseille Univ., LIF-
QARMA, CNRS, UMR 7279, F-13013, Marseille, France.
• Kohavi, R., and Provost, F. (1998). Glossary of Terms: Special
Issue on Applictions of Macine Learning and the Knowledge
Discovery Process. Machine Learning, 30, 271-274.
• http://scikit-learn.org/stable/modules/cross_validation.html

You might also like