0% found this document useful (0 votes)

60 views26 pages

Week 3

Introduction to Applied Machine Learning

Uploaded by

zeliawillscumberg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views26 pages

Week 3

Introduction to Applied Machine Learning

Uploaded by

zeliawillscumberg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Introductory Applied Machine Learning

Generalization, Overfitting, Evaluation

Victor Lavrenko and Nigel Goddard

School of Informatics
Generalization
! Training data: {xi,yi}
! examples that we used to train our predictor
! e.g. all emails that our users labelled ham / spam
! Future data: {xi,?}
! examples that our classifier has never seen before
! e.g. emails that will arrive tomorrow
! Want to do well on future data, not training
! not very useful: we already know yi
! easy to be perfect on training data (DT, kNN, kernels)
! does not mean you will do well on future data
! can over-fit to idiosyncrasies of our training data
Copyright © 2014 Victor Lavrenko
Under- and Over-fitting
• Over-fitting:
F’
• predictor too complex (flexible) future data

• fits “noise” in the training data under-fit F

• patterns that will not re-appear
• predictor F over-fits the data if:
over-fit F
• we can find another predictor F’
• which makes more mistakes on training data: Etrain(F’) > Etrain(F)
• but fewer mistakes on unseen future data : Egen(F’) < Egen(F)
• Under-fitting:
• predictor too simplistic (too rigid)
• not powerful enough to capture salient patterns in data
• can find another predictor F’ with smaller Etrain and Egen
Copyright © 2014 Victor Lavrenko
Under- and Over-fitting examples

Regression:

predictor too inflexible: predictor too flexible:

cannot capture pattern fits noise in the data

Classification:

Copyright © 2014 Victor Lavrenko

Flexible vs. inflexible predictors
• Each dataset needs different level of “flexibility”
• depends on task complexity + available data
• want a “knob” to get rigid / flexible predictors
• Most learning algorithms have such knobs:
• regression: order of the polynomial
• NB: number of attributes, limits on σ2, ε
• DT: #nodes in the tree / pruning confidence
• kNN: number of nearest neighbors
• SVM: kernel type, cost parameter future data

• Tune to minimize generalization error

• Training error:
• Generalization error: training value we
predicted
true
value
examples
• how well we will do on future data
• don’t know what future data xi will be
• don’t know what labels yi it will have
• but know the “range” of all possible {x,y} Usually
Etrain ≤ Egen
• x: all possible 20x20 black/white bitmaps
• y: {0,1,…,9} (digits)

Can never compute error as before how often we expect

generalisation error over all to see such x and y
possible x,y
Copyright © 2014 Victor Lavrenko
Estimating Generalization Error
over testing set

• Testing error: test

• set aside part of training data (testing set)

• learn a predictor without using any of this data
• predict values for testing set, compute error
• gives an estimate of true generalization error
• if testing set is unbiased sample from p(x,y): lim E test = E gen
n→∞
• how close? depends on n
• Ex: binary classification, 100 instances
€
• assume: 75 classified correctly, 25 incorrectly
• Etest = 0.25, Egen around 0.25, but how close?
Copyright © 2014 Victor Lavrenko
Confidence Interval for Future Error
• What range of errors can we expect for future test sets?
• Etest ± ΔE such that 95% of future test sets fall within that interval
• Etest is an unbiased estimate of E = true error rate
• E = probability our system will misclassify a random instance
• take a random set of n instances, how many misclassified? our test set is
one such set
• flip E-biased coin n times, how many heads will we get?
• Binomial distribution with mean = n E, variance = n E (1-E)
• Efuture= #misclassified / n, ~ Gaussian, mean E, variance = E (1-E) / n
• 2/3 future test sets will have error in E ± √(E(1-E)/n)
• p% confidence interval for future error:
• for n=100 examples, p=0.95 and E = 0.25
• σ = √(0.25"0.75/100) = .043
• CI = 0.25 ± 1.96"σ = 0.25 ± 0.08
• n=100, p=0.99 # CI = 0.25 ± 0.11
• n=10000, p=0.95 # CI = 0.25 ± 0.008
Copyright © 2014 Victor Lavrenko
Training, Validation, Testing sets
• Training set: construct classifier
• NB: count frequencies, DT: pick attributes to split on
• Validation set: pick algorithm + knob settings
• pick best-performing algorithm (NB vs. DT vs. …)
• fine-tune knobs (tree depth, k in kNN, c in SVM …)
• Testing set: estimate future error rate
• never report best of many runs
• run only once, or report results of every run
• Split randomly to avoid bias
Copyright © 2014 Victor Lavrenko
Cross-validation
• Conflicting priorities when splitting the dataset
• estimate future error as accurately as possible
• large testing set: big ntest # tight confidence interval
• learn classifier as accurately as possible
• large training set: big ntrain # better estimates
• training and testing cannot overlap: ntrain + ntest = const
• Idea: evaluate Train # Test, then Test # Train, average results
• every point is both training and testing, never at the same time
• reduces chances of getting an unusual (biased) testing set
• 5-fold cross-validation
• randomly split the data into 5 sets
• test on each in turn (train on 4 others)
• average the results over 5 folds
• more common: 10-fold
Copyright © 2014 Victor Lavrenko
Leave-one-out
• n-fold cross-validation (n = total number of instances)
• predict each instance, training on all (n-1) other instances
• Pros and cons:
• best possible classifier learned: n-1 training examples
• high computational cost: re-learn everything n times
• not an issue for instance-based methods like kNN
• there are tricks to make such learning faster
• classes not balanced in training / testing sets
• random data, 2 equi-probable classes # wrong 100% of the time
• testing balance: {1 of A, 0 of B} vs. training: {n/2 of B, n/2-1 of A}
• duplicated data # nothing can beat 1NN (0% error)
• wouldn’t happen with 10-fold cross-validation
Copyright © 2014 Victor Lavrenko
Stratification
• Problems with leave-one-out:
• training / testing sets have classes in different proportions
• not limited to leave-one-out:
• K-fold cross-validation: random splits # imbalance

• Stratification
• keep class labels balanced across training / testing sets
• simple way to guard against unlucky splits
• recipe:
• randomly split each class into K parts
• assemble ith part from all classes to make the ith fold

Copyright © 2014 Victor Lavrenko

Evaluation measures
• Are we doing well? Is system A better than B?
• A measure of how (in)accurate a system is on a task
• in many cases Error (Accuracy / PC) is not the best measure
• using the appropriate measure will help select best algorithm
• Classification
• how often we classify something right / wrong
• Regression
• how close are we to what we’re trying to predict
• Unsupervised
• how well do we describe our data
• in general – really hard
Copyright © 2014 Victor Lavrenko
Classification measures: basics
all testing instances

system predicts positive

Predict positive? False

Really positive?

Positives
Yes No (FP)
True
Yes TP FN Positives
(TP)
No FP TN
False
Confusion matrix for
Negatives
two-class classification True Negatives (TN) (FN)

really positive
Want: large diagonal,
small FP, FN
Copyright © 2014 Victor Lavrenko
Classification Error
FP
TP
! Classification error = TN FN

! Accuracy = (1 – error) =
! Basic measure of “goodness” of a classifier
! Problem: cannot handle unbalanced classes
! ex1: predict whether an earthquake is about to happen
! happen very rarely, very good accuracy if always predict “No”
! solution: make FNs much more “costly” than FPs
! ex2: web search: decide if a webpage is relevant to user
! 99.9999% of pages not relevant to any query # retrieve nothing
! solution: use measures that don’t involve TN (recall / precision)

Copyright © 2014 Victor Lavrenko

Accuracy and un-balanced classes
• You’re predicting Nobel
prize (+) vs. not (")
• Human would prefer
classifier A.
• Accuracy will
prefer classifier B
(fewer errors)
• Accuracy poor
metric here
A
B
Copyright © Victor Lavrenko, 2014 Copyright © 2014 Victor Lavrenko
FP
Misses and False Alarms TN
TP
FN

• False Alarm rate = False Positive rate = FP / (FP+TN)

• % of negatives we misclassified as positive
• Miss rate = False Negative rate = FN / (TP+FN)
• % of positives we misclassified as negative
• Recall = True Positive rate = TP / (TP+FN)
• % of positives we classified correctly (1 – Miss rate)
• Precision = TP / (TP + FP)
• % positive out of what we predicted was positive
• Meaningless to report just one of these
• trivial to get 100% recall or 0% false alarm
• typical: recall/precision or Miss / FA rate or TP/FP rate

Copyright © 2014 Victor Lavrenko

Evaluation (recap)
• Predicting class C (e.g. spam)
• classifier can make two types of mistakes:
• FP: false positives – non-spam emails mistakenly classified as spam
• FN: false negatives – spam emails mistakenly classified as non-spam
• TP/TN: true positives/negatives – correctly classified spam/non-spam
• common error/accuracy measures:
• Classification Error:
• Accuracy = 1-Error:
• False Alarm = False Positive rate = FP / (FP+TN)
• Miss = False Negative rate = FN / (TP+FN)
• Recall = True Positive rate = TP / (TP+FN)
• Precision = TP / (TP+FP)

Utility and Cost
• Sometimes need a single-number evaluation measure
• optimizing the learner (automatically), competitive evaluation
• may know costs of different errors, e.g. earthquakes:
• false positive: cost of preventive measures (evacuation, lost profit)
• false negative: cost of recovery (reconstruction, liability)
• Detection cost: weighted average of FP, FN rates
• Cost = CFP * FP + CFN * FN [event detection]
• F-measure: harmonic mean of recall, precision
• F1 = 2 / (1 / Recall + 1 / Precision) [Information Retrieval]
• Domain-specifc measures:
• e.g. observed profit/loss from +/- market prediction
Copyright © 2014 Victor Lavrenko
Thresholds in Classification
• Two systems have the following performance:
• A: True Positive = 50%, False Positive = 20%
• B: True Positive = 100%, False Positive = 60%
• Which is better? (assume no-apriori utility)
• very misleading question
• A and B could be the same exact system
• operating at different thresholds

ROC curves
• Many algorithms compute “confidence” f(x)
• threshold to get decision: spam if f(x) > t, non-spam if f(x) ≤ t
• Naïve Bayes: P(spam|x) > 0.5, Linear/Logistic/SVM: wTx > 0, Decision Tree: p+/p- > 1

• threshold t determines error rates

• False Positive rate = P(f(x)>t|ham), True Positive rate = P(f(x)>t|spam)

• Receiver Operating Characteristic (ROC):

• plot TPR vs. FPR as t varies from ∞ to -∞
shows performance of system
TP perfect performance across all possible thresholds
B
AUC = area under ROC curve
C popular alternative to Accuracy
A

FP
Evaluating regression Y
• Classification:
• count how often we are wrong
• Regression:
• predict numbers yi from inputs xi X1
X2
• always wrong, but by how much?
• distance between predicted & true values
• (root) mean squared error:
• popular, well-understood, nicely differentiable
• sensitive to single large errors (outliers) predicted true
testing set
• mean absolute error:
• less sensitive to outliers
• correlation coefficient
• insensitive to mean & scale
Copyright © 2014 Victor Lavrenko
Mean Squared Error
• Average (squared) deviation from truth
• Very sensitive to outliers
• 99 exact, 1 off by $10 same large effect
MSE on models
• all 100 wrong by $1
• Sensitive to mean / scale
• µy=1/nΣiyi ... good baseline
• Relative squared error (Weka)
MSE of
predictor

MSE when using the

mean as a predictor

Mean Absolute Error
• Mean Absolute Error (MAE):
• less sensitive to outliers
• many small errors = one large error 5 better
6 worse
all by
• best 0th order baseline: median{yi} same
amount
• not the mean as for MSE

• Median Absolute Deviation (MAD): med{|f(xi)-yi|}

• robust, completely ignores outliers
• can define similar squared error: median{(f(xi)-yi)2}
• difficult to work with (can’t take derivatives)
• Sensitive to mean, scale

Correlation Coefficient
• Completely insensitive to mean / scale:

prediction same
truth
relative to CC
relative to
mean mean

• Intuition: did you capture the relative ordering?

• output larger f(xi) for larger yi

true value
• output smaller f(xi) for smaller yi
• useful for ranking tasks:
predicted value predicted value
• e.g. recommend a movie to a user

true value
• Important to visualize data
• same CC for 4 predictors $
Copyright © 2014 Victor Lavrenko
Summary
• Training vs. generalization error
• under-fitting and over-fitting
• Estimate how well your system is doing its job
• how does it compare to other approaches?
• what will be the error rate on future data?
• Training and testing
• cross-validation, leave-one-out, stratification, significance
• Evaluation measures
• accuracy, miss / false alarm rates, detection cost
• ROC curves
• regression: (root) mean squared/absolute error, correlation
Copyright © 2014 Victor Lavrenko

KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Data Mining Evaluation Metrics Guide
No ratings yet
Data Mining Evaluation Metrics Guide
40 pages
Lecture - Model Accuracy Measures
No ratings yet
Lecture - Model Accuracy Measures
61 pages
Classification
No ratings yet
Classification
53 pages
Unit III 1
No ratings yet
Unit III 1
21 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Data Mining: Class Imbalance Solutions
No ratings yet
Data Mining: Class Imbalance Solutions
56 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
Classification & Prediction Guide
No ratings yet
Classification & Prediction Guide
103 pages
Data MIning Chapter 8
No ratings yet
Data MIning Chapter 8
11 pages
ML - 03 Evaluation Metrics
No ratings yet
ML - 03 Evaluation Metrics
17 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Mod 7 Smote ML
No ratings yet
Mod 7 Smote ML
40 pages
Unit 1
No ratings yet
Unit 1
92 pages
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-11 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-11 Reference-Material-I
81 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
Data Science Unit 5
No ratings yet
Data Science Unit 5
11 pages
Chapter 4
No ratings yet
Chapter 4
103 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
Unit3 Evaluating Models
No ratings yet
Unit3 Evaluating Models
10 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Data Mining: Accuracy and Error Measures For Classification and Prediction
No ratings yet
Data Mining: Accuracy and Error Measures For Classification and Prediction
15 pages
CP1407 Prac6-9
No ratings yet
CP1407 Prac6-9
45 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
Supervised Learning
No ratings yet
Supervised Learning
5 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
5 DL
No ratings yet
5 DL
33 pages
Notes 1
No ratings yet
Notes 1
3 pages
Inference For The Generalization Error
No ratings yet
Inference For The Generalization Error
43 pages
Unit 1-1
No ratings yet
Unit 1-1
75 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Accuracy
No ratings yet
Accuracy
15 pages
ML Lecture
No ratings yet
ML Lecture
73 pages
Machine Learning Essentials
No ratings yet
Machine Learning Essentials
12 pages
06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Lecture 15 - Recap and Midterm Review
No ratings yet
Lecture 15 - Recap and Midterm Review
37 pages
Unit 2 Classification
No ratings yet
Unit 2 Classification
59 pages
CS464 Ch5 FeatureSelection
No ratings yet
CS464 Ch5 FeatureSelection
31 pages
Unit Iii
No ratings yet
Unit Iii
67 pages
Fairness Lectures-21
No ratings yet
Fairness Lectures-21
63 pages
Hypothesis Space and Inductive Bias
No ratings yet
Hypothesis Space and Inductive Bias
51 pages
Week 3
No ratings yet
Week 3
56 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
L22 KNN+Metrics
No ratings yet
L22 KNN+Metrics
18 pages
Machine Learning Unit-2
No ratings yet
Machine Learning Unit-2
89 pages
Data Analytics & Decision Trees
No ratings yet
Data Analytics & Decision Trees
51 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
CH 6
No ratings yet
CH 6
24 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
Master of Science in Renewable Energy and Management
No ratings yet
Master of Science in Renewable Energy and Management
1 page
W2e Multivariate Gaussian
No ratings yet
W2e Multivariate Gaussian
6 pages
Doing Business in Hungary
No ratings yet
Doing Business in Hungary
22 pages
w2c Central Limit
No ratings yet
w2c Central Limit
1 page
Award in Education and Training Sample
No ratings yet
Award in Education and Training Sample
9 pages
Biological Data Science Lecture6
No ratings yet
Biological Data Science Lecture6
29 pages
Part 4
No ratings yet
Part 4
24 pages
Part 5
No ratings yet
Part 5
31 pages
BDS 2018-19
No ratings yet
BDS 2018-19
6 pages
BDS 2016-17
No ratings yet
BDS 2016-17
4 pages
Biological Data Science Lecture4
No ratings yet
Biological Data Science Lecture4
21 pages
MDA3S
No ratings yet
MDA3S
22 pages
MATH11183 Week 1-Part 2
No ratings yet
MATH11183 Week 1-Part 2
18 pages
Bayesian Workshop1 Solution
No ratings yet
Bayesian Workshop1 Solution
3 pages
MLPR w0f - Machine Learning and Pattern Recognition
No ratings yet
MLPR w0f - Machine Learning and Pattern Recognition
3 pages
Week 8 Pca
No ratings yet
Week 8 Pca
26 pages
2017 AMAM Exam Paper
No ratings yet
2017 AMAM Exam Paper
6 pages
Slides 03 A
No ratings yet
Slides 03 A
21 pages
TS Part2
No ratings yet
TS Part2
62 pages
Week 2 Naive Bayes
No ratings yet
Week 2 Naive Bayes
15 pages
PMRslides 02
No ratings yet
PMRslides 02
13 pages
Part 3
No ratings yet
Part 3
29 pages
PMRslides 03 B
No ratings yet
PMRslides 03 B
45 pages
Heat Advection
No ratings yet
Heat Advection
12 pages
W6a Gaussian Process Kernels
No ratings yet
W6a Gaussian Process Kernels
6 pages
Bio Statslectures
No ratings yet
Bio Statslectures
60 pages
Bayesian Week4 LectureNotes
No ratings yet
Bayesian Week4 LectureNotes
15 pages
2019 AMAM Exam Paper
No ratings yet
2019 AMAM Exam Paper
3 pages
w9b Netflix Prize
No ratings yet
w9b Netflix Prize
3 pages
Laplace Approximation in Bayesian Logistic Regression
No ratings yet
Laplace Approximation in Bayesian Logistic Regression
4 pages
Queens College Department of Economics
100% (1)
Queens College Department of Economics
2 pages
Primitive Roots
No ratings yet
Primitive Roots
11 pages
Test Procedure For Motor Protection
No ratings yet
Test Procedure For Motor Protection
87 pages
TGN Level 1 No. 5 Derivation of Snow Load
No ratings yet
TGN Level 1 No. 5 Derivation of Snow Load
4 pages
Screenshot 2024-02-25 at 1.55.20 PM
No ratings yet
Screenshot 2024-02-25 at 1.55.20 PM
6 pages
CMM Filters & Outliers Guide
100% (1)
CMM Filters & Outliers Guide
13 pages
Probability Concepts and Application
No ratings yet
Probability Concepts and Application
44 pages
CEE212 - Solid and Structural Mechanics (3 Credits) Winter Semester 2014-2015
No ratings yet
CEE212 - Solid and Structural Mechanics (3 Credits) Winter Semester 2014-2015
1 page
No of Bricks Calculation For Line III Kiln
No ratings yet
No of Bricks Calculation For Line III Kiln
3 pages
Jagadish B - Analysis
No ratings yet
Jagadish B - Analysis
4 pages
AG5 Quality Assurance Skills Matrix Template en
No ratings yet
AG5 Quality Assurance Skills Matrix Template en
14 pages
Douglas Mooney
No ratings yet
Douglas Mooney
85 pages
HVDC Transmission Insights
No ratings yet
HVDC Transmission Insights
22 pages
MOS2 Experiment 01
No ratings yet
MOS2 Experiment 01
3 pages
Etabs Steel Frame Design Manual PDF
100% (1)
Etabs Steel Frame Design Manual PDF
2 pages
Etextbook 978-1464111709 Sensation and Perception 2nd Edition Instant Download
100% (1)
Etextbook 978-1464111709 Sensation and Perception 2nd Edition Instant Download
102 pages
Wfun16 Missing Subtrahend 2
No ratings yet
Wfun16 Missing Subtrahend 2
1 page
Diagrammatic and Graphical Presentation of Data
No ratings yet
Diagrammatic and Graphical Presentation of Data
17 pages
Algebraic Expressions Practice Quiz
No ratings yet
Algebraic Expressions Practice Quiz
2 pages
HSC Mathematics Advanced - Financial Adviser
No ratings yet
HSC Mathematics Advanced - Financial Adviser
6 pages
Unit-4 Question Answer
No ratings yet
Unit-4 Question Answer
14 pages
6 To 7 Moving Scholarship Test 20-11-22
No ratings yet
6 To 7 Moving Scholarship Test 20-11-22
14 pages
20191217213924205457expected Seating Arrangement Questions For Ibps Clerk Prelims Exams
No ratings yet
20191217213924205457expected Seating Arrangement Questions For Ibps Clerk Prelims Exams
30 pages
Teaching Strategies, Approaches, and Methods
100% (4)
Teaching Strategies, Approaches, and Methods
47 pages
Data Science Tool Box Important Viva Question
No ratings yet
Data Science Tool Box Important Viva Question
14 pages
(786412530) Exhibit 4 - 2 - Asme Tolerances
No ratings yet
(786412530) Exhibit 4 - 2 - Asme Tolerances
4 pages
Response of A Damped System Under The Harmonic Motion of The Base
No ratings yet
Response of A Damped System Under The Harmonic Motion of The Base
28 pages
Counting With Isocrates
No ratings yet
Counting With Isocrates
12 pages
Sec 2 Semestral Assessment (Question Paper)
No ratings yet
Sec 2 Semestral Assessment (Question Paper)
4 pages
Unit-Six Sampling and Sampling Distribution
No ratings yet
Unit-Six Sampling and Sampling Distribution
19 pages

Week 3

Uploaded by

Week 3

Uploaded by

Introductory Applied Machine Learning

Generalization, Overfitting, Evaluation

Victor Lavrenko and Nigel Goddard

• fits “noise” in the training data under-fit F

predictor too inflexible: predictor too flexible:

Copyright © 2014 Victor Lavrenko

• Tune to minimize generalization error

Can never compute error as before how often we expect

• Testing error: test

• set aside part of training data (testing set)

Copyright © 2014 Victor Lavrenko

system predicts positive

Predict positive? False

Copyright © 2014 Victor Lavrenko

• False Alarm rate = False Positive rate = FP / (FP+TN)

Copyright © 2014 Victor Lavrenko

Copyright © 2014 Victor Lavrenko

Copyright © 2014 Victor Lavrenko

• threshold t determines error rates

• Receiver Operating Characteristic (ROC):

MSE when using the

Copyright © 2014 Victor Lavrenko

• Median Absolute Deviation (MAD): med{|f(xi)-yi|}

Copyright © 2014 Victor Lavrenko

• Intuition: did you capture the relative ordering?

You might also like