0% found this document useful (0 votes)

25 views8 pages

1 Introduction

Uploaded by

microstart95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views8 pages

1 Introduction

Uploaded by

microstart95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

STAT6121-ML

Introduction
Machine learning
Machine learning approaches to data analysis
cannot be done without computers
Common with statistical modelling/analysis
• For prediction and classification
• Requires an optimisation procedure
• Obtain parameters or functions from observations
• Uncertainty of learning vs. prediction/classification
New elements or techniques
• Explanation or theoretical construct not emphasis
• Data can be ‘organic’, such as text, image
• Distinction of training vs. validation/test data
• Reliance on ready-made software for implementation

2
Some broad remarks
Supervised vs. unsupervised learning
• Target outcome y and covariates/features x?
NB. log-linear models of contingency tables
• Can the learned result be applied to unseen units?
NB. principal components, clustering
Prediction vs. classification
• Best prediction of y is its expectation µx = E(y | x)
2 2 2

E (y − µ) | x = (µx − µ) + E (y − µx) | x

• Best classification of categorical y is

′
y0 = arg max
′
Pr(y = y | x)
y

e.g. if y ∼ N (µ, σ 2), then E(y) = µ but Pr(y = µ) = 0

however, let z = I(y > µ − σ), then z0 = 1

3
Some broad remarks
Parametric vs. non-parametric models
• function/model f (x; θ) fixed given θ, i.e. parameters
f (x) = E(y | x) or f (x) = Pr(y | x)
• parametric if θ contains a fixed number of constants
NB. linear regression model as a typical example
• non-parametric if no. unknowns in θ grows with the
no. observations, or if f is indeterminate in advance
Error vs. residual
• Given f (x) = E(y | x) or y0 = arg max
′
Pr(y = y ′
| x), error
y

e = y − f (x) or e = I(y = y0)

• Given fˆ or ŷ0 as estimate f or y0, residual
ê = y − fˆ(x) or ê = I(y = ŷ0)
if (y, x) are used for obtaining fˆ or ŷ0
4
Bias-variance trade-off

Eq. (2.7), mean squared error (MSE) of fˆ(x) for y given x

2 2
ˆ ˆ
E{ y − f (x) } = E{ y − f (x) + f (x) − f (x) }
2 2
ˆ
= E{ y − f (x) } + E f (x) − f (x)
ˆ

− 2E{ y − f (x) f (x) − f (x) }
2
ˆ
= V e(x) + V f (x) + Bias f (x) ˆ

over fˆ(x) and

y that are independent of each other
NB. V e(x) unaffected by whichever fˆ
2
ˆ ˆ

Q: Reduce V f (x) and Bias f (x) at the same time?
• to reduce V f (x) , let fˆ(x) be obtained based on many
ˆ

observations, e.g. by using parametric f (x; θ)...
2
• to reduce Bias f (x) , let fˆ(x) only depend on close-by
ˆ
observations, provided f is reasonably smooth...
• hence, the bias-variance trade-off
5
Ch. 3, exercise 4
Answer by ML, e.g. x ∈ (1, 10) and f (x) = β0 + β1 log(x) for (c)
get.dta <- function(n=100, beta=c(0.5,1), nonlnr=F)
{
x = seq(1,10,length=n)
if (nonlnr) { f = beta[1] + beta[2]*log(x) }
else { f = beta[1] + beta[2]*x }
y = f + rnorm(n, 0, 1)
x2 = x^2; x3 = x^3
data.frame(y,x,x2,x3,f)
}

main <- function(n=100, beta=c(0.5,1), nonlnr=F, vis=F)

{
dta = get.dta(n=n, beta=beta, nonlnr=nonlnr)
if (nonlnr) { cat("data generated under nonlinear model\n\n") }
else { cat("data generated under linear model\n\n") }
cat("fitting simple linear regression:\n")
print(summary(lm(y ~ x, data=dta)))
cat("fitting cubic (polynomial) regression:\n")
print(summary(lm(y ~ x + x2 + x3, data=dta)))
if (vis) { plot(dta$x, dta$y); lines(dta$x, dta$f) }
}

6
Additional exercise

16
14
12
y

10
8
6
4

6 8 10 12 14

Equally spaced x, fˆ1(x) = β̂x (solid), fˆ2(x) = y (dashed)

• What is V f (x) at any given x for fˆ = fˆ1 or fˆ2?
ˆ

ˆ
What can you say about Bias f (x) ?

Consider KNN predictor given K

• How would you apply the method if x = 5 or 10?
ˆ ˆ
• What about V f (x) and Bias f (x) in this case?

7
Additional exercise
n
X n
X
β̂ = xi yi / x2i
i=1 i=1
V fˆ1(x) = V (β̂x) = x V (β̂) 2

n n n
2 2
X X X
2 2 2
x2i

= x V (yi | xi) xi / xi = x V (yi | xi)/
i=1 i=1 i=1
n
X
V̂ (yi | xi) = (yi − β̂xi)2/(n − 1)
i=1
V fˆ2(x) = V (y | x) = V (yi | xi) NB. non-existant V̂ (yi | xi)

K
X
fˆ(x) = yj (x)/K
j=1
XK
V fˆ(x) = V yj (x) /K 2

j=1
K
X 2
yj (x) − fˆ(x) /(K − 1) NB. from K obs.

V̂ yj (x) =
j=1
Assume unbiasedness in all the cases...

MLDL Lecture 1
No ratings yet
MLDL Lecture 1
28 pages
Machine Learning Lecture Notes Undergrad
No ratings yet
Machine Learning Lecture Notes Undergrad
19 pages
Merge
No ratings yet
Merge
240 pages
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
No ratings yet
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
19 pages
Business Analytics
No ratings yet
Business Analytics
19 pages
Bias Variance Tradeoff
No ratings yet
Bias Variance Tradeoff
71 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
55 pages
About Model Selection
No ratings yet
About Model Selection
33 pages
Notes 2
No ratings yet
Notes 2
16 pages
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
No ratings yet
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
34 pages
Notes Stat Learning
No ratings yet
Notes Stat Learning
64 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
Briefly Explain The Trade-Offs Associated Between The Model Variance Versus Bias-Squared To Inform Model Selection
No ratings yet
Briefly Explain The Trade-Offs Associated Between The Model Variance Versus Bias-Squared To Inform Model Selection
7 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
Lecture 1: Introduction and Key Concepts
No ratings yet
Lecture 1: Introduction and Key Concepts
62 pages
Statistical Learning
No ratings yet
Statistical Learning
31 pages
7 - Grandmaster - Stage1 - Part1 For Data Science
No ratings yet
7 - Grandmaster - Stage1 - Part1 For Data Science
90 pages
Regression
No ratings yet
Regression
45 pages
Econometrics - Exercise Set 2 (Solution)
No ratings yet
Econometrics - Exercise Set 2 (Solution)
12 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Ch2 Statistical Learning
No ratings yet
Ch2 Statistical Learning
51 pages
Linear Regression & Least Squares
No ratings yet
Linear Regression & Least Squares
29 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Sta 3
No ratings yet
Sta 3
9 pages
Statistical Modelling: Regression: Choosing The Independent Variables
No ratings yet
Statistical Modelling: Regression: Choosing The Independent Variables
14 pages
Linear Model Recap 2
No ratings yet
Linear Model Recap 2
313 pages
What Is Empirical - Models
No ratings yet
What Is Empirical - Models
14 pages
226 Lecture5 Prediction
No ratings yet
226 Lecture5 Prediction
45 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
StatLearning2r PDF
No ratings yet
StatLearning2r PDF
267 pages
BA501 Week5 Linear Regression
No ratings yet
BA501 Week5 Linear Regression
45 pages
2.SupervisedLearning Error
No ratings yet
2.SupervisedLearning Error
32 pages
Lec 1
No ratings yet
Lec 1
54 pages
Chap 5
No ratings yet
Chap 5
13 pages
NVT SDS Unit V Final PDF
No ratings yet
NVT SDS Unit V Final PDF
100 pages
Lecture 3
No ratings yet
Lecture 3
33 pages
SOA Exam Statistics For Risk Modelling Formula Sheets - Coachingactuaries
No ratings yet
SOA Exam Statistics For Risk Modelling Formula Sheets - Coachingactuaries
9 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Stat 473-573 Notes
No ratings yet
Stat 473-573 Notes
139 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Lec-01-Introduction To Statistical Learning
No ratings yet
Lec-01-Introduction To Statistical Learning
38 pages
CH 2
No ratings yet
CH 2
31 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
ML Unit3
No ratings yet
ML Unit3
9 pages
L2 Dai-101
No ratings yet
L2 Dai-101
30 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
Machine Learning
No ratings yet
Machine Learning
92 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
SDS Solution1
No ratings yet
SDS Solution1
26 pages
Week2 StatisticalLearning
No ratings yet
Week2 StatisticalLearning
46 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
16 pages
SLRM Note
No ratings yet
SLRM Note
15 pages
Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon
No ratings yet
Part I: Written Exercises: Homework 3 Submit On NYU Classes by Fri. Oct. 20 at Noon
3 pages
Unit 3 1
No ratings yet
Unit 3 1
41 pages
Capstone Dfinal
No ratings yet
Capstone Dfinal
68 pages
Aao HNSF - BPPV
No ratings yet
Aao HNSF - BPPV
36 pages
Updated Academic Calendar B 41538
No ratings yet
Updated Academic Calendar B 41538
2 pages
Analytical Methods for Chondroitin
No ratings yet
Analytical Methods for Chondroitin
3 pages
Agfa IMPAX
No ratings yet
Agfa IMPAX
23 pages
EEPROM 24LC512 - 21754e
No ratings yet
EEPROM 24LC512 - 21754e
26 pages
David Cyganski, John A. Orr, With Richard F. Vaz: Information Technology - Inside and Outside
No ratings yet
David Cyganski, John A. Orr, With Richard F. Vaz: Information Technology - Inside and Outside
43 pages
English Code AmE L2 Practice Test U7
100% (1)
English Code AmE L2 Practice Test U7
2 pages
Hover Energy Spec Sheet - 03.25.22
No ratings yet
Hover Energy Spec Sheet - 03.25.22
2 pages
McKeldin Library Guide
No ratings yet
McKeldin Library Guide
2 pages
Distal Tubule Balance and Tubuloglomerular Feedback-Group 2
No ratings yet
Distal Tubule Balance and Tubuloglomerular Feedback-Group 2
42 pages
Optimise A2 - Course Planner - 90 Sessions (45 Minutes Each)
No ratings yet
Optimise A2 - Course Planner - 90 Sessions (45 Minutes Each)
24 pages
Cyber Laws & E-Business Essentials
No ratings yet
Cyber Laws & E-Business Essentials
3 pages
S4100C General Monitors Gas Detector Data Sheet
No ratings yet
S4100C General Monitors Gas Detector Data Sheet
2 pages
Brief Synopsis of Linux
No ratings yet
Brief Synopsis of Linux
12 pages
He Took My Adult Toy - "I'll Teach You How To Use It": Volume 6
No ratings yet
He Took My Adult Toy - "I'll Teach You How To Use It": Volume 6
24 pages
Mobile Application UNIT1
No ratings yet
Mobile Application UNIT1
45 pages
Understanding Medical Knowledge
No ratings yet
Understanding Medical Knowledge
4 pages
MSDS - York K Oil
No ratings yet
MSDS - York K Oil
4 pages
Step Ahead Connect 4 PLUS - October Revision
No ratings yet
Step Ahead Connect 4 PLUS - October Revision
20 pages
Forms of Tourism
No ratings yet
Forms of Tourism
14 pages
Student Pregnancy and Maternity Implications For Heis
No ratings yet
Student Pregnancy and Maternity Implications For Heis
42 pages
Understanding Definition Writing
100% (1)
Understanding Definition Writing
2 pages
QAQC Standard or Codes For Interior Decoration or Fit-Out Works
No ratings yet
QAQC Standard or Codes For Interior Decoration or Fit-Out Works
3 pages
LESSON 2.forms and Genres of Contemporary Arts
No ratings yet
LESSON 2.forms and Genres of Contemporary Arts
25 pages
Latestlog Old
No ratings yet
Latestlog Old
73 pages
COVID-19 Research & Resources Hub
No ratings yet
COVID-19 Research & Resources Hub
164 pages
Emca Labels 2024-04-23
No ratings yet
Emca Labels 2024-04-23
4 pages
MikroTik Price List-May 2023-01.05.2023
No ratings yet
MikroTik Price List-May 2023-01.05.2023
5 pages
Grade 5 Science: Plant Reproduction
100% (1)
Grade 5 Science: Plant Reproduction
13 pages

1 Introduction

Uploaded by

1 Introduction

Uploaded by

STAT6121-ML

• Best classification of categorical y is

e.g. if y ∼ N (µ, σ 2), then E(y) = µ but Pr(y = µ) = 0

e = y − f (x) or e = I(y = y0)

Eq. (2.7), mean squared error (MSE) of fˆ(x) for y given x

over fˆ(x) and

main <- function(n=100, beta=c(0.5,1), nonlnr=F, vis=F)

Equally spaced x, fˆ1(x) = β̂x (solid), fˆ2(x) = y (dashed)

Consider KNN predictor given K

You might also like