0% found this document useful (0 votes)

40 views4 pages

· (· · ·) - · - k · k ⌊ · ⌋ (a, b) a b J ·K ∇ ∇E E (w) w (·) (·) (·) k N A / B A B 0 (1) × R d ǫ δ ǫ η λ λ C Ω θ θ(s) = e / (1 + e) Φ z = Φ (x) Φ Q

1) The document defines common mathematical notation used in machine learning and statistics. It defines notation for sets, vectors, matrices, probabilities, estimators, hypothesis sets, loss functions, and other concepts. 2) Special symbols are introduced for concepts like gradients, inverses, transposes, floors, intervals, indicators, and various operators. 3) Notation is also defined for learning algorithms and their components, including regularization parameters, hypothesis selection, training and test errors, and other algorithmic details.

Uploaded by

hprof1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views4 pages

· (· · ·) - · - k · k ⌊ · ⌋ (a, b) a b J ·K ∇ ∇E E (w) w (·) (·) (·) k N A / B A B 0 (1) × R d ǫ δ ǫ η λ λ C Ω θ θ(s) = e / (1 + e) Φ z = Φ (x) Φ Q

Uploaded by

hprof1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Notation

{ }
||

event (in probability)

set
absolute value of a number, or ardinality (number of elements) of a set, or determinant of a matrix

square of the norm; sum of the squared omponents of a

[a, b]
JK

oor; largest integer whi h is not larger than the argument

()1
()
t
()

N
k

A\B
0
{1} Rd

ve tor
the interval of real numbers from

evaluates to 1 if argument is true, and to 0 if it is false

gradient operator, e.g.,
spe t to

Ein

(gradient of

Ein (w)

with re-

inverse
pseudo-inverse
transpose ( olumns be ome rows and vi e versa)
number of ways to hoose k obje ts from N distin t obje ts
N!
(equals (N k)!k! where ` !' is the fa torial)
the set

with the elements from set

removed

zero ve tor; a olumn ve tor whose omponents are all zeros

d-dimensional Eu lidean

spa e with an added `zeroth oor-

dinate' xed to 1
toleran e in approximating a target
bound on the probability of ex eeding

(the approximation

toleran e)

learning rate (step size in iterative learning, e.g., in sto has-

regularization parameter

ti gradient des ent)

regularization parameter orresponding to weight budget

penalty for model omplexity; either a bound on general-

logisti fun tion

ization error, or a regularization term

s
s

(s) = e /(1 + e )
z = (x)

feature transform,

Qth-order

polynomial transform

193

Notation

2
A

argmina ()

a oordinate in the feature transform

, zi = i (x)

probability of a binary out ome

fra tion of a binary out ome in a sample
varian e of noise
learning algorithm
the value of

at whi h the minimum of the argument is

a hieved

B
b

an event (in probability), usually `bad' event

bias
B(N, k)

the bias term in bias-varian e de omposition

the bias term in a linear ombination of inputs, also alled

w0
maximum number of di hotomies on
point

C
d
d
dv ,dv (H)
D

points with a break

bound on the size of weights in the soft order onstraint

d
d
dimensionality of the input spa e X = R or X = {1} R

Z
H
data set D = (x1 , y1 ), , (xN , yN ); te hni ally not a set,
but a ve tor of elements (xn , yn ). D is often the training
dimensionality of the transformed spa e
VC dimension of hypothesis set

set, but sometimes split into training and validation/test

sets.

Dtrain
Dval
E(h, f )
ex
e(h(x), f (x))
en

subset of
is used.

used for training when a validation or test set

validation set; subset of

used for validation.

h and target fun tion f

e = 2.71828
2
pointwise version of E(h, f ), e.g., (h(x) f (x))
leave-one-out error on example n when this nth example is
error measure between hypothesis
exponent of

in the natural base

ex luded in training [ ross validation

E[]
Ex []
E[y|x]
Eaug
Ein , Ein (h)
E v
Eout , Eout (h)
D
Eout

Eout
Eval
Etest
f
g
g (D)
g

expe ted value of argument

expe ted value with respe t to
expe ted value of

given

augmented error (in-sample error plus regularization term)

in-sample error (training error) for hypothesis

ross validation error

out-of-sample error for hypothesis
out-of-sample error when

is used for training

expe ted out-of-sample error

validation error
test error
target fun tion,
nal hypothesis

g: X Y

f: X Y
g H sele ted

by the learning algorithm;

nal hypothesis when the training set is

average nal hypothesis [bias-varian e analysis

194

Notation

minus

g
g
h

h
H
H

nal hypothesis when trained using

H(C)

restri ted hypothesis set by weight budget

H(x1 , . . . , xN )

di hotomies (patterns of

H
I

The hat matrix [linear regression

g = Ein
h H; h : X Y

gradient, e.g.,
a hypothesis

a hypothesis in transformed spa e

hypothesis set
hypothesis

set

that

some points

orresponds to

per eptrons in

transformed spa e

[soft order

onstraint

x1 , , xN

generated by

on the points

identity matrix; square matrix whose diagonal elements are

and o-diagonal elements are

K
Lq
ln
log2
M
mH (N )

size of validation set

max(, )
N
o()

maximum of the two arguments

q th-order

Legendre polynomial

logarithm in base
logarithm in base

e
2

number of hypotheses
the growth fun tion; maximum number of di hotomies generated by

on any

points

number of examples (size of

absolute value of this term is asymptoti ally negligible ompared to the argument

O()

absolute value of this term is asymptoti ally smaller than

P (x)
P (y | x)
P (x, y)
P[]
Q
Qf
R
Rd
s

(marginal) probability or probability density of

a onstant multiple of the argument

sign()

supa (.)
T
t
tanh()
tra e()
V
v

x
y
x and y

onditional probability or probability density of

joint probability or probability density of

given

probability of an event
order of polynomial transform
omplexity of

(order of polynomial dening

the set of real numbers

d-dimensional Eu lidean
P spa e
s = wt x = i wi xi (i goes from 0 to d or 1 to d
depending on whether x has the x0 = 1 oordinate or not)
sign fun tion, returning +1 for positive and 1 for negative
supremum; smallest value that is the argument for all a
signal

number of iterations, number of epo hs

iteration number or epo h number
hyperboli tangent fun tion;

tanh(s) = (es es )/(es +es )

tra e of square matrix (sum of diagonal elements)

number of subsets in

V -fold

ross validation (V

K = N)

dire tion in gradient des ent (not ne essarily a unit ve tor)

195

Notation

v
var
w

w
w
wlin
wreg
wPLA
w0
x
x0

unit ve tor version of

[gradient des ent

the varian e term in bias-varian e de omposition

weight ve tor ( olumn ve tor)
weight ve tor in transformed spa e

sele ted weight ve tor [po ket algorithm

weight ve tor that separates the data
solution weight ve tor to linear regression
regularized solution to linear regression with weight de ay
solution weight ve tor of per eptron learning algorithm

w to represent bias b
x X . Often a olumn ve tor x Rd or x
x is used if input is s alar.
oordinate to x, xed at x0 = 1 to absorb the bias

added oordinate in weight ve tor

the input
{1} Rd .
added

term in linear expressions

X
X
XOR

input spa e whose elements are

y
y

the output

matrix whose rows are the data inputs

[linear regression

ex lusive OR fun tion (returns 1 if the number of 1's in its

input is odd)

y
Y
Z
Z

olumn ve tor whose omponents are the data set outputs

[linear regression

estimate of

[linear regression

output spa e whose elements are

transformed input spa e whose elements are

z = (x)
zn = (xn )

matrix whose rows are the transformed inputs

[linear regression

196

Statistical Learning Theory
No ratings yet
Statistical Learning Theory
57 pages
Slides No Break
No ratings yet
Slides No Break
77 pages
Review Materials 0 8 1
No ratings yet
Review Materials 0 8 1
140 pages
Cheat Sheet Sta374
No ratings yet
Cheat Sheet Sta374
1 page
Math for ML: Vectors & Probability
No ratings yet
Math for ML: Vectors & Probability
1 page
Notation Example
No ratings yet
Notation Example
11 pages
L5 Normal Equations For Regression PDF
No ratings yet
L5 Normal Equations For Regression PDF
20 pages
Advanced Learning Theory Insights
No ratings yet
Advanced Learning Theory Insights
59 pages
Class 02
No ratings yet
Class 02
42 pages
Maximum Likelihood Hypothesis
No ratings yet
Maximum Likelihood Hypothesis
3 pages
DGM 2023 Endterm Solution
No ratings yet
DGM 2023 Endterm Solution
12 pages
Learning Theory
No ratings yet
Learning Theory
19 pages
MIT 6.867 Machine Learning PS4 Solutions
No ratings yet
MIT 6.867 Machine Learning PS4 Solutions
8 pages
UQ Review
No ratings yet
UQ Review
129 pages
Optimum Signal Processing - Solution Manual
No ratings yet
Optimum Signal Processing - Solution Manual
83 pages
Uncertainty Notes
No ratings yet
Uncertainty Notes
166 pages
ANN Unit1
No ratings yet
ANN Unit1
29 pages
6 Uncertainty6
No ratings yet
6 Uncertainty6
36 pages
hw1 Sol
No ratings yet
hw1 Sol
12 pages
Machine Learning - Home - Week 2 - Notes - Coursera
No ratings yet
Machine Learning - Home - Week 2 - Notes - Coursera
10 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
13 pages
Discrete and Continuous Distributions Guide
No ratings yet
Discrete and Continuous Distributions Guide
9 pages
Lecture 2
No ratings yet
Lecture 2
8 pages
Probability For Ai2
No ratings yet
Probability For Ai2
8 pages
Cours ML
No ratings yet
Cours ML
14 pages
Tutorial On Diffusion Models
No ratings yet
Tutorial On Diffusion Models
4 pages
Information Dropout Learning Optimal Representations Through Noisy Computation
No ratings yet
Information Dropout Learning Optimal Representations Through Noisy Computation
9 pages
Machine Learning Exam Instructions
No ratings yet
Machine Learning Exam Instructions
16 pages
Homework - 1
No ratings yet
Homework - 1
10 pages
Khan - Diffusion Models and Normalizing Flows
No ratings yet
Khan - Diffusion Models and Normalizing Flows
36 pages
02 02 Regression
No ratings yet
02 02 Regression
11 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
Universal Approximations of Invariant Maps by Neural Networks
No ratings yet
Universal Approximations of Invariant Maps by Neural Networks
64 pages
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
No ratings yet
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
9 pages
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
No ratings yet
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
51 pages
Lecture 1: Introduction To Uncertainty Quantification: Today
No ratings yet
Lecture 1: Introduction To Uncertainty Quantification: Today
12 pages
VTU Exam Question Paper With Solution of 21CS54 Artificial Intelligence and Machine Learning Jan-2024-Sugunadevi C
No ratings yet
VTU Exam Question Paper With Solution of 21CS54 Artificial Intelligence and Machine Learning Jan-2024-Sugunadevi C
12 pages
2-Operation On Random Variables
No ratings yet
2-Operation On Random Variables
60 pages
Probabilistic Modelling and Reasoning
No ratings yet
Probabilistic Modelling and Reasoning
13 pages
Transformers
No ratings yet
Transformers
8 pages
Unit V
No ratings yet
Unit V
11 pages
MAP Hypothesis-1
No ratings yet
MAP Hypothesis-1
10 pages
ML Assaignment 4
No ratings yet
ML Assaignment 4
5 pages
Chapter 2 Solutions Understanding Machine Learning
No ratings yet
Chapter 2 Solutions Understanding Machine Learning
4 pages
Programming Ex.1
No ratings yet
Programming Ex.1
6 pages
Exercise 01 Math Refresher
No ratings yet
Exercise 01 Math Refresher
4 pages
Function Approximation Guide
No ratings yet
Function Approximation Guide
74 pages
Bayesian
No ratings yet
Bayesian
50 pages
Lec1 Mathreview
No ratings yet
Lec1 Mathreview
61 pages
Lec10 PDF
No ratings yet
Lec10 PDF
8 pages
Solutions To The Exercises On The Bias-Variance Dilemma
No ratings yet
Solutions To The Exercises On The Bias-Variance Dilemma
8 pages
Week1 Summary Detail
No ratings yet
Week1 Summary Detail
29 pages
M Tech-Asp3
No ratings yet
M Tech-Asp3
15 pages
Six Lectures On NN - Montanari
No ratings yet
Six Lectures On NN - Montanari
77 pages
Hilbert Matrix
No ratings yet
Hilbert Matrix
4 pages
18.657: Mathematics of Machine Learning: N I I H H I 1
No ratings yet
18.657: Mathematics of Machine Learning: N I I H H I 1
6 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
No ratings yet
Expectation Maximization: Dekang Lin Department of Computing Science University of Alberta
22 pages
Statistical Learning Theory For Neural Operators
No ratings yet
Statistical Learning Theory For Neural Operators
68 pages
Chapter 1
No ratings yet
Chapter 1
6 pages
34 83773
No ratings yet
34 83773
3 pages
National Strategy For The COVID 19 Response and Pandemic Preparedness
No ratings yet
National Strategy For The COVID 19 Response and Pandemic Preparedness
200 pages
Laboratory Experimentation in Economics: Six Points of View," Cambridge University Press, 1987
No ratings yet
Laboratory Experimentation in Economics: Six Points of View," Cambridge University Press, 1987
20 pages
1.3 - Counterexamples and Invalidity
No ratings yet
1.3 - Counterexamples and Invalidity
11 pages
BEAT: The Behavior Expression Animation Toolkit: LEAVE BLANK THE LAST 3.81 CM (1.5") of The Left Column On The First Page
No ratings yet
BEAT: The Behavior Expression Animation Toolkit: LEAVE BLANK THE LAST 3.81 CM (1.5") of The Left Column On The First Page
10 pages
HII The Anatomy of An Anonymous Attack
No ratings yet
HII The Anatomy of An Anonymous Attack
17 pages
Tabular Method
60% (5)
Tabular Method
10 pages
MAT257
No ratings yet
MAT257
24 pages
M.sc. Mathemetics
No ratings yet
M.sc. Mathemetics
52 pages
Subject: Mathematics Grade: 7: No. Topics Moral Values
No ratings yet
Subject: Mathematics Grade: 7: No. Topics Moral Values
1 page
Akihiko Yukie - Shintani Zeta Functions
No ratings yet
Akihiko Yukie - Shintani Zeta Functions
351 pages
Class 12 Sample Paper With Solution Mathematics Set 8
No ratings yet
Class 12 Sample Paper With Solution Mathematics Set 8
10 pages
Solution Assignment 4
No ratings yet
Solution Assignment 4
9 pages
Module 1 - Rational Alg Expression
No ratings yet
Module 1 - Rational Alg Expression
26 pages
Lesson 2 Polynomial Functions
No ratings yet
Lesson 2 Polynomial Functions
6 pages
Class 12th Mathematics PYQs With Solution CHAPTER-1 Relations and Functions
40% (5)
Class 12th Mathematics PYQs With Solution CHAPTER-1 Relations and Functions
27 pages
Math Worksheet 8th
No ratings yet
Math Worksheet 8th
2 pages
Level 1: Rolle's Theorem
No ratings yet
Level 1: Rolle's Theorem
10 pages
MTH6140 Linear Algebra II: Fields and Vector Spaces
No ratings yet
MTH6140 Linear Algebra II: Fields and Vector Spaces
2 pages
CCF Math
No ratings yet
CCF Math
39 pages
Name: Score: Section: Date:: RC - Al Khwarizmi International College Foundation, Inc. Science Laboratory School
No ratings yet
Name: Score: Section: Date:: RC - Al Khwarizmi International College Foundation, Inc. Science Laboratory School
3 pages
1 - Partial Derivatives Lec6
No ratings yet
1 - Partial Derivatives Lec6
10 pages
DFMFullCoverageKS5 BinomialExpansion2
No ratings yet
DFMFullCoverageKS5 BinomialExpansion2
12 pages
Solutions To The 63rd William Lowell Putnam Mathematical Competition Saturday, December 7, 2002
No ratings yet
Solutions To The 63rd William Lowell Putnam Mathematical Competition Saturday, December 7, 2002
5 pages
Calculus Differentiation Practice
No ratings yet
Calculus Differentiation Practice
4 pages
03 Day3 Algebra Bridging
No ratings yet
03 Day3 Algebra Bridging
19 pages
GRTC - Grade 10 - Mathematics Question Bank
No ratings yet
GRTC - Grade 10 - Mathematics Question Bank
153 pages
Basis and Subbasis
No ratings yet
Basis and Subbasis
7 pages
Calculus for University Students
No ratings yet
Calculus for University Students
8 pages
Algebra:: Math 8 Grading Rubric
No ratings yet
Algebra:: Math 8 Grading Rubric
6 pages
Steepest Descent for Optimization
No ratings yet
Steepest Descent for Optimization
29 pages
Algebra Teachers Unit Plan
No ratings yet
Algebra Teachers Unit Plan
113 pages
Math & Simulation in R Guide
No ratings yet
Math & Simulation in R Guide
21 pages
Chapter 4 Equilibrium of Rigid Bodies
No ratings yet
Chapter 4 Equilibrium of Rigid Bodies
10 pages
Comparison of Gauss Jacobi Method and Gauss Seidel Method Using Scilab
No ratings yet
Comparison of Gauss Jacobi Method and Gauss Seidel Method Using Scilab
3 pages
PLC Programming
No ratings yet
PLC Programming
12 pages

· (· · ·) - · - k · k ⌊ · ⌋ (a, b) a b J ·K ∇ ∇E E (w) w (·) (·) (·) k N A / B A B 0 (1) × R d ǫ δ ǫ η λ λ C Ω θ θ(s) = e / (1 + e) Φ z = Φ (x) Φ Q

Uploaded by

· (· · ·) - · - k · k ⌊ · ⌋ (a, b) a b J ·K ∇ ∇E E (w) w (·) (·) (·) k N A / B A B 0 (1) × R d ǫ δ ǫ η λ λ C Ω θ θ(s) = e / (1 + e) Φ z = Φ (x) Φ Q

Uploaded by

Notation

event (in probability)

square of the norm; sum of the squared omponents of a

oor; largest integer whi h is not larger than the argument

evaluates to 1 if argument is true, and to 0 if it is false

with the elements from set

zero ve tor; a olumn ve tor whose omponents are all zeros

spa e with an added `zeroth oor-

learning rate (step size in iterative learning, e.g., in sto has-

ti gradient des ent)

penalty for model omplexity; either a bound on general-

logisti fun tion

ization error, or a regularization term

a oordinate in the feature transform

probability of a binary out ome

at whi h the minimum of the argument is

an event (in probability), usually `bad' event

the bias term in bias-varian e de omposition

the bias term in a linear ombination of inputs, also alled

points with a break

bound on the size of weights in the soft order onstraint

set, but sometimes split into training and validation/test

used for training when a validation or test set

validation set; subset of

used for validation.

h and target fun tion f

in the natural base

ex luded in training [ ross validation

expe ted value of argument

augmented error (in-sample error plus regularization term)

ross validation error

is used for training

expe ted out-of-sample error

by the learning algorithm;

nal hypothesis when the training set is

average nal hypothesis [bias-varian e analysis

nal hypothesis when trained using

restri ted hypothesis set by weight budget

The hat matrix [linear regression

a hypothesis in transformed spa e

identity matrix; square matrix whose diagonal elements are

and o-diagonal elements are

size of validation set

maximum of the two arguments

number of examples (size of

absolute value of this term is asymptoti ally smaller than

(marginal) probability or probability density of

a onstant multiple of the argument

onditional probability or probability density of

(order of polynomial dening

the set of real numbers

number of iterations, number of epo hs

tanh(s) = (es es )/(es +es )

tra e of square matrix (sum of diagonal elements)

dire tion in gradient des ent (not ne essarily a unit ve tor)

unit ve tor version of

[gradient des ent

the varian e term in bias-varian e de omposition

sele ted weight ve tor [po ket algorithm

added oordinate in weight ve tor

term in linear expressions

input spa e whose elements are

matrix whose rows are the data inputs

ex lusive OR fun tion (returns 1 if the number of 1's in its

olumn ve tor whose omponents are the data set outputs

output spa e whose elements are

transformed input spa e whose elements are

matrix whose rows are the transformed inputs

You might also like

oor; largest integer whi h is not larger than the argument

nal hypothesis when the training set is

average nal hypothesis [bias-varian e analysis

nal hypothesis when trained using

and o-diagonal elements are

(order of polynomial dening