0% found this document useful (0 votes)

6 views11 pages

Notation Example

Uploaded by

vishwanath444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views11 pages

Notation Example

Uploaded by

vishwanath444

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Example Notation for Deep Learning

Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents

Notation ii

1 Commentary 1
1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Bibliography 4

Index 5

i
Notation

This section provides a concise reference describing notation used throughout this
document. If you are unfamiliar with any of the corresponding mathematical
concepts, Goodfellow et al. (2016) describe most of these ideas in chapters 2–4.

Numbers and Arrays

a A scalar (integer or real)
a A vector
A A matrix
A A tensor
In Identity matrix with n rows and n columns
I Identity matrix with dimensionality implied by
context
e(i) Standard basis vector [0, . . . , 0, 1, 0, . . . , 0] with a
1 at position i
diag(a) A square, diagonal matrix with diagonal entries
given by a
a A scalar random variable
a A vector-valued random variable
A A matrix-valued random variable

ii
CONTENTS

Sets and Graphs

A A set
R The set of real numbers
{0, 1} The set containing 0 and 1
{0, 1, . . . , n} The set of all integers between 0 and n
[a, b] The real interval including a and b
(a, b] The real interval excluding a but including b
A\B Set subtraction, i.e., the set containing the ele-
ments of A that are not in B
G A graph
P aG (xi ) The parents of xi in G

Indexing
ai Element i of vector a, with indexing starting at 1
a−i All elements of vector a except for element i
Ai,j Element i, j of matrix A
Ai,: Row i of matrix A
A:,i Column i of matrix A
Ai,j,k Element (i, j, k) of a 3-D tensor A
A:,:,i 2-D slice of a 3-D tensor
ai Element i of the random vector a

Linear Algebra Operations

>
A Transpose of matrix A
A+ Moore-Penrose pseudoinverse of A
A B Element-wise (Hadamard) product of A and B
det(A) Determinant of A

iii
CONTENTS

Calculus
dy
Derivative of y with respect to x
dx
∂y
Partial derivative of y with respect to x
∂x
∇x y Gradient of y with respect to x
∇X y Matrix derivatives of y with respect to X
∇X y Tensor containing derivatives of y with respect to
X
∂f
Jacobian matrix J ∈ Rm×n of f : Rn → Rm
∂x
2
∇x f (x) or H(f )(x) The Hessian matrix of f at input point x
Z
f (x)dx Definite integral over the entire domain of x
Z
f (x)dx Definite integral with respect to x over the set S
S

Probability and Information Theory

a⊥b The random variables a and b are independent
a⊥b | c They are conditionally independent given c
P (a) A probability distribution over a discrete variable
p(a) A probability distribution over a continuous vari-
able, or over a variable whose type has not been
specified
a∼P Random variable a has distribution P
Ex∼P [f (x)] or Ef (x) Expectation of f (x) with respect to P (x)
Var(f (x)) Variance of f (x) under P (x)
Cov(f (x), g(x)) Covariance of f (x) and g(x) under P (x)
H(x) Shannon entropy of the random variable x
DKL (P kQ) Kullback-Leibler divergence of P and Q
N (x; µ, Σ) Gaussian distribution over x with mean µ and
covariance Σ

iv
CONTENTS

Functions
f :A→B The function f with domain A and range B
f ◦g Composition of the functions f and g
f (x; θ) A function of x parametrized by θ. (Sometimes
we write f (x) and omit the argument θ to lighten
notation)
log x Natural logarithm of x
1
σ(x) Logistic sigmoid,
1 + exp(−x)
ζ(x) Softplus, log(1 + exp(x))
||x||p Lp norm of x
||x|| L2 norm of x
x+ Positive part of x, i.e., max(0, x)
1condition is 1 if the condition is true, 0 otherwise
Sometimes we use a function f whose argument is a scalar but apply it to a
vector, matrix, or tensor: f (x), f (X), or f (X). This denotes the application of f
to the array element-wise. For example, if C = σ(X), then Ci,j,k = σ(Xi,j,k ) for all
valid values of i, j and k.

Datasets and Distributions

pdata The data generating distribution
p̂data The empirical distribution defined by the training
set
X A set of training examples
x(i) The i-th example (input) from a dataset
y (i) or y (i) The target associated with x(i) for supervised learn-
ing
X The m × n matrix with input example x(i) in row
Xi,:

v
Chapter 1

Commentary

This document is an example of how to use the accompanying files as well as some
commentary on them. The files are math_commands.tex and notation.tex. The
file math_commands.tex includes several useful LATEX macros and notation.tex
defines a notation page that could be used at the front of any publication.
We developed these files while writing Goodfellow et al. (2016). We release
these files for anyone to use freely, in order to help establish some standard notation
in the deep learning community.

1.1 Examples
We include this section as an example of some LATEX commands and the macros
we created for the book.
Citations that support a sentence without actually being used in the sentence
should appear at the end of the sentence using citep:

Inventors have long dreamed of creating machines that think. This

desire dates back to at least the time of ancient Greece. The mythical
figures Pygmalion, Daedalus, and Hephaestus may all be interpreted
as legendary inventors, and Galatea, Talos, and Pandora may all be
regarded as artificial life (Ovid and Martin, 2004; Sparkes, 1996; Tandy,
1997).

When the authors of a document or the document itself are a noun in the
sentence, use the citet command:
1
CHAPTER 1. COMMENTARY

Mitchell (1997) provides a succinct definition of machine learning: “A

computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P , if its performance
at tasks in T , as measured by P , improves with experience E.”

When introducing a new term, using the newterm macro to highlight it. If
there is a corresponding acronym, put the acronym in parentheses afterward. If
your document includes an index, also use the index command.

Today, artificial intelligence (AI) is a thriving field with many prac-

tical applications and active research topics.

Sometimes you may want to make many entries in the index that all point to a
canonical index entry:

One of the simplest and most common kinds of parameter norm penalty
is the squared L2 parameter norm penalty commonly known as weight
decay. In other academic communities, L2 regularization is also known
as ridge regression or Tikhonov regularization.

To refer to a figure, use either figref or Figref depending on whether you

want to capitalize the resulting word in the sentence.

See figure 1.1 for an example of a how to include graphics in your

document. Figure 1.1 shows how to include graphics in your document.

Similarly, you can refer to different sections of the book using partref, Partref,
secref, Secref, etc.

You are currently reading section 1.1.

Acknowledgments
We thank Catherine Olsson and Úlfar Erlingsson for proofreading and review of
this manuscript.

2
CHAPTER 1. COMMENTARY

Deep learning Example:

Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases

Representation learning

Machine learning

Figure 1.1: An example of a figure. The figure is a PDF displayed without being rescaled
within LATEX. The PDF was created at the right size to fit on the page, with the fonts at
the size they should be displayed. The fonts in the figure are from the Computer Modern
family so they match the fonts used by LATEX.

3
Bibliography

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.

Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, New York.

Ovid and Martin, C. (2004). Metamorphoses. W.W. Norton.

Sparkes, B. (1996). The Red and the Black: Studies in Greek Pottery. Routledge.

Tandy, D. W. (1997). Works and Days: A Translation and Commentary for the Social
Sciences. University of California Press.

4
Index

Artificial intelligence, 2 Tikhonov regularization, see weight decay

Transpose, iii
Conditional independence, iv
Covariance, iv Variance, iv
Vector, ii, iii
Derivative, iv
Determinant, iii Weight decay, 2

Element-wise product, see Hadamard prod-

uct

Graph, iii

Hadamard product, iii

Hessian matrix, iv

Independence, iv
Integral, iv

Jacobian matrix, iv

Kullback-Leibler divergence, iv

Matrix, ii, iii

Norm, v

Ridge regression, see weight decay

Scalar, ii, iii

Set, iii
Shannon entropy, iv
Sigmoid, v
Softplus, v

Tensor, ii, iii

Machine Learning Notation: 1 Numbers & Arrays 4 Functions
No ratings yet
Machine Learning Notation: 1 Numbers & Arrays 4 Functions
2 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
DL (Unit I)
No ratings yet
DL (Unit I)
25 pages
Unit 1
No ratings yet
Unit 1
39 pages
Main
No ratings yet
Main
183 pages
Deep-Learning
No ratings yet
Deep-Learning
28 pages
Dasar Statistika Dan Matematika
No ratings yet
Dasar Statistika Dan Matematika
30 pages
Notation
No ratings yet
Notation
4 pages
Neural Networks & Deep Learning Guide
No ratings yet
Neural Networks & Deep Learning Guide
20 pages
Deep Learning Math Background
No ratings yet
Deep Learning Math Background
30 pages
Mml-Book (1) Removed
No ratings yet
Mml-Book (1) Removed
371 pages
AI Teacher Training - Machine Learning Curriculum
No ratings yet
AI Teacher Training - Machine Learning Curriculum
34 pages
Mml-Book Removed
No ratings yet
Mml-Book Removed
295 pages
All LEC
No ratings yet
All LEC
377 pages
00 Statistics
No ratings yet
00 Statistics
18 pages
DL Unit 1
No ratings yet
DL Unit 1
65 pages
DL Notes Unit 1
No ratings yet
DL Notes Unit 1
28 pages
Lec1 Mathreview
No ratings yet
Lec1 Mathreview
61 pages
Mathematics For AI
No ratings yet
Mathematics For AI
5 pages
Deep Learning for Beginners
No ratings yet
Deep Learning for Beginners
151 pages
Deep Learning PDF
No ratings yet
Deep Learning PDF
289 pages
Machine Learning and Pattern Recognition Notation
No ratings yet
Machine Learning and Pattern Recognition Notation
4 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
009 Neural - Networks Complete
No ratings yet
009 Neural - Networks Complete
61 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
Introduction 1
No ratings yet
Introduction 1
142 pages
Math For Deep Learning What You Need To Know To Understand Neural Networks 1st Edition Ronald T. Kneuselinstant Download
100% (4)
Math For Deep Learning What You Need To Know To Understand Neural Networks 1st Edition Ronald T. Kneuselinstant Download
49 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Final2 Math EE
No ratings yet
Final2 Math EE
77 pages
Intro Deep Learning
No ratings yet
Intro Deep Learning
32 pages
DL Notes
No ratings yet
DL Notes
652 pages
Maths For ML
No ratings yet
Maths For ML
156 pages
Matematics and Machine Learning
No ratings yet
Matematics and Machine Learning
156 pages
Background Material Crib-Sheet: 1 Probability Theory
No ratings yet
Background Material Crib-Sheet: 1 Probability Theory
4 pages
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
No ratings yet
Deep Learning - AD3501 - Important Question and 2 Marks With Answers - Unit 1
9 pages
Module1 - Deep Learning
No ratings yet
Module1 - Deep Learning
26 pages
Pattern Classification
No ratings yet
Pattern Classification
41 pages
1 & 2 Linear Algebra and Probability Distribution
No ratings yet
1 & 2 Linear Algebra and Probability Distribution
11 pages
ML Interview Questions and Answers
100% (1)
ML Interview Questions and Answers
105 pages
Lecture 2 - Math
No ratings yet
Lecture 2 - Math
39 pages
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
No ratings yet
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
12 pages
ML1 Skript 2023
No ratings yet
ML1 Skript 2023
97 pages
Lesson 2 - Background For AI (Autosaved) New
No ratings yet
Lesson 2 - Background For AI (Autosaved) New
37 pages
Deep Learning Math Essentials
No ratings yet
Deep Learning Math Essentials
38 pages
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
39 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
79 pages
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
No ratings yet
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
32 pages
CSE465 T2 Mathematics For DL
No ratings yet
CSE465 T2 Mathematics For DL
29 pages
3410notes-Linear Algebra Python
No ratings yet
3410notes-Linear Algebra Python
235 pages
Tutorial On Helmholtz Machine
No ratings yet
Tutorial On Helmholtz Machine
26 pages
Neural Networks & Backpropagation
No ratings yet
Neural Networks & Backpropagation
77 pages
CQF - ML - 2 - General Issues - Annotated
No ratings yet
CQF - ML - 2 - General Issues - Annotated
81 pages
V Aids Ad3501 DL Unit-1
No ratings yet
V Aids Ad3501 DL Unit-1
70 pages
Mathematics of Deep Learning Lecture Notes
No ratings yet
Mathematics of Deep Learning Lecture Notes
58 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Lecture Maths
No ratings yet
Lecture Maths
103 pages
Complete UNIT III DEEP LEARNING
No ratings yet
Complete UNIT III DEEP LEARNING
126 pages
CA2
No ratings yet
CA2
39 pages
Data Warehouse - Concept and Fundamentals: Sridevi
No ratings yet
Data Warehouse - Concept and Fundamentals: Sridevi
25 pages
Undergraduate 1 1
No ratings yet
Undergraduate 1 1
95 pages
7 KGCOE-EEEE-374-EM-Fields and Transmission Lines - RIT Dubai - Tlili
No ratings yet
7 KGCOE-EEEE-374-EM-Fields and Transmission Lines - RIT Dubai - Tlili
4 pages
Compressive Sensing-Based Born Iterative Method For Tomographic Imaging
No ratings yet
Compressive Sensing-Based Born Iterative Method For Tomographic Imaging
13 pages
A Quick Tutorial On Rslogix Emulator 5000: Chassis
No ratings yet
A Quick Tutorial On Rslogix Emulator 5000: Chassis
15 pages
Nonfiction Reading Test Chess
No ratings yet
Nonfiction Reading Test Chess
3 pages
Cost-Efficient Knowledge-Based Question Answering With Large Language Models 2024
No ratings yet
Cost-Efficient Knowledge-Based Question Answering With Large Language Models 2024
13 pages
ICOM IC-2820H Brochure
No ratings yet
ICOM IC-2820H Brochure
3 pages
Thrive Freeze Dry - UMR Member Booklet 2024
No ratings yet
Thrive Freeze Dry - UMR Member Booklet 2024
20 pages
Residential Construction Guidelines
No ratings yet
Residential Construction Guidelines
2 pages
Yamaha Customer Satisfaction Study
No ratings yet
Yamaha Customer Satisfaction Study
28 pages
Worked Example - Analysis and Design of Steel Sheet Pile Wall (EN 1997-1) - Structville
No ratings yet
Worked Example - Analysis and Design of Steel Sheet Pile Wall (EN 1997-1) - Structville
11 pages
Reports: Myopic Children Show Insufficient Accommodative Response To Blur
No ratings yet
Reports: Myopic Children Show Insufficient Accommodative Response To Blur
5 pages
Rubber Lecture
No ratings yet
Rubber Lecture
4 pages
Low-Risk Crypto Investing 2024
No ratings yet
Low-Risk Crypto Investing 2024
23 pages
The Government of Kenya Structure Under
No ratings yet
The Government of Kenya Structure Under
27 pages
Cdi 6 Drug Education and Vice Control
No ratings yet
Cdi 6 Drug Education and Vice Control
44 pages
Peter England Questionere Mam
No ratings yet
Peter England Questionere Mam
7 pages
River Mapping for Class X Geography
No ratings yet
River Mapping for Class X Geography
1 page
2001 Menter
No ratings yet
2001 Menter
11 pages
Rules
No ratings yet
Rules
2 pages
MS4-Annual Syllabus Distribution-2022
100% (6)
MS4-Annual Syllabus Distribution-2022
3 pages
O Level Computer Science Topical Solved by Ali Akram
67% (6)
O Level Computer Science Topical Solved by Ali Akram
25 pages
Arch Shaped Cut-And-Cover Tunnels On The New "Pedemontana Lombarda" Motorway
No ratings yet
Arch Shaped Cut-And-Cover Tunnels On The New "Pedemontana Lombarda" Motorway
10 pages
EB-2 NIW Professional Plan Questionnaire (2022.09) - PROTECTED
No ratings yet
EB-2 NIW Professional Plan Questionnaire (2022.09) - PROTECTED
8 pages
Nursing Care Plan
No ratings yet
Nursing Care Plan
2 pages
Namma Kalvi 10th Maths MLM Study Material em 216964
No ratings yet
Namma Kalvi 10th Maths MLM Study Material em 216964
60 pages
HTTP Proxies
No ratings yet
HTTP Proxies
34 pages
Lab-7 1
No ratings yet
Lab-7 1
9 pages

Notation Example

Uploaded by

Notation Example

Uploaded by

Example Notation for Deep Learning

Numbers and Arrays

Sets and Graphs

Linear Algebra Operations

Probability and Information Theory

Datasets and Distributions

Inventors have long dreamed of creating machines that think. This

Mitchell (1997) provides a succinct definition of machine learning: “A

Today, artificial intelligence (AI) is a thriving field with many prac-

To refer to a figure, use either figref or Figref depending on whether you

See figure 1.1 for an example of a how to include graphics in your

You are currently reading section 1.1.

Deep learning Example:

Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, New York.

Ovid and Martin, C. (2004). Metamorphoses. W.W. Norton.

Artificial intelligence, 2 Tikhonov regularization, see weight decay

Element-wise product, see Hadamard prod-

Hadamard product, iii

Matrix, ii, iii

Ridge regression, see weight decay

Scalar, ii, iii

Tensor, ii, iii

You might also like