0% found this document useful (0 votes)

309 views28 pages

AI by Hand: Neural Network Concepts

The document describes various machine learning concepts and models. The basic section lists concepts like single node networks, networks with hidden layers, and different numbers of inputs and layers. The advanced section lists more complex models like mixture of experts, recurrent neural networks, transformer networks, generative adversarial networks, and reinforcement learning with human feedback. It also briefly describes concepts like dropout, autoencoders, and residual networks.

Uploaded by

Gobi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

309 views28 pages

AI by Hand: Neural Network Concepts

Uploaded by

Gobi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Basic

I. One Node
II. Four Nodes
III. One Hidden Layer
IV. Three Inputs
V. Seven Layers
Advanced
1. Mixture of Experts (MOEs)
2. Recurrent Neural Network (RNN)
3. Mamba
4. Matrix Multiplication
5. LLM Sampling
6. MLP in PyTorch
7. Backpropagation
8. Transformer
9. Batch Normalization
10. Generative Adversarial Network (GAN)
11. Self Attention
12. Dropout
13. Autoencoder
14. Vector Database
15. CLIP
16. Residual Network (ResNet)
17. Graph Convolution Network (GCN)
18. SORA’s Diffusion Transformer (DiT)
19. Gemini 1.5's Switch Transformer
20. Reinforcement Learning with Human Feedback (RLHF)

© 2024 Tom Yeh

Link to my original LinkedIn post
with animation and explanation

Date originally posted

I. One Node 406

12.5.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

II. Four Nodes 148

12.6.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

III. Hidden Layer 82

12.7.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

IV. Three Inputs 105

12.13.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

V. Seven Layers 197

12.13.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1. Mixture of Experts 683

X1 X2
2 3
1 1
3 2
1 1
Gate 1 1 0 0 Max
Network 0 1 1 0 ≈

1 0 1 0 0 1 1 0
1 1 0 0 1 1 0 0 ReLU

0 0 1 0 1 -1 0 0 ≈
-1 0 1 0 1 0 1 0
Y1 Y2
Expert 1 Expert 2

12.15.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

2. Recurrent Neural Network (RNN)
406
Input Sequence X 3 4 5 6

1 -1 1
Parameters A B C -1 1
1 1 2

Activation Function ɸ: ReLU

Hidden States H0
0
0

Output Sequence Y

1 1 -1 ɸ ɸ ɸ ɸ
2 1 1 ≈ ≈ ≈ ≈
-1 1

12.18.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

3. Mamba’s S6 Model 263

Input Sequence 3 4 5 6 Parameters

Output Sequence

Selective Structured State-Space

Scan

1 0
1 -1 0 0
0 -1
0 -1 0 1
1 0 -1 0
1 0 0 -1
1 0 -1 0 1 0

0 1 0 -1 0 -1

1 -1 0 0
0 0 -1 1
-1 0 0 0 -1 0
1 0 0 0 0 1
0 0 -1 0
0 1 0 0
1 -1 0 0
-1 0
0 0 -1 1
0 1
1 0 0 0
0 -1 1 0

12.19.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

4. Matrix Multiplication 127

1 1 1 5 2
X =?
-1 1 2 4 2

1.5.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

5. How does an LLM sample a sentence?
Input Embeddings 1123

LLM
Probability Distributions Random
Numbers
I .01 .01 .03
Vocab you .01 .01 .50
they .01 .01 .40
are .01 .40 .01 .34
am .01 .40 .01 .52
how .50 .05 .01 .92
why .10 .05 .01 .65
where .10 .05 .01
who .15 .01 .01
what .10 .01 .01

__

1.6.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

6. Multi Layer Perceptron in pytorch
2
337
1 mlp_model = nn.Sequential(
1
3
2 nn._______( ___, ___, bias = ___ ),
1

1 -1 1 -5 -1 0
3
nn._______(),
1 1 0 0 3 ReLU 3
0 1 1 1 5 ≈ 5
4
nn._______( ___, ___, bias = ___ ),
1 0 1 -2 3 3
5
nn._______(), 1 -1 1 0 2 ReLU 2
0 1 -1 1 1
≈ 1
nn._______( ___, ___, bias = ___ ),
6 1

1 -1 2 3 .95
7
nn._______() -1 1 1 0 .50
σ
8
)
1 -2 -2 -2
≈ .12
2 1 0 5 .99
-3 0 1 -5 .01

Hints:

Linear Layer: { Identity | Linear | Bilinear }

Activation Function: { ReLU | Tanh | Sigmoid }
in_features: { int }
out_features: { int }
bias: { T | F }

1.8.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

7. Backpropagation 1260

X
2
1
3
1

Layer 1
1 -1 1 -5 -1 0
1 1 0 0 3 ReLU 3
0 1 1 1 5 ≈ 5
1 0 1 -2 3 3
1

Layer 2
1 -1 1 0 0 2 ReLU 2
0 1 -1 1 3 4 ≈ 4
1

Layer 3 3
Soft
0
2 0 -1 max .5
0 2 -5 3 ≈ .5 1
1 1 1 -1 0 0
YPred YTarget
L: Cross-Entropy Loss

1.9.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

8. Transformer 1281

Features from the

Previous Block
↓ ↓ ↓ ↓ ↓

1 0 0 0 1
Attention
1 1 0 0 0
Q Attention
0 1 1 0 0 Weight
K
0 0 1 1 0 Matrix (A)
0 0 0 1 1

X1 X2 X3 X4 X5 Z1 Z2 Z3 Z4 Z5
5 6 0 7 0 Attention
0 2 4 0 3 Weighted
1 0 1 1 0 Features

1 1 1 1 1

1 -1 0 1
1 1 0 0 ReLU
0 1 1 1 ≈
-1 1 1 0
1 1 1 1 1

Position-wise 1 0 0 -1 0
Feed-Forward 0 1 1 0 0
Network (FFN) 0 0 1 -1 1

↓ ↓ ↓ ↓ ↓

Next Block

1.11.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

9. Batch Normalization 504

Mini-batch: X1 X2 X3 X4
1 0 3 0
0 3 1 1 Batch Statistics
2 1 0 2
Linear Layer 1 1 1 1 Σ µ σ2 σ
1 0 1 0
ReLU
1 1 0 -1
0 2 -1 0 ≈
Normalize µ
Sum (Σ)
- Mean (µ)
Variance (σ2)
Std Dev (σ)

σ
÷

1 1 1 1

Scale & Shift 2 0 0 0 Trainable

0 3 0 0 Parameters
0 0 -1 1

Next Layer

1.14.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

10. Generative Adversarial Network (GAN)
Noise: N1 N2 N3 N4
1004
1 1 0 1
1 0 1 -1
1 1 1 1

Generator 1 1 0
0 1 2
-1 1 0
[≈ ReLU] 1 1 1 1 Real:
Fake: F1 F2 F3 F4 X1 X2 X3 X4
-1 1 0 0 2 3 3 4
1 0 1 0 1 1 1 1
0 1 1 0 2 3 4 3
0 0 1 1 1 1 1 1
[≈ ReLU] 1 1 1 1 1 1 1 1

Discriminator 1 0 0 -1 0
0 1 1 0 0
0 0 1 -1 1
[≈ ReLU] 1 1 1 1 1 1 1 1

1 1 -1 -1 Z
[≈ σ] [≈ σ]
Predictions: Y

Training the
Discriminator
Targets: - YD 0 0 0 0 1 1 1 1
𝜕𝐿𝐷
Loss Gradients:
𝜕𝑍

Training the
Generator
Targets: - YG 1 1 1 1
𝜕𝐿𝐺
Loss Gradients:
𝜕𝑍

1.15.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

11. Self Attention 1376
q1 q2 q3 q4
x1 x2 x3 x4
2 0 0 2 MatMul
Features
0 1 0 0 (KTQ)
0 2 1 0
0 0 1 1 k1 T
2 0 0 0 k2 T
1 0 1 1 k3 T
WQ q1 q2 q3 q4
k4 T

1 1 0 0 0 0
Scale
0 1 0 1 0 0
0 0 1 0 1 1
□
!!
WK k1 k2 k3 k4
0 0 1 0 0 0
Softmax
0 1 0 0 0 0
1 0 0 0 0 -1
e□

÷ Σ

Attention
Weight
Matrix (A)
MatMul

WV v1 v2 v3 v4 z1 z2 z3 z4
10 0 0 0 0 0 Attention
Weighted
0 0 0 10 0 0 Features
0 10 0 0 0 0

FFN
1.16.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh
12. Dropout 557

Random
Sequence
X1 X2 Inference
3 5 3 3
Training Data: Unseen Data:
.61 4 1 2 1
.39 1 1 1 1

.75 Linear 1 0 0 1 0 0
.40 1 1 0 1 1 1
.65 0 1 1 -1 1 1
.42 1 -1 0 1 -1 0
.23 [≈ ReLU] 1 1 [≈ ReLU] 1 1

.19 Dropout 0 0 0 0 0 0
.93 (p=0.5) 0 0 0 0 0 0
.42 0 0 0 0 0 0
.87 0 0 0 0 0 0
.53
Linear 1 0 0 1 0 1 0 1 1 0
.27
.69 0 1 1 0 0 1 1 1 0 0
.50 1 0 -1 -1 1 1 0 -1 0 1
[≈ ReLU] 1 1 [≈ ReLU] 1 1
.11
Dropout 0 0 0 0
.42 (p=0.33) 0 0 0 0
0 0 0 0
1 1
Linear 1 -1 0 0 Outputs 1 1 0 0
0 1 -1 -2 Y 0 1 -1 -1

-4 7 Targets
Training
- 10 5 Y’ MSE Loss
Gradients
𝜕𝐿
X2
𝜕𝑌

1.19.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

13. Autoencoder X1 X2 X3 X4
1 1 2 1
849
2 1 2 0
3 2 4 1
1 1 2 1
1 1 1 1

Encoder 1 0 0 1 0
0 1 1 0 0
-1 0 1 0 -1
[≈ ReLU] 1 1 1 1

1 0 1 0
Bottleneck
-1 1 0 0
[≈ ReLU] 1 1 1 1

1 0 0
Decoder
0 1 1
1 -1 0
[≈ ReLU] 1 1 1 1

1 0 -1 0 Outputs

1 -1 0 0 Y
0 0 1 1
0 1 1 -3

Targets
Reconstruction
Loss Y’

- MSE Loss 𝜕𝐿
Gradients 𝜕𝑌

1.22.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

14. Vector Database 2224
Query
Data how are you who are you who am I am I you

Word Embeddings a an the how why who what are is am be was you we I they she he she me him her

0 -1 0 1 0 1 0 0 -1 1 0 0 0 3 1 0 -1 0 0 0 -1 0
2 0 2 0 0 0 -1 1 0 0 0 2 1 0 2 0 2 0 0 2 0 0
-1 0 -1 1 2 0 0 1 0 1 -1 0 0 -1 0 3 0 0 -1 0 2 -1
0 1 0 0 1 0 1 0 1 0 1 -2 0 0 0 1 0 1 0 1 0 1

Text Embeddings 1 1 1 1 1 1 1 1 1 1 1 1

Encoder 1 1 0 0 0
0 1 0 1 0
1 0 1 0 -1
Linear &
ReLU 1 -1 0 0 0

Mean Pooling

Indexing Projection
1 1 0 0 Vector
Storage
0 0 1 1

Retrieval Dot Products

Nearest Neighbor (argmax)

mini batch of text-image pairs
15. CLIP 400 millions more …
885

big table mini chair top hat table top big chair

word2vec mini big top hat chair table Flatten

0 1 1 0 1 0 Patches
1 0 0 0 1 1
0 1 0 1 0 1
Word
Image Encoder
Embeddings 1 1 1 1 1 1

1 1 0 0 0
Text Encoder 1 1 1 1 1 1 0 1 0 1 1
1 0 1 0 0 0 1 -1 1
0 3 0 -2 0 1 1 1 -1
1 1 0 1

[Mean Pooling] [Mean Pooling]

(round) (round)

[Projection] 1 1 1 [Projection] 1 1 1

1 1 0 -2 -1 1 1 0 -1
0 1 1 -2 0 0 -1 1 0

T1 T2 T3 Shared Embedding Space

Cross Entropy
[Softmax] Loss Gradients
÷ Similarity -
e□ Σ ImageàText Target ImageàText
I1 1 0 0
I2 0 1 0
I3 0 0 1
÷ Σ TextàImage
1 0 0
Similarity
TextàImage - 0 1 0
0 0 1
2.10.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh
16. Residual Network 625
2 1 0
0 3 4
X
1 0 2
1 1 1
weight
layer 1 1 0 0
ReLU 0 1 -1 0
1 0 1 -1
weight 1 1 1
layer
1 0 0 0 0 1 0
+ 0 1 0 -1 0 0 0
ReLU 0 0 1 0 -1 0 0
1 1 1

Transformer’s Encoder Block

1 0 0
0 1 0
Input
Embedding 0 0 1

1 0 1 1 0 0
Q
K 1 1 0 0 1 0
Attention 0 1 1 0 0 1
Add & Norm
2 0 1
1 3 -2
Feed 1 1 1

Forward 1 1 1
Add & Norm
1 -1 2
1 1 1

1 0 0 -1 3
Next Block
0 1 -1 0 0
↓ ↓ ↓
Next Block
2.15.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh
17. Graph Convolutional Network
573
Graph Data A B C D E

A B C D E
Graph A B C D E A
Convolutional 2 0 1 0 1 Adjacency
Network B
Matrix
1 1 0 0 0
C
0 0 -1 1 1 A B C D E
D
0 3 0 1 0
E
1 1 1 1 1

1 1 0 0 0
0 1 0 -1 0
1 0 0 1 -1
[ReLU] Messages 1 1 1 1 1

0 1 1 0
1 -1 0 -2
1 0 0 0
[ReLU] Messages 1 1 1 1 1

Fully 1 0 0 -2
Connected 0 1 0 -2
Network
0 0 1 -5
1 -1 0 0
1 0 -1 0
[ReLU] 1 1 1 1 1

1 1 1 1 1 -9

18. SORA’s Diffusion Transformer 2118
Training Video Diffusion Prompt
Step t = 3
1 0 2 0 0 1 0 1 “sora is sky”
0 1 1 0 3 0 4 0
Text
Encoder
Spacetime
Patches 0
1
(Pixels) 1
1 0 0 1 0 0 -1
1 1 1 1 0 1 0 -1 1 0
Visual Encoder Latent
1 1 0 0 1 0 Self-Attention
1 0 -1 0 0
0 2 0 1 0 2 1 0 0 0
0 1 0 1 1
[ReLU] Adaptive 1 1 0 0
Sampled 0 2 1 -1 Q 0 1 1 0
Layer Norm
Noise K
+ -1 0 -2 1 0 0 1 1
X +
Noised
Latent
1 1 1 1

Predicted Pointwise -1 1 -2
Noise FFN
- 0 1 -5

Noise-free Train Sampled 0 2 1 -1

Noise
Latent
- -1 0 -2 1
Visual Decoder 1 1 1 1

MSE Loss
1 0 1
Gradients
0 1 0
1 1 0 Generated Video
-1 1 0
[ReLU]

19. Switch Transformer 576
(Gemini 1.5’s Sparse Mixture of Experts)

1 0 0 0 1 Attention
Previous Block 1 1 0 0 0
↓ ↓ ↓ ↓ ↓ Q Attention
0 1 1 0 0 Weight
K
0 0 1 1 0 Matrix (A)
0 0 0 1 1

X1 X2 X3 X4 X5 Z1 Z2 Z3 Z4 Z5
5 6 0 7 0 Attention
0 2 4 0 3 Weighted
1 0 1 1 0
Features

Switch A 1 -1 0 Gate
B 1 0 -2 Values
C -1 1 1

argmax Expert IDs

Expert A 1 1
Expert B 1 1
Expert C 1 1

1 0 -1 0 0 0 -1 0 1 0 0 0
0 1 0 0 1 0 0 0 0 1 1 1
0 0 -1 1 1 1 -1 1 0 1 0 0

Position-wise 1 2 3 4 5
Feed-Forward x3
Network (FFN)

↓ ↓ ↓ ↓ ↓

2.24.24
20. Reinforcement
LLM 5 1 0
prompt
0 4 1 Learning with Human
[S] CEO is 0 0 4 Feedback (RLHF)
550
Preferences
{Winner | Loser} {Winner | Loser}
1 0 -1 2 prompt next prompt next
[ReLU] -1 1 0 0 doc is him doc is them

him 1 1 0
her 0 1 0 Word Embeddings
them 1 0 0 him her them is doc CEO [S]
is/are -1 1 1 0 1 1 1 1 0 -1
doc 2 0 -2 1 0 0 1 1 1 -1
0 1 0 1 0 1 -1
CEO 2 0 -1

Sample (max) Mean Mean

Pool Pool
1/3 1/3 1/3
Reward
1/3 1/3 1/3
Model (RM)
1/3 1/3 1/3

1 0 1 0
0 1 0 0
1 0 -1 0
1 1 0 0
[ReLU]
3 3 3 -3 1 Reward

Align LLM Loss Train RM Winner - Loser

Loss Gradient Predicted σ

- Target -1

A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
NLP and Generative AI Syllabus - 2025
No ratings yet
NLP and Generative AI Syllabus - 2025
5 pages
Performance Analysis of LoRA Finetuning Llama-2
No ratings yet
Performance Analysis of LoRA Finetuning Llama-2
4 pages
(FREE PDF Sample) Deep Generative Modeling Jakub M. Tomczak Ebooks
No ratings yet
(FREE PDF Sample) Deep Generative Modeling Jakub M. Tomczak Ebooks
47 pages
Rakesh Kumar - Data Scientist
No ratings yet
Rakesh Kumar - Data Scientist
3 pages
6 Types of Neural Network
No ratings yet
6 Types of Neural Network
8 pages
GenAI Pinnacle Plus Brochure
No ratings yet
GenAI Pinnacle Plus Brochure
10 pages
LLM Basics
No ratings yet
LLM Basics
35 pages
11 Machine Learning System Design PDF
No ratings yet
11 Machine Learning System Design PDF
7 pages
UML BD Problema Rezolvata
No ratings yet
UML BD Problema Rezolvata
4 pages
RLHF - Reinforcement Learning From Human Feedback
No ratings yet
RLHF - Reinforcement Learning From Human Feedback
21 pages
Machine Learning Algorithms Theory - Vimal Mishra
100% (1)
Machine Learning Algorithms Theory - Vimal Mishra
931 pages
Mastering Algorithms A Systematic Approach To Data Structures and
100% (1)
Mastering Algorithms A Systematic Approach To Data Structures and
651 pages
Notes On Backpropagation
No ratings yet
Notes On Backpropagation
14 pages
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
No ratings yet
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
12 pages
Building Large Language Models (LLM) - A Step-By-Step Guide - SaberiKamarposhti, Morteza - 2024
No ratings yet
Building Large Language Models (LLM) - A Step-By-Step Guide - SaberiKamarposhti, Morteza - 2024
374 pages
Generative AI With LangChain Build Production-Ready LLM Applications and Advanced Agents Using Python, LangChain, and LangGraph
100% (1)
Generative AI With LangChain Build Production-Ready LLM Applications and Advanced Agents Using Python, LangChain, and LangGraph
477 pages
Autoregressive Generative Models Guide
No ratings yet
Autoregressive Generative Models Guide
57 pages
Practical Natural Language Processing: A Comprehensive Guide To Building Real-World NLP Systems
No ratings yet
Practical Natural Language Processing: A Comprehensive Guide To Building Real-World NLP Systems
8 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play, 2nd Edition David Foster Kindle & PDF Formats
No ratings yet
Generative Deep Learning: Teaching Machines To Paint, Write, Compose, and Play, 2nd Edition David Foster Kindle & PDF Formats
88 pages
Regularization For Neural Networks 1718966083
No ratings yet
Regularization For Neural Networks 1718966083
9 pages
TensorFlow Tutorial For Beginners (Article) - DataCamp PDF
No ratings yet
TensorFlow Tutorial For Beginners (Article) - DataCamp PDF
60 pages
Transformers For Natural Language Processing and Computer Vision, Third Edition Denis Rothman Download
100% (3)
Transformers For Natural Language Processing and Computer Vision, Third Edition Denis Rothman Download
46 pages
Lang Graph
100% (1)
Lang Graph
113 pages
AI Engineer Interview Prep Guide
No ratings yet
AI Engineer Interview Prep Guide
16 pages
Generative AI With Large Language Models AWS & DeepLearning
100% (1)
Generative AI With Large Language Models AWS & DeepLearning
96 pages
Notes - Introduction To AI, ML, DS
No ratings yet
Notes - Introduction To AI, ML, DS
61 pages
Weaviate Advanced RAG Techniques Ebook
100% (1)
Weaviate Advanced RAG Techniques Ebook
13 pages
What Is A Support Vector Machine?: Primer
No ratings yet
What Is A Support Vector Machine?: Primer
3 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
LSTM for Touchpoint Prediction
100% (1)
LSTM for Touchpoint Prediction
73 pages
An Introduction To Mathematics Behind Neural Networks
No ratings yet
An Introduction To Mathematics Behind Neural Networks
5 pages
Ai ML
No ratings yet
Ai ML
11 pages
Machine Learning Algorithm, Second Edition by Giuseppe Bonaccorso
No ratings yet
Machine Learning Algorithm, Second Edition by Giuseppe Bonaccorso
1 page
Getting Started With GPT-4 API: May 14,2024 Update To From gpt-4 To Gpt-4o
No ratings yet
Getting Started With GPT-4 API: May 14,2024 Update To From gpt-4 To Gpt-4o
8 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
406 pages
Face Recognition With Python
No ratings yet
Face Recognition With Python
5 pages
NLP Trends and Challenges
No ratings yet
NLP Trends and Challenges
26 pages
How To Code A Neural Network With Backpropagation in Python
No ratings yet
How To Code A Neural Network With Backpropagation in Python
133 pages
Reinforcement Learning - Introduction
No ratings yet
Reinforcement Learning - Introduction
19 pages
Fast Python High Performance Techniques For Large Datasets MEAP V10 Tiago Rodrigues Antao Instant Download
No ratings yet
Fast Python High Performance Techniques For Large Datasets MEAP V10 Tiago Rodrigues Antao Instant Download
110 pages
Programming With Tensorflow Solutions For Edge Computing Applications
No ratings yet
Programming With Tensorflow Solutions For Edge Computing Applications
190 pages
Book
No ratings yet
Book
199 pages
OpenCV Guide: Setup to Deep Learning
No ratings yet
OpenCV Guide: Setup to Deep Learning
156 pages
Nicole Koenigstein - Transformers in Action (MEAP v7) 2024 (2024, Manning Publications Co.) - Libgen - Li
No ratings yet
Nicole Koenigstein - Transformers in Action (MEAP v7) 2024 (2024, Manning Publications Co.) - Libgen - Li
272 pages
Artificial Intelligence and Machine Learning in Business
No ratings yet
Artificial Intelligence and Machine Learning in Business
5 pages
StaticSpeed Security Assessment
No ratings yet
StaticSpeed Security Assessment
57 pages
Attention Is All You Need
No ratings yet
Attention Is All You Need
15 pages
Deep Learning & NLP Mastery
100% (1)
Deep Learning & NLP Mastery
17 pages
Mehryar Mohri - Foundations of Machine Learning - Book
No ratings yet
Mehryar Mohri - Foundations of Machine Learning - Book
1 page
Knowledge Representation and Reasoning
No ratings yet
Knowledge Representation and Reasoning
155 pages
ML System Design Case Studies
No ratings yet
ML System Design Case Studies
41 pages
02 - Lecture Note - TensorFlow Ops
No ratings yet
02 - Lecture Note - TensorFlow Ops
21 pages
Pytorch Tutorial by Chongruo Wu
No ratings yet
Pytorch Tutorial by Chongruo Wu
84 pages
2019 - On The Control of Multi-Agent Systems - A Survey
No ratings yet
2019 - On The Control of Multi-Agent Systems - A Survey
164 pages
My CV
No ratings yet
My CV
2 pages
AI by Hand Vol 1
No ratings yet
AI by Hand Vol 1
28 pages
NPU MachineLearning
No ratings yet
NPU MachineLearning
28 pages
Position Encoding: Intuition Lack Inherent Word Order Awareness
No ratings yet
Position Encoding: Intuition Lack Inherent Word Order Awareness
33 pages
AIcrowd - Single-Source Augmentation - Challenges
No ratings yet
AIcrowd - Single-Source Augmentation - Challenges
1 page
The Most Used Positional Encoding: Rope: Damien Benveniste
No ratings yet
The Most Used Positional Encoding: Rope: Damien Benveniste
7 pages
IterateAI Careers
No ratings yet
IterateAI Careers
4 pages
Model Compression Techniquesin Deep Learning
No ratings yet
Model Compression Techniquesin Deep Learning
23 pages
Chapter 14 - Analyzing Adversarial Performance - The Deep Learning Architect's Handbook
No ratings yet
Chapter 14 - Analyzing Adversarial Performance - The Deep Learning Architect's Handbook
1 page
Inductive Moment Matching
No ratings yet
Inductive Moment Matching
36 pages
CS236 Introduction To PyTorch
100% (4)
CS236 Introduction To PyTorch
33 pages
2024 11 15 AI Updates
No ratings yet
2024 11 15 AI Updates
20 pages
Chapter 2. Transformers: A Note For Early Release Readers
No ratings yet
Chapter 2. Transformers: A Note For Early Release Readers
85 pages
Mplug-Docowl 1.5: Unified Structure Learning For Ocr-Free Document Understanding
No ratings yet
Mplug-Docowl 1.5: Unified Structure Learning For Ocr-Free Document Understanding
26 pages
DeepSeek-VL: Open-Source Vision-Language Model
No ratings yet
DeepSeek-VL: Open-Source Vision-Language Model
33 pages
(Universitext) Paolo Baldi - Probability - An Introduction Through Theory and Exercises-Springer (2024) (Z-Lib - Io)
No ratings yet
(Universitext) Paolo Baldi - Probability - An Introduction Through Theory and Exercises-Springer (2024) (Z-Lib - Io)
395 pages
SVM Explained for AI Enthusiasts
No ratings yet
SVM Explained for AI Enthusiasts
19 pages
RNN
No ratings yet
RNN
12 pages
Probabilistic Machine Learning: Exponential Families
No ratings yet
Probabilistic Machine Learning: Exponential Families
19 pages
Probabilistic Machine Learning: Exponential Families
No ratings yet
Probabilistic Machine Learning: Exponential Families
33 pages
Generative AI & LLMs Course Overview
No ratings yet
Generative AI & LLMs Course Overview
6 pages
Ann 5TH
No ratings yet
Ann 5TH
98 pages
3 Intro To ANN
No ratings yet
3 Intro To ANN
39 pages
Advance Mechine Learning
No ratings yet
Advance Mechine Learning
2 pages
Module 4 RNN LSTM GRU
No ratings yet
Module 4 RNN LSTM GRU
59 pages
CCS355 - Neural Netwok and Deep Learning Lab Manual
No ratings yet
CCS355 - Neural Netwok and Deep Learning Lab Manual
100 pages
Machine Learning With Neural Networks: An Introduction For Scientists and Engineers Bernhard Mehlig PDF Download
100% (1)
Machine Learning With Neural Networks: An Introduction For Scientists and Engineers Bernhard Mehlig PDF Download
128 pages
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
No ratings yet
CS5228 Project 2 Twitter Sentiment Analysis Group No.: 29: 1 Problem Statement
15 pages
Deep Learning Subject Practicals Uni Mumbai
No ratings yet
Deep Learning Subject Practicals Uni Mumbai
13 pages
Reinforcement Learning and Deep Learning
No ratings yet
Reinforcement Learning and Deep Learning
3 pages
Deep Learning
No ratings yet
Deep Learning
48 pages
Deep Learning in Image Classification
No ratings yet
Deep Learning in Image Classification
17 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Lecture 11 - Introduction To Artificial Neural Networks (ANN)
No ratings yet
Lecture 11 - Introduction To Artificial Neural Networks (ANN)
35 pages
CSE 6125 Course Outlines
No ratings yet
CSE 6125 Course Outlines
3 pages
Classifier Evaluation for Researchers
No ratings yet
Classifier Evaluation for Researchers
49 pages
Soft Computing AND Neural Networks LAB (IT-408) : Submitted By:-Vipin Kumar 785/IT/11
No ratings yet
Soft Computing AND Neural Networks LAB (IT-408) : Submitted By:-Vipin Kumar 785/IT/11
9 pages
Random Forest: Machine Learning Guide
100% (1)
Random Forest: Machine Learning Guide
32 pages
Neural Networks Complete Guide
No ratings yet
Neural Networks Complete Guide
3 pages
Final Mcqs Ai Unit 3
No ratings yet
Final Mcqs Ai Unit 3
6 pages
Weight Initialization Techniques Assignment Questions
No ratings yet
Weight Initialization Techniques Assignment Questions
8 pages
Neural Networks for Beginners
No ratings yet
Neural Networks for Beginners
12 pages
22mc3014 - EK - Lab10.ipynb - Colab
No ratings yet
22mc3014 - EK - Lab10.ipynb - Colab
3 pages
Some Practice Questions
No ratings yet
Some Practice Questions
10 pages
Beef and Pork Image Classification Using CNN EfficientNet-B2
No ratings yet
Beef and Pork Image Classification Using CNN EfficientNet-B2
14 pages
Openai Chatgpt Arhitektura
No ratings yet
Openai Chatgpt Arhitektura
13 pages
Deep Neural Networks Intro As If
No ratings yet
Deep Neural Networks Intro As If
55 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Deep Learning - Lesson Plan
No ratings yet
Deep Learning - Lesson Plan
5 pages
Question Bank For NN
No ratings yet
Question Bank For NN
6 pages
MLT Course Content-4
No ratings yet
MLT Course Content-4
209 pages

AI by Hand: Neural Network Concepts

Uploaded by

AI by Hand: Neural Network Concepts

Uploaded by

Basic

© 2024 Tom Yeh

Date originally posted

12.5.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

12.6.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

12.7.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

12.13.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

12.13.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

12.15.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Activation Function ɸ: ReLU

12.18.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Input Sequence 3 4 5 6 Parameters

Selective Structured State-Space

12.19.23 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1.5.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

______ ______ ______

1.6.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Linear Layer: { Identity | Linear | Bilinear }

1.8.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1.9.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Features from the

1.11.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Scale & Shift 2 0 0 0 Trainable

1.14.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1.15.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1.19.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

1.22.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Retrieval Dot Products

Nearest Neighbor (argmax)

2.1.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

word2vec mini big top hat chair table Flatten

[Mean Pooling] [Mean Pooling]

T1 T2 T3 Shared Embedding Space

Transformer’s Encoder Block

2.17.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

Noise-free Train Sampled 0 2 1 -1

2.19.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh

argmax Expert IDs

AI by Hand ✍ Vol. 1 © 2024 Tom Yeh Next Block

Sample (max) Mean Mean

Align LLM Loss Train RM Winner - Loser

Loss Gradient Predicted σ

3.4.24 AI by Hand ✍ Vol. 1 © 2024 Tom Yeh Loss Gradient

You might also like

__