0% found this document useful (0 votes)

596 views28 pages

Kan Slide

The document discusses Kolmogorov-Arnold networks (KAN), an alternative to multilayer perceptrons for machine learning. It provides an overview of topics covered, including prerequisites, the universal approximation theorem, and properties of KANs such as their ability to represent functions and parameters count compared to MLPs. B-splines and Bézier curves are also introduced as they relate to KANs.

Uploaded by

Gobi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

596 views28 pages

Kan Slide

Uploaded by

Gobi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Kolmogorov–

Arnold Networks
Umar Jamil
Downloaded from: https://github.com/hkproj/kan-notes
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0):
https://creativecommons.org/licenses/by-nc/4.0/legalcode

Not for commercial use

Umar Jamil – https://github.com/hkproj/kan-notes

Topics Prerequisites
• Review of Multilayer Perceptron • Basics of calculus (derivative)
• Introduction to data fitting • Basics of deep learning
(backpropagation)
• Bézier Curves
• B-Splines
• Universal Approximation Theorem
• Kolmogorov-Arnold Representation
Theorem
• MLPs vs KAN
• Properties
• Multi-layer KANs
• Parameters count: MLPs vs KANs
• Grid extension
• Interpretability
• Continual training

Umar Jamil – https://github.com/hkproj/kan-notes

The Multi-layer Perceptron (MLP)
A multilayer perceptron is a neural network made up of multiple layers of neurons, organized in a feed-forward way, with nonlinear activation functions in
between.
How does it work?

Class 1

Class 2

Class 3

Class 4

Class 5
Input

Hidden Layer 1 Hidden Layer 2 Output (logits)

Umar Jamil – https://github.com/hkproj/kan-notes

The Linear layer in PyTorch

Umar Jamil – https://github.com/hkproj/kan-notes

The Linear layer in detail
A linear layer in a MLP is made of a weight matrix and a bias matrix.
n1 n2 n3 n4 n5
The bias vector will be broadcasted to every
b= b
1
row in the 𝑋𝑊 𝑇 table.
𝑧1 = (𝑟1 + 𝑏1 ) = (σ3𝑖=1 𝑎𝑖 𝑤𝑖 + 𝑏1 ) (1, 5)
+
f1 f2 f3 f1 f2 f3 f4 f5 f1 f2 f3 f4 f5

a1 a2 a3 r1 z1
Item 1 𝑶 = 𝑿𝑾𝑻 + 𝒃 Item 1 Item 1

Item 2 Item 2 Item 2

Item 3 n1 n2 n3 n4 n5 Item 3 Item 3

X= 𝑾𝑻 = w2
𝑿𝑾𝑻 = O=
(10, 3) (3, 5) w3 (10, 5) (10, 5)

Item 10 Item 10 Item 10

Umar Jamil – https://github.com/hkproj/kan-notes

Why do we need activation functions?
After each Linear layer, we usually apply a nonlinear activation function. Why?

𝑶𝟏 = 𝒙𝑾1𝑻 + 𝒃𝟏

𝑶𝟐 = (𝑶𝟏 )𝑾𝑻2 + 𝒃𝟐

𝑶𝟐 = (𝒙𝑾1𝑻 + 𝒃𝟏 )𝑾𝑻2 + 𝒃𝟐

𝑶𝟐 = 𝒙𝑾1𝑻 𝑾𝑻2 + 𝒃𝟏 𝑾𝑻2 + 𝒃𝟐

As you can see, if we do not apply any activation functions, the output will just be a linear combination of the inputs, which means that our MLP will not be
able to learn any non-linear mapping between the input and output, which represents most of the real-world data.

Umar Jamil – https://github.com/hkproj/kan-notes

Introduction to data fitting
Imagine you’re making a 2D game and you want animate your sprite (character) to pass through a series of points. One way would be to make a straight line
from one point to the next, but that wouldn’t look so good. What if you could create a smoother path, like the one below?

Umar Jamil – https://github.com/hkproj/kan-notes

Smooth curves through polynomial curves
How to find the equation of such a smooth curve?
One way is to write the generic equation of a polynomial curve and force it to pass through the series of points to get the coefficients of the equation.
We have 4 points, so we can make a system of equations with 4 equations, which means we can solve for 4 variables: yes, we get a polynomial with degree 3.

𝑦 = 𝑎𝑥 3 + 𝑏𝑥 2 + 𝑐𝑥 + 𝑑
We can write our system of equations as follows and solve to find the equation of the curve:

5 = 𝑎(0)3 +𝑏(0)2 +𝑐 0 + 𝑑
1 = 𝑎(1)3 +𝑏(1)2 + 𝑐(1) + 𝑑
3 = 𝑎(2)3 +𝑏(2)2 + 𝑐(2) + 𝑑
2 = 𝑎(5)3 +𝑏(5)2 + 𝑐(5) + 𝑑

Umar Jamil – https://github.com/hkproj/kan-notes

What if I have hundreds of points?
If you have N points, you need a polynomial of degree N – 1 if you want the line to pass through all those points. But as you can see, when we have lots of
points, the polynomial starts getting crazy on the extremes. We wouldn’t want the character in our 2D game to go out of the screen while we’re animating it,
right?
Thankfully, someone took the time to solve this problem, because we have Bézier curves!

Source: https://arachnoid.com/polysolve/

Umar Jamil – https://github.com/hkproj/kan-notes

Bézier curves
A Bézier curves is a parametric curve (which means that all the coordinates of the curve depend on an independent variable 𝑡, between 0 and 1).
For example, given two points, we can calculate the linear B curve as the following interpolation:

𝑩 𝑡 = 𝑷0 + 𝑡 𝑷1 − 𝑷0 = 1 − 𝑡 𝑷0 + 𝑡𝑷1

Given three points, we can calculate the quadratic Bézier curve that interpolates them.
Source: Wikipedia

𝑸0 𝑡 = 1 − 𝑡 𝑷0 + 𝑡𝑷1
𝑸1 𝑡 = 1 − 𝑡 𝑷1 + 𝑡𝑷2

𝑩 𝑡 = 1 − 𝑡 𝑸0 + 𝑡𝑸1
= 1 − 𝑡 1 − 𝑡 𝑷0 + 𝑡𝑷1 + 𝑡 1 − 𝑡 𝑷1 + 𝑡𝑷2
= 1 − 𝑡 2 𝑷0 + 2 1 − 𝑡 𝑡𝑷1 + 𝑡 2 𝑷2

With four points, we can proceed with a similar reasoning.

Umar Jamil – https://github.com/hkproj/kan-notes

Bézier curves: going deeper
Yes, we can go deeper! If we have 𝑛 + 1 points, we can find the 𝑛 degree Bézier curve using the following formula

𝑛 𝑛
𝑛 𝑛−𝑖 𝑖
𝑩 𝑡 = ෍ 1−𝑡 𝑡 𝑷𝑖 = ෍ 𝑏𝑖,𝑛 (𝑡)𝑷𝑖
𝑖
𝑖=0 𝑖=0

Bernstein basis polynomials

Blue: 𝑏0,3 𝑡
Green: 𝑏1,3 𝑡
Red: 𝑏2,3 𝑡
Cyan: 𝑏3,3 𝑡

Binomial coefficients

𝑛 𝑛!
=
𝑖 𝑖! 𝑛 − 𝑖 !
Source: Wikipedia

Umar Jamil – https://github.com/hkproj/kan-notes

From Bézier curves to B-Splines
If you have lots of points (say n), you need a Bézier curve with a degree n-1 to approximate it well, but that can be quite complicated computationally to
calculate.
Someone wise thought: why don’t we stitch together many Bézier curves between all these points, instead of one big Bézier curve that interpolates all of
them?

Source: Wikipedia

Umar Jamil – https://github.com/hkproj/kan-notes

B-splines in detail
A 𝑘-degree B-Spline curve that is defined by 𝑛 control points, will consist of 𝑛 − 𝑘 Bézier curves.
For example, if we want to use a quadratic Bézier curve and we have 6 points, we need 6 − 2 = 4 Bézier curves.
In this case we have n=6 and k=2

Source: Wolfram Alpha

Umar Jamil – https://github.com/hkproj/kan-notes

B-splines in detail
The degree of our B-Spline also tells what kind of continuity we get.

Source: MIT

Umar Jamil – https://github.com/hkproj/kan-notes

Calculating B-splines: algorithm

Source: MIT

Umar Jamil – https://github.com/hkproj/kan-notes

B-Splines: basis functions

𝑁0,2 𝑁5,2

𝑁2,2 𝑁3,2
𝑁1,2 𝑁4,2

Umar Jamil – https://github.com/hkproj/kan-notes

B-splines: local control
Moving a control point only changes the curve locally (in the proximity of the control point), leaving the adjacent Bezier curves unchanged!

Umar Jamil – https://github.com/hkproj/kan-notes

Universal Approximation Theorem
We can think of neural networks as functions approximators. Usually, we have access to some data points generated by an ideal function that we do not have
access. The goal of training a neural network is to approximate this ideal function (that we do not have access).
But how do we know if a neural network is powerful enough to model our ideal function? What can we say about the expressive power of neural networks?
This is what the universal approximation theorem is all about: it is a series of results that put limits on what neural networks can learn.
It has been proven that neural networks with a certain width (number of neurons) and depth (number of layers) can approximate any continuous function if
using specific non-linear activation functions, for example the ReLU function. Check Wikipedia for more theoretical results.

I want to emphasize what it means to be a universal approximator: it means that given an ideal function (or a family of functions) that models the training data,
the network can learn to approximate it as good as we want, that is, given an error 𝜖, we can always find an approximate function that is close to the ideal
function within this error limit.
This is however a theoretical result; it doesn’t tell us how to do it practically. On a practical level, we have many problems:
• Achieving good approximations may take enormous amounts of computational power
• We may need a large big quantity of training data
• Our hardware may not be able to represent certain weights in 32 bit
• Our optimizer may remain stuck in a local minima

So as you can see, just because a neural network can learn anything, doesn’t mean we are be able to learn it in practice. But at least we know that the limits
are practical.

Umar Jamil – https://github.com/hkproj/kan-notes

Kolmogorov-Arnold representation theorem

Umar Jamil – https://github.com/hkproj/kan-notes

Kolmogorov-Arnold Networks
This can network can be thought of as two layers applied in sequence:
𝑜1 • The first layer maps 2 input features into 5 output features.
• The second layer maps 5 input features into 1 output feature.

𝑛=2
2𝑛 + 1 = 5
𝜑1 𝜑2 𝜑3 𝜑4 𝜑5

ℎ1 ℎ2 ℎ3 ℎ4 ℎ5 We sum the output of the learnable functions

Instead of having learnable weights,

we have learnable functions
𝜑1,1 𝜑2,1 𝜑3,1 𝜑4,1 𝜑5,1 𝜑1,2 𝜑2,2 𝜑3,2 𝜑4,2 𝜑5,2

𝑥1 𝑥2

Umar Jamil – https://github.com/hkproj/kan-notes

MLP vs KAN

Umar Jamil – https://github.com/hkproj/kan-notes

Multi-layer KAN

Layer 2
5 input features, 1 output features
total of 5 functions to “learn”

Layer 1
2 input features, 5 output features
total of 10 functions to “learn”

Umar Jamil – https://github.com/hkproj/kan-notes

Implementation details

Umar Jamil – https://github.com/hkproj/kan-notes

Parameters count

Compared to MLP, we also have (G+k) parameters for each activation, because we need to learn where to put the control points for the B-Splines.

Umar Jamil – https://github.com/hkproj/kan-notes

Grid extension
We can increase the number of “control points” in the B-Spline to give it more “degrees of freedom” to better approximate more complex functions, meaning
that we can extend the grid of an existing pre-trained network.

Umar Jamil – https://github.com/hkproj/kan-notes

Interpretability

Umar Jamil – https://github.com/hkproj/kan-notes

Continual learning

Umar Jamil – https://github.com/hkproj/kan-notes

Thanks for watching!
Don’t forget to subscribe for
more amazing content on AI
and Machine Learning!

Umar Jamil – https://github.com/hkproj/kan-notes

Flowcode Basic Tutorial
No ratings yet
Flowcode Basic Tutorial
5 pages
Module I
No ratings yet
Module I
109 pages
Lecture # 15-1 Knowledge Distillation
No ratings yet
Lecture # 15-1 Knowledge Distillation
51 pages
Normalization
No ratings yet
Normalization
1 page
Handout Image Registration Techniques B.H.
No ratings yet
Handout Image Registration Techniques B.H.
30 pages
Tensorflow, Keras and Deep Learning
No ratings yet
Tensorflow, Keras and Deep Learning
51 pages
Multimedia User Guide
No ratings yet
Multimedia User Guide
99 pages
Flowcode Introductory Course PDF
No ratings yet
Flowcode Introductory Course PDF
12 pages
Adaline/Madaline:Applications
100% (1)
Adaline/Madaline:Applications
25 pages
Fluent Adjoint Solver 14.5
No ratings yet
Fluent Adjoint Solver 14.5
82 pages
QBASIC Programs for Nepali Students
No ratings yet
QBASIC Programs for Nepali Students
10 pages
Current
No ratings yet
Current
575 pages
Cbm370 - Wearable Devices - Unit 1
No ratings yet
Cbm370 - Wearable Devices - Unit 1
48 pages
1 - Intro To Neural Network
No ratings yet
1 - Intro To Neural Network
12 pages
Pneumonia Detection Using Deep Learning
No ratings yet
Pneumonia Detection Using Deep Learning
5 pages
NN Examples Matlab
No ratings yet
NN Examples Matlab
91 pages
Dr. Trushar B. Gohil: Area of Interests
No ratings yet
Dr. Trushar B. Gohil: Area of Interests
4 pages
Cellular Neural Networks Review
No ratings yet
Cellular Neural Networks Review
31 pages
DDR3 Demo For The ECP5™ and ECP5-5G™ Versa Development Boards User Guide
No ratings yet
DDR3 Demo For The ECP5™ and ECP5-5G™ Versa Development Boards User Guide
12 pages
Anfis Structure
No ratings yet
Anfis Structure
5 pages
Unit 4
No ratings yet
Unit 4
16 pages
Electric Wiring Domestic 12th Edition Brian Scaddan Ieng Full Digital Chapters
100% (4)
Electric Wiring Domestic 12th Edition Brian Scaddan Ieng Full Digital Chapters
137 pages
An Introduction To Parallel Programming. Second Edition Peter S. Pacheco Full Chapters Instanly
100% (1)
An Introduction To Parallel Programming. Second Edition Peter S. Pacheco Full Chapters Instanly
146 pages
Agent Architecture in Artificial - PPT - Presentation
No ratings yet
Agent Architecture in Artificial - PPT - Presentation
25 pages
Deep Learning for CS Students
No ratings yet
Deep Learning for CS Students
75 pages
MATLAB Convolution for ECE Students
No ratings yet
MATLAB Convolution for ECE Students
7 pages
Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey
No ratings yet
Healthcare Predictive Analytics Using Machine Learning and Deep Learning Techniques: A Survey
45 pages
Lecture Notes 5
No ratings yet
Lecture Notes 5
3 pages
LAMMPS & OVITO Simulation Guide
No ratings yet
LAMMPS & OVITO Simulation Guide
2 pages
Autoencoders and Restricted Boltzmann Machines: Amir H. Payberah
No ratings yet
Autoencoders and Restricted Boltzmann Machines: Amir H. Payberah
139 pages
MA250 - Intro To PDEs
No ratings yet
MA250 - Intro To PDEs
16 pages
Diffusion Models
No ratings yet
Diffusion Models
46 pages
Image Rotation Using CUDA
No ratings yet
Image Rotation Using CUDA
18 pages
(EBook PDF) Advances in Biomedical Engineering and Technology 1st Edition by Albert Rizvanov, Bikesh Kumar Singh, Padma Ganasala 9811563292 9789811563294 Full Chapters PDF Download
100% (6)
(EBook PDF) Advances in Biomedical Engineering and Technology 1st Edition by Albert Rizvanov, Bikesh Kumar Singh, Padma Ganasala 9811563292 9789811563294 Full Chapters PDF Download
87 pages
Lecture 2 Robot Manipulators
No ratings yet
Lecture 2 Robot Manipulators
66 pages
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
100% (1)
Data-Level Parallelism in Vector, SIMD, And: GPU Architectures
29 pages
Deep Learning CNN Training Guide
No ratings yet
Deep Learning CNN Training Guide
20 pages
Experiment No. 4 TE SL-II (ANN)
100% (1)
Experiment No. 4 TE SL-II (ANN)
2 pages
Heat Transfer Simulations Guide
No ratings yet
Heat Transfer Simulations Guide
4 pages
Deep Learning Unit-III
No ratings yet
Deep Learning Unit-III
9 pages
L1 24 07 2019 Introduction
100% (2)
L1 24 07 2019 Introduction
28 pages
Deep Learning Onramp
No ratings yet
Deep Learning Onramp
1 page
Fuzzy Logic
No ratings yet
Fuzzy Logic
47 pages
Arm Assembly
No ratings yet
Arm Assembly
76 pages
Mobile Computing Technology Applications and Service Creation 2nd Edition Hasan Et Al. Talukder PDF Download
No ratings yet
Mobile Computing Technology Applications and Service Creation 2nd Edition Hasan Et Al. Talukder PDF Download
81 pages
AI-Lecture 12 - Simple Perceptron
100% (1)
AI-Lecture 12 - Simple Perceptron
24 pages
Control Systems Block Diagram Reduction and Mason's Formula Www-Tutorialspoint
No ratings yet
Control Systems Block Diagram Reduction and Mason's Formula Www-Tutorialspoint
8 pages
VGG16 Architecture Overview
No ratings yet
VGG16 Architecture Overview
30 pages
Artificial Neural Network Based Power System Restoratoin
50% (2)
Artificial Neural Network Based Power System Restoratoin
22 pages
Expert Systems Programming Guide
No ratings yet
Expert Systems Programming Guide
2 pages
Complete Deep Learning Interview Question
No ratings yet
Complete Deep Learning Interview Question
46 pages
Multi-Scale Modeling Mechanical Research Network: Fluid Sub-Section
No ratings yet
Multi-Scale Modeling Mechanical Research Network: Fluid Sub-Section
26 pages
Geometric Deep Learning With Grids Groups Graphs Geodesics and Gauges
No ratings yet
Geometric Deep Learning With Grids Groups Graphs Geodesics and Gauges
160 pages
AN2450-L6599 Application Note PDF
No ratings yet
AN2450-L6599 Application Note PDF
32 pages
Btech CSE
100% (1)
Btech CSE
17 pages
Gpu, Cuda and Pycuda
No ratings yet
Gpu, Cuda and Pycuda
11 pages
Mirror Adder
No ratings yet
Mirror Adder
51 pages
Splines & Bezier Curves in Engineering
0% (1)
Splines & Bezier Curves in Engineering
79 pages
CGR Unit V INTRODUCTION TO CURVES
No ratings yet
CGR Unit V INTRODUCTION TO CURVES
13 pages
25 UCS632 Interpolation
No ratings yet
25 UCS632 Interpolation
77 pages
AIcrowd - Single-Source Augmentation - Challenges
No ratings yet
AIcrowd - Single-Source Augmentation - Challenges
1 page
The Most Used Positional Encoding: Rope: Damien Benveniste
No ratings yet
The Most Used Positional Encoding: Rope: Damien Benveniste
7 pages
IterateAI Careers
No ratings yet
IterateAI Careers
4 pages
Model Compression Techniquesin Deep Learning
No ratings yet
Model Compression Techniquesin Deep Learning
23 pages
Chapter 14 - Analyzing Adversarial Performance - The Deep Learning Architect's Handbook
No ratings yet
Chapter 14 - Analyzing Adversarial Performance - The Deep Learning Architect's Handbook
1 page
Inductive Moment Matching
No ratings yet
Inductive Moment Matching
36 pages
Getting Started With GPT-4 API: May 14,2024 Update To From gpt-4 To Gpt-4o
No ratings yet
Getting Started With GPT-4 API: May 14,2024 Update To From gpt-4 To Gpt-4o
8 pages
2024 11 15 AI Updates
No ratings yet
2024 11 15 AI Updates
20 pages
CS236 Introduction To PyTorch
100% (4)
CS236 Introduction To PyTorch
33 pages
Chapter 2. Transformers: A Note For Early Release Readers
No ratings yet
Chapter 2. Transformers: A Note For Early Release Readers
85 pages
Mplug-Docowl 1.5: Unified Structure Learning For Ocr-Free Document Understanding
No ratings yet
Mplug-Docowl 1.5: Unified Structure Learning For Ocr-Free Document Understanding
26 pages
SVM Explained for AI Enthusiasts
No ratings yet
SVM Explained for AI Enthusiasts
19 pages
(Universitext) Paolo Baldi - Probability - An Introduction Through Theory and Exercises-Springer (2024) (Z-Lib - Io)
No ratings yet
(Universitext) Paolo Baldi - Probability - An Introduction Through Theory and Exercises-Springer (2024) (Z-Lib - Io)
395 pages
AI by Hand: Neural Network Concepts
No ratings yet
AI by Hand: Neural Network Concepts
28 pages
DeepSeek-VL: Open-Source Vision-Language Model
No ratings yet
DeepSeek-VL: Open-Source Vision-Language Model
33 pages
Probabilistic Machine Learning: Exponential Families
No ratings yet
Probabilistic Machine Learning: Exponential Families
19 pages
RNN
No ratings yet
RNN
12 pages
Probabilistic Machine Learning: Exponential Families
No ratings yet
Probabilistic Machine Learning: Exponential Families
33 pages
Generative AI & LLMs Course Overview
No ratings yet
Generative AI & LLMs Course Overview
6 pages
IOT Based Distribution Transformers Health Monitoring System Using Arduino and Nodemcu
No ratings yet
IOT Based Distribution Transformers Health Monitoring System Using Arduino and Nodemcu
10 pages
Exploding Kittens - The Review Game
No ratings yet
Exploding Kittens - The Review Game
41 pages
2018 Kemppi Robotic Welding Catalogue
No ratings yet
2018 Kemppi Robotic Welding Catalogue
96 pages
Bibliometrics Tools for Researchers
No ratings yet
Bibliometrics Tools for Researchers
10 pages
Prajwal. K
No ratings yet
Prajwal. K
31 pages
High Voltage Equipment Inspectors
No ratings yet
High Voltage Equipment Inspectors
2 pages
Dr. Amol Kolhe YT Analytics
No ratings yet
Dr. Amol Kolhe YT Analytics
4 pages
WIKA CPG1000 Manual
No ratings yet
WIKA CPG1000 Manual
68 pages
Rv220w Admin v1 0 1 0 Manual
No ratings yet
Rv220w Admin v1 0 1 0 Manual
178 pages
0-Level Documentation Part - 2 (System Reference Manual BV Family R2.3)
No ratings yet
0-Level Documentation Part - 2 (System Reference Manual BV Family R2.3)
24 pages
IM Machine Shorthand
No ratings yet
IM Machine Shorthand
32 pages
Thesis Title Examples Hotel Restaurant Management
100% (3)
Thesis Title Examples Hotel Restaurant Management
8 pages
DLL Etech Q2 W7
No ratings yet
DLL Etech Q2 W7
4 pages
Architectural Programming: Architectural Design II Dossin
No ratings yet
Architectural Programming: Architectural Design II Dossin
7 pages
ch7 - Defining Own Class
No ratings yet
ch7 - Defining Own Class
8 pages
Design of Motor and Power Control
No ratings yet
Design of Motor and Power Control
2 pages
GA 90 160 Fixed Speed WS IPM Corr
No ratings yet
GA 90 160 Fixed Speed WS IPM Corr
44 pages
SC 920
No ratings yet
SC 920
75 pages
SL - VD4-AF (EN) - ABB Frequent VCB
No ratings yet
SL - VD4-AF (EN) - ABB Frequent VCB
19 pages
The Creator Economy 2023 Report
No ratings yet
The Creator Economy 2023 Report
76 pages
RKNL-G (15-25 Ton)
No ratings yet
RKNL-G (15-25 Ton)
64 pages
Network Performance
No ratings yet
Network Performance
12 pages
Engineering Drawing Checklist
No ratings yet
Engineering Drawing Checklist
1 page
Kasas AMO Manuals Overview
100% (1)
Kasas AMO Manuals Overview
7 pages
Python Questions and Answers Lists 5
No ratings yet
Python Questions and Answers Lists 5
4 pages
Energy Intelligence: World Crude Oil Data & Handbook 2019
100% (1)
Energy Intelligence: World Crude Oil Data & Handbook 2019
1 page
Practice Exercise For Data Entry
No ratings yet
Practice Exercise For Data Entry
11 pages
Specs DuraDiagnost 452299104071.pdf?nodeid 10787196&vernum - 2
0% (1)
Specs DuraDiagnost 452299104071.pdf?nodeid 10787196&vernum - 2
24 pages
CANDY Cataloge 2013 Dish Washers & Washing Machines
No ratings yet
CANDY Cataloge 2013 Dish Washers & Washing Machines
1 page
Q-Trend Strategy FTMO
No ratings yet
Q-Trend Strategy FTMO
2 pages

Kan Slide

Uploaded by

Kan Slide

Uploaded by

Kolmogorov–

Not for commercial use

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Hidden Layer 1 Hidden Layer 2 Output (logits)

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Item 2 Item 2 Item 2

Item 3 n1 n2 n3 n4 n5 Item 3 Item 3

Item 10 Item 10 Item 10

Umar Jamil – https://github.com/hkproj/kan-notes

𝑶𝟐 = 𝒙𝑾1𝑻 𝑾𝑻2 + 𝒃𝟏 𝑾𝑻2 + 𝒃𝟐

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

With four points, we can proceed with a similar reasoning.

Umar Jamil – https://github.com/hkproj/kan-notes

Bernstein basis polynomials

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Source: Wolfram Alpha

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

ℎ1 ℎ2 ℎ3 ℎ4 ℎ5 We sum the output of the learnable functions

Instead of having learnable weights,

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

Umar Jamil – https://github.com/hkproj/kan-notes

You might also like