0% found this document useful (0 votes)

26 views8 pages

Neural Network Notes

The document explains how neural networks can resolve the XOR problem using a specific architecture consisting of an input layer, a middle layer with ReLU-based units, and an output layer. It details the process of calculating outputs through weighted sums and activation functions, emphasizing the advantages of using ReLU and tanh over sigmoid functions. Additionally, it describes the structure and training of feed-forward neural networks for classification tasks.

Uploaded by

Ayaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views8 pages

Neural Network Notes

Uploaded by

Ayaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

How Neural Networks resolve XOR issue?

ARCHITECTURE ? 1 ip layer, x1 and x2

1 middle layer, h1 and h2

1 op layer, y1

# Middle layer consists of 2 ReLu-based units

[Diagram]

x1 x2

0 0

0 1

1 0

1 1

When (0, 0) is in the middle layer unit h1, the weights [1, 1] are applied to the input values to obtain:

0(1) + 0(1) = 0

Adding bias term:

0 + (-1) (+1) = -1

However, since ReLu-based units produce a value of 0 for all values that are negative in nature, the

output value for h1 = 0.

Similarly for h2, it yields a value of 0.

? Middle layer yields the value [0, 0], applying the value of the weights [-2, 1] and the bias
expression (0+1) the output of the entire neural network will be 0.

When this entire process is replicated for all of their input values of the table, the corresponding

values produced are that of their XOR operation.

PREREQUISITES

=> Concept of Neural Networks:

A neural network is simply a network of neural computing units, each of which takes in a vector of

inputs and produces a single output value.

[Diagram]

A neural computing unit is the fundamental building block of a neural network.

It takes in a vector of input values, performs some computation on them and produces an output

value.

When neural computing units receive a vector of input values, they perform a weighted sum on

these input values and then they add a bias to the result of this weighted sum.

The result is then passed into some linear function, known as activation function, to produce an

output value.

Eg: w = [0.2, 0.2, 0.2, 0.1], b = 0.5

x = [5.0, 4.0, 1.0, 2.0]

weighted sum = 0.2(5) + 0.2(4) + 0.2(1) + 0.1(2)

= 2.2
adding bias = 2.2 + 0.5 = 2.7 ? g supplied to activation function
=> Activation Functions ? function that is added to an ANN in order to help the network learn

complex patterns in the data.

When comparing with neuron-based model that is in our brain, it is at the end deciding what is to be

fired to the next neuron.

In ANN, the activation function of a node defines the output of the node given an input or set of

inputs.

They are simply non-linear functions that convert the values they receive into output values for the

neural computational units.

1. Sigmoid Activation Function

f(z) = 1 / (1 + e^-z)

[Graph of sigmoid]

2. Tanh or Hyperbolic Tangent Activation Function (Sigmoidal)

f(x) = tanh(x) = (2 / (1 + e^-2x)) - 1

[Graph of tanh]

? Why is tanh better than sigmoid activation function?

- When the input is large or small, the output is almost smooth and the gradient is small, which is not
conducive to weight update.

The difference is the output interval.

Output interval for tanh is [-1, 1] and the output function is 0-centric, which is better than sigmoid.

- Major advantage is that the negative inputs will be mapped strongly negative and the 0 input will

be mapped near 0 in the tanh graph.

? In binary classification problems, tanh is used for hidden layer and the sigmoid function for output

layer.

3. ReLu (Rectified Linear Unit) activation function:

?(x) = max(0, x), x > 0

= 0, x < 0

[Graph of ReLU]

Range [0, ?)

? Better than tanh and sigmoid:

? When the input is positive, there is no gradient saturation problem.

? The calculation speed is much faster. It has only a linear relationship so whether it is forward or

backward it is faster than the 2 (sigmoid and tanh need to calculate their exp output which will be

slower)

(* dead ReLU problem)

? Feed Forward Neural Networks

A multilayer network of neural units in which the outputs from the units in each layer are passed to

the units in the higher layer.

These networks don?t have any cycles within them, i.e., the outputs from these units don?t flow in

cyclical manner.

[Diagram]

n (x1, x2, ..., x_input) ? i/p values

y (y1, y2, ..., y_output) ? o/p values

x1 to xn ? n input values of the network reside on the first layer (Layer 0)

y1 to ym ? n input output values reside on the last layer (Layer 2)

W ? matrix containing the weights to be applied to the input values

U ? matrix containing the weights to be applied to the output values of the hidden layer

b ? vector containing the bias terms to be applied to the input values

Mathematical Representation:

Multinomial classification:

h = activation function (W.x + b)

z = U.h

y = softmax(z)
For multinomial classification, it?s prudent to choose a softmax function to normalize any vector of

real values received through the performance of a matrix multiplication between U and h.

The normalization process is meant to transform the vector of real values into a vector that

represents a probability distribution.

softmax(zi) = e^zi / ? e^zj for 1 ? i ? d

A feed-forward neural network is a supervised ML algorithm.

To train a neural network means to figure out the right values of W and U for each layer in the neural

network to enable it to predict accurate values of y when given input values of x.

ML Lec-22
No ratings yet
ML Lec-22
25 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Lecture - 05 (Introduction To ANN)
No ratings yet
Lecture - 05 (Introduction To ANN)
27 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
Activation FN
No ratings yet
Activation FN
15 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Artificial Neural Networks (ANN)
No ratings yet
Artificial Neural Networks (ANN)
67 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Unit 2
No ratings yet
Unit 2
18 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
ML Neural Networks
No ratings yet
ML Neural Networks
71 pages
Unit-2.a Feedforward DNN
No ratings yet
Unit-2.a Feedforward DNN
13 pages
Week 14 (NN)
No ratings yet
Week 14 (NN)
49 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Unit 2 - Activation Function - PR
No ratings yet
Unit 2 - Activation Function - PR
22 pages
Unit V
No ratings yet
Unit V
25 pages
0905 Cs 161183 Vishal
No ratings yet
0905 Cs 161183 Vishal
38 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
DL Unit 1
No ratings yet
DL Unit 1
10 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
No ratings yet
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
4 pages
Unit 2
No ratings yet
Unit 2
35 pages
08 Neural Networks
No ratings yet
08 Neural Networks
47 pages
Neural Network and Deep Learning - Unit 1
No ratings yet
Neural Network and Deep Learning - Unit 1
20 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Neural Networks: A Deep Dive
No ratings yet
Neural Networks: A Deep Dive
34 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Dl-Module 2
No ratings yet
Dl-Module 2
138 pages
Unit V
No ratings yet
Unit V
9 pages
Activation Function
No ratings yet
Activation Function
34 pages
Artificial Neural Artificial Neural Networks
No ratings yet
Artificial Neural Artificial Neural Networks
40 pages
Soft Computing Manual.-1
No ratings yet
Soft Computing Manual.-1
45 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
7 NN Apr 28 2021
No ratings yet
7 NN Apr 28 2021
81 pages
Neural Network Activation Guide
No ratings yet
Neural Network Activation Guide
43 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Module 2
No ratings yet
Module 2
44 pages
Activation Function
No ratings yet
Activation Function
4 pages
Deep Learning
No ratings yet
Deep Learning
37 pages
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
No ratings yet
CS 522 Selected Topics in CS: Lecture 07 - Artificial Neural Network
52 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
29 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
FML Unit5
No ratings yet
FML Unit5
21 pages
Unit 1-1
No ratings yet
Unit 1-1
25 pages
6103 Deep Neural Network - Related Concepts (Lecture 12)
No ratings yet
6103 Deep Neural Network - Related Concepts (Lecture 12)
7 pages
Neural Networks Basics & Training
No ratings yet
Neural Networks Basics & Training
8 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Unit 4
No ratings yet
Unit 4
19 pages
ANNs
No ratings yet
ANNs
57 pages
Unit 2 Deep Learning
No ratings yet
Unit 2 Deep Learning
19 pages
Internship Report On Machine Learning
100% (1)
Internship Report On Machine Learning
26 pages
Cybersecurity Training for Businesses
No ratings yet
Cybersecurity Training for Businesses
10 pages
Udemy+ +Docker+SWARM
No ratings yet
Udemy+ +Docker+SWARM
98 pages
Java Identifiers & OOP Basics
No ratings yet
Java Identifiers & OOP Basics
2 pages
Product Analyst Assignment - Nov 2022
No ratings yet
Product Analyst Assignment - Nov 2022
2 pages
Mackie DL1608 Reference Guide
No ratings yet
Mackie DL1608 Reference Guide
192 pages
IFHRMS - Transfer and Relieve An Employee
No ratings yet
IFHRMS - Transfer and Relieve An Employee
26 pages
Launcher Log1
No ratings yet
Launcher Log1
3 pages
Synergy Server Max
No ratings yet
Synergy Server Max
3 pages
Adobe XD Keyboard Shortcuts Quiz
No ratings yet
Adobe XD Keyboard Shortcuts Quiz
7 pages
N-Light CONNECT Installation Instructions-V6
No ratings yet
N-Light CONNECT Installation Instructions-V6
1 page
Alok Shukla DS Resume
No ratings yet
Alok Shukla DS Resume
1 page
Dosya 29 Computational Design
No ratings yet
Dosya 29 Computational Design
78 pages
OOP IMP Questions and Topics
No ratings yet
OOP IMP Questions and Topics
11 pages
RM Lesson 4
No ratings yet
RM Lesson 4
14 pages
Digital Initiatives To Boost General Insurance Awareness & Penetration Among Youth
No ratings yet
Digital Initiatives To Boost General Insurance Awareness & Penetration Among Youth
11 pages
26.1.7 Lab - Snort and Firewall Rules
No ratings yet
26.1.7 Lab - Snort and Firewall Rules
8 pages
CSE373: Design and Analysis of Algorithms
No ratings yet
CSE373: Design and Analysis of Algorithms
52 pages
Data Flow Diagram:: Admin System
No ratings yet
Data Flow Diagram:: Admin System
1 page
D-Copia 253 MF Plus - 303 MF Plus 8765
No ratings yet
D-Copia 253 MF Plus - 303 MF Plus 8765
2 pages
Excerpt
No ratings yet
Excerpt
10 pages
Lecture Notes 2 (COS 201) Edited
No ratings yet
Lecture Notes 2 (COS 201) Edited
6 pages
No Busques Nubes en Forma de Leon
0% (1)
No Busques Nubes en Forma de Leon
2 pages
Grade 11/12 Empowerment Technologies Curriculum
100% (1)
Grade 11/12 Empowerment Technologies Curriculum
24 pages
Velveteen Rabbit: by Margery Williams Illustrations by William Nicholson
No ratings yet
Velveteen Rabbit: by Margery Williams Illustrations by William Nicholson
20 pages
CS MS 3
No ratings yet
CS MS 3
14 pages
Artificial Intelligence
100% (1)
Artificial Intelligence
8 pages
Cambridge IGCSE: Computer Science 0478/12
No ratings yet
Cambridge IGCSE: Computer Science 0478/12
11 pages
Chapter 01-Overview of Digital Transformation
No ratings yet
Chapter 01-Overview of Digital Transformation
26 pages
MTH211 Summaries
No ratings yet
MTH211 Summaries
23 pages

Neural Network Notes

Uploaded by

Neural Network Notes

Uploaded by

How Neural Networks resolve XOR issue?

ARCHITECTURE ? 1 ip layer, x1 and x2

1 middle layer, h1 and h2

# Middle layer consists of 2 ReLu-based units

Adding bias term:

output value for h1 = 0.

Similarly for h2, it yields a value of 0.

values produced are that of their XOR operation.

=> Concept of Neural Networks:

inputs and produces a single output value.

A neural computing unit is the fundamental building block of a neural network.

Eg: w = [0.2, 0.2, 0.2, 0.1], b = 0.5

x = [5.0, 4.0, 1.0, 2.0]

weighted sum = 0.2(5) + 0.2(4) + 0.2(1) + 0.1(2)

complex patterns in the data.

fired to the next neuron.

neural computational units.

1. Sigmoid Activation Function

2. Tanh or Hyperbolic Tangent Activation Function (Sigmoidal)

f(x) = tanh(x) = (2 / (1 + e^-2x)) - 1

? Why is tanh better than sigmoid activation function?

The difference is the output interval.

be mapped near 0 in the tanh graph.

3. ReLu (Rectified Linear Unit) activation function:

?(x) = max(0, x), x > 0

? Better than tanh and sigmoid:

? When the input is positive, there is no gradient saturation problem.

(* dead ReLU problem)

the units in the higher layer.

n (x1, x2, ..., x_input) ? i/p values

y (y1, y2, ..., y_output) ? o/p values

x1 to xn ? n input values of the network reside on the first layer (Layer 0)

y1 to ym ? n input output values reside on the last layer (Layer 2)

W ? matrix containing the weights to be applied to the input values

b ? vector containing the bias terms to be applied to the input values

h = activation function (W.x + b)

represents a probability distribution.

softmax(zi) = e^zi / ? e^zj for 1 ? i ? d

A feed-forward neural network is a supervised ML algorithm.

network to enable it to predict accurate values of y when given input values of x.

You might also like