0% found this document useful (0 votes)
50 views16 pages

Working of Multi-Layer Perceptron

The document provides an overview of Multi-Layer Perceptron (MLP) learning, detailing its structure, components, and functioning including forward propagation and backpropagation. It also introduces Radial Basis Function (RBF) Neural Networks, explaining their architecture and applications in function approximation. Additionally, the document covers interpolation methods in machine learning, highlighting their importance and various types such as linear, polynomial, and spline interpolation.

Uploaded by

Shaik Khalid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views16 pages

Working of Multi-Layer Perceptron

The document provides an overview of Multi-Layer Perceptron (MLP) learning, detailing its structure, components, and functioning including forward propagation and backpropagation. It also introduces Radial Basis Function (RBF) Neural Networks, explaining their architecture and applications in function approximation. Additionally, the document covers interpolation methods in machine learning, highlighting their importance and various types such as linear, polynomial, and spline interpolation.

Uploaded by

Shaik Khalid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Multi-Layer Perceptron Learning

Multi-Layer Perceptron (MLP) consists of fully connected dense layers that transform input
data from one dimension to another. It is called multi-layer because it contains an input
layer, one or more hidden layers and an output layer. The purpose of an MLP is to model
complex relationships between inputs and outputs

Components of Multi-Layer Perceptron (MLP)

 Input Layer: Each neuron or node in this layer corresponds to an input feature. For
instance, if you have three input features the input layer will have three neurons.

 Hidden Layers: MLP can have any number of hidden layers with each layer containing
any number of nodes. These layers process the information received from the input
layer.

 Output Layer: The output layer generates the final prediction or result. If there are
multiple outputs, the output layer will have a corresponding number of neurons

This Photo by Unknown Author is licensed under CC BY-SA

Every connection in the diagram is a representation of the fully connected nature of an MLP.
This means that every node in one layer connects to every node in the next layer. As the data
moves through the network each layer transforms it until the final output is generated in the
output layer.

Working of Multi-Layer Perceptron


Let's see working of the multi-layer perceptron. The key mechanisms such as forward
propagation, loss function, backpropagation and optimization.

1. Forward Propagation

In forward propagation the data flows from the input layer to the output layer, passing
through any hidden layers. Each neuron in the hidden layers processes the input as follows:

1. Weighted Sum: The neuron computes the weighted sum of the inputs:

z=∑iwixi+bz=∑iwixi+b

Where:

 xixi is the input feature.

 wiwi is the corresponding weight.

 bb is the bias term.

 2. Activation Function: The weighted sum z is passed through an activation function


to introduce non-linearity. Common activation functions include:

 Sigmoid: σ(z)=11+e−zσ(z)=1+e−z1

 ReLU (Rectified Linear Unit): f(z)=max⁡(0,z)f(z)=max(0,z)

 Tanh (Hyperbolic Tangent): tanh⁡(z)=21+e−2z−1tanh(z)=1+e−2z2−1


Forward propagation is a fundamental process in neural networks where input data moves
through the network layer by layer until it reaches the output layer. It helps the network
make predictions or classifications.

Here's how it works:

1. Input Layer: The network receives raw data (e.g., an image or numerical values).

2. Weighted Sum Calculation: Each neuron in the next layer computes a weighted sum
of inputs:
[ z = w_1x_1 + w_2x_2 + ... + w_nx_n + b ] where ( w ) are weights, ( x ) are inputs,
and ( b ) is the bias.

3. Activation Function: The weighted sum is passed through an activation function


(such as ReLU or sigmoid) to introduce non-linearity.

4. Propagation Through Layers: This process continues for multiple hidden layers until
it reaches the output layer.

5. Output Layer: The final layer generates predictions, which could be class
probabilities (in classification problems) or numeric values (in regression problems).
Backpropagation in Neural Network
Back Propagation is also known as "Backward Propagation of Errors" is a method
used to train neural networks. Its goal is to reduce the difference between the
model’s predicted output and the actual output by adjusting the weights and biases
in the network.

It works iteratively to adjust weights and bias to minimize the cost function. In each
epoch the model adapts these parameters by reducing loss by following the error
gradient. It often uses optimization algorithms like gradient descent or stochastic
gradient descent The algorithm computes the gradient using the chain rule from
calculus allowing it to effectively navigate complex layers in the neural network to
minimize the cost function.

Back Propagation plays a critical role in how neural networks improve over time. Here's why:

1. Efficient Weight Update: It computes the gradient of the loss function with respect
to each weight using the chain rule making it possible to update weights efficiently.

2. Scalability: The Back Propagation algorithm scales well to networks with multiple
layers and complex architectures making deep learning feasible.
3. Automated Learning: With Back Propagation the learning process becomes
automated and the model can adjust itself to optimize its performance

4. Working of Back Propagation


Algorithm
5. The Back Propagation algorithm involves two main steps:
the Forward Pass and the Backward Pass.
6. 1. Forward Pass Work
7. In forward pass the input data is fed into the input layer.
These inputs combined with their respective weights are
passed to hidden layers. For example in a network with two
hidden layers (h1 and h2) the output from h1 serves as the
input to h2. Before applying an activation function, a bias is
added to the weighted inputs.
8. Each hidden layer computes the weighted sum (`a`) of the
inputs then applies an activation function like ReLU
(Rectified Linear Unit) to obtain the output (`o`). The
output is passed to the next layer where an activation
function such as softmax converts the weighted outputs
into probabilities for classification.
2. Backward Pass

In the backward pass the error (the difference between the predicted and actual output) is
propagated back through the network to adjust the weights and biases. One common
method for error calculation is the Mean Squared Error (MSE) given by:

MSE=(Predicted Output−Actual Output)2MSE=(Predicted Output−Actual Output)2

Once the error is calculated the network adjusts weights using gradients which are
computed with the chain rule. These gradients indicate how much each weight and bias
should be adjusted to minimize the error in the next iteration. The backward pass continues
layer by layer ensuring that the network learns and improves its performance. The activation
function through its derivative plays a crucial role in computing these gradients during Back
Propagation.

Example of Back Propagation in Machine Learning

Let’s walk through an example of Back Propagation in machine learning. Assume the
neurons use the sigmoid activation function for the forward and backward pass. The target
output is 0.5 and the learning rate is 1.

What are radial basis function neural


networks?
Radial Basis Function (RBF) Neural Networks are used for function approximation tasks. They
are a special category of feed-forward neural networks comprising of three layers. Due to
this distinct three-layer architecture and universal approximation capabilities they offer
faster learning speeds and efficient performance in classification and regression problems.

How Do RBF Networks Work?

RBF Networks are conceptually similar to K-Nearest Neighbor (k-NN) models though their
implementation is distinct. The fundamental idea is that nearby items with similar predictor
variable values influence an item's predicted target value

1. Input Vector: The network receives an n-dimensional input vector that needs
classification or regression.

2. RBF Neurons: Each neuron in the hidden layer represents a prototype vector from
the training set. The network computes the Euclidean distance between the input
vector and each neuron's center.

3. Activation Function: The Euclidean distance is transformed using a Radial Basis


Function (typically a Gaussian function) to compute the neuron’s activation value.
This value decreases exponentially as the distance increases.

4. Output Nodes: Each output node calculates a score based on a weighted sum of the
activation values from all RBF neurons. For classification the category with the
highest score is chosen.
Step 2: Spread Parameter (σ): Determines how much influence each radial basis function
(RBF) neuron has.

1. Manual vs. Constant Setting: It can be adjusted individually for each neuron or kept
the same for all neurons.

2. Heuristic Method: A common way to set σ is by dividing the largest distance


between neuron centers by the square root of twice the number of centers.

3. Purpose: This method ensures neurons cover the input space effectively, maintaining
a balanced influence

Step 3: Training the Output Weights

 Linear Regression: The objective of linear regression techniques which are


commonly used to estimate the output layer weights, is to minimize the error
between the anticipated output and the actual target values.

 Pseudo-Inverse Method: One popular technique for figuring out the weights is to
utilize the pseudo-inverse of the hidden layer outputs matrix

Advantages of RBF Networks

1. Universal Approximation: RBF Networks can approximate any continuous function


with arbitrary accuracy given enough neurons.

2. Faster Learning: The training process is generally faster compared to other neural
network architectures.

3. Simple Architecture: The straightforward three-layer architecture makes RBF


Networks easier to implement and understand.

Applications of RBF Networks

 Classification: RBF Networks are used in pattern recognition and classification tasks
such as speech recognition and image classification.

 Regression: These networks can model complex relationships in data for prediction
tasks.

 Function Approximation: RBF Networks are effective in approximating non-linear


functions
Interpolation in Machine Learning
In machine learning, interpolation refers to the process of estimating unknown values that
fall between known data points.

This can be useful in various scenarios, such as filling in missing values in a dataset or
generating new data points to smooth out a curve.

interpolation is an essential method for estimating values within a range of known data
points.

The practice of guessing unknown values based on available data points is known
as interpolation. In tasks like regression and classification, where the objective is to predict
outcomes based on input features, it is important. ML algorithms are capable of producing
well-informed predictions for unknown or intermediate values by interpolating between
known data points.

Interpolation Types
0. Linear Interpolation: Estimates values along a straight line between two data points.

1. Polynomial Interpolation: Uses a polynomial function to fit data points, allowing for
more flexibility in capturing nonlinear patterns.

2. Spline Interpolation: Connects data points smoothly using piecewise polynomials,


avoiding sudden changes.

3. Radial Basis Function Interpolation: Computes values based on distances between


data points, offering a more adaptable approach.

Interpolation in Linear Form


A straightforward but efficient technique for guessing values between two known data
points is linear interpolation.

The value of y at any intermediate point x can be approximated using the following formula,
given two data points:(⁽1,1)(x1 ,y1 )and(2 ,2)(x2 ,y2 ).(⁽1,1)(x1 ,y1 )and(2 ,2)(x2 ,y2 ). i.e y=y1+
(x−x1)⋅(y2−y1)/x2−x1y=y1+(x−x1)⋅(y2−y1)/x2−x1

Implementation

 This code snippet illustrates linear interpolation using LinearNDInterpolator from


SciPy.
 It randomly generates 10 data points in 2D space with corresponding values.

 The LinearNDInterpolator function constructs an interpolation function based on


these points. It then interpolates the value at a specified point and visualizes both
the data points and the interpolated point on a scatter plot.

 Finally, the interpolated value at the specified point is printed

Spline Interpolation
Definition: A technique where the interpolating function is made up of smaller
polynomial segments called splines.

1. Difference from Polynomial Interpolation: Instead of using a single polynomial for all
data points, spline interpolation breaks the data into segments, each with its own
polynomial.

2. Smoothness: Since each segment is connected smoothly, the function better


captures the local behavior of the data without abrupt changes.

3. Cubic Spline Interpolation: The most common type, where each segment is
represented by a cubic polynomial. It ensures smoothness by maintaining continuity
in the first and second derivatives.
Implementation
4. This code demonstrates cubic spline interpolation using `CubicSpline` from SciPy. It
starts with a set of sample data points defined in arrays `x` and `y`.

5. The `CubicSpline` function constructs a cubic spline interpolation function based on


these points.

6. Then, it generates interpolated points along the x-axis and calculates corresponding
y-values.

7. Finally, both the original data points and the interpolated curve are plotted using
matplotlib to visualize the interpolation result

import numpy as np

import matplotlib.pyplot as plt

from scipy.interpolate import CubicSpline

x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) # Generate some sample data points

y = np.array([5, 6, 9, 8, 7, 4, 6, 7, 8, 5])

cs = CubicSpline(x, y) # Create a CubicSpline interpolation

x_interp = np.linspace(1, 10, 100) # Create a CubicSpline interpolation

y_interp = cs(x_interp)

plt.figure(figsize=(8, 6))

plt.plot(x, y, 'o', label='Data Points')

plt.plot(x_interp, y_interp, label='Cubic Spline Interpolation')

plt.title('Cubic Spline Interpolation')

plt.xlabel('X')

plt.ylabel('Y')

plt.legend()

plt.grid(True)

plt.show()
Radial Basis Function Interpolation
Radial Basis Function (RBF) interpolation is a method of interpolation that uses radial basis
functions to approximate the underlying data. Unlike polynomial interpolation, which fits a
single polynomial to the entire dataset, RBF interpolation uses a combination of radial basis
functions Centered at each data point to construct the interpolating function.

Implementation

 This code demonstrates Radial Basis Function (RBF) interpolation


using `RBFInterpolator` from SciPy.

 It generates random data points in a 2D space and calculates corresponding y-values


based on a predefined function.

 A grid is then created for visualization purposes.

 The `RBFInterpolator` function constructs an interpolation function based on the


random data points.
 Finally, it plots the interpolated surface and scatter plot of the original data points to
visualize the interpolation result.

import numpy as np

import matplotlib.pyplot as plt

from scipy.interpolate import RBFInterpolator

# Generate data

rng = np.random.default_rng()

x_data = rng.uniform(-1, 1, (100, 2))

y_data = np.sum(x_data, axis=1) * np.exp(-6 * np.sum(x_data**2, axis=1))

# Interpolate

x_grid = np.mgrid[-1:1:50j, -1:1:50j]

x_flat = np.column_stack((x_grid[0].ravel(), x_grid[1].ravel()))

y_grid = RBFInterpolator(x_data, y_data)(x_flat).reshape(50, 50)

fig, ax = plt.subplots()

ax.pcolormesh(*x_grid, y_grid)

p = ax.scatter(*x_data.T, c=y_data, s=50, edgecolors='k')

fig.colorbar(p)

plt.title('RBF Interpolation with Random Data')

plt.xlabel('X1')

plt.ylabel('X2')

plt.show()
Applications Of Interpolation in Machine Learning
Interpolation in machine learning is used for estimating unknown values between known
data points. Key applications include:

 Image Processing: Resizing images by estimating pixel values.

 Computer Graphics: Creating smooth curves and animations.

 Numerical Analysis: Approximating function values for simulations.

 Signal Processing: Upsampling signals without changing frequency.

 Mathematical Modeling: Estimating unknown values in models.

 GIS: Predicting geographical features like elevation.

 Audio Processing: Resampling audio signals for modifications


Polynomial Interpolation
Definition: A technique for estimating values by fitting a polynomial through given data
points.

1. Purpose: Helps approximate complex functions without a simple analytical form.

2. How It Works: Finds a polynomial that exactly passes through all known data points.

3. Common Methods:

o Lagrange Polynomial: Uses weighted sums of basis polynomials.

o Newton’s Divided Differences: Builds the polynomial iteratively using divided


differences.

4. Applications: Used in numerical analysis, curve fitting, and scientific computing

Implementation
 This article demonstrates polynomial interpolation using the interp1d function from
SciPy.

 It begins by generating sample data representing points along a sine curve. The
interp1d function is then applied with a cubic spline interpolation method to
approximate the curve between the data points.

 Finally, the original data points and the interpolated curve are visualized using
matplotlib, showcasing the effectiveness of polynomial interpolation in
approximating the underlying function from sparse data points.

 CODE
import numpy as np

from scipy.interpolate import interp1d

import matplotlib.pyplot as plt

# Generate some sample data

x = np.linspace(0, 10, 10)

y = np.sin(x)

# Perform polynomial interpolation

poly_interp = interp1d(x, y, kind='cubic')

# Generate points for plotting the interpolated curve


x_interp = np.linspace(0, 10, 100)

y_interp = poly_interp(x_interp)

# Plot the original data and the interpolated curve

plt.scatter(x, y, label='Original Data')

plt.plot(x_interp, y_interp, color='red', label='Polynomial Interpolation')

plt.xlabel('X')

plt.ylabel('Y')

plt.title('Polynomial Interpolation with interp1d')

plt.legend()

plt.grid(True)

plt.show()

You might also like