UNIT II
DEEP NEURAL NETWORKS
Introduction to Neural Networks
deep neural networks
• Deep neural networks (DNNs) are a type of artificial neural network
that are made up of many layers of artificial neurons.
• They are used to classify and recognize objects in data.
• A deep neural network (DNN) is an ANN with multiple hidden layers
between the input and output layers.
• Similar to shallow ANNs, DNNs can model complex non-linear
relationships.
Layers in Neural Network
Architecture
• Input Layer: This is where the network receives its input data. Each
input neuron in the layer corresponds to a feature in the input data.
• Hidden Layers: These layers perform most of the computational heavy
lifting. A neural network can have one or multiple hidden layers. Each
layer consists of units (neurons) that transform the inputs into
something that the output layer can use.
• Output Layer: The final layer produces the output of the model. The
format of these outputs varies depending on the specific task (e.g.,
classification, regression).
Working of Neural Networks
• Forward Propagation
When data is input into the network, it passes through the network in the
forward direction, from the input layer through the hidden layers to the output
layer. This process is known as forward propagation. Here’s what happens
during this phase:
1.Linear Transformation: Each neuron in a layer receives inputs, which are
multiplied by the weights associated with the connections. These products
are summed together, and a bias is added to the sum. This can be
represented mathematically as: z=w1x1+w2x2+…+wnxn+bz=w1x1+w2x2
+…+wnxn+b where ww represents the weights, xx represents the inputs,
and bb is the bias.
2.Activation: The result of the linear transformation (denoted as zz) is then
passed through an activation function. The activation function is crucial
because it introduces non-linearity into the system, enabling the network to
learn more complex patterns. Popular activation functions include ReLU,
sigmoid, and tanh.
Artificial Models
Types of Activation Functions in
Deep Learning
• Linear Activation Function
• Linear Activation Function resembles straight line define by y=x. No matter
how many layers the neural network contains, if they all use linear activation
functions, the output is a linear combination of the input.
• The range of the output spans from (−∞ to +∞)(−∞ to +∞).
• Linear activation function is used at just one place i.e. output layer.
• Using linear activation across all layers makes the network’s ability to learn
complex patterns limited.
• Linear activation functions are useful for specific tasks but must be combined
with non-linear functions to enhance the neural network’s learning and
predictive capabilities.
Non-Linear Activation Functions
• Sigmoid Function
• Sigmoid Activation Function is characterized by ‘S’ shape. It is
mathematically defined asA=11+e−xA=1+e−x1. This formula ensures a
smooth and continuous output that is essential for gradient-based
optimization methods.
• It allows neural networks to handle and model complex patterns that linear
equations cannot.
• The output ranges between 0 and 1, hence useful for binary classification.
• The function exhibits a steep gradient when x values are between -2 and 2.
This sensitivity means that small changes in input x can cause significant
changes in output y, which is critical during the training process.
Tanh Activation Function
• anh function or hyperbolic tangent function, is a shifted version of
the sigmoid, allowing it to stretch across the y-axis. It is defined as:
• f(x)=tanh(x)=21+e−2x–1.f(x)=tanh(x)=1+e−2x2–1.
• Alternatively, it can be expressed using the sigmoid function:
• tanh(x)=2×sigmoid(2x)–1tanh(x)=2×sigmoid(2x)–1
• Value Range: Outputs values from -1 to +1.
• Non-linear: Enables modeling of complex data patterns.
• Use in Hidden Layers: Commonly used in hidden layers due to its
zero-centered output, facilitating easier learning for subsequent
layers.
3. ReLU (Rectified Linear
Unit) Function
• ReLU activation is defined by A(x)=max(0,x)A(x)=max(0,x), this means
that if the input x is positive, ReLU returns x, if the input is negative, it
returns 0.
• Value Range: [0,∞)[0,∞), meaning the function only outputs non-
negative values.
• Nature: It is a non-linear activation function, allowing neural networks
to learn complex patterns and making backpropagation more efficient.
• Advantage over other Activation: ReLU is less computationally
expensive than tanh and sigmoid because it involves simpler
mathematical operations. At a time only a few neurons are activated
making the network sparse making it efficient and easy for computation.
Exponential Linear Units
• 1. Softmax Function
• Softmax function is designed to handle multi-class classification
problems. It transforms raw output scores from a neural network into
probabilities. It works by squashing the output values of each class
into the range of 0 to 1, while ensuring that the sum of all
probabilities equals 1.
• Softmax is a non-linear activation function.
• The Softmax function ensures that each class is assigned a probability,
helping to identify which class the input belongs to.
SoftPlus Function
• Softplus function is defined mathematically
as: A(x)=log(1+ex)A(x)=log(1+ex). This equation ensures that the
output is always positive and differentiable at all points, which is an
advantage over the traditional ReLU function.
• Nature: The Softplus function is non-linear.
• Range: The function outputs values in the range (0,∞)(0,∞), similar to
ReLU, but without the hard zero threshold that ReLU has.
• Smoothness: Softplus is a smooth, continuous function, meaning it
avoids the sharp discontinuities of ReLU, which can sometimes lead to
problems during optimization.
Impact of Activation Functions
on Model Performance
• The choice of activation function has a direct impact on the performance
of a neural network in several ways:
• Convergence Speed: Functions like ReLU allow faster training by avoiding
the vanishing gradient problem, while Sigmoid and Tanh can slow down
convergence in deep networks.
• Gradient Flow: Activation functions like ReLU ensure better gradient
flow, helping deeper layers learn effectively. In contrast, Sigmoid can
lead to small gradients, hindering learning in deep layers.
• Model Complexity: Activation functions like Softmax allow the model to
handle complex multi-class problems, whereas simpler functions
like ReLU or Leaky ReLU are used for basic layers.
Feedforward neural network
• Feed forward neural networks are artificial neural networks in which
nodes do not form loops. This type of neural network is also known as
a multi-layer neural network as all information is only passed forward.
• A feedforward neural network (FNN) is a type of artificial neural
network that processes information in one direction. Examples of
FNNs include networks used for image classification and object
detection.
Example 1: Image classification
• Input: An image with pixel values
• Output: A prediction of the image's class label, such as "cat" or "dog"
• Layers: The network has an input layer, one or more hidden layers,
and an output layer
• Activation functions: The input layer uses nonlinear activation
functions, like the rectified linear unit (ReLU) or sigmoid function, to
send pixel values to the hidden layers
Example 2: Object detection
• Input: Multiple images of objects, such as starfish or sea urchins
• Output: A prediction of whether the input image contains a starfish or
sea urchin
• Training: The network is trained on images of starfish and sea urchins,
with each object associated with a set of visual features