DL Unit 3 Upto Mid 1
DL Unit 3 Upto Mid 1
1
1. Anatomy of Neural Network
ANN stands for Artificial Neural Networks. Basically, it’s a computational model. That is based
on structures and functions of biological neural networks. Although, the structure of the ANN
affected by a flow of information. Hence, neural network ch
anges were based on input and output.
Basically, we can consider ANN as nonlinear statistical data.
That means complex relationship defines between input and output. As a result, we found
different patterns. Also, we call the ANN as a neural network.
ANN stands for Artificial Neural Networks. Basically, it’s a computational model. That is based
on structures and functions of biological neural networks. Although, the structure of the ANN
affected by a flow of information. Hence, neural network ch
2
anges were based on input and output.
Basically, we can consider ANN as nonlinear statistical data.
That means complex relationship defines between input and output. As a result, we found
different patterns. Also, we call the ANN as a neural network.
Generally, the working of a human brain by making the right connections is the idea behind ANNs. That
was limited to the use of silicon and wires as living neurons and dendrites.
Here, neurons, part of the human brain. That was composed of 86 billion nerve cells. Also, connected to
other thousands of cells by Axons. Although, there are various inputs from sensory organs. That was
accepted by dendrites. As a result, it creates electric impulses. That is used to travel through the
Artificial neural network. Thus, to handle the different issues, neuron send a message to another neuron.
As a result, we can say that ANNs are composed of multiple nodes. That imitate biological neurons of
the human brain. Although, we connect these neurons by links. Also, they interact with each other.
Although, nodes are used to take input data. Further, perform simple operations on the data. As a result,
3
these operations are passed to other neurons. Also, output at each node is called its activation or node
value.
As each link is associated with weight. Also, they are capable of learning. That takes place by altering
weight values. Hence, the following illustration shows a simple ANN
A neural network is made up of vertically stacked components called Layers. Each dotted line in the
image represents a layer. There are three types of layers in a NN-
Input Layer– First is the input layer. This layer will accept the data and pass it to the rest of the
network.
Hidden Layer– The second type of layer is called the hidden layer. Hidden layers are either one or
more in number for a neural network. In the above case, the number is 1. Hidden layers are the ones
that are actually responsible for the excellent performance and complexity of neural networks. They
perform multiple functions at the same time such as data transformation, automatic feature creation,
etc.
Output layer– The last type of layer is the output layer. The output layer holds the result or the output
of the problem. Raw images get passed to the input layer and we receive output in the output layer.
4
ANN Working
The input node takes the information in numerical form. The information represents an
activation value where each node has given a number.
The higher the number, the greater the activation. Based on weights and activation function, the
activation value passes to the next node.
Each node calculates the weighted sum and updates that sum based on the transfer
function(activation function). After that, it applies an activation function.
This function applies to this particular neuron. From that, the neuron concludes if it needs to
forward the signal or not.
ANN decides the signal extension on the adjustments of the weights.
The activation runs through the network until it reaches the output node. The output layer shares
the information in an understanding way.
The network uses the cost function to compare the output and expected output. Cost function
refers to the difference between the actual value and the predicted value.
Lower the cost function, closer it is to the desired output.
There are two processes for minimizing the cost function.
1) Back Propagation
Back propagation is the core of neural network training. It is the prime mechanism by which
neural networks learn. Data enters the input layer and propagates in the network to give the
output.
After that, the cost function will equate the output and desired output. If the value of the cost
function is high then the information goes back, and the neural network starts learning to reduce
the cost function by adjusting the weights.
Proper adjustment of weights lowers the error rate and makes the model definitive.
2) Forward Propagation
The information enters into the input layer and forwards in the network to get the output value.
The user compares the value to the expected results.
The next step is calculating errors and propagating the information backward.
This permits the user to train the neural network and modernize the weights.
Due to the structured algorithm, the user can adjust weights simultaneously. It will help the user
to see which weight of the neural network is responsible for error.
5
Types of Artificial Neural Network
This network contains an input, hidden, and output layer. Signals can move in only one
direction. Input data passes to the hidden layer to perform the mathematical calculations.
Processing element computes according to the weighted sum of its inputs. The output of the
previous layer becomes the input of the following layer. This continues through all the
layers and determines the output.
Eg: Data mining
6
This network has feedback paths. It means signals can travel in both the direction using
loops. Neurons can have all the possible connections. Due to loops, it becomes a dynamic
system that changes continuously to reach in the equilibrium state.
Eg: Recurrent neural network
ANN Learning Techniques
1) Supervised Learning
In this learning, the user trains the model using labelled data. It means some data is
already marked with the correct answers.
Supervised learning can be compared to the learning which is held in the presence of
a supervisor.
2) Unsupervised learning
In this learning ,the model does not need supervision. It usually deals with the
unlabelled data.
User permits the model to work on its own to classify the data. It sorts the data
according to the similarities and patterns without any prior training to the data.
Artificial Neural Network Applications
1) Text classification and categorization
It is an essential part of many applications like web searching, information filtering, and
language identification.
2) Medical
We can use it in detecting cancer cells and analyzing the MRI images to give detailed
results.
3) Paraphrase detection
Question answering system needs to determine whether two sentences have the same
meaning or not. Artificial neural networks are very helpful in paraphrasing detection.
4) Forecast
We can use it in every field of business decisions like in finance and the stock market, in
economic and monetary policy.
7
5) Image processing
We can use satellite imagery processing for agricultural and defense use.
ANN Advantages
1) It has a parallel processing ability. It has the numerical strength that performs more than
one task at the same time.
2) Failure of one element of the network does not affect the working of the whole system.
This characteristic makes it fault-tolerant.
3) A neural network learns from the experience and does not need reprogramming.
Disadvantages of ANN
1) Its black-box nature is the most prominent disadvantage of ANN. The neural network
does not give the proper explanation of determining the output. It reduces trust in the
network.
2) The duration of the development of the network is unknown.
3) There is no assurance of proper network structure. There is no proper rule to determine
the structure.
2. Introduction to Keras:
Keras
TensorFlow
Theano and CNTK
Setting up Deep Learning Workstation
Introduction to Keras
Keras is a deep learning framework for Python that provides a convenient way to define and
train almost any kind of deep learning model.
8
Keras is a high-level neural networks API, written in Python which is capable of running on top
of Tensorflow, Theano and CNTK.
It was developed for enabling fast experimentation.
With Keras, however, the entire process of creating a Neural Network’s structure, as well as
training and tracking it, becomes exceedingly straightforward.
Keras
Keras is a high-level API that works with the backends Tensorflow, Theano, and CNTK.
It includes a good and user-friendly API for implementing neural network tests.
It’s also capable of running on both CPUs as well as GPUs.Keras comes with 10 different
neural network modelling and training API modules.
9
1. Modularity
Keras is modular. It considers a model in the form of a graph or a sequence. Keras allows you to save
the model you are working on. Keras provides a save() method to save the current model. You can even
use the model in the future.
2. Large Dataset
Keras contains a large pre-defined dataset. It provides you a variety of datasets. You can use this
dataset to be directly importing and loading it.
3. Train from NumPy Data
Keras uses the NumPy array to train and evaluate the model. It makes use of the fit() method. The fit()
method fits the model to the training data. This training process may take some time. fit() method had
three arguments batch_size, validation_data and epochs.
4. Evaluation and Prediction
Keras has evaluate() and predict() methods. These methods can use the dataset of NumPy. After testing
the data, the evaluation of the result is done. These methods are used to evaluate our models.
5. Pre-trained Models in Keras
Keras contains a number of pre-trained models. These models can be imported from keras.applications.
These models are useful for feature extraction and fine-tuning. Keras.application is a module that
contains weights for image classification like VGG16, VGG19, Xception, etc.
6. Encoding in Keras
Karas allows you encoding feature. There is one_hot() function in Keras that enables encoding. It helps
you to encode integers in one step. It also enables you to tokenize the data. This function filters out the
white spaces, make the text to lower case, and filter out the punctuations.
7. Layers in Keras
There are numerous layers and parameters in Keras. All Keras layers have a number of methods in
them. These layers are useful to construct, train, configure the data. The dense layer is beneficial to
implement operations.
8. You can Obtain the Output of an Intermediate Layer
Keras is a very easy library. It enables you to obtain the output in the intermediate of a layer. To obtain
output in the intermediate, you can simply create a new layer that will help you to obtain the output.
10
9. Keras is Python-Native Library
Keras is a complete Python library. It uses all the known concepts of Python. It is a library that is
written in the Python language. As Keras is Python oriented, it provides you a user-friendly
environment.
10. Pre-processing of Data
Keras provides you several functions for the preprocessing of data. ImageDataGenerator is one such
method. You can import it in by:
from keras.preprocessing.image import ImageDataGenerator.
It came into the market on 9th November 2015 under the Apache License 2.0.
It is built in such a way that it can easily run on multiple CPUs and GPUs as well as on mobile operating
systems.
It consists of various wrappers in distinct languages such as Java, C++, or Python.
Advantages of TensorFlow:
Tensor flow has a better graph representation for a given data rather than any other top platform out
there.
Tensor flow has the advantage that it does support and uses many backend software like GUI and
ASIC.
When it comes to community support tensor flow has the best.
Tensor flow also helps in debugging the sub-part of the graphs.
Tensor flow has shown a better performance when compared with other platforms.
Easy to extend as it gives freedom to add custom blocks to build on new ideas.
11
Disadvantages of TensorFlow:
Tensor flow not specifically designed for the Windows operating systems but it is designed for
other OS like Linux but tensor flow can be installed in windows with the help of a python
package installer(pip).
The speed of the tensor flow is less when it is compared to other platforms of the same type.
For a better understanding of tensor flow, the user must have the fundamentals of calculus.
2. TensorFlow is used for large datasets and high Keras is usually used for small datasets.
performance models.
6. TensorFlow has a complex architecture and not Keras has a simple architecture and easy to use.
easy to use.
7. TensorFlow was developed by the Google Brain Keras was developed by François Chollet while he
team. was working on the part of the research effort of
project ONEIROS.
12
Theano
Theano was developed at the University of Montreal, Quebec, Canada, by the MILA group.
It is an open-source python library that is widely used for performing mathematical operations
on multi-dimensional arrays by incorporating scipy and numpy.
It utilizes GPUs for faster computation and efficiently computes the gradients by building
symbolic graphs automatically.
It has come out to be very suitable for unstable expressions, as it first observes them
numerically and then computes them with more stable algorithms.
Advantages of Theano:
Keras, Lasagne, and Blocks like packages are built on Theano.
Raw Theano is a low-level product.
Having high-level modules like Keras, Blocks, Lasagne, etc, makes it more usable.
Drawbacks of Theano:
On AWS, it can be complex.
13
It can be run on a single GPU.
Requires large compile time for vast and complex models.
Error notices are complex, which makes debugging harder.
Table of Difference between TensorFlow and Theano
TensorFlow Theano
TensorFlow execution speed is slow when Theano performs tasks way faster
compared to Theano. But in the case of than TensorFlow. Mainly the tasks
Execution handling the tasks which require multiple that require a single GPU will run
Speed GPU TensorFlow is faster. faster in Theano.
14
CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework.
It consists of all the basic building blocks, which are required to form a neural network.
The models are trained using C++ or Python, but it incorporates C# or Java to load the model
for making predictions.
Microsoft Cognitive Toolkit (CNTK), now known as Microsoft Azure Machine Learning, is an open-
source deep learning framework developed by Microsoft. It's designed to enable efficient and scalable
training of deep neural networks for a wide range of machine learning tasks. While it was initially
developed by Microsoft Research, it has been integrated into the Azure Machine Learning ecosystem to
provide a comprehensive set of tools for building, training, and deploying machine learning models.
Here are some key features and aspects of the Microsoft Cognitive Toolkit (CNTK):
Deep Learning Framework: CNTK is primarily used for deep learning tasks, particularly deep neural
networks. It supports a variety of neural network architectures, including feedforward networks,
convolutional neural networks (CNNs), recurrent neural networks (RNNs), and more.
15
Performance and Scalability: CNTK is known for its high-performance computing capabilities. It is
optimized for training deep neural networks on multi-GPU setups and distributed computing clusters.
This makes it suitable for handling large-scale machine learning tasks efficiently.
Flexibility: CNTK offers a flexible and extensible architecture, allowing researchers and developers to
customize and experiment with their neural network models easily. It supports both symbolic and
imperative programming paradigms.
Supported Languages: While CNTK is primarily used with Python, it also provides APIs for C++ and
C#. This makes it accessible to a broader audience of developers.
Integration with Azure: As part of the Microsoft Azure Machine Learning ecosystem, CNTK
seamlessly integrates with Azure services, making it easier to deploy models in the cloud and manage
machine learning pipelines.
Pre-trained Models: CNTK provides pre-trained models for various tasks, such as image
classification, speech recognition, and natural language processing. These models can be used as a
starting point for custom projects or fine-tuned for specific tasks.
Cross-Platform Compatibility: While CNTK originated as a Microsoft product, it's available on
multiple platforms, including Windows, Linux, and macOS. This cross-platform compatibility makes it
accessible to a wider user base.
Community and Support: Although CNTK has transitioned to being a part of Azure Machine
Learning, there is still an active user community and documentation available. Users can find resources
and support on the Azure Machine Learning website and GitHub repository.
Learning Curve: While CNTK provides powerful capabilities, it may have a steeper learning curve
compared to some other deep learning frameworks like TensorFlow and PyTorch. However, for users
who require high-performance computing and scalability, it can be a valuable choice.
Microsoft Cognitive Toolkit (CNTK) Components
1. Model Definition: In CNTK, you define your deep neural network model using a high-level API in
your chosen programming language (typically Python). This includes specifying the layers,
connections, and parameters of your neural network.
2. Data Input: Your data, which could be images, text, or any other structured or unstructured data, is
preprocessed and fed into the model. CNTK allows you to create data pipelines for efficient data
loading and preprocessing.
16
3. Training: The training process involves optimizing the model's parameters (weights and biases) using
a training dataset. This is typically done through stochastic gradient descent (SGD) or variations of it,
like mini-batch gradient descent. During training, you compute the gradients of the model's loss with
respect to its parameters and update the parameters to minimize the loss.
4. Model Evaluation: After training, you evaluate the model's performance using a separate validation
dataset. This helps you assess how well the model generalizes to unseen data.
5. Inference: Once the model is trained and evaluated, you can use it for making predictions or
classifications on new, unseen data. This is called inference. CNTK allows you to deploy your trained
models for real-world applications.
6. Hardware Acceleration: CNTK supports various hardware configurations, including multi-GPU
setups and distributed computing clusters. This hardware acceleration is crucial for speeding up
training, especially for large models and datasets.
7. Integration with Azure: CNTK integrates with Microsoft Azure services for cloud-based machine
learning tasks, making it easier to deploy and manage models in a cloud environment.
1. Define a network: In this step, you define the different layers in our model and the connections
between them. Keras has two main types of models: Sequential and Functional models. You choose
which type of model you want and then define the dataflow between them.
2. Compile a network: To compile code means to convert it in a form suitable for the machine to
understand. In Keras, the model.compile() method performs this function. To compile the model, we
define the loss function which calculates the losses in our model, the optimizer which reduces the
loss, and the metrics which is used to find the accuracy of our model.
17
3. Fit the network: Using this, we fit our model to our data after compiling. This is used to train the
model on our data.
4. Evaluate the network: After fitting our model, we need to evaluate the error in our model.
5. Make Predictions: We use model.predict() to make predictions using our model on new data.
Keras Models
Keras is a high-level library for deep learning, built on top of Theano and Tensorflow. It is written in
Python and provides a clean and convenient way to create a range of deep learning models. Keras has
become one of the most used high-level neural networks APIs when it comes to developing and testing
neural networks.
Creating layers for neural networks as well as setting up complex architectures are now a breeze due to
the Keras high-level API.
A Keras model is made up of a sequence or a standalone graph. There are several fully configurable
modules that can be combined to create new models.
Some of these configurable modules that you can plug together are neural layers, cost functions,
optimizers, initialization schemes, dropout, loss, activation functions, and regularization schemes. One
of the main advantages that come with modularity is that you can easily add new features as separate
modules.
As a result, Keras is very flexible and well-suited for innovative research. There are two ways you can
develop a Keras model: sequential and functional.
18
Sequential API Mode
The Sequential API model is the simplest model and it comprises a linear pile of layers that
allows you to configure models layer-by-layer for most problems.
The sequential model is very simple to use, however, it is limited in its topology.
The limitation comes from the fact that you are not able to configure models with shared layers
or have multiple inputs or outputs.
Functional API
Alternatively, the Functional API is ideal for creating complex models, that require extended
flexibility.
It allows you to define models that feature layers connect to more than just the previous and
next layers.
Models are defined by creating instances of layers and connecting them directly to each other in
pairs, Actually, with this model you can connect layers to any other layer.
With this model creating complex networks such as siamese networks, residual networks, multi-
input/multi-output models, directed acyclic graphs (DAGs), and models with shared layers
becomes possible.
Applications of Keras
Keras is used for creating deep models which can be productized on smartphones.
Keras is also used for distributed training of deep learning models.
19
Keras is used by companies such as Netflix, Yelp, Uber, etc.
Keras is also extensively used in deep learning competitions to create and deploy working models,
which are fast in a short amount of time.
Advantages of Keras
o It is very easy to understand and incorporate the faster deployment of network models.
o It has huge community support in the market as most of the AI companies are keen on using it.
o It supports multi backend, which means you can use any one of them among TensorFlow, CNTK, and
Theano with Keras as a backend according to your requirement.
o Since it has an easy deployment, it also holds support for cross-platform. Following are the devices on
which Keras can be deployed:
1. iOS with CoreML
2. Android with TensorFlow Android
3. Web browser with .js support
4. Cloud engine
5. Raspberry pi
o It supports Data parallelism, which means Keras can be trained on multiple GPU's at an instance for
speeding up the training time and processing a huge amount of data.
Disadvantages of Keras
o The only disadvantage is that Keras has its own pre-configured layers, and if you want to create an
abstract layer, it won't let you because it cannot handle low-level APIs.
o It only supports high-level API running on the top of the backend engine (TensorFlow, Theano, and
CNTK).
20
Setting up Deep Learning Workstation
Setting up a deep learning workstation is a crucial step in your journey to working on
deep learning projects.
A well-configured workstation can significantly accelerate your model training and
experimentation. Here's a step-by-step guide to setting up a deep learning workstation:
1. Hardware Selection:
Before setting up your workstation, you need to select the appropriate hardware
components based on your budget and requirements. Common components include:
GPU: Invest in a powerful GPU (Graphics Processing Unit) as it significantly speeds up
training deep learning models.
NVIDIA GPUs are the most commonly used for deep learning, and models like the
NVIDIA GeForce RTX 30 series or NVIDIA A100 are popular choices.
CPU: A multi-core CPU with high clock speeds will complement your GPU for tasks
that involve data preprocessing and post-processing.
RAM: Deep learning models often require a lot of memory (16 GB to 64 GB or more) to
handle large datasets and complex models.
Storage: High-speed SSDs are essential for storing datasets, code, and model
checkpoints. Consider adding a larger HDD for long-term storage.
2. Operating System:
Choose a compatible operating system for deep learning. Ubuntu Linux is a popular
choice because it has good support for deep learning libraries and frameworks like
TensorFlow and PyTorch. You can also use Windows Subsystem for Linux (WSL) for
Windows-based workstations.
3. Software Installation:
Install the necessary software and libraries for deep learning. Here are some essential
components:
NVIDIA Drivers: Install the appropriate NVIDIA GPU drivers for your GPU model.
CUDA Toolkit: CUDA is a parallel computing platform that enables GPU acceleration
for deep learning. Install the CUDA Toolkit compatible with your GPU.
21
cuDNN: This is a GPU-accelerated library for deep neural networks. Install the cuDNN
library compatible with your CUDA version.
Deep Learning Frameworks: Install popular deep learning frameworks like TensorFlow,
PyTorch, and Keras using pip or conda.
Python: Deep learning is typically done using Python. Install Python and relevant
packages using a package manager like Anaconda or Miniconda.
Jupyter Notebooks: Jupyter notebooks are great for interactive coding and visualization.
Install Jupyter using pip or conda.
4. Development Environment:
Set up your preferred integrated development environment (IDE) or code editor. Some
popular options include PyCharm, Visual Studio Code, and Jupyter Notebooks.
5. Data Management:
Organize and store your datasets efficiently. Make use of high-capacity storage drives
and consider setting up a dedicated data pipeline if working with large datasets.
6. Version Control:
Use version control systems like Git and platforms like GitHub or GitLab to keep track
of your code changes and collaborate with others.
7. Training Environment:
Ensure your deep learning workstation is in a well-ventilated area to prevent overheating
during intensive training sessions. Proper cooling is essential.
8. Framework-Specific Setup:
Depending on the deep learning framework you're using (TensorFlow, PyTorch, etc.),
you may need to configure specific settings and environment variables.
9. Tutorials and Documentation:
Familiarize yourself with the documentation and tutorials provided by the deep learning
frameworks and libraries you're using. This will help you get started with your projects.
10. Experiment and Learn:
Finally, start experimenting with deep learning projects, taking advantage of the
powerful hardware and software setup you've created. Continuously learn and improve
your skills by working on a variety of projects and staying updated with the latest
developments in the field.
22
Configuration of the Workstation:
Processor – Intel Xeon E2630 v4 – 10 core processor, 2.2 GHz with Turboboost upto 3.1 GHz.
25 MB Cache
23