0% found this document useful (0 votes)

509 views61 pages

Unit 5

Uploaded by

Sai Manasa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

509 views61 pages

Unit 5

Uploaded by

Sai Manasa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 61

Unit V:

Neural Networks and Deep Learning

Introduction to Artificial Neural Networks with

Keras, Implementing MLPs with Keras, Installing
TensorFlow2, Loading and Preprocessing Data with
Tensor Flow.
Introduction to ANNs

• Artificial Neural Networks (ANN) are algorithms based on brain

function and are used to model complicated patterns and forecast
issues. The Artificial Neural Network (ANN) is a deep learning
method that arose from the concept of the human brain Biological
Neural Networks.

• ANNs are at the very core of Deep Learning, being used in

» Google Images
» Apple’s Siri
» YouTube
» DeepMind’s AlphaGo
From Biological to Artificial Neurons
• ANNs were first introduced back in 1943 by the
neurophysiologist Warren McCulloch and the mathematician
Walter Pitts.
• The early successes of ANNs led to the widespread belief that
we would soon be conversing with truly intelligent machines.
• Sadly, this promise went unfulfilled, triggering the first AI
winter in the 1970s.
• In mid-80s, new architectures such as Multilayer Perceptrons
(MLPs) and better training techniques such as
backpropagation algorithm revives the interest in
connection (i.e., the study of neural networks).
• However, by the 1990s, other powerful Machine Learning
techniques such as Support Vector Machines and Random
Forests overtake ANNs.
From Biological to Artificial Neurons (cont.)
• Since early 2010s, the success of deep learning in computer
vision, there is a huge wave of interest in ANNs.
• Reasons for this AI spring:
» There is now a huge quantity of data available to train
neural networks
 ANNs frequently outperform other ML techniques on very
large and complex problems.
» The tremendous increase in computing power since the 1990s
now makes it possible to train large neural networks in a
reasonable amount of time.
 The availability of Powerful GPU cards and cloud
computing platforms.
» The training algorithms have been improved.
 We can train very large networks now.
A Biological Neuron
Multiple layers in a biological neural
network (human cortex)
Logical Computations with Neurons
• McCulloch and Pitts proposed a very simple model of the
biological neuron, which later became known as an artificial
neuron: it has one or more binary (on/off) inputs and one
binary output.
» Even with such a simplified model it is possible to build a network
of artificial neurons that computes any logical proposition you want.

• These networks can be combined to compute complex logical

expressions
The Perceptron
• The Perceptron is one of the simplest ANN architectures,
invented in 1957 by Frank Rosenblatt.
» Based on a slightly different artificial neuron called a
threshold logic unit (TLU), or sometimes a linear
threshold unit (LTU).
 Compute a weighted sum of its inputs then applies a step
function
• Architecture of a Perceptron with two input neurons, one bias
neuron, and three output neurons:
Single-layer perceptron
Single layer perceptron is a simple Neural Network which contains
only one layer.
It is the calculation of sum of input vector with the value multiplied by
corresponding vector weight.
The displayed output value will be the input of an activation function.

The perceptron consists of 4 parts.

1.Input values or One input layer
2.Weights and Bias
3.Net sum
4.Activation Function
Single Layer Perceptron has just two layers of input and output. It only
has single layer hence the name single layer perceptron. It does not
contain Hidden Layers as that of Multilayer perceptron.

Input nodes are connected fully to a node or multiple nodes in the next
layer. A node in the next layer takes a weighted sum of all its inputs
Multi-layer perceptron
A multilayer perceptron is a type of feed-forward artificial neural
network that generates a set of outputs from a set of inputs.
An MLP is a neural network connecting multiple layers in a directed
graph, which means that the signal path through the nodes only goes
one way.
The MLP network consists of input, output, and hidden layers.
Each hidden layer consists of numerous perceptron’s which are called
hidden layers or hidden unit.
How is a Perceptron trained?

• The Perceptron training algorithm proposed by Rosenblatt was

largely inspired by Hebb’s rule.
» when a biological neuron triggers another neuron often, the connection
between these two neurons grows stronger.
» “Cells that fire together, wire together”
• Perceptron learning rule:
» For every output neuron that produced a wrong prediction, it reinforces
the connection weights from the inputs that would have contributed to
the correct prediction.

• Perceptron convergence theorem:

» if the training instances are linearly separable, Rosenblatt demonstrated
that this algorithm would converge to a solution.
Perception in SciKit-Learn
Limitations of Perceptrons

• In their 1969 monograph Perceptrons, Marvin Minsky and

Seymour Papert highlighted a number of serious weaknesses of
Perceptrons
» Exclusive OR (XOR) classification problem
• But some of the limitations can be eliminated by Multilayer
Perceptron (MLP)
Architecture of a Multilayer Perceptron
Backpropagation Algorithm
• The field of Deep Learning studies deep neural networks
(DNNs)---ANNs contain a deep stack of hidden layers---and more
generally models containing deep stacks of computations.
• For many years researchers struggled to find a way to train MLPs,
without success.
• In 1986, David Rumelhart, Geoffrey Hinton, and Ronald
Williams introduced the backpropagation training
algorithm.
» Two passes through the network (one forward,
one backward)
» Compute the gradient of the network’s error with regard
to every single model parameter.
» Adjust the parameters by using gradient descent.
• The algorithm handles one mini-batch at a time (e.g., 32
instances in the training set).
• It goes through the full training set multiple times. Each pass is
called an epoch.
• Forward pass: Each mini-batch is passed from the network’s
input layer to the output layer through the hidden layers. All
intermediate results are preserved.
• Measures the network’s output error by a loss function that
compares the desired output and the actual output of the
network.
• Computes how much each output connection contributed to the
error by the chain rule.
• Reserve pass: Measures how much of these error
contributions came from each connection in the layer below,
again using the chain rule, working backward until the
algorithm reaches the input layer.
• Performs a Gradient Descent step to weak all the connection
weights in the network, using the error gradients it just
computed.
Activation Functions
• Backpropagation algorithm cannot use with the step function,
which provides no gradient for Gradient Descent
• Logistic (sigmoid) function: σ(z) = 1 / (1 + exp(–z)).
• Hyperbolic tangent function: tanh(z) = 2σ(2z) – 1
• Rectified Linear Unit function: ReLU(z) = max(0, z)
Output Neurons for Regression MLPs
• For regression tasks, an ANN predicts a single value only.
Thus, one output neuron is sufficient.
• For multivariate regression, there is one output neuron per
output dimension.
• Output neurons should return any range of values.
» ReLU function: ReLU(z) = max(0, z)
» Softplus function: softplus(z) = log(1 + exp(z))
» logistic function or hyperbolic tangent with scaling factors.
• Loss functions:
» Mean Squared Error
» Mean Absolute Error
» Huber Loss
Typical regression MLP architecture
Output Neurons for Classification MLPs
• For binary classification tasks, there is one output neuron with
the logistic activation function
» The output can be interpreted as the estimated probability of the
positive class.
• For multilabel binary classification tasks, you need multiple
output neurons.
• For multiclass classification tasks, there is one output neuron
per class, and the softmax activation function should be used
for the whole output layer.
Output Neurons for Classification MLPs (cont.)
• The predicted class is:

• Loss function: cross-entropy loss (a.k.a.

the log loss):
A modern MLP for classification
Typical classification MLP architecture
Implementing MLPs with Keras
• Keras is a high-level Deep Learning API that
allows you to easily build, train, evaluate, and
execute all sorts of neural networks.
» https://keras.io
» Computation backend: TensorFlow, Microsoft
Cognitive Toolkit (CNTK), Theano, Apache
MXNet, Apple’s Core ML, JavaScript or
TypeScript, and PlaidML.
• tf.keras: Extended Keras implementation based
on TensorFlow with TensorFlow-specific
features.
• PyTorch is also quite popular.
Multibackend Keras vs. tf.keras
Installing TensorFlow 2
• If you are using Google Colab only, you can skip this step.
• If you plan to run your code on your own computer, please install Jupyter,
Scikit-Learn, etc.
• Activate the virtual environment and then use pip to install TensorFlow 2:

• Open a Python shell or a Jupyter notebook and print the version of

TensorFlow and tf.keras
Fashion MNIST Dataset
• 70,000 grayscale images of 28 × 28 pixels each, with 10 classes

• Drop-in replacement of MNIST in Chapter 2.

» But the images represent fashion items rather than handwritten
digits.
» More challenging than MNIST: a simple linear model reaches about 92%
accuracy on MNIST, but only about 83% on Fashion MNIST.
Using Keras to Load the Dataset
• Keras provides some utility functions to fetch and load common datasets.

• Loading data from Keras is different from Scikit-Learn:

» Every image is represented as a 28 × 28 array rather than a 1D array of
size 784.

» The pixel intensities are represented as integers (from 0 to 255) rather than floats (from
0.0 to 255.0)
• Since we are going to train the neural network using Gradient Descent, we
must scale the input features down to the 0–1 range by dividing them by
255.0:
Naming the Labels
• Unlike MNIST, Fashion MNIST needs the list of class names of
each label to know what the image are:

• For example, the first image in the training set represents a

coat:
Creating the model using the Sequential API

• The first method for building a neural network in tf.keras is the use of
Sequential API.
» Only for neural neteworks that compose of a single stack of layers connected
sequentially.
• The tf.keras code for building a classification MLP with two
hidden layers:

» The Flatten layer convert each input image into a 1D array.

» Each Dense layer manages its own weight matrix, containing all the connection weights
between the neurons and their inputs, as well as the bias terms.
Creating the model using the Sequential API
(cont.)
• Alternatively, you can add the layers when the Sequential model is created.

• The model’s summary() method displays the information of the model’s layers:
Accessing the Information of a Model
• Directly get a model’s list of layers:

• All the parameters of a layer can be accessed using its get_weights() and set_weights() methods:
Compiling the Model
• Before training the model, you must compile the model:

• Use the "sparse_categorical_cross entropy" loss when we have

sparse labels (i.e., for each instance, there is just a tar- get class
index, from 0 to 9 in this case) and the classes are exclusive.
» Otherwise, the "categorical_crossentropy" loss if one-hot vectors is used
(i.e., [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.] represents class 3).
» Otherwise, use the "binary_crossentropy" loss if the "sigmoid" (i.e.,
logistic) activation function in the output layer is used for binary
classification tasks.
• Use “sgd” for Stochastic Gradient Descent (i.e., reverse-mode
autodiff plus Gradient Descent)
• Use “accuracy” because our model is a classifier.
Training the Model
• After compiling the model, call fit() to train the model with the training and
validation datasets.

• You should check whether overfitting occurs (i.e., accuracy >> val_accuracy)
• Consider passing the class_weight argument if the training set is skewed.
Drawing the Learning Curves
• fit() returns a History object, which contains:
» The training parameters (history.params)
» The list of epochs it went through (history.epoch)
» The loss and extra metrics at the end of each epoch on the training set
and on the validation set (history.history).
• You can draw the learning curves using matplotlib:
Drawing the Learning Curves (cont.)
• The learning curve shows the mean training loss and accuracy
measured over each epoch, and the mean validation loss and
accuracy measured at the end of each epoch:

• When reporting the learning curves, you should shift the training
curve in the above graph by half an epoch to the left.
Continue the Training
• If the model has not converged yet, call fit() again to continue the
training.
• If you are not satisfied with the performance of your model, you
should go back and tune the hyperparameters.
» Tune the learning rate
» Try another optimizer
» Adjust the number of layers, the number of neurons per layer, and the
types of activation functions to use for each hidden layer
» Change the batch size
• Finally, estimate the generalization error using the test set before
you deploy the model to production.

• Don’t tweak the hyperparameters to improve the accuracy of the

test set
Using the Model to Make Predictions
• After training the model, you can use the model’s predict() method to
make predictions on new instances:

• If you want to know the class with the highest estimated probability only,
use the predict_classes() method instead:

• They should be correct (otherwise, more training)

California Housing with the Sequential API
Building Complex Models Using the
Functional API
• You cannot use the Sequential API to build nonsequential
neural networks.
• For example, consider the Wide & Deep neural network:
» can learn both deep patterns (using the deep path) and simple rules
(through the short path)
Using the Functional API
• How about sending a subset of the features through the wide
path and a different subset (possibly overlapping) through the
deep path:
Using the Functional API (cont.)
• Compile, train, and evaluate the model, and then make
predictions:
Models with Multiple Outputs
• Reasons for having multiple outputs:
» The task may demand it.
» You have multiple independent tasks
based on the same data---multitask
classification.
» Add some auxiliary outputs for
regularization.
Models with Multiple Outputs (cont.)
• Each output will need its own loss function:

• Train the models with two datasets:

• Evaluate the outputs separately:

• Likewise, make predictions separately:

Using the Subclassing API to Build
Dynamic Models
• Both the Sequential API and the Functional API
are declarative
» Advantages:
 The model can easily be saved, cloned, and shared
 its structure can be displayed and analyzed
 the framework can infer shapes and check types, so errors
can be caught early
 It’s also fairly easy to debug, since the whole model is a
static graph of layers.
» Disadvantage:
 The models are static---cannot build models that involve
loops, varying shapes, conditional branching, and other
dynamic behaviors.
Using the Subclassing API to Build
Dynamic Models (cont.)
• The Subclassing API: subclass the Model class, create the layers you need in
the constructor, and use them to perform the computations you want in the
call() method.
» Advantage: Imperative programming style---you can use for loops, if statements,
low-level TensorFlow operation in call()
» Disadvantage: Keras cannot inspect the model’s architecture and it is hard to
debug
Saving and Restoring a Model
• When using the Sequential API or the Functional API, you can save a
trained Keras model:

• Keras will use the HDF5 format to save

» The model’s architecture (including every layer’s hyperparameters)
» The values of all the model parameters for every layer (e.g., connection
weights and biases)
» The optimizer (including its hyperparameters and any state it may
have)
» etc.
• To load the model:
Using Callbacks to Save Intermediate
Models during Training
• Remember to save models at regular intervals during a long training
session to avoid losing everything if your computer crashes.
• The fit() method accepts a callbacks argument that lets you specify a list of
objects that Keras will call at the start and end of training, at the start and
end of each epoch, and even before and after processing each batch.

• If you use a validation set during training, you can set

save_best_only=True when creating the ModelCheckpoint to implement
early stopping:
Using Callbacks to Implement Early
Stopping and custom callbacks
• Another way to implement early stopping is to simply use the
EarlyStopping callback.

• If you need extra control, you can easily write your own
custom callbacks. For example,
Using TensorBoard for Visualization
• TensorBoard is a great interactive visualization
tool that you can use to
» view the learning curves during training
» compare learning curves between multiple runs
» visualize the computation graph
» analyze training statistics
» view images generated by your model
» visualize complex multidimensional data projected
down to 3D and automatically clustered for you
» etc.
Visualizing Learning Curves with TensorBoard
Using TensorBoard
• To use TensorBoard, you must modify your program so that it outputs the
data you want to visualize to special binary log files called event files.
• Each binary data record is called a summary.
• The TensorBoard server will monitor the log directory, and it will
automatically pick up the changes and update the visualizations.
• In general, you want to point the TensorBoard server to a root log
directory and configure your program so that it writes to a different
subdirectory every time it runs.
• Define the root log directory for TensorBoard logs
Using TensorBoard (cont.)
• Keras provides the TensorBoard() callback:

• The callback automatically create the log directory, generate

event files and write summaries to them during training.
• The directory structure:
Using TensorBoard (cont.)
• Start the TensorBoard server by running a command in a
terminal:

• Once the server is up, you can open a web browser and go to
http://localhost:6006
• To use TensorBoard directly within Jupyter:
Using TensorBoard (cont.)
• TensorFlow offers a lower-level API in the tf.summary package.
» E.g., you can create a SummaryWriter using the create_file_writer() function,
and it uses this writer as a context to log scalars, histograms, images, audio,
and text, all of which can then be visualized using TensorBoard
Number of Hidden Layers
• Theoretically, you can use a shallow neural networks to model even the
most complex functions, provided it has enough neurons.
• But deep networks have a much higher parameter efficiency than shallow
ones for complex problems.
» Real-world data is often structured in such a hierarchical way, and deep neural networks
automatically take advantage of this fact.
• Not only does this hierarchical architecture help DNNs converge faster to a
good solution, but it also improves their ability to generalize to new
datasets (i.e., transfer learning)
• Very complex tasks, such as large image classification or speech
recognition, typically require networks with hundreds of layers and they
need a huge amount of training data.
» It is more common to reuse parts of a pretrained state-of-the-art network that performs
these tasks.
Number of Neurons per Hidden Layer
• The number of neurons in the input and output layers is
determined by the type of input and output your task requires.
» e.g., the MNIST task requires 28 × 28 = 784 input neurons and 10 output
neurons.
• As for the hidden layers, it used to be common to size them to form
a pyramid, with fewer and fewer neurons at each layer.
• You can try increasing the number of neurons gradually until the
network starts overfitting.
• The “stretch pants” approach: pick a model with more layers and
neurons than you actually need, then use early stopping and other
regularization techniques to prevent it from overfitting.
» Avoid bottleneck layers that could ruin your model.
• In general you will get more bang for your buck by increasing the
number of layers instead of the number of neurons per layer.
Tuning the Learning Rate
• Learning rate is arguably the most important hyperparameter.
• One way to find a good learning rate is to train the model for a few
hundred iterations, starting with a very low learning rate (e.g., 10-5)
and gradually increasing it up to a very large value (e.g., 10).
» This is done by multiplying the learning rate by a constant factor at each
iteration (e.g., by exp(log(106)/500) to go from 10-5 to 10 in 500 iterations).
• If you plot the loss as a function of the learning rate (using a log
scale for the learning rate), you should see it dropping at first.
» But after a while, the learning rate will be too large, so the loss will shoot
back up
• The optimal learning rate will be a bit lower than the point at which
the loss starts to climb (typically about 10 times lower than the
turning point).
• You can then reinitialize your model and train it normally using this
good learning rate.
Tuning Optimizer, Batch Size, Activation
Functions, and Number of Iterations
• Choosing a better optimizer than plain old Mini-batch
Gradient Descent is quite important.
• The main benefit of using large batch sizes is that hardware
accelerators like GPUs can process them efficiently, so the
training algorithm will see more instances per second.
» But some researchers reported that large batch sizes often lead
to training instabilities, the resulting model may not generalize
as well as a model trained with a small batch size.
• There are activation functions better than ReLU
• In most cases, the number of training iterations does not
actually need to be tweaked: just use early stopping
instead.

Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
Deep Learning Exp
100% (1)
Deep Learning Exp
25 pages
Unit-V Deep Learning Techniques
100% (1)
Unit-V Deep Learning Techniques
31 pages
Neural Networks & SVMs in AI
No ratings yet
Neural Networks & SVMs in AI
19 pages
DL Unit-2
No ratings yet
DL Unit-2
51 pages
21cs743 Model Question Paper Solution
No ratings yet
21cs743 Model Question Paper Solution
33 pages
Unit Ii
No ratings yet
Unit Ii
8 pages
DL Unit 2
No ratings yet
DL Unit 2
29 pages
Deep Learning Laboratory
No ratings yet
Deep Learning Laboratory
69 pages
DL Full Merged
No ratings yet
DL Full Merged
454 pages
Deep Learning CNN Training Guide
No ratings yet
Deep Learning CNN Training Guide
20 pages
Backpropagation & Neural Networks
No ratings yet
Backpropagation & Neural Networks
30 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
UNIT2
No ratings yet
UNIT2
25 pages
Deep Learning - Unit-III Two Marks
100% (2)
Deep Learning - Unit-III Two Marks
3 pages
ML Unit-1
No ratings yet
ML Unit-1
43 pages
Deep Learning Question Bank (2024-25)
No ratings yet
Deep Learning Question Bank (2024-25)
2 pages
Deep Learning Handout
100% (1)
Deep Learning Handout
6 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
LP I ML Viva Questions
100% (1)
LP I ML Viva Questions
9 pages
Unit 3 Full Notes
No ratings yet
Unit 3 Full Notes
30 pages
Unit - 3-NNDL - Notes
No ratings yet
Unit - 3-NNDL - Notes
17 pages
Omkar Sabnis B4-764 Experiment No. 7 Aim: Implementation of MC-Culloch Pitt Model For AND Gate Using Python. Theory
No ratings yet
Omkar Sabnis B4-764 Experiment No. 7 Aim: Implementation of MC-Culloch Pitt Model For AND Gate Using Python. Theory
10 pages
Stochastic Encoders
100% (1)
Stochastic Encoders
2 pages
DL Question Bank
No ratings yet
DL Question Bank
23 pages
Question Bank AML
No ratings yet
Question Bank AML
4 pages
Neural Network Optimizers Guide
100% (2)
Neural Network Optimizers Guide
21 pages
Machine Learning Summarized Notes 1660762916
No ratings yet
Machine Learning Summarized Notes 1660762916
111 pages
Perceptron and Backpropagation
No ratings yet
Perceptron and Backpropagation
17 pages
R22 ML Question Bank For It and CSM
No ratings yet
R22 ML Question Bank For It and CSM
4 pages
Deep Learning in Healthcare
100% (1)
Deep Learning in Healthcare
57 pages
Deep Learning Course File Aiml-1
No ratings yet
Deep Learning Course File Aiml-1
183 pages
ML Notes Unit 1-2
No ratings yet
ML Notes Unit 1-2
55 pages
Deep Learning Exam With Answers
No ratings yet
Deep Learning Exam With Answers
4 pages
ML Unit 1
No ratings yet
ML Unit 1
15 pages
DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
Summary Notes of CNN
No ratings yet
Summary Notes of CNN
23 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
AD601 Deep Learning Unit-2 Notes
No ratings yet
AD601 Deep Learning Unit-2 Notes
14 pages
Advanced RNN Techniques Explained
No ratings yet
Advanced RNN Techniques Explained
15 pages
Unit 4 NNDL
No ratings yet
Unit 4 NNDL
37 pages
Neural Networks Lab Guide
No ratings yet
Neural Networks Lab Guide
26 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
DEEP LEARNING Import Questions For External Exam
No ratings yet
DEEP LEARNING Import Questions For External Exam
1 page
Unit III
No ratings yet
Unit III
38 pages
CCS355 SET3 Neural Network Anna University Lab Question Set
No ratings yet
CCS355 SET3 Neural Network Anna University Lab Question Set
2 pages
Chap 7-2 Regularization For Deep Learning-Hyun-Lim Yang
No ratings yet
Chap 7-2 Regularization For Deep Learning-Hyun-Lim Yang
49 pages
Decision Tree and Ensemble
No ratings yet
Decision Tree and Ensemble
92 pages
ML - CSA 301 - ML Perspective and Issues
No ratings yet
ML - CSA 301 - ML Perspective and Issues
34 pages
402B Deep Learning
100% (1)
402B Deep Learning
82 pages
ML Unit-1
No ratings yet
ML Unit-1
15 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Machine Learning-4
100% (1)
Machine Learning-4
18 pages
Machine Learning QB
No ratings yet
Machine Learning QB
3 pages
Unit 4
No ratings yet
Unit 4
24 pages
Unit V
No ratings yet
Unit V
21 pages
Deep Learning Optimization Guide
No ratings yet
Deep Learning Optimization Guide
30 pages
Chapter10 Keras
No ratings yet
Chapter10 Keras
66 pages
A Neural Network-Based Nonlinear Acoustic Echo Canceller
No ratings yet
A Neural Network-Based Nonlinear Acoustic Echo Canceller
5 pages
Bankruptcy Prediction Models: Case Study of As Tere
No ratings yet
Bankruptcy Prediction Models: Case Study of As Tere
50 pages
AI - Artificial Intelligence Program Brochure by Weschool, Bangalore (Welingkar Management Institute)
No ratings yet
AI - Artificial Intelligence Program Brochure by Weschool, Bangalore (Welingkar Management Institute)
9 pages
Robuts Recognition For Traffic Signals
No ratings yet
Robuts Recognition For Traffic Signals
5 pages
Support Vector Machines For Wind Speed Prediction
No ratings yet
Support Vector Machines For Wind Speed Prediction
9 pages
Chapter II Build A Neural Network Step by Step
No ratings yet
Chapter II Build A Neural Network Step by Step
31 pages
AI Systems for Business Solutions
No ratings yet
AI Systems for Business Solutions
14 pages
A Review Study On Urban Planning and Art
No ratings yet
A Review Study On Urban Planning and Art
4 pages
Data Science Distributions & Models
100% (1)
Data Science Distributions & Models
5 pages
Stock Time
No ratings yet
Stock Time
16 pages
Pattern Recognition Exam Guide
No ratings yet
Pattern Recognition Exam Guide
2 pages
Data Mining for Business Decisions
100% (1)
Data Mining for Business Decisions
85 pages
Convolution Model Step by Step v1
No ratings yet
Convolution Model Step by Step v1
31 pages
Predictive Data Mining and Discovering Hidden Values of Data Warehouse
No ratings yet
Predictive Data Mining and Discovering Hidden Values of Data Warehouse
5 pages
Introduction To Neural Networks: Deep Learning For NLP
No ratings yet
Introduction To Neural Networks: Deep Learning For NLP
57 pages
Machine Learning Question Paper Solved ML
No ratings yet
Machine Learning Question Paper Solved ML
55 pages
(Lecture Notes in Computer Science 4488) JeongHee Cha, GyeYoung Kim, HyungIl Choi (auth.), Yong Shi, Geert Dick van Albada, Jack Dongarra, Peter M. A. Sloot (eds.)-Computational Science – ICCS 2007_ 7.pdf
No ratings yet
(Lecture Notes in Computer Science 4488) JeongHee Cha, GyeYoung Kim, HyungIl Choi (auth.), Yong Shi, Geert Dick van Albada, Jack Dongarra, Peter M. A. Sloot (eds.)-Computational Science – ICCS 2007_ 7.pdf
1,284 pages
Neural Networks For Control Systems: G. Linear and Nonlinear Programming
No ratings yet
Neural Networks For Control Systems: G. Linear and Nonlinear Programming
3 pages
ML-Based Radio Resource Management in 5G and Beyond Networks A Survey
No ratings yet
ML-Based Radio Resource Management in 5G and Beyond Networks A Survey
22 pages
Machine Learning Engineer Course Curriculum PDF
No ratings yet
Machine Learning Engineer Course Curriculum PDF
40 pages
Seismic Resilience via Machine Learning
No ratings yet
Seismic Resilience via Machine Learning
14 pages
Djukanovic 1995
No ratings yet
Djukanovic 1995
8 pages
Introduction To Artificial Neural Networks-Zurada
100% (4)
Introduction To Artificial Neural Networks-Zurada
764 pages
Uncertainty-Aware Surrogates for Design
No ratings yet
Uncertainty-Aware Surrogates for Design
8 pages
RWRF
No ratings yet
RWRF
8 pages
Artificial Intelligence Fundamentals
No ratings yet
Artificial Intelligence Fundamentals
2 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
12 Neural Network
No ratings yet
12 Neural Network
52 pages
AI Enhances Banking Customer Experience
No ratings yet
AI Enhances Banking Customer Experience
9 pages
On The Opportunities and Risks of Foundation Models: Corresponding Author: Pliang@cs - Stanford.edu Equal Contribution
No ratings yet
On The Opportunities and Risks of Foundation Models: Corresponding Author: Pliang@cs - Stanford.edu Equal Contribution
212 pages

Unit 5

Uploaded by

Unit 5

Uploaded by

Unit V:

Neural Networks and Deep Learning

Introduction to Artificial Neural Networks with

• Artificial Neural Networks (ANN) are algorithms based on brain

• ANNs are at the very core of Deep Learning, being used in

• These networks can be combined to compute complex logical

The perceptron consists of 4 parts.

• The Perceptron training algorithm proposed by Rosenblatt was

• Perceptron convergence theorem:

• In their 1969 monograph Perceptrons, Marvin Minsky and

• Loss function: cross-entropy loss (a.k.a.

• Open a Python shell or a Jupyter notebook and print the version of

• Drop-in replacement of MNIST in Chapter 2.

• Loading data from Keras is different from Scikit-Learn:

• For example, the first image in the training set represents a

» The Flatten layer convert each input image into a 1D array.

• Use the "sparse_categorical_cross entropy" loss when we have

• Don’t tweak the hyperparameters to improve the accuracy of the

• They should be correct (otherwise, more training)

• Train the models with two datasets:

• Evaluate the outputs separately:

• Likewise, make predictions separately:

• Keras will use the HDF5 format to save

• If you use a validation set during training, you can set

• The callback automatically create the log directory, generate

You might also like