UNIT - 4 ML
UNIT - 4 ML
1
3. Iterative Refinement: The network refines its output by adjusting
weights and biases, gradually improving its performance on diverse
tasks.
In an adaptive learning environment:
The neural network is exposed to a simulated scenario or dataset.
Parameters such as weights and biases are updated in response to
new data or conditions.
With each adjustment, the network’s response evolves, allowing it
to adapt effectively to different tasks or environments.
3
Working of Neural Networks
Forward Propagation
When data is input into the network, it passes through the network in
the forward direction, from the input layer through the hidden layers
to the output layer. This process is known as forward propagation.
Here’s what happens during this phase:
1. Linear Transformation: Each neuron in a layer receives inputs,
which are multiplied by the weights associated with the
connections. These products are summed together, and a bias is
added to the sum. This can be represented mathematically
as: z=w1x1+w2x2+…+wnxn+bz=w1x1+w2x2+…+wnxn
+b where ww represents the weights, xx represents the inputs,
and bb is the bias.
2. Activation: The result of the linear transformation (denoted as zz)
is then passed through an activation function. The activation
function is crucial because it introduces non-linearity into the
system, enabling the network to learn more complex patterns.
Popular activation functions include ReLU, sigmoid, and tanh.
Backpropagation
After forward propagation, the network evaluates its performance
using a loss function, which measures the difference between the
actual output and the predicted output. The goal of training is to
minimize this loss. This is where backpropagation comes into play:
1. Loss Calculation: The network calculates the loss, which provides
a measure of error in the predictions. The loss function could vary;
common choices are mean squared error for regression tasks or
cross-entropy loss for classification.
2. Gradient Calculation: The network computes the gradients of the
loss function with respect to each weight and bias in the network.
4
This involves applying the chain rule of calculus to find out how
much each part of the output error can be attributed to each weight
and bias.
3. Weight Update: Once the gradients are calculated, the weights
and biases are updated using an optimization algorithm like
stochastic gradient descent (SGD). The weights are adjusted in the
opposite direction of the gradient to minimize the loss. The size of
the step taken in each update is determined by the learning rate.
Iteration
This process of forward propagation, loss calculation,
backpropagation, and weight update is repeated for many iterations
over the dataset. Over time, this iterative process reduces the loss, and
the network’s predictions become more accurate.
Through these steps, neural networks can adapt their parameters to
better approximate the relationships in the data, thereby improving
their performance on tasks such as classification, regression, or any
other predictive modeling.
Example of Email Classification
Let’s consider a record of an email dataset:
Email Email Subject
ID Content Sender Line Label
“Get free
“Exclusive
1 gift cards spam@example.com 1
Offer”
now!”
To classify this email, we will create a feature vector based on the
analysis of keywords such as “free,” “win,” and “offer.”
The feature vector of the record can be presented as:
“free”: Present (1)
“win”: Absent (0)
“offer”: Present (1)
5
Email Featur
Emai Conten Subject e Labe
l ID t Sender Line Vector l
“Get
free gift spam@example.co “Exclusiv [1, 0,
1 1
cards m e Offer” 1]
now!”
Now, let’s delve into the working:
1. Input Layer: The input layer contains 3 nodes that indicates the
presence of each keyword.
2. Hidden Layer
The input data is passed through one or more hidden layers.
Each neuron in the hidden layer performs the following operations:
1. Weighted Sum: Each input is multiplied by a corresponding
weight assigned to the connection. For example, if the weights
from the input layer to the hidden layer neurons are as follows:
o Weights for Neuron H1: [0.5, -0.2, 0.3]
o Weights for Neuron H2: [0.4, 0.1, -0.5]
2. Calculate Weighted Input:
o For Neuron H1:
o Calculation=(1×0.5)+(0×−0.2)+(1×0.3)=0.5+0
+0.3=0.8Calculation=(1×0.5)+(0×−0.2)+(1×0.3
)=0.5+0+0.3=0.8
o For Neuron H2:
o Calculation=(1×0.4)+(0×0.1)+(1×−0.5)=0.4+0
−0.5=−0.1Calculation=(1×0.4)+(0×0.1)+(1×−0.
5)=0.4+0−0.5=−0.1
3. Activation Function: The result is passed through an activation
function (e.g., ReLU or sigmoid) to introduce non-linearity.
o For H1, applying ReLU: ReLU(0.8)=0.8ReLU(0.8)=0.8
o For H2, applying ReLU: ReLU(−0.1)=0ReLU(−0.1)=0
3. Output Layer
The activated outputs from the hidden layer are passed to the
output neuron.
6
The output neuron receives the values from the hidden layer
neurons and computes the final prediction using weights:
o Suppose the output weights from hidden layer to output
neuron are [0.7, 0.2].
o Calculation:
o Input=(0.8×0.7)+(0×0.2)=0.56+0=0.56Input=(0.8
×0.7)+(0×0.2)=0.56+0=0.56
o Final Activation: The output is passed through a sigmoid
activation function to obtain a probability:
o σ(0.56)≈0.636σ(0.56)≈0.636
4. Final Classification
The output value of approximately 0.636 indicates the probability
of the email being spam.
Since this value is greater than 0.5, the neural network classifies
the email as spam (1).
8
Long Short-Term Memory (LSTM): LSTM is a type of RNN
that is designed to overcome the vanishing gradient problem in
training RNNs. It uses memory cells and gates to selectively read,
write, and erase information.
Implementation of Neural Network using TensorFlow
Here, we implement simple feedforward neural network that trains on
a sample dataset and makes predictions using following steps:
Step 1: Import Necessary Libraries
Import necessary libraries, primarily TensorFlow and Keras, along
with other required packages such as NumPy and Pandas for data
handling.
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
Step 2: Create and Load Dataset
Create or load a dataset. Convert the data into a format suitable for
training (usually NumPy arrays).
Define features (X) and labels (y).
data = {
'feature1': [0.1, 0.2, 0.3, 0.4, 0.5],
'feature2': [0.5, 0.4, 0.3, 0.2, 0.1],
'label': [0, 0, 1, 1, 1]
}
df = pd.DataFrame(data)
X = df[['feature1', 'feature2']].values
y = df['label'].values
Step 3: Create a Neural Network
Instantiate a Sequential model and add layers. The input layer and
hidden layers are typically created using Dense layers, specifying the
number of neurons and activation functions.
model = Sequential()
model.add(Dense(8, input_dim=2, activation='relu')) # Hidden layer
model.add(Dense(1, activation='sigmoid')) # Output layer
Step 4: Compile the Model
Compile the model by specifying the loss function, optimizer, and
metrics to evaluate during training.
9
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
Step 5: Train the Model
Fit the model on the training data, specifying the number of epochs
and batch size. This step trains the neural network to learn from the
input data.
model.fit(X, y, epochs=100, batch_size=1, verbose=1)
Step 5: Make Predictions
Use the trained model to make predictions on new data. Process the
output to interpret the predictions (e.g., convert probabilities to binary
outcomes).
test_data = np.array([[0.2, 0.4]])
prediction = model.predict(test_data)
predicted_label = (prediction > 0.5).astype(int)
Complete Code
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Convert to DataFrame
df = pd.DataFrame(data)
10
# Add input layer and hidden layer
model.add(Dense(8, input_dim=2, activation='relu')) # 2 input
features, 8 neurons in hidden layer
# Make a prediction
prediction = model.predict(test_data)
predicted_label = (prediction > 0.5).astype(int) # Convert probability
to binary output
print(f"Predicted label: {predicted_label[0][0]}")
Output:
Predicted label: 1
Advantages of Neural Networks
Neural networks are widely used in many different applications
because of their many benefits:
Adaptability: Neural networks are useful for activities where the
link between inputs and outputs is complex or not well defined
because they can adapt to new situations and learn from data.
Pattern Recognition: Their proficiency in pattern recognition
renders them efficacious in tasks like as audio and image
identification, natural language processing, and other intricate data
patterns.
Parallel Processing: Because neural networks are capable of
parallel processing by nature, they can process numerous jobs at
11
once, which speeds up and improves the efficiency of
computations.
Non-Linearity: Neural networks are able to model and
comprehend complicated relationships in data by virtue of the non-
linear activation functions found in neurons, which overcome the
drawbacks of linear models.
Disadvantages of Neural Networks
Neural networks, while powerful, are not without drawbacks and
difficulties:
Computational Intensity: Large neural network training can be a
laborious and computationally demanding process that demands a
lot of computing power.
Black box Nature: As “black box” models, neural networks pose a
problem in important applications since it is difficult to understand
how they make decisions.
Overfitting: Overfitting is a phenomenon in which neural
networks commit training material to memory rather than
identifying patterns in the data. Although regularization approaches
help to alleviate this, the problem still exists.
Need for Large datasets: For efficient training, neural networks
frequently need sizable, labeled datasets; otherwise, their
performance may suffer from incomplete or skewed data.
Applications of Neural Networks
Neural networks have numerous applications across various fields:
1. Image and Video Recognition: CNNs are extensively used in
applications such as facial recognition, autonomous driving, and
medical image analysis.
2. Natural Language Processing (NLP): RNNs and transformers
power language translation, chatbots, and sentiment analysis.
3. Finance: Predicting stock prices, fraud detection, and risk
management.
4. Healthcare: Neural networks assist in diagnosing diseases,
analyzing medical images, and personalizing treatment plans.
5. Gaming and Autonomous Systems: Neural networks enable real-
time decision-making, enhancing user experience in video games
and enabling autonomous systems like self-driving cars.
12
2.Model Representation I
Neuron in the brain
o Many neurons in our brain
o Dendrite: receive input
o Axon: produce output
When it sends a message through the Axon to
another neuron
It sends to another neuron’s
Dendrite
Neuron model: logistic unit
o Yellow circle: body of neuron
o Input wires: dendrite
o Output wire: axon
Neural Network
13
o 3 Layers
1 Layer: input layer
2 Layer: hidden layer
Unable to observe values
Anything other than input or output layer
3 Layer: output
layer
significance
j (first of two subscript numbers)= ranges
from 1 to the number of units in layer l+1
i (second of two subscript numbers) =
ranges from 0 to the number of units in
layer l
15
l is the layer you’re moving
FROM
Notation
16
features
o Neural network, learns its own features
The features a’s are learned from x’s
It learns its own features to feed into logistic
regression
Better hypothesis than if we were constrained with
just x1, x2, x3
We can have whatever features we want to feed to
the final logistic regression function
Implemention in Octave for a2
17
a2 = sigmoid (Theta1 *
x);
1. Overfitting
When a neural network model learns the training data too well,
including the noise and outliers, it may perform poorly on new,
unseen data. This can be mitigated with techniques like dropout,
regularization, and cross-validation.
18
2. Underfitting
3. Vanishing/Exploding Gradients
5. Computational Requirements
Training deep neural networks can be computationally expensive and
time-consuming, requiring powerful hardware like GPUs. Optimizing
algorithms, model architecture, and using distributed computing can
mitigate this issue.
6. Hyperparameter Tuning
Finding the right hyperparameters (like learning rate, batch size, etc.)
is crucial for performance but can be a tedious and complex task. Grid
search, random search, and more advanced methods like Bayesian
optimization can help in tuning hyperparameters.
7. Interpretability
19
models and using tools like SHAP (SHapley Additive exPlanations)
can help in understanding model predictions.
8. Class Imbalance
When the dataset has classes that are not equally represented, the
model might become biased towards the majority class. Techniques
like resampling, class weighting, and anomaly detection can help in
handling imbalanced datasets.
21
This entire procedure is known as Gradient Ascent, which is also
known as steepest descent. The main objective of using a gradient
descent algorithm is to minimize the cost function using iteration. To
achieve this goal, it performs two steps iteratively:
What is Cost-function?
22
of machine learning models, as loss function refers to the error of one
training example, while a cost function calculates the average error
across an entire training set.
Hypothesis:
Parameters:
1. Y=mX+c
Where 'm' represents the slope of the line, and 'c' represents the
intercepts on the y-axis.
23
The starting point(shown in above fig.) is used to evaluate the
performance as it is considered just as an arbitrary point. At this starting
point, we will derive the first derivative or slope and then use a tangent
line to calculate the steepness of this slope. Further, this slope will
inform the updates to the parameters (weights and bias).
The slope becomes steeper at the starting point or arbitrary point, but
whenever new parameters are generated, then steepness gradually
reduces, and at the lowest point, it approaches the lowest point, which
is called a point of convergence.
The main objective of gradient descent is to minimize the cost function
or the error between expected and actual. To minimize the cost
function, two data points are required:
Learning Rate:
It is defined as the step size taken to reach the minimum or lowest point.
This is typically a small value that is evaluated and updated based on
the behavior of the cost function. If the learning rate is high, it results
in larger steps but also leads to risks of overshooting the minimum. At
the same time, a low learning rate shows the small step sizes, which
compromises overall efficiency but gives the advantage of more
precision.
24
Types of Gradient Descent
Batch gradient descent (BGD) is used to find the error for each point in
the training set and update the model after evaluating all training
examples. This procedure is known as the training epoch. In simple
words, it is a greedy approach where we have to sum over all examples
for each update.
26
For convex problems, gradient descent can find the global minimum
easily, while for non-convex problems, it is sometimes difficult to find
the global minimum, where the machine learning models achieve the
best results.
Whenever the slope of the cost function is at zero or just close to zero,
this model stops learning further. Apart from the global minimum, there
occur some scenarios that can show this slop, which is saddle point and
local minimum. Local minima generate the shape similar to the global
minimum, where the slope of the cost function increases on both sides
of the current points.
In contrast, with saddle points, the negative gradient only occurs on one
side of the point, which reaches a local maximum on one side and a
local minimum on the other side. The name of a saddle point is taken
by that of a horse's saddle.
The name of local minima is because the value of the loss function is
minimum at that point in a local region. In contrast, the name of the
global minima is given so because the value of the loss function is
minimum there, globally across the entire domain the loss function.
27
In a deep neural network, if the model is trained with gradient descent
and backpropagation, there can occur two more issues other than local
minima and saddle point.
Vanishing Gradients:
Exploding Gradient:
MLP networks are used for supervised learning format. A typical learning algorithm for
MLP networks is also called back propagation's algorithm.
28
A multilayer perceptron (MLP) is a feed forward artificial neural network that generates a
set of outputs from a set of inputs. An MLP is characterized by several layers of input
nodes connected as a directed graph between the input nodes connected as a directed
graph between the input and output layers. MLP uses backpropagation for training the
network. MLP is a deep learning method.
o Build graphs and run sessions [Do all the set-up and then execute a session to
implement a session to evaluate tensors and run operations].
o Create our coding and run on the fly.
For this first part, we will use the interactive session that is more suitable for an
environment like Jupiter notebook.
1. sess = tf.InteractiveSession()
Creating placeholders
It's a best practice to create placeholder before variable assignments when using
TensorFlow. Here we'll create placeholders to inputs ("Xs") and outputs ("Ys").
To find the
outputs of y3, y4 and y5
3. Computing Outputs
32
At h1 node,
a1=(w1,1x1)+(w2,1x2)=(0.2∗0.35)+(0.2∗0.7)=0.21a1=(w1,1 x1)+(w2,1x2
)=(0.2∗0.35)+(0.2∗0.7)=0.21
Once, we calculated the a1 value, we can now proceed to find the
y3 value:
yj=F(aj)=11+e−a1yj=F(aj)=1+e−a11
y3=F(0.21)=11+e−0.21y3=F(0.21)=1+e−0.211
y3=0.56y3 =0.56
Similarly find the values of y4 at h2 and y5 at O3 ,
a2=(w1,2∗x1)+(w2,2∗x2)=(0.3∗0.35)+(0.3∗0.7)=0.315a2 =(w1,2 ∗x1)+(w2,2∗x2
)=(0.3∗0.35)+(0.3∗0.7)=0.315
y4=F(0.315)=11+e−0.315y4 =F(0.315)=1+e−0.3151
a3=(w1,3∗y3)+(w2,3∗y4)=(0.3∗0.57)+(0.9∗0.59)=0.702a3=(w1,3∗y3)+(w2,3 ∗y4
)=(0.3∗0.57)+(0.9∗0.59)=0.702
y5=F(0.702)=11+e−0.702=0.67y5=F(0.702)=1+e−0.7021=0.67
4. Error Calculation
Note that, our actual output is 0.5 but we obtained 0.67.
To calculate the error, we can use the below formula:
Errorj=ytarget−y5Errorj=ytarget −y5
Error=0.5−0.67=−0.17Error=0.5−0.67=−0.17
Using this error value, we will be backpropagating.
Backpropagation
1. Calculating Gradients
The change in each weight is calculated as:
Δwij=η×δj×OjΔwij=η×δj×Oj
Where:
δjδj is the error term for each unit,
ηη is the learning rate.
2. Output Unit Error
For O3:
33
δ5=y5(1−y5)(ytarget−y5)δ5=y5(1−y5)(ytarget−y5)
=0.67(1−0.67)(−0.17)=−0.0376=0.67(1−0.67)(−0.17)=−0.0376
3. Hidden Unit Error
For h1:
δ3=y3(1−y3)(w1,3×δ5)δ3=y3(1−y3)(w1,3×δ5)
=0.56(1−0.56)(0.3×−0.0376)=−0.0027=0.56(1−0.56)(0.3×−0.0376)=−0.0
027
For h2:
δ4=y4(1−y4)(w2,3×δ5)δ4=y4(1−y4)(w2,3×δ5)
=0.59(1−0.59)(0.9×−0.0376)=−0.0819=0.59(1−0.59)(0.9×−0.0376)=−0.081
9
4. Weight Updates
For the weights from hidden to output layer:
Δw2,3=1×(−0.0376)×0.59=−0.022184Δw2,3
=1×(−0.0376)×0.59=−0.022184
New weight:
w2,3(new)=−0.22184+0.9=0.67816w2,3(new)=−0.22184+0.9=0.67816
For weights from input to hidden layer:
Δw1,1=1×(−0.0027)×0.35=0.000945Δw1,1=1×(−0.0027)×0.35=0.000945
New weight:
w1,1(new)=0.000945+0.2=0.200945w1,1(new)=0.000945+0.2=0.200945
Similarly, other weights are updated:
w1,2(new)=0.271335w1,2(new)=0.271335
w1,3(new)=0.08567w1,3(new)=0.08567
w2,1(new)=0.29811w2,1(new)=0.29811
w2,2(new)=0.24267w2,2(new)=0.24267
The updated weights are illustrated below,
34
y5=0.61y5=0.61
Since y5=0.61y5=0.61 is still not the target output, the process of calculating
the error and backpropagating continues until the desired output is
reached.
This process demonstrates how backpropagation iteratively updates
weights by minimizing errors until the network accurately predicts the
output.
Error=ytarget−y5Error=ytarget−y5
=0.5−0.61=−0.11=0.5−0.61=−0.11
This process is said to be continued until the actual output is gained by
the neural network.
35
8Face recognition
Face recognition using Artificial Intelligence(AI) is a computer
vision technology that is used to identify a person or object from an image or
video. It uses a combination of techniques including deep learning, computer
vision algorithms, and Image processing. These technologies are used to
enable a system to detect, recognize, and verify faces in digital images or
videos.
The technology has become increasingly popular in a wide variety of
applications such as unlocking a smartphone, unlocking doors, passport
authentication, security systems, medical applications, and so on. There are
even models that can detect emotions from facial expressions.
Difference between Face recognition & Face detection
Face recognition is the process of identifying a person from an image or video
feed and face detection is the process of detecting a face in an image or video
feed. In the case of Face recognition, someone’s face is recognized and
differentiated based on their facial features. It involves more advanced
processing techniques to identify a person’s identity based on feature point
extraction, and comparison algorithms. and can be used for applications such
as automated attendance systems or security checks. While Face detection is a
much simpler process and can be used for applications such as image tagging
or altering the angle of a photo based on the face detected. it is the initial step
in the face recognition process and is a simpler process that simply identifies a
face in an image or video feed.
36
Every Machine Learning algorithm takes a dataset as input and learns from the
data it basically means to learn the algorithm from the provided input and
output as data. It identifies the patterns in the data and provides the desired
algorithm. For instance, to identify whose face is present in a given image,
multiple things can be looked at as a pattern:
Height/width of the face.
Height and width may not be reliable since the image could be rescaled to a
smaller face or grid. However, even after rescaling, what remains
unchanged are the ratios – the ratio of the height of the face to the width of
the face won’t change.
Color of the face.
Width of other parts of the face like lips, nose, etc.
There is a pattern involved – different faces have different dimensions like the
ones above. Similar faces have similar dimensions. Machine Learning
algorithms only understand numbers so it is quite challenging. This numerical
representation of a “face” (or an element in the training set) is termed as a
feature vector. A feature vector comprises of various numbers in a specific
order.
As a simple example, we can map a “face” into a feature vector which can
comprise various features like:
Height of face (cm)
Width of the face (cm)
Average color of face (R, G, B)
Width of lips (cm)
Height of nose (cm)
Essentially, given an image, we can convert them into a feature vector like:
Height of face (cm) Width of the face (cm) Average color of face (RGB)
Width of lips (cm) Height of nose (cm)
23.1 15.8 (255, 224, 189) 5.2 4.4
So, the image is now a vector that could be represented as (23.1, 15.8, 255,
224, 189, 5.2, 4.4). There could be countless other features that could be
derived from the image,, for instance, hair color, facial hair, spectacles, etc.
Machine Learning does two major functions in face recognition technology.
These are given below:
1. Deriving the feature vector: it is difficult to manually list down all of the
features because there are just so many. A Machine Learning algorithm can
intelligently label out many of such features. For instance, a complex
feature could be the ratio of the height of the nose and the width of the
forehead.
37
2. Matching algorithms: Once the feature vectors have been obtained, a
Machine Learning algorithm needs to match a new image with the set of
feature vectors present in the corpus.
3. Face Recognition Operations
Face Recognition Operations
The technology system may vary when it comes to facial recognition.
Different software applies different methods and means to achieve face
recognition. The stepwise method is as follows:
Face Detection: To begin with, the camera will detect and recognize a
face. The face can be best detected when the person is looking directly at
the camera as it makes it easy for facial recognition. With the
advancements in technology, this is improved where the face can be
detected with slight variation in their posture of face facing the camera.
Face Analysis: Then the photo of the face is captured and analyzed. Most
facial recognition relies on 2D images rather than 3D because it is more
convenient to match to the database. Facial recognition software will
analyze the distance between your eyes or the shape of your cheekbones.
Image to Data Conversion: Now it is converted to a mathematical formula
and these facial features become numbers. This numerical code is known as
a face print. The way every person has a unique fingerprint, in the same
way, they have unique face prints.
Match Finding: Then the code is compared against a database of other face
prints. This database has photos with identification that can be compared.
The technology then identifies a match for your exact features in the
provided database. It returns with the match and attached information such
as name and address or it depends on the information saved in the database
of an individual.
Implementations
Steps:
Import the necessary packages
Load the known face images and make the face embedding of known image
Launch the live camera
Record the images from the live camera frame by frame
Make the face detection using the face_recognization face_location
command
Make the rectangle around the faces
Make the face encoding for the faces captured by the camera
if the faces are matched then plot the person image else continue
38
The model accuracy further can be improved using deep learning and and
other methods.
Face Recognition Softwares
Many renowned companies are constantly innovating and improvising to
develop face recognition software that is foolproof and dependable. Some
prominent software is being discussed below:
a. Deep Vision AI
Deep Vision AI is a front-runner company excelling in facial recognition
software. The company owns the proprietorship of advanced computer vision
technology that can understand images and videos automatically. It then turns
the visual content into real-time analytics and provides very valuable
insights.
Deep Vision AI provides a plug and plays platform to its users worldwide. The
users are given real-time alerts and faster responses based upon the analysis of
camera streams through various AI-based modules. The product offers a
highly accurate rate of identification of individuals on a watch list by
continuous monitoring of target zones. The software is highly flexible that it
can be connected to any existing camera system or can be deployed through
the cloud.
At present, Deep Vision AI offers the best performance solution in the market
supporting real-time processing at +15 streams per GPU.
Business intelligence gathering is helped by providing real-time data on
customers, their frequency of visits, or enhancement of security and safety.
Further, the output from the software can provide attributes like count, age,
gender, etc that can enhance the understanding of consumer behavior,
changing preferences, shifts with time, and conditions that can guide future
marketing efforts and strategies. The users also combine the face recognition
capabilities with other AI-based features of Deep Vision AI like vehicle
recognition to get more correlated data of the consumers.
39
The company complies with international data protection laws and applies
significant measures for a transparent and secure process of the data generated
by its customers. Data privacy and ethics are taken care of.
The potential markets include cities, public venues, public transportation,
educational institutes, large retailers, etc. Deep Vision AI is a certified partner
for NVIDIA’s Metropolis, Dell Digital Cities, Amazon AWS, Microsoft, Red
Hat, and others.
b. SenseTime
SenseTime is a leading platform developer that has dedicated efforts to
create solutions using the innovations in AI and big data analysis. The
technology offered by SenseTime is multifunctional. The aspects of this
technology are expanding and include the capabilities of facial recognition,
image recognition, intelligent video analytics, autonomous driving, and
medical image recognition. SenseTime software includes different subparts
namely, SensePortrait-S, SensePortrait-D, and SenseFace.
SensePortrait-S is a Static Face Recognition Server. It includes the
functionality of face detection from an image source, extraction of features,
extraction, and analysis of attributes, and target retrieval from a vast facial
image database
SensePortrait D is a Dynamic Face Recognition Server. The capabilities
included are face detection, tracking of a face, extraction of features, and
comparison and analysis of data from data in multiple surveillance video
streams.
SenseFace is a Face Recognition Surveillance Platform. This utility is a
Face Recognition technology that uses a deep learning algorithm.
SenseFace is very efficient in integrated solutions to intelligent video
analysis. It can be extensively used for target surveillance, analysis of the
trajectory of a person, management of population and the associated data
analysis, etc
SenseTime has provided its services to many companies and government
agencies including Honda, Qualcomm, China Mobile, UnionPay, Huawei,
Xiaomi, OPPO, Vivo, and Weibo.
c. Amazon Rekognition
Amazon provides a cloud-based software solution Amazon Rekognition is a
service computer vision platform. This solution allows an easy method to add
image and video analysis to various applications. It uses a highly scalable and
proven deep learning technology. The user is not required to have any machine
learning expertise to use this software. The platform can be utilized to identify
objects, text, people, activities, and scenes in images and videos. It can also
detect any inappropriate content. The user gets a highly accurate facial
analysis and facial search capabilities. Hence, the software can be easily used
40
for verification, counting of people, and public safety by detection, analysis,
and comparison of faces.
Organizations can use Amazon Rekognition Custom Labels to generate data
about specific objects and scenes available in images according to their
business needs. For example, a model may be easily built to classify specific
machine parts on the assembly line or to detect unhealthy plants. The user
simply provides the images of objects or scenes he wants to identify, and the
service handles the rest.
d. FaceFirst
The FaceFirst software ensures the safety of communities, secure transactions,
and great customer experiences. FaceFirst is secure, accurate, private, fast, and
scalable software. Plug-and-play solutions are also included for physical
security, authentication of identity, access control, and visitor analytics. It can
be easily integrated into any system. This computer vision platform has been
used for face recognition and automated video analytics by many
organizations to prevent crime and improve customer engagement.
As a leading provider of effective facial recognition systems, it benefits to
retail, transportation, event security, casinos, and other industry and public
spaces. FaceFirst ensures the integration of artificial intelligence with existing
surveillance systems to prevent theft, fraud, and violence.
e. Trueface
TrueFace is a leading computer vision model that helps people understand
their camera data and convert the data into actionable information. TrueFace is
an on-premise computer vision solution that enhances data security and
performance speeds. The platform-based solutions are specifically trained as
per the requirements of individual deployment and operate effectively in a
variety of ecosystems. The software places the utmost priority on the diversity
of training data. It ensures equivalent performance for all users irrespective of
their widely different requirements.
Trueface has developed a suite consisting of SDKs and a dockerized container
solution based on the capabilities of machine learning and artificial
intelligence. The suite can convert the camera data into actionable intelligence.
It can help organizations to create a safer and smarter environment for their
employees, customers, and guests using facial recognition, weapon detection,
and age verification technologies.
f. Face++
Face++ is an open platform enabled by the Chinese company Megvii. It
offers computer vision technologies. It allows users to easily integrate deep
learning-based image analysis recognition technologies into their
applications.
41
Face++ uses AI and machine vision in amazing ways to detect and analyze
faces, and accurately confirm a person’s identity. Face++ is also developer-
friendly being an open platform such that any developer can create apps
using its algorithms. This feature has resulted in making Face++ the most
extensive facial recognition platform in the world, with 300,000 developers
from 150 countries using it.
The most significant usage of Face++ has been its integration into
Alibaba’s City Brain platform. This has allowed the analysis of the CCTV
network in cities to optimize traffic flows and direct the attention of medics
and police by observing incidents.
g. Kairos
Kairos is a state-of-the-art and ethical face recognition solution available to
developers and businesses across the globe. Kairos can be used for Face
Recognition via Kairos cloud API, or the user can host Kairos on their
servers. The utility can be used for control of data, security, and privacy.
Organizations can ensure a safer and better accessibility experience for
their customers.
Kairos Face Recognition On-Premises has the added advantage of
controlling data privacy and security, keeping critical data in-house and
safe from any potential third parties/hackers. The speed of face recognition-
enabled products is highly enhanced because it does not come across the
issue of delay and other risks associated with public cloud deployment.
Kairos is ultra-scalable architecture such that the search for 10 million
faces can be done at approximately the same time as 1 face. It is being
accepted by the market with open hands.
h. Cognitec
Cognitec’s FaceVACS Engine enables users to develop new applications for
face recognition. The engine is very versatile as it allows a clear and logical
API for easy integration in other software programs. Cognitec allows the use
of the FaceVACS Engine through customized software development kits. The
platform can be easily tailored through a set of functions and modules specific
to each use case and computing platform. The capabilities of this software
include image quality checks, secure document issuance, and access control by
accurate verification.
The distinct features include:
A very powerful face localization and face tracking
Efficient algorithms for enrollment, verification, and identification
Accurate checking of age, gender, age, exposure, pose deviation, glasses,
eyes closed, uniform lighting detection, unnatural color, image, and face
geometry
Fulfills the requirements of ePassports by providing ISO 19794-5 full-
frontal image type checks and formatting
42
Utilization of Face Recognition
While facial recognition may seem futuristic, it’s currently being used in a
variety of ways. Here are some surprising applications of this technology.
Genetic Disorder Identification:
There are healthcare apps such as Face2Gene and software like Deep Gestalt
that uses facial recognition to detect genetic disorders. This face is then
analyzed and matched with the existing database of disorders.
Airline Industry:
Some airlines use facial recognition to identify passengers. This face scanner
would help save time and to prevent the hassle of keeping track of a ticket.
Hospital Security:
Facial recognition can be used in hospitals to keep a record of the patients
which is far better than keeping records and finding their names, and
addresses. It would be easy for the staff to use this app and recognize a patient
and get its details within seconds. Secondly, can be used for security purposes
where it can detect if the person is genuine or not or if is it a patient.
Detection of emotions and sentiments:
Real-time emotion detection is yet another valuable application of face
recognition in healthcare. It can be used to detect emotions that patients
exhibit during their stay in the hospital and analyze the data to determine how
they are feeling. The results of the analysis may help to identify if patients
need more attention in case they’re in pain or sad.
Problems and Challenges
Face recognition technology is facing several challenges. The common
problems and challenges that a face recognition system can have while
detecting and recognizing faces are discussed in the following paragraphs.
Pose: A Face Recognition System can tolerate cases with small rotation
angles, but it becomes difficult to detect if the angle would be large and if
the database does not contain all the angles of the face then it can impose a
problem.
Expressions: Because of emotions, human mood varies and results in
different expressions. With these facial expressions, the machine could
make mistakes to find the correct person’s identity.
Aging: With time and age face changes it is unique and does not remain
rigid due to which it may be difficult to identify a person who is now 60
years old.
Occlusion: Occlusion means blockage. This is due to the presence of
various occluding objects such as glasses, beard, mustache, etc. on the face,
and when an image is captured, the face lacks some parts. Such a problem
can severely affect the classification process of the recognition system.
43
Illumination: Illumination means light variations. Illumination changes can
vary the overall magnitude of light intensity reflected from an object, as
well as the pattern of shading and shadows visible in an image. The
problem of face recognition over changes in illumination is widely
recognized to be difficult for humans and algorithms. The difficulties posed
by illumination condition is a challenge for automatic face recognition
systems.
Identify similar faces: Different persons may have a similar appearance
that sometimes makes it impossible to distinguish.
Disadvantages of Face Recognition
1. The danger of automated blanket surveillance
2. Lack of clear legal or regulatory framework
3. Violation of the principles of necessity and proportionality
4. Violation of the right to privacy
5. Effect on democratic political culture
9. Neural Network Learning: Neural Network Representation, Problems for Neural Network
Learning, Perceptions and gradient descent, Multi Layer Network and Back propagation Algorithm,
Illustrative Example of Back Propagation Algorithm- Face Recognition, Advanced Topics inANN.
ADVANCED TOPICS INANN IN MACHINE LEARNING
Here are some advanced topics in Artificial Neural Networks (ANNs) and
machine learning:
1. Deep Learning
2. Kernel Methods
44
Bayesian Networks: Representing relationships between variables using
graph structures.
Undirected Models: Also known as Markov Random Fields.
Bayesian Learning and Structure Learning: Inference on graphical
models.
4. Reinforcement Learning
5. Generative Models
These topics offer a lot of depth and potential for groundbreaking applications
in various domains such as image processing, natural language understanding,
and autonomous systems.
45