0% found this document useful (0 votes)
17 views56 pages

UNIT 2 Merged

The document discusses various subfields of artificial intelligence, focusing on machine learning, including supervised, unsupervised, and reinforcement learning. It details the supervised learning process, including data collection, preprocessing, model training, evaluation, and deployment, while highlighting challenges such as data quality and overfitting. Additionally, it covers key components and techniques in supervised learning, such as classification and regression methods.

Uploaded by

sathiyavathyp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views56 pages

UNIT 2 Merged

The document discusses various subfields of artificial intelligence, focusing on machine learning, including supervised, unsupervised, and reinforcement learning. It details the supervised learning process, including data collection, preprocessing, model training, evaluation, and deployment, while highlighting challenges such as data quality and overfitting. Additionally, it covers key components and techniques in supervised learning, such as classification and regression methods.

Uploaded by

sathiyavathyp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

UNIT II AI SUBFIELDS AND TECHNOLOGIES 9

Machine Learning-Neural Network and Deep Learning-Natural Language Processing


(NLP) And Computer Vision. Case study: Smart speaker, Case study: Self-driving car.

INTRODUCTION:
“We are entering a new world. The technologies of machine learning, speech recognition, and natural
language understanding are reaching a nexus of capability. The end result is that we'll soon have
artificially intelligent assistants to help us in every aspect of our lives. 2 — Amy Stapleton, Opus
Research

The world is currently going through an unmatched transition in a fast- evolving digital ecosystem.
It is powered by such technologies that were once the stuff of science fiction. The capacity to integrate
this flood of information and draw useful insights has become critical in this age of information when
data pours consistently from innumerable sources. At the centre of this transformation are four
interrelated artificial intelligence pillars: natural language machine learning, computer vision, deep
learning, and neural networks.

MACHINE LEARNING

“Most of the human and animal learning is Unsupervised learning if intelligence was a cake, and
animal learning is unsupervised processing, learning. If unsupervised learning would be the cake,
supervised learning would be the icing on the cake, and reinforcement learning would be the cherry on
the cake. We know how to make the icing and the cherry, but we don’t know how to make the cake. We
need to solve the unsupervised learning problem before we can even think of getting to true AL” Yann
LeCun, VP: Facebook

Machine learning is a subfield of artificial intelligence (AI) that focuses on the development of
algorithms and statistical models that enable computer systems to improve their performance on a
specific task through learning from data, of without being explicitly programmed. It is characterized by
the use of statistical techniques to enable computers to identify patterns, make decisions, and improve
their performance over time based on the data they encounter. In the age limitless data and computational
expertise, machine learning is a symbol of creativity and transformation. It has influenced every part of
our lives, from personalised to cutting-edge medical diagnostics, from autonomous counsel systems to
predictive analytics. Note that, in the context of machine learning and data science, data refers to the
raw information, facts, or observations collected or generated for analysis, modelling, and decision-
making purposes.

The three primary paradigms of machine learning—Supervised Learning, Unsupervised Learning, and
Reinforcement Learning will be the focus of this section.

 Supervised Machine Learning :

Supervised machine learning is a type of machine learning in which an algorithm learns from
labelled training data and makes predictions or decisions based on that learning. In supervised learning,
the algorithm is provided with a dataset that includes both input features and corresponding correct
output labels. The primary goal of supervised learning is to learn a mapping from input data to output
data, allowing the algorithm to make predictions or classifications for new, unseen data.

The term "supervised” implies that there is a supervisor, typically a human or a predefined
algorithm, who guides the model during the learning process. There are two primary types of supervised
learning: classification and regression.
 Key Components of Supervised Learning :

 Dataset: A dataset is partitioned into two parts which are :


 Training Data: This is the labelled data used to train the machine learning model. It consists of
input features (dependent variables) and corresponding is the target labels (independent variables).
The model patterns and relationships between features and labels from this data. and learns Testing
Data: This is a subset of the dataset that is not used for training. It used to measure the model's
performance and its capacity to generalise to new, previously unknown data.
 Features: Features, also known as input variables or independent variables, are characteristics or
attributes of the data that the model uses to make predictions. Features can be numerical or
categorical.
 Labels: Labels, also known as target or dependent variables, represent the outcomes or predictions
that the model is trained to produce. In classification tasks, labels indicate class memberships (e.g.,
pass or fail), while in regression tasks, labels represent continuous values (e.g., predicting house
prices).
 Model: The machine learning model is the algorithm or mathematical function that learns the
relationship between the input features and the target labels during the training process.
 Loss Function: The loss function (or cost function or objective function) measures how well the
model's predictions match the true labels in the training data. The goal during training is to
minimize this loss, typically by adjusting the model's parameters.
 Optimization Algorithm: An optimization algorithm is used to update the model's parameters
iteratively to minimize the loss function. Gradient descent a commonly used optimization
algorithm for this purpose.
 the Hyperparameters: Hyperparameters are configuration settings that are not learned from the
data but are set before training. Examples include learning rates, regularization strength, and the
architecture of the model itself.
 Training Process: The training process involves feeding the training data into model, the
computing predictions, calculating the loss, and updating the model's parameters using the
optimization algorithm. This process iterates until model's performance converges or reaches a
predefined stopping a criterion.
 Validation: Validation is the process of assessing the model's performance on a separate validation
dataset during training. It helps monitor for overfitting and guides the adjustment of
hyperparameters.
 Testing and Evaluation: After training and validation, the model is tested on holdout testing
dataset to evaluate its generalization performance. Various evaluation metrics, such as accuracy,
precision, recall, F1-score, mean squared error, etc., are used to assess how well the model
performs on unseen data.
 Deployment: Once a model has been trained and tested, it can be deployed in real-world
applications to make predictions or automate decision-making.
 Monitoring and Maintenance: After deployment, models may require monitoring and periodic
updates to ensure they continue to perform well as data distributions change or new data becomes
available.

 Classification :
Classification is a type of supervised learning where the goal is to assign input to one of several
predefined categories or classes. A classification technique provides discrete, categorical labels or
classes as output. It can be further categorized based on the specific techniques and algorithms used:
 Binary Classification: In binary classification, there are only two possible classes. The
algorithm predicts whether an input belongs to one of these two classes, Example:
Classifying emails as either spam or not spam.
 Multi-Class Classification: In multi-class classification, there are more than two possible
classes. The algorithm predicts which class an input belongs to out of multiple options.
Example: Handwriting recognition (recognizing digits 0-9), Identifying objects in images and
labelling them (e.g., recognizing cats, dogs, cars, etc.). Multi-Label Classification: In multi-label
classification, an input can belong multiple classes simultaneously. The algorithm predicts a set of labels
for each input.
Example: Movie genre classification (a movie can belong to multiple genres simultaneously, such
as action and comedy).
Classification algorithm are used to categorize or classify data into algorithms predefined classes
or categories based on the features or attributes of the data.
Here are some common classification algorithms:
 Decision Trees: Decision trees are tree-like structures where each internal node represents a
feature or attribute, and each leaf node represents a class. Random Forests and Gradient Boosted
Trees are ensemble methods that combine multiple decision trees to improve accuracy.

 Logistic Regression:
 Logistic regression is a simple and widely used binary classification algorithm that models
the probability of the input data belonging to a particular class.

 Naive Bayes:
 Naive Bayes classifiers are based on Bayes' theorem and assume that features are
conditionally independent given the class. They are particularly useful for text
classification tasks like spam detection and sentiment analysis.

 Support Vector Machine:


 SVM is a hyperplane-based binary classification technique that divides data points into
two groups. It can handle high-dimensional data and is effective when the data cannot be
separated linearly.

 K-Nearest Neighbours (K-NN):


 K-NN is a simple and intuitive classification algorithm. It operates by finding the K
training data points closest to a new, unlabelled data point in a multidimensional feature
space, using a distance metric like Buclidean distance. Once the closest neighbours have
been determined, K-NN labels the new data point's class based on the dominant class
among its neighbours. This algorithm is valued for its simplicity and adaptability, making
it suitable for both binary and multi-class classification tasks. However, KNN's
performance can be sensitive to the choice of the number of neighbours (K) and may be
computationally expensive for large datasets or high-dimensional feature spaces.

Numerical Illustration 1:
We have a dataset of points in a two-dimensional space, and we want to classify a new data point
into one of two classes based on its proximity to its k (= 3, say)nearest neighbours. Here's the dataset:

Now, we want to classify a new data point (3, 4) into either Class A or Class B using the k-NN algorithm.
Step 1- Compute Distances: Calculate the Euclidean distance between the new data point (3, 4) and all
the points in the dataset:

Step 2- Select k Nearest Neighbours: Choose the k nearest neighbours with the smallest distances. In
this case, k = 3, so the three nearest neighbours are (2, 3), (2, 5), and (4, 3).

Step 3 -Majority Voting: Determine the class of the new data point based on the classes of its k nearest
neighbours. In this case, two of the nearest neighbours belong to Class A, and one belongs to Class B.
Since Class A has the majority of the nearest neighbours, we classify the new data point (3, 4) as Class
A using majority voting.

The choice of classification algorithm depends on the nature of the data, the problem we computational
are trying resources, to solve, and and available other data. factors It such is often as interpretability,
recommended to experiment with multiple algorithms to determine which one works best for a specific
task.

Step 2- Select k Nearest Neighbours: Choose the k = 3 nearest neighbours with the smallest
distances.
In this case, the three nearest neighbours are Joy, Agniv, and Aishi.
Step 3 Majority Voting: Determine the class of the new data point based on the classes of its k
nearest neighbours. In this case, two of the nearest neighbours(Joy, Agniv) belong to Class Cricket, and
one (Aishi) belongs to Class Football. Since Class Cricket has the majority of the nearest neighbours,
we classify the sports class of Riju as Cricket.

 Regression
Regression is a type of supervised learning where the algorithm predicts a continuous numerical
value or quantity (output) based on input data.These techniques model the relationship between the
features and the target variable as a mathematical function. Here are some common regression
techniques: Linear Regression: Linear regression is one of the simplest and most widely used regression
techniques. It models the relationship between the input features and the target variable as a linear
equation.
There are two main types:
 Simple Linear Regression: When there is only one input feature. Multiple Linear
Regression: When there are multiple input features.

Example 1: Predicting house prices based on rate per square foot.

Example 2: Companies frequently employ linear regression to analyse the revenue- to-ad spending
relationship. For instance, they may run a straightforward linear regression model with revenue serving
as the response variable and advertising expenditure serving as the predictor variable.
Numerical Illustration I: Let us apply linear regression technique to predict the ‘6th week sales’, from
the following dataset (four Weeks’s sales data).

 Polynomial Regression: Polynomial regression extends linear regression by allowing for


polynomial terms of higher degrees. It can capture more complex relationships between
the features and the target variable.
Example: Predicting a person's age based on height.
 Random Forest Regression: Random forest regression is an ensemble technique that combines
multiple decision trees to reduce overfitting and improve accuracy. It is robust and handles high-
dimensional data well. The choice of regression technique depends on the nature of the data, the
assumptions about the relationship between features and the target variable, and considerations like
model interpretability, computational resources, and the need for regularization. It's often a good
practice to try multiple regression techniques and evaluate their performance to select the most
suitable one for a given problem.

 The Supervised Learning Workflow:


The supervised learning workflow is a fundamental machine learning method in which an algorithm
learns a mapping from input data to match target labels using a labelled dataset. It consists of many
important processes. The usual supervised learning workflow is as follows:

 Data Collection: = Gather a dataset that consists of input data (features) and their corresponding
target labels.
Ensure that the data is representative of the problem we want to solve and is of high quality (clean,
correctly labelled, and relevant).
 Data Preprocessing:
Clean the data by handling missing values, outliers (Outliers are data points that deviate
significantly from the majority of the data in a dataset), and errors in the dataset. Normalize or standardize
features if necessary, to ensure that they have similar scales. Split the dataset into two or more subsets,
typically a training set and a testing set, to evaluate the model's performance later.

 Feature Engineering: Select and transform features to improve the model's performance. This
may include:
Feature selection: Choosing the most relevant features. Feature extraction: Creating new features from
existing ones.
Feature scaling or encoding: Making sure the features are suitable for the chosen algorithm.

 Model Selection: Choose an appropriate machine learning algorithm (e.g., decision trees, support
vector machines, k-NN) that is well-suited for your problem and dataset.
Consider the trade-offs between various algorithms in terms of accuracy, interpretability, and
computational complexity.
 Model Training: Use the training dataset to train the selected model. During training, the model
learns the relationship between the input features and the target labels. The model adjusts its
internal parameters to minimize the difference between its predictions and the true labels in the
training data.

 Model Evaluation: Assess the evaluation depending model's metrics on the performance include
using accuracy, nature of the the testing precision, problem dataset. recall, Common Fl-score,
(classification, etc. regression, etc).Analyse the model's performance to understand its strengths
and weaknesses.

 Hyperparameter Tuning: Fine-tune the model's hyperparameters (e.g., learning rate,


regularization strength) to optimize its performance. This can be done using techniques like grid
search, random search, or Bayesian optimization.

 Model Deployment: Once you are satisfied with the model's performance, deploy it to make
predictions on new, unseen data in a real-world setting. Integrating the model into a software
programme or system may be required for deployment.

 Monitoring and Maintenance: Continuously monitor the model's performance in a production


environment. Re-evaluate and retrain the model periodically to account for changing data
distributions and ensure it remains accurate and effective.

 Documentation and Reporting:


Maintain documentation that records all the steps taken during the supervised learning workflow,
including data sources, preprocessing steps, model selection, and evaluation results. Communicate
understandably. the findings and results to stakeholders clearly and The supervised learning
workflow is iterative, and you may need to revisit and refine various steps to improve the model's
performance or adapt to changing requirements.

 Challenges of Supervised Learning:


Supervised learning is a popular and widely used branch of machine learning, but it comes with
its own set of challenges and limitations. Here are some key challenges specific to supervised learning:

 Data Availability and Quality: Supervised learning is strongly dependent on labelled data,
which may be costly and time-consuming to gather. Obtaining a significant number of high-
quality labelled data might be difficult at times.

 Imbalanced Data: Imbalanced datasets, in which one class has a disproportionately higher
number of occurrences than another, may lead to biased models that underperform on minority
classes. To address this problem, methods such as resampling and the use of appropriate
assessment criteria are required.

 Overfitting: Overfitting is a common problem in supervised machine learning, where a model


learns to perform extremely well on the training data but fails to generalize its predictions to
unseen or new data. It occurs when a model captures noise or random fluctuations in the training
data rather than the underlying patterns or relationships. This leads to the model being too
complex, making it highly sensitive to small variations in the input.
 Underfitting: Underfitting happens when a model is too simple to capture the underlying
relationships in the data. It results in poor performance on both the training and test datasets.
Choosing more complex models or feature engineering may be necessary.
 Generalization: Ensuring that a model generalizes well to unseen data is a central
challenge. This includes finding the right balance between bias and variance to achieve
good model performance on new examples.
 Feature Engineering: Selecting and engineering informative features is crucial for the
success of supervised learning models. Finding the right set of features can be challenging
and often requires domain expertise.
 Curse of Dimensionality: High-dimensional data can lead to increased model complexity
the need for larger datasets. Dimensionality reduction techniques may be required to handle
high-dimensional feature spaces.
 Model Selection: Choosing the right algorithm or model architecture for a specific task
can be challenging. Different algorithms may perform better for different types of data and
problems.
 Interpretability: It is crucial to understand the reasoning behind a model's predictions,
especially when they have legal or moral consequences.
 Scalability: As the size of the dataset grows, training and inference times can become a
bottleneck. Distributed computing and parallel processing may be required to handle large
datasets efficiently.
 Labelling Cost and Consistency: Labelling data can be expensive and prone to human
errors. Ensuring labelling consistency and reliability is crucial for supervised learning.
 Model Maintenance: Models need to be regularly updated and maintained to account for
changing data distributions and evolving requirements. This requires ongoing monitoring
and retraining.
 Ethical Considerations: Supervised learning can raise ethical questions, particularly in
cases where the chosen model or the bias in the data can lead to discriminatory outcomes.
 Transfer Learning;: It can be difficult to adapt models that have been trained in one
domain to function well in another, however methods like transfer learning are assisting in
this.
Addressing these challenges often requires a combination of technical expertise, domain
knowledge, data collection strategies, and careful model selection. work on effectively. Researchers and
practitioners in supervised learning continually developing methods and best practices to tackle these
issues.

 A Real Example of Supervised Learning

For Email Spam Classification, supervised learning is used to automatically classify incoming
emails as either spam or non-spam. Here's how this works:
 Data Collection: A dataset is compiled that comprises a huge number of emails as well as labels
indicating whether each email is spam or not. Human annotators carefully evaluate and classify the
emails to produce these labels.
 Data Preprocessing: The emails are preprocessed to extract relevant features. This preprocessing
may involve: s Tokenizing the text: Breaking the emails into words or tokens.
Removing stop words: Common words like "the", "and" and "is" are often removed since they don't
carry significant information.
Feature extraction: Converting the text data into numerical features, such as word frequencies or term
frequency-inverse document frequency (TF- IDF) values.
 Model Selection: A supervised learning algorithm is chosen to build the spam classification
model. Common choices include logistic regression, naive Bayes, SVM, random forest, k-NN, etc.
 Model Training: The chosen algorithm is trained on the preprocessed dataset, with the extracted
features serving as input and the matching labels (spam or non-spam) serving as output. During
training, the model learns to recognise patterns and correlations in data that differentiate spam emails
from real ones.
 Evaluation: The trained model's performance is tested using a different dataset. Common evaluation
metrics include accuracy, precision, recall, F1-score, and receiver operating characteristic (ROC)
curves.
 Model Deployment: When the model performs well on the evaluation dataset, test it may be deployed
in a real-world email system. The deployed model processes incoming emails and provides a
likelihood or confidence score that the email is spam. Emails having a high spam likelihood are sent
to a spam folder or reported for user review, while emails with a low spam probability are delivered
to the inbox
 Continuous Improvement: As fresh email data becomes available, the spam categorization model
may be continually updated and enhanced. Manual reviews and user comments can help the model
become more accurate and adapt to changing spammer strategies.

 Unsupervised Machine Learning


Unsupervised machine learning is a branch of machine learning where the algorithm is trained on
a dataset without explicit supervision or labelled target outcomes. Instead of: predicting specific target
values, unsupervised learning algorithms focus on finding patterns, structures, or relationships within the
data itself. The primary goal is to explore and extract meaningful information, uncover hidden structures,
or reduce the dimensionality of the input data. This field encompasses a range of techniques, including
clustering, dimensionality reduction, density estimation, and anomaly detection, making it a versatile tool
for data exploration, knowledge discovery, and preprocessing tasks across various domains, from data
analysis to natural language processing and computer vision.

 Key Components of Unsupervised Learning :


Here are some key aspects, types, and techniques of unsupervised machine learning:

 Input Data: The fundamental element of unsupervised learning is raw data. This data can take many
forms, including numerical, textual, visual, or other data types, and it is used as input for unsupervised
learning methods. Clustering: Clustering is a fundamental approach in unsupervised machine learning
that outcomes. involves Clustering's grouping primary goal similar data points based on their
characteristic features or patterns without the need for fixed labels or goal within datasets, making it
an effective tool for data exploration and analysis. Various clustering is techniques, including to K-
Means, find hidden Hierarchical structures Clustering, and DBSCAN (Density-Based Spatial
Clustering of Applications with Noise), split the data into clusters, where data points within the same
cluster indicate high similarity while those in other clusters are dissimilar.

 Dimensionality Reduction: Dimensionality reduction techniques are used to reduce the number of
features in the dataset while preserving the most important information. Principal Component Analysis
(PCA) and t-SNE (t- Distributed Stochastic Neighbour Embedding) are popular methods for
dimensionality reduction.

 Anomaly Detection: Unsupervised learning can be used to detect anomalies or within a dataset.
Anomalies are data points deviate significantly from the norm and may indicate errors, fraud, or
unusual events. One-class that SVM techniques.

 Density and Estimation: Density Forest are estimation examples aims to model the underlying
probability distribution of the data for understanding the data distribution and generating new data
samples. Kernel Density Estimation (KDE) and Gaussian Mixture Models (GMM) are common density
estimation methods.
 Representation Learning: Unsupervised learning can be used to learn meaningful data
representations. Models that can learn hierarchical data representations include autoencoders and deep
belief networks (DBNs).
 Evaluation Metrics: Metrics like the silhouette score or the Davies-Bouldin index are used to assess
the quality of clusters.

 Association Rule Mining: Association rule mining is a technique for discovering correlations and
patterns in transactional databases. It is often used like in market basket analysis to discover
relationships between items purchased together.

 Hyperparameter Tuning: Unsupervised learning algorithms often have hyperparameters that need to
be tuned. For example, in clustering algorithms K-Means, the number of clusters (k) is a
hyperparameter that must be Hyperparameter tuning helps optimize the performance of unsupervised
models.

 Visualization: Visualizing the results of unsupervised learning is essential for understanding the
discovered patterns and structures in the data. Preprocessing: Data preprocessing steps, such as data
scaling, normalization, and handling missing values, are important in unsupervised learning to ensure
that the algorithm can perform effectively and that the results are meaningful.

 Domain Knowledge: Domain knowledge can be valuable in interpreting the results of unsupervised
learning and understanding the significance of discovered patterns. It can help guide the analysis and
make the insights more actionable.

 The Unsupervised Learning Workflow


The unsupervised learning workflow involves a series of steps for exploring, analysing, and
extracting valuable insights from unlabelled data. Here's a typical unsupervised learning workflow:

 Data Collection: Gather a dataset that contains raw, unlabelled data. This data can come from
various sources, such as sensors, text documents, images, or that will be used in any domain-
specific data.

 Data Prepro cessing: Clean and preprocess the data to ensure it is suitable for analysis. This step
may involve handling missing values, removing outliers. and normalizing or scaling features as
needed.

 Feature Selection or Extraction: Identify the relevant features or attributes analysis. In some
cases, dimensionality reduction techniques, like Principal Component Analysis (PCA), may be
applied to reduce the number of features while preserving important information.

 Exploratory: If the Data Analysis (EDA):Perform EDA to gain an initial understanding of the
data's distribution, relationships between variables, and potential patterns. Visualization tools and
techniques, such as scatter plots. histograms, and heatmaps, are commonly used.

 Clustering or Dimensionality Reduction: Depending on the goals of the analysis, choose the
appropriate unsupervised learning technique:
If a goal is to group similar data points, apply clustering algorithms like K-Means, Hierarchical
Clustering, or DBSCAN.
If the focus is on reducing dimensionality while retaining important information, use techniques like
PCA or t-SNE.
 Model Training: Train the selected unsupervised learning model on the preprocessed data. In the
case of clustering, the model learns to group data points into clusters based on similarity, while
dimensionality reduction models learn to project data into a lower-dimensional space.
 Evaluation (if applicable):Unlike supervised learning, there are typically no standard evaluation
metrics for unsupervised learning tasks since there are no target labels. This Here, evaluation involves
assessing the quality and meaningfulness of the results manually or through domain-specific criteria.
 Interpretation and Analysis: Interpret the results to gain insights into the data. may involve
identifying distinct clusters, understanding the relationships between data points, or discovering
important features.
 Visualization: Visualize the results to communicate findings effectively. Visualization techniques
can provide a clear representation of clusters reduced-dimensional representations, or other
discovered patterns.
 Application and Decision Making: Apply the insights gained from the learning to make informed
decisions or inform further analyses. These decisions can range from refining business strategies to
optimizing processes or even creating supervised learning models based on the discovered patterns.

 Documentation and Reporting: Document the entire unsupervised learning process, including data
preprocessing steps, chosen techniques, results, and insights. Report the findings to stakeholders
clearly and understandably. The unsupervised learning workflow is highly iterative, and multiple
techniques may be applied and refined to uncover the most meaningful patterns or structures in the
data.
 A Clustering Algorithm (K-Means Algorithm):
K-Means is a popular clustering algorithm used in machine learning and data analysis. It is an
unsupervised learning algorithm that is used to partition a dataset into K distinct, non-overlapping
clusters. The goal of K-Means is to find clusters in the data, with the number of clusters (K) specified
by the user. Here's how the K-Means algorithm works:
 Initialization: Choose K initial cluster centroids randomly from the data points. These
centroids will act as the initial cluster centres.
 Assignment Step: For each data point in the dataset, calculate its distance to each of the K
cluster centroids. Assign the data point to the cluster whose centroid is closest to it. This step
effectively groups data points into clusters based on their proximity to the cluster centres. :
 Update Step: After all data points have been assigned to clusters, calculate new cluster
centroids by finding the mean of all data points assigned to each cluster. These new centroids
become the updated cluster centres.
 Repeat Assignment and Update Steps: Continue the assignment and update steps iteratively
until one of the stopping criteria is met. Common stopping criteria include a maximum number
of iterations, convergence (when cluster assignments longer change), or a predefined threshold
for centroid movement.
 Final Clustering: Once the algorithm converges or reaches the maximum number of iterations, the
data points are grouped into K clusters, and each data point belongs to the cluster with the nearest
centroid. Key points to note about K-Means: K-Means is sensitive to the initial choice of cluster
centroids. Different initializations can lead to different final cluster assignments. To mitigate this, the
algorithm is often run multiple times with different initializations, and the best result (i.e., the one with
the lowest overall within-cluster variance) is selected. K-Means works well when clusters have similar
shapes and densities.

The choice of K is a critical decision in K-Means, There are various methods to select an appropriate
K, such as the elbow method or silhouette analysis. K-Means can be sensitive to outliers because it's
based on the mean (average). Outliers can significantly affect the positions of cluster centroids. widely
K-Means is used for various applications, including image compression, customer segmentation,
anomaly detection, and more. It's a simple and Numerical efficient algorithm for discovering patterns
and grouping data points into clusters. To Illustration: (9, 8) illustrate the K-Means clustering algorithm
numerically, let us use a small dataset {@, 3), @, 5), 3, ), (5, 7, (6, 5), (9, 8)} with two-dimensional
points and try to cluster them into two groups (K=2),
Step 1-Initialization: Randomly choose two initial centroids, Let's choose (2, 8) and as initial centroids.
Step 2 -Assignment: Calculate the distance from each data point to both centroids and assign each point
to the nearest centroid:

Step 4 - Repeat: Repeat the assignment and update steps until convergence. Let's check if the
assignments have changed. The assignments are the same as in the previous step, so there's no change
in cluster assignments. Since there is no change in assignments or centroids, the algorithm has
converged.
Step 5 -Final Clustering: Cluster 1: {2, 3), @, 5), (3, 4)}, Cluster 2: {5, 7), (6, 5), (9, 8)}. This is the
final result of the K-Means clustering algorithm. It has clustered the data points into two groups based
on their proximity to the centroids. Cluster 1 contains points closer to the centroid (2.33, 4.0), and
Cluster 2 contains points closer to the centroid (6.67, 6.67).

 Challenges of Unsupervised Learning


While unsupervised learning has many applications and advantages, it also comes with several
challenges and limitations:
 Lack of Ground Truth: Unsupervised learning, in contrast to supervised learning, frequently
lacks the basic information needed to evaluate the model's performance. This makes evaluating
the quality of the learnt representations or clusters challenging.

 Evaluation Metrics: Determining evaluation metrics suitable unsupervised learning tasks can
be challenging. Metrics such as inertia, silhouette score, or Davies-Bouldin index are commonly
used for clustering, but their effectiveness may vary depending on the dataset and problem.

 Difficulty for in Interpretation: Unsupervised learning models, such as clustering algorithms


or autoencoders, can generate complicated, difficult-to- interpret representations. Understanding
the meaning or relevance of learned features can be non-trivial. Curse of Dimensionality:
Unsupervised learning can be difficult with high- dimensional data. Dimensionality's burden can
lead to sparsity (most of the data points are empty or contain zero values), higher processing
complexity, and difficulty visualising and understanding data.

 Clustering Ambiguity: Depending on the initialization or hyperparameters used, clustering


algorithms can generate a variety of results. The uncertainty in clustering solutions might make
selecting the "correct" clustering difficult.

 Scalability: Unsupervised learning algorithms may struggle with scalability when dealing with
large datasets. Training models on large data can be computationally intensive and time-
consuming.

 Data careful Preprocessing: Preprocessing data for unsupervised learning often requires
consideration, including handling missing values, scaling features, and dealing with outliers.
Poor preprocessing can lead to suboptimal results. Determining the Number of Clusters: In
clustering tasks, deciding the optimal number of clusters can be challenging. Various methods,
such as the elbow method or silhouette analysis, are used, but none is 100% accurate.

 Imbalanced Clusters: Clustering algorithms may produce clusters of significantly imbalanced


sizes, making it challenging to work with the results. Robustness to Noise: Noise refers to random
variations, errors, or irrelevant information present in a dataset. Unsupervised learning models
may be sensitive to noisy data points or outliers. Ensuring robustness in the presence of noise is
essential for practical applications.

 Interactions between Clusters: Real-world data may contain complex structures where clusters
interact or overlap. Traditional clustering algorithms may struggle to capture such interactions.

 Transferability: Models learned through unsupervised learning may not transfer well to
different datasets or domains, requiring adaptation or fine- tuning.
Despite these challenges, unsupervised learning remains a valuable tool for discovering patterns and
structures in data, and advancements in the field continue to address some of these limitations.

 A Real Example of Unsupervised Learning :


Imagine an e-commerce company that wants to better understand its customer base and tailor
marketing strategies to different customer groups. Unsupervised learning can be used to segment
customers into distinct groups based on their purchasing behaviour, preferences, and demographics.
 Data Collection: Collect data on customer behaviour and characteristics, such as purchase
history, demographics (age, gender, location),
 Data Preprocessing: Before applying unsupervised learning algorithms, the data is
cleaned and prepared, including handling missing values, scaling features, and encoding
categorical variables.
 Algorithm Selection: Common clustering algorithms like k-means clustering or
hierarchical clustering can be applied. These algorithms group customers based on
similarity in purchasing patterns.
 Interpretation: After applying the clustering algorithm, the company obtains clusters of
customers. Each cluster represents a group of customers who exhibit similar purchasing
behaviour. For example, one cluster might consist of tech- savvy, young customers who
frequently buy electronics, while another cluster could include older customers who prefer
fashion and home decor.
 Tailored Marketing and Strategies: The business uses these customer segments to customize
marketing campaigns, product recommendations, and pricing strategies for each group.
For instance, they might send technology- related promotions to the tech-savvy cluster and traditional
product discounts to the older cluster.
Unsupervised learning, in this case, helps the e-commerce company extract valuable insights from
their data, ultimately leading to more effective marketing strategies and improved customer
relationships.

 Reinforcement Machine Learning :


Reinforcement Learning (RL) is a machine learning paradigm in which an agent learns to take
actions within an environment to maximize a cumulative reward signal by discovering optimal strategies
or policies through interactions and feedback from the environment. RL involves learning through trial
and error by interacting with an environment.

 Key Components Reinforcement Learning :


Here are some key components and concepts in reinforcement learning: Agent: An agent is a
learner or decision-maker that interacts with the environment. It takes actions based on its current state
and the policy it has learned.
 Environment: Environment represents the external system or world with which the agent
interacts. It is where the agent operates and receives feedback in the form of rewards and
state transitions.
 State (S): A state is a representation of the current situation or configuration of the
environment. It provides essential information to the agent about the environment's current
state.
 Action (A): An action is a decision or move that the agent can take at a particular state.
The set of all possible actions is called the action space.
 Policy (): The policy is a strategy or a mapping that defines the agent's behaviour. It
specifies which action the agent should take in each state or state- action pair. Policies can
be deterministic or stochastic.
 An Trajectory or Episode: A trajectory (or episode) is a sequence of states, actions, and
rewards that the agent experiences from the beginning to the end of interaction with the
environment. It starts with an initial state and continues until a terminal state is reached
 Reward (R): A reward is a numerical value that the environment provides to the For agent
after each action. It quantifies how good or bad an action was in a particular state. The
agent's goal is to maximize the cumulative reward over time.
The term cumulative reward refers to the total sum of rewards that an agent receives over a
sequence of actions and interactions with its environment during a specific episode or trajectory.
example, we may recall the snake game that used to be on our mobile phones once. In the game, when
the snake eats a piece of food, it typically receives a positive reward and if the snake collides with the
game's boundaries or itself, it typically receives a negative reward. The cumulative reward in this context
is the sum of all the rewards the agent receives throughout an episode, where an episode ends when the
game is over.
 Value Function (V or Q): The value function is a function that estimates the expected cumulative
reward the agent can achieve starting from a particular state (V) or a state-action pair (Q). It helps
the agent assess the desirability of different states or actions.
 Model (Optional): In some RL approaches, a model of the environment may be used. This model
can predict the next state and reward given the current state and action, aiding in planning and
learning.
 Exploration: In RL, trying new actions to learn more about the environment is known as
exploration
 Exploitation: In RL, choosing actions that are currently believed to be the best is known as
exploitation.
 Discount Factor (y):It is a factor is a parameter that determines the weight of future rewards
compared to immediate rewards. It helps in defining the agent's preference for short-term or long-
term rewards.
 Learning Algorithm: It is the method or approach used by the agent to update its policy or value
function based on its interactions the environment. Common RL algorithms include Q-learning,
Deep Q-Networks (DQN), and various policy gradient methods.

These components together form the foundation of reinforcement learning. with The specific
implementation and variations of these components can vary depending on the RL problem and approach
being used.

 Reinforcement learning workflow


A typical RL workflow is illustrated on the following page. The key steps include: Problem Formulation:
Define the problem you want to solve with RL. This includes specifying the environment, the agent's
objectives, state and action spaces, and the reward function.

 Initialization: Initialize the agent's policy, value function, and any other necessary parameters.
 Interaction with the Environment: The agent interacts with the environment in episodes or
trajectories.
 In each episode, the agent starts from an initial state. each time step of an episode:
 The agent observes the current state (S).
 It selects an action (A) based on its policy (r).
 The agent takes the selected action.
 The environment responds with a reward (R) and a new state (S').

 Policy Evaluation (Prediction): Update the value function (either state- value V or action-value
function Q) to estimate the expected cumulative reward for each state or state-action pair.
Common methods include Dynamic Programming, Monte Carlo methods, or Temporal
Difference (TD) learning like Q-learning.
 Policy Improvement (Control): Adjust the agent's policy (m) based on the estimated value
function to improve decision-making. Exploration strategies are often used to balance
exploration and exploitation. Common methods include Policy Gradient methods, Actor-Critic
methods, or Q-learning with epsilon-greedy exploration.
 Repeat: Continue the process of interaction with the environment, policy evaluation, and policy
improvement for many episodes or until convergence.
 Convergence and Evaluation: Monitor the agent's performance over episodes to check for
convergence and improvements in the cumulative reward. Use evaluation metrics to assess the
agent's performance, such as average return or success rate.
 Fine-Tuning and Hyperparameter Optimization: Adjust hyperparameters and experiment
with different settings to improve the agent's performance. This step may involve tuning learning
rates, discount factors, exploration strategies, and neural network architectures if applicable.
 Deployment (if applicable):Once the agent has learned an effective policy, it can be deployed
in a real-world application to make decisions in real-time. Monitoring and Maintenance:
Continuously monitor the agent's performance in the real-world application and make
adjustments as needed.

 Challenges of Reinforcement Learning:


Reinforcement learning (RL) comes with several challenges and limitations: Sample Efficiency:
RL algorithms often require a large number of interactions with the environment to learn effective
policies. This can be problematic in real-world scenarios where each interaction is costly or time-
consuming.
o Exploration vs. Exploitation: Balancing exploration and exploitation is a key challenge
in RL. Agents must explore to discover optimal strategies, but excessive exploration can
result in ineffective learning.
o High—Dimensional State Spaces: Many real-world problems have high- dimensional
state spaces, making exploration and learning computationally infeasible.
o Delayed Rewards: In some environments, rewards may be sparse or delayed, making it
challenging for agents to associate their actions with future rewards.
o Non-Stationarity: Environments can change over time, and the underlying dynamics
may be non-stationary. RL agents need to adapt to these changes.
o Continuous Action Spaces: Handling continuous action spaces in RL can be
challenging. Techniques like actor-critic methods and policy gradients are used, but they
can be computationally hard.
o Partial Observability: In many real-world scenarios, agents have limited or partial
information about the environment, making it harder to make optimal decisions.
o Credit Assignment: Determining which actions led to particular outcomes or rewards,
especially in long sequences of actions, is a complex problem known as credit
assignment.
o Safety and Ethics: RL agents may learn behaviours that are unsafe or ethically
problematic. Ensuring that RL agents act responsibly is a critical challenge.
o Transfer Learning: Generalizing knowledge Jearned in one environment to a different
but related environment (transfer learning) is a challenge in RL.
o Evaluation and Benchmarking: Establishing meaningful benchmarks for RL
algorithms and evaluating their performance is not always straightforward, particularly
for complex tasks.
o Stochastic Environments: Many real-world environments are inherently stochastic. RL
agents need to handle uncertainty effectively.
o Real-Time Decision Making: In applications like robotics or automomous driving, RL
agents must make decisions in real-time, which imposes additional constraints and
challenges.
o Environmental Modelling: In some cases, RL agents may need to learn models of the
environment to make informed decisions, adding complexity to the learning process.
Researchers and practitioners in RL continue to develop techniques and approaches to
tackle these obstacles and advance the field.

 A Real Example of Reinforcement Learning


Consider an autonomous vehicle manufacturer working on developing self- driving cars.
Reinforcement learning can play a crucial role in training these vehicles to make intelligent driving
decisions. Scenario: The company aims to train an autonomous vehicle to navigate urban environments,
obey traffic rules, and make safe decisions while dealing with complex traffic scenarios.
 Reinforcement Learning Task: The primary task here is autonomous driving control, and
reinforcement learning is used to teach the vehicle how to make optimal driving decisions
in real-time.
 Data Collection: The company collects a vast amount of data from real-world driving
scenarios, including sensor data (e.g., LiDAR, cameras, GPS), road conditions, traffic data,
and human driving behaviour. Simulation: To augment the dataset and provide a safe
environment for training, the company builds a high-fidelity driving simulator. This
simulator replicates urban driving scenarios, including traffic, pedestrians, various weather
conditions, and road obstacles.
 Agent: The autonomous vehicle serves as the RL agent. Environment: The urban driving
environment, whether real-world streets or the simulated environment, acts as the RL
environment.
 Reward Function: The reward function is designed to encourage safe and efficient driving
behaviour. It may include rewards for staying in the correct lane, obeying traffic signals,
avoiding collisions, and following the speed limit. Penalties can be applied for violations
or accidents. The
 Learning Algorithms: company deep reinforcement learning algorithms, such as Deep Q-
Networks (DQN) or Trust Region Policy Optimization (TRPO), to train autonomous
vehicles.

 Employs Exploration: Initially, the vehicle explores the driving environment, taking
random actions or following -basic driving rules. This phase collects data to initialize the
RL model.

 Model Training: Using collected data from real-world driving and simulation, RL agent
is trained to maximize the cumulative reward. The agent learns e ¢ its to the optimal
policies for various driving scenarios.

 Evaluation: The trained model's performance is evaluated in both simulated and real-
world scenarios. The model may go through iterations of training and evaluation to
improve performance.

 Safety and Real-World Deployment: The trained model is thoroughly tested ensure
safety and reliability. Safety mechanisms, such as emergency braking systems, are
integrated into the vehicle to override the RL model if it makes potentially unsafe
decisions.
Here, reinforcement learning allows the autonomous vehicle to learn from interactions with the
environment, adapting its driving behaviour based on real-time feedback and optimizing its decision-
making process for safe and efficient urban driving.
 Comparison between Supervised, Unsupervised and Reinforcement learning

The following Table 2.1 summarizes the key differences and similarities between supervised
learning, unsupervised learning, and reinforcement learning.
NEURAL NETWORK AND DEEP LEARNING
“I have always been convinced that the only way to get artificial intelligence to work
is to do the computation in a way similar to the human brain. That is the goal I have been pursuing. We
are making progress, though we still have lots to learn about how the brain actually works.”—Geoffrey
Hinton

 Artificial Neural Network Artificial Neural Network (ANN) is a computational model which is
based on the neural architecture of the human brain. It is made up of layers of networked nodes, or
neurons. These nodes process information, and during training, the network modifies the connection
strengths (weights) to learn from the data. This allows the network to identify patterns, anticipate
outcomes, and perform a variety of machine learning and artificial intelligence tasks such as pattern
recognition, classification, regression, and decision-making.
 Key Components of ANN Neurons: In an artificial neural network, the basic processing units are called
neurons or artificial neurons. These neurons are analogous to the neurons in the biological nervous
system.
 Layers: Neurons are organized into layers. The three main types of layers are the input layer,
hidden layer(s), and output layer. The input layer receives input data, the hidden layers process
this information, and the output layer produces the final result.
 Connections: Neurons in one layer are connected to neurons in the next layer. These connections
allow information to flow through the network.
 Weights: Each connection has an associated weight, representing the strength of that connection.
During training, these weights are adjusted to make the network learn from input-output pairs.
 Activation Function: Each neuron has an activation function, which determines the output of
the neuron given its input. Common activation functions include sigmoid, hyperbolic tangent
(tanh), and rectified linear unit (ReLU).
 Feedforward: Information flows through the network from the input layer to the output layer.
The input is processed layer by layer, and the final output is produced.
 Backpropagation: During the training phase, the network learns by adjusting the weights based
on the error between the predicted output and the actual output. The backpropagation algorithm
is commonly used for this purpose.
 Learning: ANNs learn from data through a training process. They generalize patterns from the
training data and can make predictions on new, unseen data.
 Training: Training involves presenting input data to the network, comparing the predicted
output to the actual output, and adjusting the weights to reduce the error.
 Error: It is the difference between the actual (target) output and the predicted output.

 Structure of an ANN

Figure depicts a simple structure of an artificial neuron. 2.5 X1,X3,..,Xn, are nodes or input units, weights
are represented by w1,w2,.., wn. Y’ is the Here, output unit.Yout, is the signal transmitted by Y. It is also
known as the activation of Y.
 Properties of Artificial Neural Network (ANN)

 Parallel Processing; ANNs can process multiple inputs sxmultaneously, making them well-suited
for tasks that involve large datasets or require real- time processing, such as image and speech
recognition.
 Adaptability: ANNs can adapt and learn from data. Through training, ANNs adjust their internal
parameters (weights and biases) to minimize errors and improve ability to make accurate
predictions. This adaptability allows neural networks to generalize from the training data to new,
unseen data.
 Robustness: ANNs can often handle noisy and incomplete data, making them suitable for real-
world applications where data may not be perfect.
 Scalability: ANNs can be designed with varying numbers of layers and neurons to accommodate
different complexities of tasks. This scalability allows for a wide range of applications, from
simple to highly complex tasks.
 Transfer Learning: Pre-trained neural network models can be used as a starting point for new
tasks. This technique, known as transfer learning, reduces time and data required to train models
for specific applications.
 Generalization: Well-trained ANNs can generalize from the training data to make accurate
predictions on new, unseen data. This generalization is a key property that applications.
 Multimodal allows neural Processing: networks to be effective in a wide range of ANNs can
handle multiple types data simultaneously, such as text, images, and audio. This makes them
suitable for tasks of that driving. Fault involve multiple modalities, like video analysis or
autonomous
 Tolerance: Some neural network architectures, especially recurrent networks and self-organizing
maps, exhibit a degree of fault tolerance, where they can continue to function even if some parts
of the network are damaged or missing.

 Biological neural networks vs. artificial neural networks:

 Biological neural networks and artificial neural networks a fundamental connection, as the latter
is inspired by the structure and functioning of the share former. Let's explore the relationship
between biological neural networks (BNNGs) and artificial neural networks (ANNs):

 While ANNs are inspired by BNNs, it's essential to note that they are simplified models, and the
level of abstraction varies. Also, they are not a perfect replica of the complexity and functionality
of biological neural networks.
 Building Blocks of ANN neural Artificial blocks:
Network networks Topology, (ANNs) rely on three fundamental building blocks:
Network Topology, Adjustments of weights (learning),and Activation functions.

 Network Topology:
 Multi-Layer Network (Multi-layer Perceptron):
Multi-layer neural networks, also known as multi-layer perceptrons (MLPs), are a type
of artificial neural network with layers of neurons in between the input and the output layers of neurons.
These networks are capable of learning complex, non- linear relationships in data and have become a
fundamental building block in modern machine learning and deep learning.
Here are the key characteristics and components of a multi-layer neural network.

 Input Layer: The input layer is the first layer of the network, where each neuron corresponds to
a feature or input variable. The number of neurons in the input layer is determined by the
dimensionality of the input data.

 Hidden Layers: Between the input and output layers, multi-layer neural networks can have one
or more hidden layers. These hidden layers are responsible for feature transformation and
abstraction. Each neuron in a hidden layer computes a weighted sum of its inputs and applies an
activation function to produce an output.

 Weights and Biases: Each connection between neurons in adjacent layers has an associated
weight, and each neuron has a bias term. These weights and biases are learned during the training
process, where the network adjusts them to minimize the prediction error.

 Activation Functions: Activation functions are applied to the output of neurons in hidden layers
and, optionally, in the output layer.

 Output Layer: The output layer is the final layer of the network. The number of neurons in this
layer depends on the specific task.
 Activation functions:
The term "activation" 3 refers to the output of a processing unit. The activation function of a processing
unit is defined as the function that maps the net input value to the output signal value, hence causing
activation. There are several activation functions used in neural networks. Some of the most common ones
are listed and discussed subsequently.
 Types of Artificial Neural Network
There are various types of neural networks, each designed for specific tasks and problem
domains, Here are some of the most commonly used types of neural networks:
 Feedforward Neural Network (FNN): Also are used for tasks like classification and regression.
known as Multilayer perceptrons(MLPs),these are the simplest type of Neural Networks.They
consits of an Input Layer,one or more Hidden Layers and an Output Layer.FNNs are used for
tasks like Classidfication and Regression.
 Convolutional Neural Network (CNN): Designed for brocessing grid-like data such as image
and video.CNN use Convolutional Layers to automatically learn features from input dat and are
widely used in image classification ,object detection, and image generation tasks.
 Recurrent Neural Network (RNN): Suitable for sequence data, RNNs have connections that
loop back on themselves, allowing them to maintain a form of memory. RNNs are used in natural
language Processing, speech recognition, and time series prediction.
 Self-Organizing Map (SOM):SOMs are a type of unsupervised neural network used for
clustering and visualization. They reduce the dimensionality of data while preserving topological
relationships.
 Hopfield Network: Hopfield networks are recurrent neural networks used for associate memory
and pattern recognition.They are often used in optimization problems and content addressable
memory.

 Application of Artifiial Neural Network:

 Speech Recognition: Artificial neural networks (ANNS) play a major role in speech recognition.
Statistical techniques like Hidden Markov techniques were employed in earlier voice recognition
models. Since the advent of deep learning, obtaining an accurate classification has become
impossible without using one of the many types of neural networks
 Recognition of Handwritten Characters: Handwritten characters are recognized by ANNs.
Neural networks have been trained to identify handwritten characters, which can be either letters
or numbers. Natural Language Processing (NLP): Recurrent Neural Networks (RNNs) and
Transformers are used for tasks such as language translation, sentiment analysis, text generation,
and chatbots.
 Healthcare and Medical Diagnosis: ANNs are employed in medical imaging for tasks like
diagnosing diseases from X-rays and MRIs, predicting patient outcomes, and drug discovery.
 Autonomous Vehicles: Neural networks, including deep reinforcement learning, enable self-
driving cars to perceive and navigate their surroundings.
 Astronomy: ANNs are used for tasks like classifying celestial objects, finding exoplanets, and
analysing large-scale astronomical data.
 Deep Learning: Deep learning has changed how we interact with technology in recent years, from
voice assistants that understand our natural language to autonomous cars that navigate tough
scenarios. Neural networks, which are mathematical models that mirror the neurons in our brains,
enable these breakthroughs. These networks let computers learn from data and adapt in ways
previously considered to be exclusive to human brains.
Deep learning is a branch of machine learning that uses algorithmic analysis to
automatically learn and enhance functionality. The term deep in deep learning describes the quantity of
layers that are used to change the data. An ANN having several layers between the input and output layers
is called a Deep Neural Network (DNN). ANNs are used by the algorithms to mimic human thought
processes and learning in order to learn and become more efficient. Traditionally, & shallow neural
network (SNN) is one with one or two hidden layers. Thus, a deep neural network (DNN) is one with
more than two hidden layers.
 Types of Deep Learning Models: Convolutional Neural Networks (CNNs): Convolutional
Neural Networks (CNNs) are an incremental version of artificial neural networks (ANNSs). It is
mainly used for feature extraction from input datasets. A convolutional neural network is made up
of several layers, including, convolutional layer, ReLU layer, Pooling layer and fully connected
layers. Uses of these layers are describes in Table 2.10.

NATURAL LANGUAGE PROCESSING (NLP) AND COMPUTER VISION

Natural Language Processing (NLP) and Computer Vision stand at the forefront of
cutting-edge technologies, revolutionizing the way machines interact with human language and visual
information. NLP is a subfield of artificial intelligence (AI) that focuses on enabling computers to
understand, interpret, and generate human language in a way that is both meaningful and contextually
relevant. This includes tasks such as language translation, sentiment analysis, and speech recognition.
On the other hand, Computer Vision empowers machines to interpret and make sense of the visual world
by extracting meaningful information from images or videos. It encompasses image recognition, object
detection, and facial recognition, among other applications. Together, NLP and Computer Vision. form
a powerful duo, bridging the gap between human communication and computational understanding,
paving the way for innovative applications across various industries, from healthcare to finance and
beyond.

Natural Language Processing (NLP) Natural language processing (NLP) is the study of constructing
machines that can govern written, spoken, and organised human language, or data that mimics human
language. It evolved from computational linguistics, which studies language concepts using computer
science.

 Stages of NLP Lexical analysis:


It is a fundamental stage in Natural Language Processing (NLP) that involves breaking down a
sequence of characters (usually a text) into o ¢ o in meaningful units called tokens. These tokens serve
as the basic building blocks for subsequent stages of language processing.
 Syntactic analysis: Syntactic analysis, also known as parsing, is a crucial phase natural language
processing (NLP) that includes evaluating the grammatical structure of sentences to comprehend
word connections. This technique helps in determining the syntax, or word order of a phrase, which
is necessary for interpreting the meaning of the text.
 Semantic analysis: Semantics describes the meaning of words, phrases, sentences, and
paragraphs. It involves understanding the meaning of words, phrases, and sentences in a certain
context. In contrast to syntactic analysis, which focuses on grammatical structure, semantic
analysis goes into content interpretation and attempts to capture the intended meaning.
 Discourse integration: A discourse is an exchange of ideas between two or more people.
Discourse integration deciphers unclear language by examining earlier words and phrases. It refers
to the process of understanding and connecting multiple sentences or utterances in a coherent and
cohesive manner. It goes beyond the analysis of individual sentences and focuses on the
relationships and connections between them to derive a more comprehensive understanding of a
text or conversation.
 Pragmatic analysis: The goal of pragmatic language analysis is to obtain the intended meaning
rather than the literal meaning.

 Steps of NLP of Segmentation: Deconstruct the entire document into its individual sentences. To
accomplish this, divide the text into sections and add punctuation marks, such as commas and full
stops, as illustrated on the following page.
 Input (Text): The Sky is clear, the stars are twinkling at night Output
 (Text 1): The Sky is clear Output
 (Text 2): The stars are twinkling at night
 Tokenization: Breaking down a text into individual words or tokens.
 Input (Text): The sun is shining in the sky Output (Tokens): The, sun, is, shining, in, the, sky
Removing Stop Words: Eliminating common words (e.g., "and," "the") that don't contribute much
to the meaning. Input (Text): Joy is going to the college Q Output (Tokens): Joy, going, college
 Stemming or Lemmatization: Stemming means to get the base word, taking out the prefixes and
suffixes.
 Input (List of words): Affection, Affects, Affecting, Affected, Affecting Output (Root word):
Affect
 Lemmatization means combine several word inflections onto a base word known as a lemma.
 Input (List of words): doing, done, did
 Output (Lemma): do
 Parts of Speech (POS) tagging: Identify parts of speech for different tokens.
Input (Sentence): The cat killed the fish. Output (Parts of speech): Definite article, noun, verb,
definite article, noun.
 Named entity recognition: Classify named entities mentioned in the text into categories such
as “People”, “Locations”,“Organizations” and so on. Input (Text): Google CEO Sundar Pichai
resides in New York. Named entity recognition.: Google — Organization, Sundar Pichai —
Person, New York — Location Chunking: The process by which we take discrete bits of
information and combine them into larger groups.

 Applications of NLP
NLP applications are diverse and can be found in various fields, including:
 Search Engines: Improving search results by understanding user queries.
 Virtual Assistants: Enhancing the capabilities of virtual assistants like Siri, Alexa, or Google
Assistant.
 Text-based Chatbots: Enabling automated responses in customer support and other
applications.
 Language Translation Services: Facilitating communication across different languages.
Sentiment.
 Analysis in Social Media: Analysing social media content to understand public opinion.
 Healthcare: Extracting information from medical texts and assisting in clinical decision-
making.
 Finance: Analysing financial reports and news to make informed investment decisions.

 Computer Vision
Computer vision is a multidisciplinary field that enables machines to interpret and
make decisions based on visual data. Leveraging techniques from computer science, mathematics, and
machine learning, computer vision aims to replicate human vision and perception in machines. If AT
enables computers to think, computer vision enables them to see, observe and understand. It aims to
automate visual tasks that are capable of being performed by the human visual system, like object
detection, scene interpretation, and image recognition. Deep learning and machine learning algorithms
are frequently used in computer vision, which is a broad field of approaches and methodologies.
 Basic Concepts in Computer Vision
The multidisciplinary field of Computer Vision relies several fundamental concepts to process and
understand visual data. These concepts form the building blocks for various on computer vision tasks,
ranging from image processing to advanced scene understanding. In this section, we explore the essential
concepts that underpin the functioning of computer vision systems.
 Image Representation: At the heart of computer vision lies the concept of image representation.
An image is a two-dimensional array of pixels, where each pixel represents the smallest unit of
visual information. The properties of these pixels, such as color intensity, form the basis for
image representation. Common color spaces include RGB (Red, Green, Blue) and grayscale,
with each pixel's value indicating its color or intensity.
 Image Processing Techniques: Image processing involves manipulating images to enhance
their quality, extract useful information, or prepare them for further analysis. Common image
processing techniques include:
 Filtering: Applying filters, such as blurring or sharpening, to highlight or suppress certain image
features.
 Thresholding: Segmenting an image by setting a threshold to separate regions of interest based
on pixel intensity.
 Morphological Operations: Modifying the structure of image components using operations like
dilation and erosion.
 Feature Extraction: Feature extraction is a critical step in computer vision that involves
identifying and isolating meaningful information from raw data. Features can be edges, corners,
textures, or any distinctive pattern that aids in subsequent analysis. Techniques like Harris corner
detection and Scale- Invariant Feature Transform (SIFT) are commonly used for feature
extraction.
 Image Recognition and Classification: Image recognition is the task of assigning labels or
categories to images based on their content. Classification, a subset of image recognition,
involves categorizing an image into predefined classes or groups. Traditional computer vision
methods often used handcrafted features and machine learning algorithms, while modern
approaches leverage deep learning models, particularly Convolutional Neural Networks
(CNNS), for image classification tasks.

 Achievements and Impact of Computer Vision


Computer vision technology has achieved remarkable milestones and has a profound impact on
various industries and aspects of daily life, and the continuous advancements in algorithms, models, and
hardware contributes to its widespread adoption. The key achievements and impacts of computer vision
include:
 Image and Video Analysis: Content Understanding: Computer vision algorithms accurately
classify and recognize objects, scenes, and actions in images and videos. This capability is
employed in applications like content moderation, where inappropriate content is automatically
identified and filtered.
 Video Surveillance: Computer vision plays a crucial role in surveillance systems, automating
the monitoring of public spaces, identifying anomalies, and enhancing security.
 Autonomous Vehicles: track objects Object Detection and Tracking: Computer vision enables
vehicles to detect and in their surroundings, a fundamental capability for autonomous driving.
This includes recognizing pedestrians, other vehicles, and obstacles in real-time.
 Lane Detection: identifying lane Vision systems markings, autonomous navigation. Healthcare:
help vehicles stay within lanes by contributing to the safety and precision of Medical 1maging:
Computer vision is extensively used in medical image analysis, aiding in the diagnosis and
treatment of various conditions. It plays a vital role in detecting abnormalities in X-rays, MRIs,
CT scans, and other medical images.
 Diagnosis: Vision systems contribute to the identification Disease diseases and conditions by
analyzing visual symptoms. For example, retinal scans can be used for early detection of diabetic
retinopathy. Augmented Reality (AR) and Virtual Reality (VR): of AR Applications:
Computer vision is integral to AR, overlaying digital information onto the real-world
environment. This is employed in gaming, navigation, and education. the VR Immersion: In VR,
computer vision enhances the sense of immersion by tracking user's Object movements and
accordingly adapting the virtual environment. Robotics: Manipulation: Robots equipped with
computer vision systems can perceive and manipulate objects in their environment. This is
crucial for tasks such as pick-and-place operations in manufacturing. Autonomous Navigation:
Robots use vision systems as inputs to navigate through environments, avoiding obstacles and
making decisions.
 Retail and E-Commerce: Product Recognition: Computer vision is used for automatic product
recognition, enabling features like visual search. Users can take a picture of an item, and the
system identifies and finds similar products for purchase.
 Inventory Management: Vision systems assist in inventory tracking by automating the
counting and monitoring of stock levels.
 Accessibility:
 Assistive Technologies: Computer vision contributes to the development of assistive
technologies for individuals with visual impairments. Applications include object recognition,
scene description, and text-to-speech conversion.
 Entertainment and Creative Industries:
 Special Effects: Computer vision is extensively used in the entertainment industry for creating
realistic visual effects in movies and games. Gesture Recognition: Vision systems enable gesture
recognition, allowing users to interact with devices and applications through natural movements.
 Agricultural Technology: Crop Monitoring: Computer vision is employed in precision
agriculture for tasks such as monitoring crop health, detecting diseases, and optimizing yield
through automated analysis of satellite and drone imagery.
 Research and Exploration:
 Space Exploration: Computer vision aids in the analysis of images and data from space
missions. It is used for terrain mapping, object recognition, and exploration planning.

 Advantages of Computer Vision


 Automation: Computer vision enables the automation of tasks that traditionally required human
visual perception, leading to increased efficiency and productivity.
 Accuracy: When properly trained and configured, computer vision systems can achieve high
levels of accuracy in tasks such as image recognition, object detection, and classification.
 Speed: Computer vision algorithms can process large amounts of visual data at high speeds,
domains. Objectivity: making them suitable for real-time applications in various Unlike human
observers, computer vision systems are not influenced by emotions or biases, providing more
objective and consistent results.
 Cost Savings: Automation through computer vision can lead to cost savings by reducing the
need for manual labour in tasks like quality control, inspection, and monitoring.
 24/7 Operation: Computer vision systems can operate continuously, allowing for round-the-
clock monitoring and analysis, which may not be feasible for human operators
 Data Analysis: Computer vision facilitates the extraction of valuable insights from visual data,
contributing to data-driven decision-making in various industries.

 Disadvantages of Computer Vision


 Complexity: Developing and implementing computer vision systems can be complex and may
require expertise in computer science, specialized mathematics, and machine learning.
 Data Dependency: The performance of computer vision algorithms is highly dependent on the
quality and quantity of training data. Biased or insufficient data can lead to inaccurate results.
 Limited Generalization: Some computer vision systems may struggle to generalize well to new
or unseen scenarios, especially if they were trained on a narrow dataset.
 High Computing Power: Training and running sophisticated computer vision models may
require substantial computing resources, which can be costly.
 Ethical Concerns: Issues related to privacy, surveillance, and potential misuse of computer
vision technologies raise ethical concerns and require careful consideration.
 Interpretability: Deep learning models used in computer vision are often considered "black
boxes," making it challenging to interpret their decision- making processes.
 Vulnerability to Adversarial Attacks: Some computer vision systems, especially those based
on deep learning, are susceptible to adversarial attacks, where carefully crafted inputs can
mislead the system. Integration Challenges: Integrating computer vision systems into existing
workflows and technologies can be challenging, especially in industries with established
practices.

 Real Example: Tesla Autopilot


Object detection in autonomous vehicles is a critical application of computer vision involves
identifying and locating various objects in the surroundings, such as pedestrians, other vehicles, road
signs, and obstacles. Tesla's Autopilot is a vehicle's well-known and widely used example of object
detection in autonomous vehicles.
Autopilot utilizes a combination of sensors, including cameras, radar, and ultrasonic sensors, to perceive
and interpret the environment in real-time.
 Object classify Recognition; Tesla's cameras capture images of the surrounding environment,
and computer vision algorithms process these images to recognize and objects. This cyclists, and
static obstacles. includes identifying other vehicles, pedestrians,
 Lane Detection: Computer vision algorithms analyze the road markings and detect lanes,
allowing the vehicle to stay within its lane and navigate safely.
 Traffic Sign Recognition: Vision systems in Tesla vehicles can recognize and interpret traffic
signs, including speed limits, stop signs, and traffic signals. This information is used to adjust
the vehicle's speed and respond to traffic rules.
 Obstacle Avoidance: Autopilot employs computer vision to identify and track potential
obstacles in the vehicle's path. The system can take corrective actions, such as steering or
braking, to avoid collisions.

 How it Works.
 Data Collection: Tesla vehicles are equipped with multiple cameras that continuously capture
images of the surroundings.
 Image Processing: Computer vision algorithms process these images to extract relevant
information, such as the presence of objects, their locations, and their characteristics.
 Object Detection: Through deep learning and convolutional neural networks « s key (CNNs),
the system identifies and classifies objects in the environment. For example, distinguishing
between a car, a pedestrian, or a cyclist.
 Decision Making: The processed information is used by the vehicle's control system to make
real-time decisions, such as adjusting speed, steering, and braking.
 Feedback Loop: The system continuously receives feedback from sensors, understanding of the
environment, and adapts its actions accordingly. This example showcases how computer vision,
through object detection, is a enabler of autonomous vehicles, making driving safer, more
efficient, and eventually paving the way for fully autonomous transportation systems.

CASE STUDY: SMART SPEAKER


 Introduction:
Alexa is an artificial intelligence based voice assistant technology developed by
amazon,was first installed in amazon echo smart speaker. Alexa has minimum of 4 built-in microphone
which listen and detect the spoken command and which is sent to amazon cloud where it is further
assessed and processed. By 2019 amazon was able to develop partnership with 3rd parties to install
Alexa as voice assistant in over 28000 devices creating Alexa ecosystem. It has transformed life of
million as it became the part of their daily routine. It has wide spectrum of use from listening music to
home automation and can be used by any age group. Everything has its pros and cons so does life with
artificial intelligence though Alexa made our life easier by getting our jobs done on a single voice
command but exposed us to multiple risk related to our privacy

 Case backgrounds
Amazon launched echo and echo dot smart speaker in 2014 followed by echo plus and echo
dot kids edition, also created Alexa ecosystem with 3rd parties association. Alexa became one of the
fastest growing technical innovation as echo dot enabled customer to ask question , make calls , check
the news, play music and control their home with smart home devices by using artificial intelligence
technology, Wi-Fi and Bluetooth .It also provided children’s content platform called free time unlimited
that provided kids opportunity to access to educational content, games , audio books, movies and t.v
shows. This indicates that Alexa brought technological revolution and impacted the life of million. Alexa
was introduced as amazon mission of introducing product that could provide ease and comfort in their
customers daily life but contrary to this quite opposite has happen. Alexa could not respond to highly
complex language as it was trained initially with American English. It came across the multiple issues
like endangering privacy and security of the user data, cognitive growth of children using Alexa

 Key issue raised in the case study


 Data custodianship:
o It was a major issue raised in the case study as few incident in the case study mention the
susceptibility of device to control the unauthorize parties within the hearing range of the
speaker. If the incident like transferring money , ordering valuable things happen , it is
unclear who would be responsible for the resulting loss.
 Cyber security :
o Hackers can excess the personnel data or listen the private conversation by using the
technique called voice squatting due to which Alexa can be turn into a silent spy.
 Customers privacy and data security :
o 1000 audio clips per 9 hour were listen by the employees of amazon and then and uploaded
the data into the AI in order to improve the speech recognition and response to command
which was disclosed by a study conducted by Bloomberg in 2019. It is neither ethical nor
customer friendly approach. It put the question on the company whether the data is safe
with the company and it would not disclose the data with3rd party even if profits are
involved. The next section will analyse these issues using the PEST model.

 Analysis-PEST model

 Economic factors
o It was expected that voice shopping will have bigger impact on the consumer journey
and they will prefer to search products by using Alexa device for the voice shopping
rather than going for google search but it hasn’t happened as only 2% of the people used
Alexa device for the shopping and contributed a tiny fraction of company sale revenue.
Digital personal assistant alter the purchasing behaviour entirely by taking away the
consumer independence. In some cases it was found that pets have ordered food online
via Alexa using digital personal assistant. It means it can be misused by the thieves and
children.
 Social factors
o Alexa impacted the cognitive growth of the children. Learning happens when a child is
being exposed to a challenge by a parent or peer or their mentor. Parental guidance and
correction is missing and child use offensive language which reinforce the negative
behaviour. Child need human interaction, guidance and correction for their cognitive
development and reinforcing positive behaviour. By using Alexa excessive interaction
with the virtual assistant reduce the human interaction. It retarded the development of
creative and critical thinking and reduce the patience among children. Sharing of children
information with selected affiliated business raised the concern regarding children
privacy. Alexa could be misused by the hacker as it has ability to listen , record and
transmit the data. Hackers could have the access to the personal financial information
and can misuse it. Alexa has changed the behavioural pattern of the society as frequency
of listening music increased by 63% , 38% check the weather forecast multiple times a
day, 56% played game multiple times a day and time for T.V viewing was displaced by
listening to audio such as podcast via smart speakers.
 Technological factor:
o Alexa was primarily trained for American accent language so non-native speakers
experienced significant difficulty to communicate with Alexa. It was only able to
understand and answer 82% question out of 5000 question in 2018. Alexa failed to
provide a real time location during the emergency call made for the purpose of home
security. Alexa enabled gadgets have threat of being hacked as Alexa is not trained to
respond to only one person’s voice so the manipulation in the voice can be made due to
which it can be easily misused.
 Solution:
Alexa faced the problem of being understood by the non-native American and having different
accent of English other than American. It is a product which has to be used globally so there is a need
to mobilise NLP for different languages so that understanding and responses improve. Alexa speech
also failed to convey the correct emotion so the team need to work on Text-to-speech system to reach
a natural sound. Alexa has failed on the ground of privacy and security, data custodianship attack by
the hackers, voice misinterpretation and voice squatting so there is a need to develop a suitable risk
management framework. There should be end to end encrypted policy in which customer data should
not be revealed to the 3rd party. Educationist and parental concern about the cognitive development of
the alexa user children is a dangerous proposition. Amazon should strategically design echo dot kids
edition to address the concern. It should set a limit to the screen viewing and helping the children to
understand the difference between technology and a real person and to interact with more and more
people. Parents can provide the restricted access to the children to promote the healthy growth of the
children.
CASE STUDY: SELF-DRIVING CAR
 .Introduction:
The advent of driverless and automated vehicle technologies offers enormous
opportunities. It will make driving easier, improve road safety, reduce emissions, and ease
congestion. It will also enable drivers to choose to do other things which riding a self-driving car.
Ultimately access to fully automated vehicles will also improve mobility for those unable or
unwilling to take the wheel, enhancing their quality of life. As a result, driverless vehicles could
provide significant economic, environmental and social benefits.

A self-driving or driver less car is a vehicle that is capable of sensing its environment and
navigating without human inputs. For a long time, engineers have worked on creating safer cars to
reduce the number of road accidents. They have focused a lot of energy on the most unreliable link
in the entire driving process – Humans.

Despite the increasing sophistication of modern vehicles, and greater application of driver-
assistance technologies, the driver still has to concentrate on driving 100% of the time. Highly and
fully automated vehicles will change this. For the first time since the invention of motor vehicles,

the ‘driver’ will be able to choose whether they want to be in control, or to hand the task of
driving over to the vehicle itself. This represents a major opportunity – allowing drivers to safely
use the journey time however they wish, from reading a book, to surfing the web, watching a
film or just chatting face to face with other passengers.

Human error is a factor in over 90% of collisions. Humans failing to look properly,
misjudging other road users’ movements, being distracted, careless or in too much of a hurry are
the most common causes of collisions on our roads. Automated vehicles will not make these
mistakes. They use a range of sensors which will constantly monitor their surroundings. We have
come to rely on many technologies that assist the driver of a vehicle, including Anti-lock Braking
Systems (ABS), cruise control or parking sensors. As these technologies evolve, they are reaching
the point where a vehicle is capable of operating for periods of time with reduced, or in some
instances without, driver input.By communicating with their environment and other vehicles,
automated and driverless vehicles offer the promise of better use of road space, reducing
congestion and providing more consistent journey times, through the use of “connected vehicle”
technologies. “Connected vehicles” would communicate with each other and their
surroundings to identify the optimum
route. Vehicles could also communicate with roadside infrastructure such as traffic lights and use
this information to minimize fuel consumption and emissions.

As wonderful and brilliant the human brain is, it has observed that it gets distracted quite
frequently - an unavoidable problem given the amount of information the brain gains and
processes. Hence, to assist the driver and to account for the possible distractions of the brain
engineers have come up with a variety of driver assist systems. And over the period of time they
have decided to replace the driver entirely, to trust sensors instead of sense organs and
microcontrollers instead of the brain. Below is an image capturing and highlighting this change.

Estimated congestion and parking cost reductions, energy savings and emission reductions
are also uncertain due to interactive effects.

A capture from “Driverless Cars: Optional by 2024, Mandatory by 2044”,


http://spectrum.ieee.org/transportation/advanced-cars/driverless-cars-optional-by-2024
mandatory-by-2044#WhatCouldGoWrong

The above figure shows how driver assist systems like cruise control, parking assist,
steering correction etc. have given way to completely autonomous cars.
 Overview of Self driving car:

Sensors act as the sense organs of the car and per second they take hundreds of images;
these images are then fed into the system which create convert the images back to complex codes.
Based on the parameters in the code the microcontroller makes decisions and actuates steering,
brake, accelerator etc. (Below image shows how car views the world around it).

Cameras need to be at least mono-vision cameras, which mean they have one source of
vision. Mono-vision cameras are very simple devices and the video feed is usually used for
understanding basic surroundings—typically fixed infrastructure like lane markings, speed limit
signs, etc. The hardware itself is simple and cheap. Automotive mono-vision cameras are less
sophisticated and have lower pixel density than cameras on smartphones. However, the challenge
is on the software side, which involves fast image processing to recognize common roadside
infrastructure from a simple black and white relatively low-resolution image. The next stage up is
stereovision cameras, which use two video sources, similar to human eyesight. This incorporates
depth perception and can help the car better understand the relative position of moving traffic and
potential obstacles. Apart from object detection, the cameras can be used for various other
applications, including reading speed limit signs, headlight high beam de-activation in case of an
approaching vehicle, light sensing, etc.
A capture from “special report driverless car infrastructure”
http://www.protradertoday.com/report/driverless-car-infrastructure/1527

In addition to visual confirmation of its surroundings, the car also collects sensory images
using radar systems. There are two typical types of radar systems— short-range and long-range,
which are usually mutually exclusive. Short-range radar, as the name indicates, "feels" around the
car's immediate surroundings, especially at low speeds, while long-range radar is used at high
speeds and over relatively long distances. It is the combination of long distance radar plus
algorithm-based processing of images from stereovision cameras that gives the autonomous car
the capability of knowing, with a reasonably high degree of accuracy, exactly what is in front of it
and how the positions and profiles of external objects are changing at all times.

LIDAR uses a combination of reflected laser/light (LI) and radar (DAR) to create a 3D
profile of the surroundings of the car. LIDAR is extensively used today in marine, archeological,
and mapping applications.

While the cameras, radar, and LIDAR are used for obstacle and environment monitoring,
sensors are used extensively to understand what is happening with the car itself. In addition to
navigating the roads, the autonomous car also needs to monitor itself to know that it is not traveling
over the speed limit or if something is wrong with the car and it has to pull over. Sensors of all
kinds are already extensively used in cars, including acceleration sensors, pressure sensors, light
sensors, etc. We expect a meaningful step up in sensor content in the car, especially in the active
safety and human-machine interface (HMI) areas.

Autonomous cars will need reliable, high-speed two-way data communications equipment
for navigation, V2V/V2X communication, and content reception. This will include antennas, 4G
receivers, and GPS receivers. Autonomous cars will also likely need to have sophisticated event
data recorders or black boxes, similar to planes, given the high level of automation, in the event of
an accident or failure.

Self-driven cars are an ideal system to study from a system architecture standpoint as they
are in the very initial stages; the concepts are still evolving and the stakeholders are not yet decided
on whether to support or to oppose them.
An example of a Google car’s internal map at an intersection, tweeted by Idealab founder
Bill Gross http://robohub.org/how-do-self-driving-cars-work/

 Autonomous Vehicle Equipment and Service Requirements

 Automatic transmissions.
 Diverse and redundant sensors (optical, infrared, radar, ultrasonic and laser) capable of
operating in diverse conditions (rain, snow, unpaved roads, tunnels, etc.).
 Wireless networks. Short range systems for vehicle-to-vehicle communications, and
long-range systems to access to maps, software upgrades, road condition reports, and
emergency messages.
 Navigation, including GPS systems and special maps.
 Automated controls (steering, braking, signals, etc.)
 Servers, software and power supplies with high reliability standards.
 Additional testing, maintenance and repair costs for critical components such as sensors
and controls.

The convergence of sensor-based safety systems and connected vehicle technology will have
far-reaching implications as the technology matures and becomes pervasive. Crash elimination,
reduced need for new infrastructure, data challenges, and new models for vehicle ownership, travel
time dependability, productivity improvements, and improved energy efficiency are number of
major implications.
 Advantages and disadvantages

• Advantages

 Reduced driver stress. Reduce the stress of driving and allow motorists to rest and work
while traveling.
 Reduced driver costs. Reduce costs of paid drivers for taxis and commercial transport.
 Mobility for non-drivers. Provide independent mobility for non-drivers, and therefore
reduce the need for motorists to chauffeur non-drivers, and to subsidize public transit.
 Increased safety. May reduce many common accident risks and therefore crash costs and
insurance premiums. May reduce high-risk driving, such as when impaired.
 Increased road capacity, reduced costs. May allow platooning (vehicle groups traveling
close together), narrower lanes, and reduced intersection stops, reducing congestion and
roadway costs.
 More efficient parking, reduced costs. Can drop off passengers and find a parking space,
increasing motorist convenience and reducing total parking costs.
 Increase fuel efficiency and reduce pollution. May increase fuel efficiency and reduce
pollution emissions.
 Supports shared vehicles. Could facilitate car-sharing (vehicle rental services that
substitute for personal vehicle ownership), which can provide various savings.

• Disadvantages

 Increases costs. Requires additional vehicle equipment, services and maintenance, and
possibly roadway infrastructure.
 Additional risks. May introduce new risks, such as system failures, be less safe under
certain conditions, and encourage road users to take additional risks (offsetting behavior).
 Security and Privacy concerns. May be used for criminal and terrorist activities (such as
bomb delivery), vulnerable to information abuse (hacking), and features such as GPS
tracking and data sharing may raise privacy concerns.
 Induced vehicle travel and increased external costs. By increasing travel convenience and
affordability, autonomous vehicles may induce additional vehicle travel, increasing
external costs of parking, crashes and pollution.
 Social equity concerns. May have unfair impacts, for example, by reducing other modes’
convenience and safety.
 Reduced employment and business activity. Jobs for drivers should decline, and there may
be less demand for vehicle repairs due to reduced crash rates.
 Misplaced planning emphasis. Focusing on autonomous vehicle solutions may discourage
communities from implementing conventional but cost-effective transport projects such as
pedestrian and transit improvements, pricing reforms and other demand management
strategies

Stakeholder Analysis:
The term “stakeholder” has gained broader usage since Edward Freeman’s 1984
publication of Stakeholder Management, which firmly established the term and the importance of
stakeholder management as an active task. However, over the years it has come to mean any and
all parties touched by the system, with the net result that “managing stakeholders” is often
interpreted as a downstream public relations activity rather than an upstream process identifying
and serving potential customers.

Beneficiaries are those who benefit from your actions. Architecture produces an outcome
or output that addresses their needs. It is beneficiaries whom we must examine in order to list the
needs of the system.

Stakeholders are those who have a stake in your product or enterprise. They have an
outcome or output that addresses your needs. They are important to for company. This is much
closer to the original concept of stockholders, who supply cash (which the firm needs), in return
for a stake in the firm, notably a share in the firm’s profits.

At the center of the diagram we have Beneficial Stakeholders, who both receive valued
outputs from us and provide valued inputs to us. Beneficiaries who are not stakeholders we call
Charitable Beneficiaries in that the firm receives no return (however indirect) from the outputs
provided to them. Stakeholders who aren’t beneficiaries are Problem Stakeholders in the sense that
you need something from them, but there is nothing they need that you can provide in return.
Figure: stakeholders and beneficiaries

There are many beneficiaries and stakeholders of a product/system, both internal and
external to the organization. Internal beneficiaries and stakeholders include technology developers,
design teams, implementation, operations, sales, service, management/strategy, marketing, and so
on. External beneficiaries and stakeholders include regulators, customers, operators, suppliers,
investors, and potentially competitors.

Main stakeholders in self driving car are

 Customer (driver/Owner)
 Regulators/NGOs/Government
 Suppliers
 Investors/ Shareholders
 Oil/Electrical companies
 People
 Employees
 Marketing
 Insurance companies
 Competitors
 Infrastructure investment
 Customer (driver/Owner)

 Consumer who purchases car, “drives” car. This stakeholder requires good price,
efficiency (gas, time), good design (good model, exterior and interior design),
driving/Travelling experience (comfortable passenger experience), baggage space,
safety, and ease of use.

 Regulators/NGOs/Government

 Organizations that make rules or regulations that the car must satisfy, including:
Technological Standard (security of network, Security of control system –what
provision if some parts fail while on the road). Gas emission regulations, Speed
limits, Political support. Ensures team does not ignore moral requirements are the
major needs with respect to the system.

 Suppliers

 Companies that supply different structural parts of the car such as sensors, tires, etc.
 Revenue and potential technological innovation are the major needs with respect to the system.

 Investors/ Shareholders

 Companies or individuals who invest money and/or technology in the project.


Short-term earnings of high revenue and clear understanding of potential project
benefits are the major needs with respect to the system.

 Oil/Electrical companies

 Oil / electricity providers for the system. Revenue comes from using their
products - earn revenue both directly from the project and from consumers.
 People

 People in general exposed/affected by car operation - includes drivers and


pedestrians. Low noise, no emissions, size, and security/safety, obeys traffic laws
are the major needs with respect to the system.
 Employee
People who will work on the project development including engineers, code
writers, managers, and more. Project success, fulfilling work, and learning new
techniques are the major needs with respect to the system.

 Marketing

Advertise product, ensure proper media attitude towards vehicle. Marketers want money,
high sales, and low levels of controversy from the project.

 Insurance companies

 Companies in charge of insuring the driver - High security index (to adjust
insurance prices) and revenue are the major needs with respect to the system.

 Competitors

 Companies working on similar product development such as Tesla or offering


alternatives to the autonomous driving (Uber). They want project failure (if our
project fails they succeed), though project success could result in a sales boost as
the concept reaches more people.

 Infrastructure investment

 New electric stations, better roads, etc. Creates employment, Government


building infrastructure earns votes, taxes, payment against consumption of
electricity.

 Needs are a product/system attribute in that you build a system to meet needs.
Needs exist in the mind (or heart) of the beneficiary, and they are often expressed
in fuzzy or general (ambiguous) terms. Needs can be unexpressed or even
unrecognized by the beneficiary. Needs are primarily outside of the producing
enterprise; they are owned by a beneficiary.

A central tenet of stakeholder theory is that stakeholder value results from an exchange. Here identifying
the needs of the beneficiary so that it can architect our system to satisfy these needs. This is the first half
of the exchange; the outputs or outcomes satisfy the beneficiary’s needs. The other half of this exchange is
that stakeholders have outputs or outcomes that meet the input needs.
Figure Value delivery as an exchange

When identifying needs check for completeness by examining whether beneficiaries


produced outputs that needed, and whether stakeholders needed inputs. If we take the existing list
of stakeholders and their needs, we can observe that they are interrelated.

Figure: stakeholder value network


The structure of the network generated by the team is a meshed network. The meshed
stakeholder network for the project of Google car contains a central stakeholder, a set of bilateral
interactions, and indirect value delivery (longer value loops). A Meshed network allows for value
flows between two stakeholders, rather than only between the project and a stakeholder.

The value loops in a hubs network only exist between the central project and one of the
other stakeholders. In comparison, the value loops in a meshed network could show an indirect
exchange of needs. In general, a loop or value loop is a value chain that begins and ends with the
same stakeholder. These loops are described below, and can flow through many stakeholders
before the loop returns to the project.

Supply Importance scores are indicated by the ellipse shapes with either low, medium, or
high written inside of them. Benefit scores are indicated by the color of the arrow connecting two
elements on the SVN. Value Flow Score is indicated by the decimal number written next to each
arrow.

Supply Scores

Most supply scores are simple, but explanations for the more complicated scores follow.
The market dominance value paths between competitors and the google self-driving car have low
supply scores because while there is plenty of competition in the car industry, there is very little
competition specifically in the autonomous vehicle market – this is the market that we care about.
There is a medium flow score of fuel or electricity to the project because the relative lack of
electricity balances out the surplus of available gasoline. There is a high amount of cash available
to be invested by investors but low amounts of cash to be shelled out by the project, at least initially.
The supply of jobs created by this project is low because the project would be led by mostly google
employees, and would only need to hire employees to manufacture vehicles – and because there
would be a low supply of self-driving cars initially, not many new jobs are available.

Benefit Scores

The self-driving car will need to navigate itself more accurately than current GPS
technology allows for therefore the GPS technology incorporated into the vehicle will need to be
top of the line, and potentially need to be better than what is available. This technology is a must
have. The project will never get off the ground without shareholder support and money, so that
value flow is very important. The project should hire community members to manufacture
vehicles, but it is not absolutely necessary - thus, a should have value flow. Finally, customers
must have both insurance and some way to propel the vehicle in order to drive it. Therefore, the
insurance to customer and gas/electric Company to customer flows are must haves.

In order to decide which stakeholder is most important, we will total the value loop score
of each loop involving a given stakeholder and divide that value by the total sum of all value loop
scores. That sum of all value loop scores is approximately 1.263424.

Table 3: Sum of value loop scores with considering stakeholders

As expected, customers are the most important stakeholders – as they are the consumers of
the product, if they do not approve of the project and buy vehicles then the project will not make
a profit, making everything else irrelevant. The next most important stakeholder, surprisingly, is
oil and electric companies. This is due to the fact that these companies are involved in so many
loops, particularly involving customers. This is a case where we likely should use a combination
of these quantitative results and common sense, as while oil and electric companies are very
important, it is more of a minimum requirement that the car is able to be propelled relatively
cheaply and efficiently. This does explain why oil companies have such a firm hand in the
automobile industry however – expensive gas would make it so that a vast majority of customers
could not afford to drive a car. Essentially, without an initial investment and subsequent funding
and support, the project could not possibly get off the ground. While customers and sales ensure
that the project is supported late in its lifetime, investors ensure the project is supported early on
in its life. Finally, the last two most important stakeholders are suppliers and GPS companies.

For any product/system there will be many stakeholders, each potentially with many needs.
It is hardly ever possible to meet all the goals of all the stakeholders, but there should be a well-
defined subset of all of these needs that will be represented by the goals adopted by the enterprise.

 Trade space exploration

 Formulation
To enumerate the architectures for a self- driving vehicle, the team come up with 8 important
decisions and their sub-options. These architectural decisions are related to selecting goals or
functions to be finished by the system, function-to-form mapping, specialization of function,
characterization of form and system decomposition. They have a large impact on the architecture
and its ability to satisfy the most important system goals, stakeholder’s needs.

 Size of the cars


“Size of the car” describes how big the google car should be to provide enough capacity to satisfy
the stakeholder (customer)’s need. The cars of various sizes have different target customers. The
size of the vehicle is one of the most important decisions influencing the design metrics like the
vehicle structure, load capacity, engine choice, aerodynamics, cost, etc. There are four alternative
options.
1. Two-seats sports car
2. Four-seats Sedan
3. Minivan
4. Bus.
Canonical type for this decision is standard form – we can only select one size of car to develop in
a single architecture.
 Type of motors
“Type of motor” determines the type of the fuel, weight of the vehicle, and the powertrain of the
vehicle. This decision determines the power, fuel, safety risk, and cost of the vehicle. There are
six possible candidates for the engine choice.
1. gasoline combustion engine
2. diesel combustion engine
3. compressed natural gas engine
4. electric motor
5. hydrogen engine
6. hybrid-engine system consuming gasoline and electricity.
This is a standard-form decision.

 Type of sensors
The sensors can detect the environment and provide information for the on-vehicle computer to
make driving decisions. The sensor determines the field of view (FOV) of self-driving car. Type
of sensors also decides the accuracy of the sensed environmental data. Different sensors are
expected to be combined to provide the most complete environmental information because each
type of sensor has its own specialty in detecting the environment. In other words, the performance,
cost, and operation safety of the autonomous vehicle all rely on this decision. There are five
popular types of sensors used on self-driving vehicle.
1. Light Detection and Ranging sensors (LIDAR system)
2. laser sensor
3. ultrasonic sensor
4. camera (visual) sensor
5. radar sensor.
Since the designers can combine different sensors, this decision is of a down-selecting canonical
type.

 Choices of computers
The computing unit which makes the google car drive autonomously can be regarded as the brain
or computing/command center of self-driving car. The driving algorithm determines the driving
command. Choices of the computers and their embedded algorithm are the main elements which
make it achieve autonomous driving. There are four potential candidates for the computers used
on a self-driving car.
1. commercial computer
2. customized computer
3. a system and a combination of microcontrollers and other controlling units
4. online (cloud) computer.
The team assumes all the computers or computing system perform the same functions, which are
receiving the data and computing driving decisions. Thus, this decision is down selection.

 Choice of methods of actuate steering wheel, brake, throttle, etc.


The methods for steering and for other actuation purposes should be illustrated in a self-driving
car. The command inputs when the vehicle is driving are determined by the on-board computer
Passengers could change their initial inputs like the destination. The variation of how these
command inputs actuate the end effectors on the self-driving car are important in determining the
performance (response time, robustness, stability, and operation risk) and cost. There are two
possible ways.
1. The computer has the direct access to the engine and wheel.
2. The computer should control the throttle, brake, and steering wheel to implement deriving
decisions.
The first method could allow the car to get rid of its steering wheel, throttle, and brake paddle. In
other words, the passenger will not have the access to control the vehicle directly. The decision is
of the type of standard form.

 Communication methods
The way a self-driving car communicates with different targets like the passenger, satellite, support
team, and other self-driving cars are the communication methods. The performance of receiving
information and sending information of a self-driving car is highly correlated to this decision. The
communication methods also affect the user experience. There will be four options for
communication methods. They are
1. Voice and Vision (through screen)
2. Short-length communication with other google cars
3. Middle-length communication with local public service and technical support team
4. Long-length communication via satellite to obtain information of traffic, weather, map, etc.
This is also a down-selecting decision.

 Marketing Strategy (A separately decision, which won’t be considered into enumeration


and tradespace analysis)
This decision decides how to market the Google Car as a commercial product. This is important
because most of the stakeholders (except the problem ones) expect the autonomous project would
create profits. The market is then an important element to improve the profit of any product.
Therefore, this decision will affect the profit in the system metrics. They are five down-selecting
options.
1. Sell to private owner
2. Sell to government
3. U.S. Market
4. Global sale
5. Direct sale or through dealers

The constraints in these architectural decisions are:


1. If the bus size vehicle is selected, then the architecture must contain either a compressed
natural gas engine or a diesel combustion engine. This is because the larger vehicle requires
an engine that provides high enough power to move it without overloading.
2. If the sports car size vehicle is selected, the architecture cannot contain either a hydrogen
engine or a compressed natural gas engine.
3. If the sports car size vehicle is selected, the architecture cannot use a cloud computer. The
cloud computer would use the driving algorithm for common vehicles (optimize
comfortability and overall performance of the self-driving car system). The team assumes
one cloud computer controls multiple self-driving cars (a fleet of them). For a sports car, it
is special and requires extra computing effort to achieve its high performance. Including a
special unit in a set of common units (common self-driving cars) is inefficient for a central
computing system. Thus, the sports car would have its own computing device on board.
Hope this makes sense.
4. If the architecture generated has no sensors present, then the architecture is infeasible. This
is because the vehicle cannot possibly be autonomous if it has no sensing capability. In an
analogy to the human body, here we would have no eyes or ears and thus would have no
way of receiving road data inputs.
5. If the architecture generated has no computer present, then the architecture is infeasible.
This is because the vehicle cannot be autonomous within a computer to do analysis. In an
analogy to the human body, here we would have no brain, and thus the input data could not
be processed into the necessary outputs.
6. A generated architecture must contain voice and vision at a minimum for communication
– this is the technology that allows the user to input a destination and receive information
from the vehicle, and without it an architecture is infeasible.

b. Enumeration
Size Calculation – Full and with Constraints

The total size of the architecture space will be computed in two ways – unconstrained, and
constrained. This will be especially useful for determining the effects of the enforced constraints.
Two types of decisions are present within our architecture description – standard form and down
selecting. The number of possibilities that each decision can take must be determined before the
number of architectures within the entire architecture space can be calculated. The following
equations provide the process needed to determine the number of possibilities for a standard form
and down selecting decision respectively. Note that the standard form rule below will be used to
find the size of the total architecture space once the size of down selecting decisions is found.
Essentially the down selecting decisions are translated into standard form such that the below rule
can be applied.

𝑁
𝑆𝑖𝑧𝑒 𝑜𝑓 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐹𝑜𝑟𝑚 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝖦 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑝𝑡𝑖𝑜𝑛𝑠𝑖
𝑖=1

𝑆𝑖𝑧𝑒 𝑜𝑓 𝐷𝑜𝑤𝑛 𝑆𝑒𝑙𝑒𝑐𝑡𝑖𝑛𝑔 𝐷𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 2𝑁 𝑤ℎ𝑒𝑟𝑒 𝑁 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑂𝑝𝑡𝑖𝑜𝑛𝑠


Using the two above equations, the size of the unconstrained architecture space is
calculated – the number found is confirmed using the full factorial enumeration code described
below.

𝑈𝑛𝑐𝑜𝑛𝑠𝑡𝑟𝑎𝑖𝑛𝑒𝑑 𝐴𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑆𝑝𝑎𝑐𝑒 𝑆𝑖𝑧𝑒 = (4 ∗ 6 ∗ 2) ∗ 25 ∗ 24 ∗ 24 = 393216

Next the effect of the constraints must be considered within the enumeration in order to
determine the size of the constrained architecture space. However, the number of architectures to
eliminate from the architecture space for each constraint depends on the order that constraints are
enforced, and in general is not a simple problem. Thus, the output from the full factorial
enumeration code will be used to determine the size of the constrained architecture space. The
number of feasible architectures given our decisions and constraints is 118049.

Conclusion:

For future work, we think it could be interesting to include the marketing decision into our
evaluation functions. This decision is quite uncoupled from the rest of decisions and this is the
reason why we kept it out from our architecture analysis. However, we think that the marketing
strategy to promote our product would have a lot of impact on the project benefit.

The current architectural decisions focus on how to make a vehicle drive itself. However,
other decisions related to traditional vehicles are necessary. The power train design, chassis design,
and shape and strength of the hull should have all been identified in a complete and successful
vehicle architecture. Moreover, there should be more about the analysis on potential customers.
The designer and engineer should know how much a customer can afford to upgrade his/her
vehicle to fully autonomous and what are the most important features besides self-driving for a
customer.

Another thing that could be interesting to study in the future is the comparison between the
two Pareto fronts obtained during our tradespace analysis. In this report, we reduced our decision- space
in order to be able to do full-factorial enumeration and, therefore, we obtained the true Pareto front.
However, in the previous homework assignments, we had only checked if the Pareto fronts obtained
using random/deterministic sampling (we were not able to enumerate all possible architectures because
the decision space was too big) and optimization had any architectures in common to compare the two
Pareto fronts and see which one was better. In the future, it would be nice to compare them using the
Hypervolume Indicator (Hv), which is interesting because it captures both the proximity of the
approximation Pareto front to the true Pareto front and the distribution of the approximated Pareto
from over the objective space. This indicator would have told us which Pareto front was actually better.
Generally, however, the GA is expected to perform better than the random/deterministic enumeration,
since with this last approach it is easier to miss “good” unexplored areas of the decision space.

Our selected architecture is again a fully equipped combustion sedan with direct control of the
actuators. It is obvious that, if in the future we include contamination models of the different types of
engines, we would not bias our architectures towards the cheapest engine (combustion) but also the
least efficient in terms of contamination. We think it would be interesting to include the contamination
level in the performance metrics. Finally, we would like to improve the performance metrics used in
future iterations of this project – it was difficult to create metrics from scratch for a brand new technology

You might also like