0% found this document useful (0 votes)
11 views31 pages

Unit 3

Uploaded by

prasanar2021aids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views31 pages

Unit 3

Uploaded by

prasanar2021aids
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

DEEP LEARNING MODELS AND AI ANALYST

18AIC402J

UNIT 3 – COMPUTER VISIONS WITH WATSON STUDIOS

WATSON KNOWLEDGE STUDIO

 IBM Watson Knowledge Studio (WKS) is a cloud-based application designed


for training custom machine learning models that can understand the nuances of
domain-specific languages.

 By enabling the creation of domain-specific models, it helps businesses leverage


their unique expertise and data to gain a competitive advantage.
Features of WKS

 Custom Text Analytics: Watson Knowledge Studio allows users to create custom
machine learning models for natural language processing.

 Domain-Specific Models: Users can define and train their own domain-specific
models, also known as "annotator models," to identify specific entities and
relationships within text data.

 Annotation: The platform provides tools for users to annotate and label text data,
specifying the entities and relationships they want the model to recognize. This
annotation process is used to train the machine learning model.

 Collaboration: Watson Knowledge Studio supports collaboration among teams,


allowing multiple users to work on creating and refining annotation models
simultaneously.

WATSON KNOWLEDGE CATALOG

 IBM Watson Knowledge Catalog (WKC) is a cloud-based data cataloging and


governance tool designed to help organizations manage, organize, and analyze
their data assets.
 It provides a central repository for discovering, cataloging, and governing data, as
well as facilitating collaboration among data users.

Features of WKC

 Data Discovery: Provides a centralized location to discover and access data assets
within an organization.

 Metadata Management: Users can add and manage metadata to describe data
assets, making it easier to understand and use them.

 Data Governance: Helps enforce data governance policies, ensuring data security
and compliance.

 Collaboration: Teams can collaborate on data projects, share insights, and work
on data assets together.

 Data Integration: Integration with other IBM Watson services and third-party
tools allows for advanced analytics and AI applications.

WATSON DISCOVERY

 IBM Watson Discovery is an AI-powered search and content analysis engine


designed to uncover hidden insights from unstructured data.

 It uses natural language processing (NLP) and machine learning to analyze and
extract meaningful information from a variety of data sources, such as documents,
emails, webpages, and databases.
Features of Watson Discovery

 Content Ingestion: Watson Discovery can ingest a wide range of unstructured


data types, including text documents, images, and web pages.

 Document Enrichment: The platform enriches documents by extracting entities


(e.g., names, dates, locations), concepts, sentiment, and relationships from the text.
 Search and Query: Users can perform powerful searches and queries across large
volumes of unstructured data. Watson Discovery uses NLP to understand the
context of the search and return relevant results.

 Customization: Organizations can train Watson Discovery to understand domain-


specific terminology and concepts through machine learning models, which can be
fine-tuned to improve relevance.

WATSON AUTOAI

 Watson AutoAI is a component of IBM Watson Studio designed to simplify and


accelerate the process of building machine learning models.

 IBM Watson AutoAI automates the end-to-end process of building machine


learning models by handling tasks like data preparation, model selection, and
hyperparameter optimization.

Features of Watson AutoAI

 Automated Model Selection: Automatically selects the most appropriate machine


learning algorithms and techniques for your dataset and prediction task.
 Feature Engineering: Automatically generates and selects relevant features from
your dataset to improve model accuracy.

 Hyperparameter Optimization: Tunes hyperparameters for selected machine


learning models to optimize their performance, improving accuracy and efficiency.

WATSON OPENSCALE

 IBM Watson OpenScale tracks and measures outcomes from your AI models,
ensuring that they remain fair, explainable, and compliant regardless of where
your models were built or deployed.

 Watson OpenScale also detects and helps correct drift in model accuracy when in
production.

Key Features

 Model Monitoring: Continuously monitors deployed AI models, tracking


performance, accuracy, drift detection, and other metrics.

 Data Privacy and Compliance: Ensures data privacy and compliance with
regulations like GDPR by providing auditing and governance features.

 Model Deployment and Versioning: Supports deployment across various


environments (cloud and on-premises), with versioning to track changes and roll
back if necessary.

WATSON API

 IBM Watson offers a suite of powerful APIs that enable developers to integrate
AI and machine learning capabilities into their applications.

 These APIs are part of the IBM Cloud platform and allow building intelligent
applications that can understand, reason, and learn from data.

Common Watson APIs

 IBM Watson Natural Language Understanding API


 IBM Watson Language Translator API

 IBM Watson Speech to Text API

 IBM Watson Text to Speech API

 IBM Watson Assistant API

 IBM Watson Visual Recognition API

 IBM Watson Discovery API

 IBM Watson Personality Insights API

 IBM Watson Knowledge Studio API

 IBM Watson Tone Analyzer API


COMPUTER VISION

COMPUTER VISION

 Computer vision in AI refers to the field of study and technology that enables
computers to understand and interpret visual information from images or
videos.

 It involves the development of algorithms and techniques that allow computers to


analyze and process visual data to extract meaningful insights and make decisions
based on that data.

 It involves various tasks, including image recognition, object detection, image


segmentation, facial recognition, scene understanding, and more.

Image processing overview

 Image processing refers to the manipulation and analysis of digital images using
various techniques and algorithms.

 It involves transforming raw images into a more meaningful and visually


appealing form, extracting relevant information, and making decisions based on
that information.
COMPUTER VISION TASKS:

1. Image Classification and Tagging:

 It sees an image and can classify it (a dog, an apple, a person’s face).

 More precisely, it is able to accurately predict that a given image belongs to a


certain class.

 Example, a social media company might want to use it to automatically identify,


and segregate objectionable images uploaded by users.
2. Object Localization

 Object detection can use image classification to identify a certain class of image
and then detect and tabulate their appearance in an image or video.

 Examples include detecting damages on an assembly line or identifying machinery


that requires maintenance.

3. Object Tracking

Object tracking follows or tracks an object once it is detected.

This task is often executed with images captured in sequence or real-time video feeds.

Autonomous vehicles, for example, need to not only classify and detect objects such as
pedestrians, other cars and road infrastructure, they need to track them in motion to avoid
collisions and obey traffic laws.
4. Content-Based Image Retrieval

The content-based image retrieval aims to find the similar images from a large-scale
dataset against a query image.

It is also known as query by image content (QBIC) and content-based visual information
retrieval (CBVIR).

These tasks can be used for digital asset management systems and can increase the
accuracy of search and retrieval.
TYPES OF COMPUTER VISION

Computer vision is divided into three basic categories.

1. Low-level Vision: It aims to extract elementary visual information and enhance


the quality of the image data, preparing it for higher-level analysis.

2. Intermediate-level Vision: It aims to extract more meaningful features and


structures from the visual data. Tasks in this level include object recognition,
object tracking, image segmentation, motion estimation, and optical flow analysis.

3. High-level Vision: It represents the most complex level of visual understanding.


High-level vision tasks include scene understanding, object categorization, image
captioning, activity recognition, visual reasoning, and inference.

HOW DOES COMPUTER VISION WORK?

 Computer vision needs lots of data. It runs analyses of data over and over until it
discerns distinctions and ultimately recognize images.

 Example:

To train a computer to recognize automobile tires, it needs to be fed vast


quantities of tire images and tire-related items to learn the differences and recognize a
tire, especially one with no defects.
 Two essential technologies are used to accomplish this:

1. A Type of machine learning called Deep learning

2. Convolution Neural Network (CNN).

1) Convolutional neural networks (CNNs)

 Convolutional neural networks (CNNs) are specifically designed for computer


vision tasks.

 CNNs are able to learn to recognize patterns in images by scanning them for
features such as edges, shapes, and textures.

 This makes them well-suited for tasks such as object detection, image
classification, and facial recognition.
CNN WORKS

How computer reads an image


 A computer understands an image using numbers at each pixels.

 In example, we have considered that a black pixel will have value 1 and a white
pixel will have -1 value.

 CNN compress the images piece by piece. The pieces that it looks for are called
Features.

 By finding rough feature matches, in roughly the same position in two images.

 CNN gets a lot better at seeing similarity than whole image matching schemes.
How Do Convolutional Neural Networks Work?

 Convolutional neural networks are working based on TRANSFER LEARNING


method it is distinguished from other neural networks by their superior
performance with image, speech, or audio signal inputs. They have three main
types of layers, which are:

 Convolutional layer

 Pooling layer

 Fully-connected (FC) layer

1.CONVOLUTIONAL LAYER

• The convolutional layer is the core building block of a CNN, and it is where most
of the computation occurs.
• It requires a few components, which are input data, a filter, and a feature map.

• Example: Let’s assume that the input will be a color image, which is made up of a
matrix of pixels in 3D.

• This means that the input will have three dimensions—a height, width, and
depth—which correspond to RGB in an image.

• We also have a feature detector, also known as a kernel or a filter, which will move
across the receptive fields of the image, checking if the feature is present.

• This process is known as a Convolution

 The feature detector is a two-dimensional (2-D) array of weights, which represents


part of the image.

 While they can vary in size, the filter size is typically a 3x3 matrix.

 The filter is then applied to an area of the image, and a dot product is calculated
between the input pixels and the filter.
 This dot product is then fed into an output array.

 The final output from the series of dot products from the input and the filter is
known as a feature map, activation map, or a convolved feature.

 The output value in the feature map does not have to connect to each pixel value in
the input image.

 The Convolutional (and pooling) layers are commonly referred to as “partially


connected” layers.

 The weights in the feature detector remain fixed as it moves across the image,
which is also known as parameter sharing.

 Some parameters, like the weight values, adjust during training through the
process of backpropagation and gradient descent.

There are three hyperparameters which affect the volume size of the output that need to
be set before the training of the neural network begins. These include:

1. The number of filters affects the depth of the output.

2. Stride is the distance, or number of pixels, that the kernel moves over the input matrix.
While stride values of two or greater is rare, a larger stride yields a smaller output.

3. Zero-padding is usually used when the filters do not fit the input image. This sets all
elements that fall outside of the input matrix to zero, producing a larger or equally sized
output. There are three types of padding:

 Valid padding: This is also known as no padding. In this case, the


last convolution is dropped if dimensions do not align.

 Same padding: This padding ensures that the output layer has the
same size as the input layer

 Full padding: This type of padding increases the size of the output
by adding zeros to the border of the input.
BATCH NORMALIZATION

 Batch normalization is a technique for training very deep neural networks that
normalizes the contributions to a layer for every mini-batch.

 This has the impact of settling the learning process and drastically decreasing the
number of training epochs required to train deep neural networks.
2. POOLING LAYER

 Pooling layers, also known as down sampling, conducts dimensionality reduction,


reducing the number of parameters in the input.

 There are two main types of pooling:

 Max pooling: As the filter moves across the input, it selects the pixel with
the maximum value to send to the output array.

 Average pooling: As the filter moves across the input, it calculates the
average value within the receptive field to send to the output array.

 While a lot of information is lost in the pooling layer, it also has a number of
benefits to the CNN.

 They help to reduce complexity, improve efficiency, and limit risk of overfitting.

3. FULLY-CONNECTED LAYER

 Fully-connected layer is also a linear classifier such as logistic regression


which is used for this reason.

 Convolution and pooling layers extract features from image. So, this layer doing
some "preprocessing" of data.
This layer performs the task of classification based on the features extracted through the
previous layers and their different filters. This function to classify inputs appropriately,
producing a probability from 0 to 1.

------------------------------------------------------------------------------------------------

CV FOR THE ENTERPRISE - AGRICULTURE

CASE STUDY: Convolutional Neural Networks in Detection of Plant Leaf Diseases

Problem Statement:

How to create Neural Network that classifies the leaves into diseased and Non diseased
crops?

STEP 1: Each leaf will be broken into pixels depending on the dimension of leaf.

For Example: If the image composed of 30 by 30 pixels. Then the total num of
pixel will be 900.

Then pass each leaf to Input layer in neural network.


STEP 2:

 Once an input layer is determined, weights are assigned.

 All inputs are then multiplied by their respective weights and then summed.

 Then assign a numerical value call Bias to each perceptron.


STEP 3:

 The output is passed through an activation function, which determines the output.
This is known as Transformation function

 If that output exceeds a given threshold, its “fires” (or activates) the node, passing
data to the next layer in the network.

STEP 4:
 At the o/p layer the probability is derived which divide whether the data belongs to
class a or class b.

 This process of passing data from one layer to the next layer define this neural
network as a feedforward network.

 Now let's assume a case where the predicted output is wrong.


In such a situation, we train the neural n/w by using the Back Propagation method.

Python in Computer Vision

 Python is one of the most popular programming languages for computer vision

 Here are some key Python libraries and frameworks commonly used in computer
vision:

1. OpenCV
2. scikit-image

3. TensorFlow

4. Keras

5. Pillow

Open CV Package

 OpenCV (Open-Source Computer Vision Library) is a popular open-source library


for computer vision and image processing tasks.

 It provides a comprehensive set of tools, algorithms, and functions that enable


developers to build applications for tasks like image and video analysis, object
detection and tracking, facial recognition, augmented reality, and more.

Package Installation: !pip install opencv-python

 After running the installation command, you can import the OpenCV:

import cv2

Identify The Images:

There are two common ways to identify the images:

1. Grayscale
• Grayscale images are those images which contain only two colors black and white.
The contrast measurement of intensity is black treated as the weakest intensity,
and white as the strongest intensity.

2. RGB

• An RGB is a combination of the red, green, blue color which together makes a
new color. The computer retrieves that value from each pixel and puts the results
in an array to be interpreted.

Read & Display Images

To read an image, you can use the cv2.imread() function.

import cv2

img = cv2.imread(r’path/image.jpg')

To display an image using OpenCV, you can use the cv2.imshow() function.
cv2.imshow(‘Elon Musk', img) # window name , image array

cv2.waitKey(0) #Display an image and pause the program until a key is pressed.

cv2.destroyAllWindows() #Close the window, ensuring a clean termination of the


program.

print('Image dimensions:', img.shape) #Used to print the dimensions and shape of image

Example:

import cv2

img= cv2.imread(r'C:/Users/ibmtr/OneDrive/Desktop/Elon.jpg’)

print('Image dimensions:', img.shape)

cv2.imshow('image',img)

cv2.waitKey(0)

cv2.destroyAllWindows()
OUTPUT:

Image dimensions: (1080, 1920, 3)

[Note:1080-height of the image

1920- width of the image

3-number of color channels in the image (Red, Green, and Blue)]

Open cv-image processing operations

1. Reading and displaying images: Load and display images using functions like
cv2.imread() and cv2.imshow().

2. Image resizing: Resize images using functions such as cv2.resize() to adjust the
dimensions of the image.

3. Image cropping: Extract a region of interest (ROI) from an image using slicing or
the cv2.crop() function.

4. Image rotation: Rotate images using functions like cv2.getRotationMatrix2D()


and cv2.warpAffine() to achieve desired orientations.
5. Image flipping: Flip images horizontally or vertically using functions like
cv2.flip().

6. Image filtering and smoothing: Apply various filters and smoothing techniques
such as blurring, Gaussian smoothing, median filtering, etc., using functions like
cv2.blur(), cv2.GaussianBlur(), cv2.medianBlur(), etc.

7. Image edge detection: Detect edges in images with the cv2.Canny() function.

8. Image blending: Blend two or more images together using functions like
cv2.addWeighted().

9. Image color space conversion: Convert images between different color spaces,
such as RGB to grayscale or RGB to HSV, using functions like cv2.cvtColor().

10. Image drawing: Draw shapes, lines, circles, rectangles, text, etc., on images using
functions like cv2.line(), cv2.circle(), cv2.rectangle(), cv2.putText(), etc.

PILLOW PACKAGE

 The PIL (Python Imaging Library) package, also known as PILLOW, is a


powerful and widely used Python library for image processing and manipulation.

 It provides a range of functions and classes to perform various operations on


images, including loading, saving, resizing, cropping, enhancing, filtering, and
transforming images.

 Pillow is an updated fork of the original PIL library, which adds support for
Python 3.x and includes several enhancements and bug fixes.

Read & Display Images

!pip install pillow # Package installation

from PIL import Image #Image module from the PIL package.

image=Image.open (r'C:\Users\ibmtr\OneDrive\Desktop\Elon.jpg’) # Read the image

image.show() # Display the image

image.save() #Save the o/p image


Attributes of Image Module

 Image.format: This function returns file format of the image file like ‘JPEG’,
‘BMP’, ‘PNG’, etc.

image.format

O/p: ’JPEG'

 Image.mode: It is used to get the pixel format used by the image. Typical values
are “1”, “L”, “RGB” or “CMYK”.

image.mode

O/p: 'RGB'

 Image.size: It returns the tuple consist of height & weight of the image.

image.size

O/p: (1280, 721)

 Image.width: It returns only the width of the image.

image.width

O/p:1280

 Image.height: It returns only the height of the image.

image.height

O/p: 721

 Image.info: It returns a dictionary holding data associated with the image.

Pillow - Image Processing Operations

1. Image opening and saving: Open and save images in various formats such as
JPEG, PNG, BMP, GIF, TIFF, etc., using Image.open() and Image.save().

2. Image resizing: Resize images to specific dimensions or scale them


proportionally using Image.resize().
3. Image cropping: Crop images to extract specific regions of interest using
Image.crop().

4. Image rotation and flipping: Rotate images clockwise or counterclockwise and


flip them horizontally or vertically using Image.rotate() and Image.transpose().

5. Image filtering and enhancing: Apply various filters and enhance image quality
using functions like ImageFilter.BLUR, ImageFilter.SHARPEN, and
ImageEnhance.Contrast.

6. Image manipulation and composition: Combine, paste, or blend images together


using functions like Image.alpha_composite(), Image.paste(), and Image.blend().

7. Image color adjustments: Adjust image brightness, contrast, saturation, and color
balance using functions like ImageEnhance.Brightness, ImageEnhance.Contrast,
ImageEnhance.Color, and ImageOps.colorize().

8. Image text and annotation: Draw text, shapes, and other annotations on images
using functions like ImageDraw.Draw.text(), ImageDraw.Draw.rectangle(), and
ImageDraw.Draw.line().

9. Image statistics and analysis: Calculate image histograms, compute average


color values, identify dominant colors, and perform other statistical analysis using
functions like Image.histogram(), ImageStat.Stat().

You might also like