0% found this document useful (0 votes)

26 views81 pages

1 Introduction

Uploaded by

ramjasjdh31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views81 pages

1 Introduction

Uploaded by

ramjasjdh31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

Pattern Recognition and Machine

Learning
Dr Suresh Sundaram
sureshsundaram@iitg.ernet.in
Lets get started
• Person identification systems -> Biometrics,
Aadhar,
Human Perception
• How did we learn the alphabet of the English
language?

Trained ourselves to recognize alphabets, so

that given a new alphabet, we use our
memory / intelligence in recognizing it.
Machine Perception
• How about providing such capabilities to
machines to recognize alphabets ?

• The field of pattern recognition exactly does

that.
Idea
• Build a machine that can recognize patterns:

– Speech recognition

– Fingerprint identification

– OCR (Optical Character Recognition)

– DNA sequence identification

A basic PR framework
• Training samples
• Testing samples
• An algorithm for recognizing an unknown test
sample

• Samples are labeled (supervised learning)

Typical supervised PR problem
• Alphabets – 26 in number (upper case)

• # of alphabets/ classes to recognize – 26.

• Collect samples of each of the 26 alphabets
and train using an algorithm.
• Once trained, test system using unknown test
sample/ alphabeth.
Basics
So what's a pattern ?
A pattern is an entity, vaguely defined, that
could be given a name, e.g.,
• fingerprint image,
• handwritten word,
• human face,
• speech signal,
• DNA sequence
• alphabeth
Handwriting Recognition

Machine print document Input handwritten document

Handwriting recognition
Face recognition
Fingerprint recognition
Other Applications
• Object classification
• Signature verification ( genuine vs forgery)
• Iris recognition
• Writer adaptation
• Speaker recognition
• Bioinformatics (gene classification)
• Communication System Design
• Medical Image processing
Pattern Recognition Algorithms
• Bag of algorithms that can used to provide
some intelligence to a machine.

• These algorithms have a solid probabilistic

framework.

• Algorithms work on certain characteristics

defining a class -refered as ‘features’.
What is a feature?
• Features across classes need to be
discriminative for better classification
peformance.

Pattern l
Pattern i
• Presence of a dot in ‘i’ can distinguish these
‘i’ from ‘l’ and is a feature.

• Features values can be discrete or continuous

in nature (floating value).

• In practice, a single feature may not suffice for

discrimination.
Pattern Recognition Algorithms
• Bag of algorithms that can used to provide
some intelligence to a machine.

• These algorithms have a solid probabilistic

framework.

• Algorithms work on certain characteristics

defining a class -refered as ‘features’.
What is a feature?
• Features across classes need to be
discriminative for better classification
peformance.

Pattern l
Pattern i
• Presence of a dot in ‘i’ can distinguish these
‘i’ from ‘l’ and is a feature.

• Features values can be discrete or continuous

in nature (floating value).

• In practice, a single feature may not suffice for

discrimination.
Feature selection
In practice, a single feature may not suffice for
discrimination.

• A possible solution is to look out for many features

and select a set ( possibly with feature selection
algorithms). The goal is to improve the recognition
performance of unseen test data.

• The different features selected can be represented

with a vector called as ‘feature vector’.
Dimension of a feature vector
• Suppose we select d features, we can
represent them with a d-dimensional feature
vector.

• Pixels of an image of size M XN can be

represented with a MN*1 dimensional feature
vector.
Feature selection
• Domain Knowledge helps in extracting
features

• Feature discriminability measures are

available like Fisher scores to measure the
effectiveness of features.
List of features used in literature
• Pixels in an image
• Edge based features in an image
• Transformed coefficients

DFT (Shape description)

DCT (Compression)
Wavelets (Palm print recognition)
KLT /PCA (Face recognition)
Gabor (Texture classification, script
identification)
MFCCs (Speech systems)
Features
• Feature to be discriminative
• Specific to applications…… no universal
feature for all pattern recognition problems
…. Ugly Duckling Theorem

• To be robust to translation, rotation,

occlusion, scaling
Features

• Continuous, real valued

• Discrete
• Binary
• Mixed
Features
Curse of dimensionality

• If limited data is available, too many features

may degrade the performance ….. We need as
large number of training samples for better
generalization….to beat the `curse of
dimensionality’!

• Need arises to come up with techniques such

as PCA to pick the `relevant features’.
Basic Pattern Recognition
• “Sorting incoming Fish on a conveyor
according to species using optical sensing”

Sea bass
Species
Salmon
• Problem Analysis

– Set up a camera and take some sample images to extract

features

• Length
• Lightness
• Width
• Number and shape of fins
• Position of the mouth, etc…

• This is the set of all suggested features to explore for use in our
classifier!
• Preprocessing

– Use a segmentation operation to isolate fishes from

one another and from the background

• Information from a single fish is sent to a feature

extractor whose purpose is to reduce the data by
measuring certain features

• The features are passed to a classifier

• Classification

– Select the length of the fish as a possible feature

for discrimination
The length is a poor feature alone!

Select the lightness as a possible feature.

• Adopt the lightness and add the width of the
fish as a new feature

Fish xT = [x1, x2]

Lightness Width
• We might add other features that are not
correlated with the ones we already have. A
precaution should be taken not to reduce the
performance by adding such “noisy features”

• Ideally, the best decision boundary should be

the one which provides an optimal
performance such as in the following figure:
Use simple models to complicated ones : Occams razor
• Sensing

– Use of a transducer (camera or microphone)

• Segmentation and grouping

– Patterns should be well separated and should not

overlap
• Feature extraction
– Discriminative features
– Invariant features with respect to translation, rotation and
scale.

• Classification
– Use a feature vector provided by a feature extractor to
assign the object to a category

• Post Processing
– Exploit context input dependent information other than
from the target pattern itself to improve performance
The Design Cycle

• Data collection
• Feature Choice
• Model Choice
• Training
• Evaluation
• Computational Complexity
• Data Collection

– How do we know when we have collected an

adequately large and representative set of
examples for training and testing the system?
• Feature Choice

– Depends on the characteristics of the problem

domain. Simple to extract, invariant to irrelevant
transformation insensitive to noise.
• Model Choice

– Unsatisfied with the performance of our fish

classifier and want to jump to another class of
model
• Training

– Use data to determine the classifier. Many

different procedures for training classifiers and
choosing models
• Evaluation

– Measure the error rate (or performance and

switch from one set of features to another one
• Computational Complexity

– What is the trade-off between computational

ease and performance?

– (How an algorithm scales as a function of the

number of features, patterns or categories?)
Learning paradigms
• Supervised learning

– A teacher provides a category label or cost for

each pattern in the training set

• Unsupervised learning

– The system forms clusters or “natural groupings”

of the input patterns
Unsupervised Learning
• The system forms clusters or “natural groupings” of
the input patterns….

• Clustering is often called an unsupervised

learning task as no class values denoting an a priori
grouping of the data instances are given
Segmentation of an image into k clusters by a popular iterative algorithm called k
Means Algorithm.

Original image

Segmented image using

k Means Clustering
(k=3)
Reinforcement learning
• Reinforcement learning is an area of machine
learning inspired by behaviorist psychology,
concerned with how software agents ought to
take actions in an environment so as to
maximize some notion of cumulative reward.
Semi-supervised learning
• Semi-supervised learning is a class of supervised
learning tasks and techniques that also make use
of unlabeled data for training - typically a small
amount of labeled data with a large amount of
unlabeled data.

• It falls between unsupervised learning (without

any labeled training data) and supervised
learning (with completely labeled training data).
Learning paradigms
• Supervised learning

– A teacher provides a category label or cost for

each pattern in the training set

• Unsupervised learning

– The system forms clusters or “natural groupings”

of the input patterns
Unsupervised Learning
• The system forms clusters or “natural groupings” of
the input patterns….

• Clustering is often called an unsupervised

learning task as no class values denoting an a priori
grouping of the data instances are given
Segmentation of an image into k clusters by a popular iterative algorithm called k
Means Algorithm.

Original image

Segmented image using

• It falls between unsupervised learning (without

any labeled training data) and supervised
learning (with completely labeled training data).
Regression

Division of feature space to distinct

regions by decision surfaces
Empirical Risk Minimization
• Every classifier / regressor does what is called
as - `empirical risk minimization’

• Learning pertains to coming up with an

architecture that can minimize a risk / loss
function defined on the training /empirical
data.
No- free lunch theorem

• There ain’t such thing as free lunch -- It is impossible to

get nothing for something !

• In view of the no-free-lunch theorem it seems that one cannot

hope for a classifier that would perform best on all possible
problems that one could imagine.
Classifier taxonomy
• Generative classifiers
• Discriminative classifiers

• Types of generative classifier

[a] Parametric
[b] Non-parametric
Generative classifier
• Samples of training data of a class assumed
to come from a probability density function
(class conditional pdf)

• If the form of pdf is assumed , such as

uniform, gaussian, rayleigh, etc …one can
estimate the parameters of the distribution.

• Parametric classifier
Class conditional Density : pdf built using infinite samples of a given pattern / class.

In this figure, we have 2 pdf s corresponding to 2 classes w1 and w2 .

Feature x ‘brightness’ is used to construct the pdfs.

Generative classifier
• One can as well assume to use the training
data to build a pdf - Non parametric
approach

• Discriminative classifier No such

assumption of data being drawn from an
underlying pdf. Models the decision boundary
by adaptive gradient descent techniques.
Discriminative Classifier
• Start with initial weights that define the
decision surface
• Update the weights based on some
optimization criterion….

• No need to model the distribution of samples

of a given class…..class conditional density
concept not required!
• Neural nets (such as MLP, Single layer
perceptron, SVMs) fall in the category of
discriminative classifiers.
Discriminative classifier
Linearly separable data
w1x1+w2x2+b>0

w1x1+w2x2+b<0

Linearly separable data

Separating line : w1x1+w2x2+b=0
Non- linearly separable data
Covers Theorem
• The theorem states that given a set of training
data that is not linearly separable, one can
transform it into a training set that is linearly
separable by mapping it into a possibly
higher-dimensional space via some non-linear
transformation.
Cover’s Theorem

The samples of the original data is in 2D. After a non-linear

transformation , it becomes linearly separable in three
dimensions as shown in (b).
Cover’s Theorem
Evaluation Metric

Consider scenario wherein a patient

is screened for a disease.

Yes : Healthy
No: Diseased

TP : True positive
FN : False negative
TN : True Negative
FN : False Negative

Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
46 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
100% (1)
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
57 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
100% (1)
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
Pattern Recognition Explained
No ratings yet
Pattern Recognition Explained
40 pages
PR01
100% (1)
PR01
41 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
Artificial Neural Networks-Pattern Recogntion
No ratings yet
Artificial Neural Networks-Pattern Recogntion
21 pages
Unit 1-2
No ratings yet
Unit 1-2
112 pages
Lecture10 PatternRecognition 1 51
No ratings yet
Lecture10 PatternRecognition 1 51
51 pages
Lecture 9 Introducation To ML
No ratings yet
Lecture 9 Introducation To ML
48 pages
UNIT-V Notes
No ratings yet
UNIT-V Notes
24 pages
Unit 1 Image Proc
No ratings yet
Unit 1 Image Proc
37 pages
Unit 1
No ratings yet
Unit 1
46 pages
Pattern Recognition: Lecturer
No ratings yet
Pattern Recognition: Lecturer
43 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Pattern Recognition Insights
No ratings yet
Pattern Recognition Insights
40 pages
Pattern Recognition
No ratings yet
Pattern Recognition
12 pages
Pattern Recognition
No ratings yet
Pattern Recognition
5 pages
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
B.Tech Pattern Recognition Lab
No ratings yet
B.Tech Pattern Recognition Lab
19 pages
Pattern Recoginition 5
No ratings yet
Pattern Recoginition 5
43 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
AI Unit-5 Notes
No ratings yet
AI Unit-5 Notes
25 pages
PATTERN RECOGNITION Final Notes
90% (10)
PATTERN RECOGNITION Final Notes
40 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
Pattern Recognition for CS Scholars
0% (1)
Pattern Recognition for CS Scholars
37 pages
Unit 1 - Pattern Recognition Techniques
No ratings yet
Unit 1 - Pattern Recognition Techniques
23 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
PRA Min
No ratings yet
PRA Min
93 pages
Basics of Pattern Recognition
No ratings yet
Basics of Pattern Recognition
35 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
ML Mid Syllabus
No ratings yet
ML Mid Syllabus
182 pages
Pattern and Classification
No ratings yet
Pattern and Classification
20 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
Fundamentals of PR
No ratings yet
Fundamentals of PR
44 pages
Pattern Recognition
No ratings yet
Pattern Recognition
50 pages
Pattern Recognition in AI
No ratings yet
Pattern Recognition in AI
3 pages
PR Unit 1 ....
No ratings yet
PR Unit 1 ....
34 pages
Pattern Recognition Notes For Students-1
No ratings yet
Pattern Recognition Notes For Students-1
18 pages
Pattern Recognition 21BR551 MODULE 01 NOTES
No ratings yet
Pattern Recognition 21BR551 MODULE 01 NOTES
20 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Pattern Recognition Organizer
No ratings yet
Pattern Recognition Organizer
112 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
Pattern Recognition
No ratings yet
Pattern Recognition
66 pages
PR Slide Spring 2017
No ratings yet
PR Slide Spring 2017
5 pages
Ass
No ratings yet
Ass
8 pages
Chapter 1. Introduction: (Huan - Nguyen@inha - Ac.kr)
No ratings yet
Chapter 1. Introduction: (Huan - Nguyen@inha - Ac.kr)
24 pages
Pattern Recognition With Semi-Supervised Learning Algorithm
No ratings yet
Pattern Recognition With Semi-Supervised Learning Algorithm
57 pages
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
No ratings yet
Traffic Signal Control System Using Deep Reinforcement Learning With Emphasis On Reinforcing Successful Experiences
8 pages
Machine Learning in Finance: Matthew F. Dixon Igor Halperin Paul Bilokon
83% (12)
Machine Learning in Finance: Matthew F. Dixon Igor Halperin Paul Bilokon
565 pages
CHAPTER 20-Final
No ratings yet
CHAPTER 20-Final
20 pages
ML U5 Notes
No ratings yet
ML U5 Notes
26 pages
Describe About Reinforcement Learning, Passive Reinforcement Learning and Active Reinforcement
No ratings yet
Describe About Reinforcement Learning, Passive Reinforcement Learning and Active Reinforcement
1 page
DeepSeek Proves AI Comes For All Jobs - Even AI Jobs
No ratings yet
DeepSeek Proves AI Comes For All Jobs - Even AI Jobs
6 pages
Toward Efficient 6G IoT Networks A Perspective On Resource Optimization Strategies Challenges and Future Directions
No ratings yet
Toward Efficient 6G IoT Networks A Perspective On Resource Optimization Strategies Challenges and Future Directions
28 pages
Human-Centric Reinforcement Learning
No ratings yet
Human-Centric Reinforcement Learning
8 pages
Slides Active Flow Control Deep Reinforcement Learning
No ratings yet
Slides Active Flow Control Deep Reinforcement Learning
46 pages
Handwriting Augmentation
No ratings yet
Handwriting Augmentation
12 pages
Reinforcement Learning For Building Management Systems
No ratings yet
Reinforcement Learning For Building Management Systems
9 pages
AI-Based Path Planning For Autonomous Robots
No ratings yet
AI-Based Path Planning For Autonomous Robots
3 pages
Large Language Models LLMs Inference Offloading and Resource Allocation in Cloud-Edge Computing An Active Inference Approach
No ratings yet
Large Language Models LLMs Inference Offloading and Resource Allocation in Cloud-Edge Computing An Active Inference Approach
12 pages
GPT Protocol Whitepaper v2.0
No ratings yet
GPT Protocol Whitepaper v2.0
9 pages
Lecture Notes: Introduction To Machine Learning For The Sciences
No ratings yet
Lecture Notes: Introduction To Machine Learning For The Sciences
80 pages
Chapter 5 - Machine Learning
100% (1)
Chapter 5 - Machine Learning
114 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
No ratings yet
Gujarat Technological University: Bachelor of Engineering Syllabus Subject Code: Subject Name
3 pages
Optimzing Traffic Control System Using ML
No ratings yet
Optimzing Traffic Control System Using ML
5 pages
Flamingo Deeepmind
No ratings yet
Flamingo Deeepmind
22 pages
Deep Reinforcement Learning Algorithms For Profitable Stock Trading Strategies
No ratings yet
Deep Reinforcement Learning Algorithms For Profitable Stock Trading Strategies
6 pages
PREVIEW - The Girly Guide To Artificial Intelligence
100% (1)
PREVIEW - The Girly Guide To Artificial Intelligence
25 pages
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
100% (1)
Introduction To Machine Learning: WWW - Seas.upenn - Edu/ Cis519
51 pages
The PyBullet Module-Based Approach To Control The Collaborative YuMi Robot
No ratings yet
The PyBullet Module-Based Approach To Control The Collaborative YuMi Robot
4 pages
Deep Reinforcement Learning For Trainin
No ratings yet
Deep Reinforcement Learning For Trainin
71 pages
Dynamic Portfolio Management SEO
No ratings yet
Dynamic Portfolio Management SEO
5 pages
CS 601 Machine Learning Unit 4
No ratings yet
CS 601 Machine Learning Unit 4
14 pages
Personalized Learning Path Generation in E-Learning Systems Using Reinforcement Learning and Generative Adversarial Networks
No ratings yet
Personalized Learning Path Generation in E-Learning Systems Using Reinforcement Learning and Generative Adversarial Networks
8 pages
Scalable Agent Alignment Via Reward Modeling: A Research Direction
No ratings yet
Scalable Agent Alignment Via Reward Modeling: A Research Direction
30 pages
Reinforcement Learning Guide
No ratings yet
Reinforcement Learning Guide
18 pages

1 Introduction

Uploaded by

1 Introduction

Uploaded by

Pattern Recognition and Machine

Trained ourselves to recognize alphabets, so

• The field of pattern recognition exactly does

– OCR (Optical Character Recognition)

– DNA sequence identification

• Samples are labeled (supervised learning)

• # of alphabets/ classes to recognize – 26.

Machine print document Input handwritten document

• These algorithms have a solid probabilistic

• Algorithms work on certain characteristics

• Features values can be discrete or continuous

• In practice, a single feature may not suffice for

• These algorithms have a solid probabilistic

• Algorithms work on certain characteristics

• Features values can be discrete or continuous

• In practice, a single feature may not suffice for

• A possible solution is to look out for many features

• The different features selected can be represented

• Pixels of an image of size M XN can be

• Feature discriminability measures are

DFT (Shape description)

• To be robust to translation, rotation,

• Continuous, real valued

• If limited data is available, too many features

• Need arises to come up with techniques such

– Set up a camera and take some sample images to extract

– Use a segmentation operation to isolate fishes from

• Information from a single fish is sent to a feature

• The features are passed to a classifier

– Select the length of the fish as a possible feature

Select the lightness as a possible feature.

Fish xT = [x1, x2]

• Ideally, the best decision boundary should be

– Use of a transducer (camera or microphone)

• Segmentation and grouping

– Patterns should be well separated and should not

– How do we know when we have collected an

– Depends on the characteristics of the problem

– Unsatisfied with the performance of our fish

– Use data to determine the classifier. Many

– Measure the error rate (or performance and

– What is the trade-off between computational

– (How an algorithm scales as a function of the

– A teacher provides a category label or cost for

– The system forms clusters or “natural groupings”

• Clustering is often called an unsupervised

Segmented image using

• It falls between unsupervised learning (without

– A teacher provides a category label or cost for

– The system forms clusters or “natural groupings”

• Clustering is often called an unsupervised

Segmented image using

• It falls between unsupervised learning (without

Similar to Curve Fitting Problem to a set of points…..

Division of feature space to distinct

• Learning pertains to coming up with an

• There ain’t such thing as free lunch -- It is impossible to

• In view of the no-free-lunch theorem it seems that one cannot

• Types of generative classifier

• If the form of pdf is assumed , such as

In this figure, we have 2 pdf s corresponding to 2 classes w1 and w2 .

Feature x ‘brightness’ is used to construct the pdfs.

• Discriminative classifier No such

• No need to model the distribution of samples

Linearly separable data

The samples of the original data is in 2D. After a non-linear

Consider scenario wherein a patient

You might also like