0% found this document useful (0 votes)

40 views34 pages

PR Unit 1 ....

Uploaded by

RITIK UPADHYAY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views34 pages

PR Unit 1 ....

Uploaded by

RITIK UPADHYAY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Pattern Recognition | Introduction UNIT 1

Pattern is everything around in this digital world. A pattern can either be seen physically or it can
be observed mathematically by applying algorithms.

Example: The colours on the clothes, speech pattern etc. In computer science, a pattern is
represented using vector features values.

What is Pattern Recognition ?

Pattern recognition is the process of recognizing patterns by using machine learning algorithm.
Pattern recognition can be defined as the classification of data based on knowledge already gained
or on statistical information extracted from patterns and/or their representation. One of the
important aspects of the pattern recognition is its application potential.

Examples: Speech recognition, speaker identification, multimedia document recognition (MDR),

automatic medical diagnosis.
In a typical pattern recognition application, the raw data is processed and converted into a form
that is amenable for a machine to use. Pattern recognition involves classification and cluster of
patterns.

 In classification, an appropriate class label is assigned to a pattern based on an abstraction

that is generated using a set of training patterns or domain knowledge. Classification is used
in supervised learning.
 Clustering generated a partition of the data which helps decision making, the specific
decision making activity of interest to us. Clustering is used in an unsupervised learning.

Features may be represented as continuous, discrete or discrete binary variables. A feature is a
function of one or more measurements, computed so that it quantifies some significant
characteristics of the object.

Example: consider our face then eyes, ears, nose etc are features of the face.
A set of features that are taken together, forms the features vector.

Example: In the above example of face, if all the features (eyes, ears, nose etc) taken together
then the sequence is feature vector([eyes, ears, nose]). Feature vector is the sequence of a features
represented as a d-dimensional column vector. In case of speech, MFCC (Melfrequency Cepstral
Coefficent) is the spectral features of the speech. Sequence of first 13 features forms a feature
vector.

Feature Vector:

The collection of observations is also known as a feature vector. A feature is a distinctive

characteristic of a good or service that sets it apart from similar items. Feature vector is the
combination of n features in n-dimensional column vector.The different classes may have
different features values but the same class always has the same features values.
Example:

Pattern recognition possesses the following features:

 Pattern recognition system should recognise familiar pattern quickly and accurate
 Recognize and classify unfamiliar objects
 Accurately recognize shapes and objects from different angles
 Identify patterns and objects even when partly hidden
 Recognise patterns quickly with ease, and with automaticity.

Training and Learning in Pattern Recognition

Learning is a phenomena through which a system gets trained and becomes adaptable to give
result in an accurate manner. Learning is the most important phase as how well the system
performs on the data provided to the system depends on which algorithms used on the data. Entire
dataset is divided into two categories, one which is used in training the model i.e. Training set and
the other that is used in testing the model after training, i.e. Testing set.

Training set:

Training set is used to build a model. It consists of the set of images which are used to train the
system. Training rules and algorithms used give relevant information on how to associate input
data with output decision. The system is trained by applying these algorithms on the dataset, all
the relevant information is extracted from the data and results are obtained. Generally, 80% of the
data of the dataset is taken for training data.

Testing set:

Testing data is used to test the system. It is the set of data which is used to verify whether the
system is producing the correct output after being trained or not. Generally, 20% of the data of the
dataset is used for testing. Testing data is used to measure the accuracy of the system. Example: a
system which identifies which category a particular flower belongs to, is able to identify seven
category of flowers correctly out of ten and rest others wrong, then the accuracy is 70 %

Real-time Examples and Explanations:

A pattern is a physical object or an abstract notion. While talking about the classes of animals, a
description of an animal would be a pattern. While talking about various types of balls, then a
description of a ball is a pattern. In the case balls considered as pattern, the classes could be
football, cricket ball, table tennis ball etc. Given a new pattern, the class of the pattern is to be
determined. The choice of attributes and representation of patterns is a very important step in
pattern classification. A good representation is one which makes use of discriminating attributes
and also reduces the computational burden in pattern classification.

An obvious representation of a pattern will be a vector. Each element of the vector can represent
one attribute of the pattern. The first element of the vector will contain the value of the first
attribute for the pattern being considered.
Example: While representing spherical objects, (25, 1) may be represented as an spherical object
with 25 units of weight and 1 unit diameter. The class label can form a part of the vector. If
spherical objects belong to class 1, the vector would be (25, 1, 1), where the first element
represents the weight of the object, the second element, the diameter of the object and the third
element represents the class of the object.

Advantages:

 Pattern recognition solves classification problems

 Pattern recognition solves the problem of fake bio metric detection.
 It is useful for cloth pattern recognition for visually impaired blind people.
 It helps in speaker diarization.
 We can recognise particular object from different angle.

Disadvantages:

 Syntactic Pattern recognition approach is complex to implement and it is very slow process.
 Sometime to get better accuracy, larger dataset is required.
 It cannot explain why a particular object is recognized.
Example: my face vs my friend’s face.

Applications:

 Image processing, segmentation and analysis

Pattern recognition is used to give human recognition intelligence to machine which is

required in image processing.

 Computer vision

Pattern recognition is used to extract meaningful features from given image/video samples
and is used in computer vision for various applications like biological and biomedical
imaging.

 Seismic analysis

Pattern recognition approach is used for the discovery, imaging and interpretation of
temporal patterns in seismic array recordings. Statistical pattern recognition is implemented
and used in different types of seismic analysis model.

 Radar signal classification/analysis

Pattern recognition and Signal processing methods are used in various applications of radar
signal classifications like AP mine detection and identification.

 Speech recognition

The greatest success in speech recognition has been obtained using pattern recognition
paradigms. It is used in various algorithms of speech recognition which tries to avoid the
problems of using a phoneme level of description and treats larger units such as words as
pattern

 Finger print identification

The fingerprint recognition technique is a dominant technology in the biometric market. A

number of recognition methods have been used to perform fingerprint matching out of which
pattern recognition approaches is widely used.

 Machine Vision:

A machine vision system captures images via a camera and analyzes them to produce
descriptions of images=d objects. For example, during inspection in manufacturing industry
when the manufactured objects are passed through the camera, the images have to be
analyzed online.

 Computer Aided Diagnosis (CAD):

CAD helps to assist doctors in making diagnostic decision. Computer assisted diagnosis has
been applied in medical field such as X-rays, ECGs, ultrasound images etc.

 Speech Recognition:

This process recognizes the spoken information. In this the software in built around a pattern
recognition system which recognizes the spoken text ans translated it into ASCII characters
which are shown on the screen. In this we can also identify the identity of speaker.

 Character Recognition:

This application recognizes both letter and number. In this the optically scanned image is
provided as input and alphanumeric characters are generated as output. Its major implication
is in automation and information handling. It is also used in page readers, zip code, license
plate etc.

 Manufacturing:

In this the 3-D images such as structured light, laser, stereo etc is provided as input and as a
result we can identify the objects.

 Fingerprint Identification:

In this the input image is obtained from fingerprint sensors and by this technique various
fingerprint classes are obtained and we can identify the owner of the fingerprint.

 Industrial Automation:
In this we provide the intensity or range image of the product and by this the defective or
non-defective product is identified.

Pattern Recognition | Phases and Activities

Process/ steps/ component in Pattern Recognition System

Approaches for Pattern Recognition Systems can be represented by different phases as Pattern
Recognition Systems can be divided into components.

 Phase 1: Converts images or sounds or other inputs into signal data.

 Phase 2: Isolates the sensed objects from the background.
 Phase 3: Measures objects properties that are useful for classification.
 Phase 4: Assign the sensed object to category.
 Phase 5: Take other consideration to decide for appropriate action.
Problems solved by these Phases are as follows:
1. Sensing: It deals with problem arises in the input such as itsbandwidth, resolution,
sensitivity, distortion, signal-to-noise ratio, latency,etc.

2. Segmentation and Grouping: Deepest problems in pattern recognition that deals with the
problem of recognizing or grouping together the various parts of an object.

3. Feature Extraction: It deals with the characterization of an object so that it can be

recognized easily by measurements.Those objects whose values are very similar for the
objects are consider to be in the same category, while whose values are very different for the
objects are placed in different categories.
4. Classification: It deals with assigning the object to their particular categories by using the
feature vector provided by the feature extractor and determining the values of all of the
features for a particular input.

5. Post Processing: It deals with action decision making by using the output of the
classifier.Action such as to minimum-error-rate classification that will minimize the total
expected cost.

Design principle /Activities for designing the Pattern Recognition Systems

There are various sequences of activities that are used for designing the Pattern Recognition
Systems. These activities are as follows:

 Data Collection
 Feature Choice
 Model Choice
 Training
 Evaluation

Approaches of Pattern Recognition

In pattern recognition system, for recognizing the pattern or structure two basic approaches are
used which can be implemented in diferrent techniques. These are –
 Statistical Approach and
 Structural Approach

Statistical Approach:

Statistical methods are mathematical formulas, models, and techniques that are used in the
statistical analysis of raw research data. The application of statistical methods extracts
information from research data and provides different ways to assess the robustness of research
outputs.
Two main statistical methods are used :

3. Descriptive Statistics: It summarizes data from a sample using indexes such as the mean or
standard deviation.
4. Inferential Statistics: It draw conclusions from data that are subject to random variation.

Structural Approach:
The Structural Approach is a technique wherein the learner masters the pattern of sentence.
Structures are the different arrangements of words in one accepted style or the other.
Types of structures:
 Sentence Patterns
 Phrase Patterns
 Formulas
 Idioms

Difference Between Statistical Approach and Structural Approach:

SR. NO. STATISTICAL APPROACH STRUCTURAL APPROACH

1 Statistical decision theory. Human perception and cognition.

2 Quantitative features. Morphological primitives

3 Fixed number of features. Variable number of primitives.

4 Ignores feature relationships. Captures primitives relationships.

5 Semantics from feature position. Semantics from primitives encoding.

6 Statistical classifiers. Syntactic grammars.

Supervised learning
Supervised learning as the name indicates the presence of a supervisor as a teacher. Basically
supervised learning is a learning in which we teach or train the machine using data which is well
labeled that means some data is already tagged with the correct answer. After that, the machine is
provided with a new set of examples(data) so that supervised learning algorithm analyses the
training data(set of training examples) and produces a correct outcome from labeled data.
For instance, suppose you are given a basket filled with different kinds of fruits. Now the first
step is to train the machine with all different fruits one by one like this:

 If shape of object is rounded and depression at top having color Red then it will be labelled
as –Apple.
 If shape of object is long curving cylinder having color Green-Yellow then it will be labelled
as –Banana.
Now suppose after training the data, you have given a new separate fruit say Banana from basket
and asked to identify it.

Since the machine has already learned the things from previous data and this time have to use it
wisely. It will first classify the fruit with its shape and color and would confirm the fruit name as
BANANA and put it in Banana category. Thus the machine learns the things from training
data(basket containing fruits) and then apply the knowledge to test data(new fruit).

Supervised learning classified into two categories of algorithms:

 Classification: A classification problem is when the output variable is a category, such as
“Red” or “blue” or “disease” and “no disease”.
 Regression: A regression problem is when the output variable is a real value, such as
“dollars” or “weight”.
Unsupervised learning

Unsupervised learning is the training of machine using information that is neither classified nor
labeled and allowing the algorithm to act on that information without guidance. Here the task of
machine is to group unsorted information according to similarities, patterns and differences
without any prior training of data.
Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore machine is restricted to find the hidden structure in unlabeled data by our-
self.
For instance, suppose it is given an image having both dogs and cats which have not seen ever.

Thus the machine has no idea about the features of dogs and cat so we can’t categorize it in dogs
and cats. But it can categorize them according to their similarities, patterns, and differences i.e.,
we can easily categorize the above picture into two parts. First first may contain all pics
having dogs in it and second part may contain all pics having cats in it. Here you didn’t learn
anything before, means no training data or examples.
Unsupervised learning classified into two categories of algorithms:
 Clustering: A clustering problem is where you want to discover the inherent groupings in
the data, such as grouping customers by purchasing behavior.
 Association: An association rule learning problem is where you want to discover rules that
describe large portions of your data, such as people that buy X also tend to buy Y.
Introduction to Clustering

It is basically a type of unsupervised learning method . An unsupervised learning method is a

method in which we draw references from datasets consisting of input data without labelled
responses. Generally, it is used as a process to find meaningful structure, explanatory underlying
processes, generative features, and groupings inherent in a set of examples.
Clustering is the task of dividing the population or data points into a number of groups such that
data points in the same groups are more similar to other data points in the same group and
dissimilar to the data points in other groups. It is basically a collection of objects on the basis of
similarity and dissimilarity between them.
For ex– The data points in the graph below clustered together can be classified into one single
group. We can distinguish the clusters, and we can identify that there are 3 clusters in the below
picture.

It is not necessary for clusters to be a spherical. Such as:

DBSCAN: Density-based Spatial Clustering of Applications with Noise

These data points are clustered by using the basic concept that the data point lies within the given
constraint from the cluster centre. Various distance methods and techniques are used for
calculation of the outliers.

Why Clustering ?

Clustering is very much important as it determines the intrinsic grouping among the unlabeled
data present. There are no criteria for a good clustering. It depends on the user, what is the criteria
they may use which satisfy their need. For instance, we could be interested in finding
representatives for homogeneous groups (data reduction), in finding “natural clusters” and
describe their unknown properties (“natural” data types), in finding useful and suitable groupings
(“useful” data classes) or in finding unusual data objects (outlier detection). This algorithm must
make some assumptions which constitute the similarity of points and each assumption make
different and equally valid clusters.
Clustering Methods :

 Density-Based Methods : These methods consider the clusters as the dense region having
some similarity and different from the lower dense region of the space. These methods have
good accuracy and ability to merge two clusters.Example DBSCAN (Density-Based Spatial
Clustering of Applications with Noise) , OPTICS (Ordering Points to Identify Clustering
Structure) etc.

 Hierarchical Based Methods : The clusters formed in this method forms a tree-type
structure based on the hierarchy. New clusters are formed using the previously formed one. It
is divided into two category
 Agglomerative (bottom up approach)
 Divisive (top down approach)
examples CURE (Clustering Using Representatives), BIRCH (Balanced Iterative Reducing
Clustering and using Hierarchies) etc.

 Partitioning Methods : These methods partition the objects into k clusters and each
partition forms one cluster. This method is used to optimize an objective criterion similarity
function such as when the distance is a major parameter example K-means, CLARANS
(Clustering Large Applications based upon Randomized Search) etc.
 Grid-based Methods : In this method the data space is formulated into a finite number of
cells that form a grid-like structure. All the clustering operation done on these grids are fast
and independent of the number of data objects example STING (Statistical Information
Grid), wave cluster, CLIQUE (CLustering In Quest) etc.

Clustering Algorithms :

K-means clustering algorithm – It is the simplest unsupervised learning algorithm that solves
clustering problem.K-means algorithm partition n observations into k clusters where each
observation belongs to the cluster with the nearest mean serving as a prototype of the cluster .

Applications of Clustering in different fields

 Marketing : It can be used to characterize & discover customer segments for marketing
purposes.
 Biology : It can be used for classification among different species of plants and animals.
 Libraries : It is used in clustering different books on the basis of topics and information.
 Insurance : It is used to acknowledge the customers, their policies and identifying the
frauds.
City Planning: It is used to make groups of houses and to study their values based on their
geographical locations and other factors present.
Earthquake studies: By learning the earthquake-affected areas we can determine the dangerous
zones.
Pattern Recognition approaches

Patterns generated from the raw data depend on the nature of the data. Patterns may
be generated based on the statistical feature of the data. In some situations, underlying
structure of the data decides the type of the pattern generated. In some other instances,
neither of the two situation exits. In such scenarios a system is developed and trained for
desired responses. Thus, for a given problem one or more of these different approaches may
be used to obtain the solution. Hence, to obtain the desired attributes for a pattern recognition
system, there are many different mathematical techniques. The four best-known approaches
for the pattern recognition are:
1. Template matching
2. Statistical classification
3. Syntactic matching
4. Neural networks
In template matching, the prototype of the pattern to be recognized is compared
against the pattern to be recognized. In the statistical approach, the patterns are described as
random variables, from which class densities can be inferred. Classification is done based on
the statistical modeling of data. In the syntactic approach, a pattern is seen as being
composed of simple sub-patterns which are themselves built from yet simpler sub-patterns,
the simplest being the primitives. Inter relationships between these primitive patterns are
used to represent a more complex pattern. The neural network approach to pattern
recognition is strongly related to the statistical methods, since they can be regarded as
parametric models with their own learning scheme.
The models proposed need not be independent and sometimes the same pattern
recognition method exists with different interpretations. A hybrid system may be built
involving multiple models. The comparison of different approaches is summarized in
Table 1.1.
Table 1.1: Pattern Recognition Models

Approach Representation Recognition Typical Criterion

Function

Template Matching Samples, pixels, Correlation, distance Classification error

Curves measure

Statistical Features Discriminant Classification error

Function

Syntactic or Primitives Rules, grammar Acceptance error

Structural

Neural networks Samples, pixels, Network function Mean square error

features

Template matching
One of the simplest and earliest approaches to pattern recognition is based on
template matching. Matching is carried out to determine the similarity between two entities
such as points, curves, or shapes of the same type. In template matching, a template or a
prototype of the pattern to be recognized is available. The pattern to be recognized is
matched against the stored template while taking into account all allowable operations such
as translation, rotation and scale changes. The similarity measure, often a correlation, may be
optimized based on the available training set. Often, the template itself is learned from the
training set. Template matching is computationally demanding. Present day computers with
higher computation power, due to their faster processors, has made this approach more
feasible. The rigid template matching even though effective in some application domains has
a number of disadvantages. For example, it would fail if the patterns are distorted due to the
imaging process, viewpoint change, or large intra-class variations among the patterns. When
the deformation cannot be easily explained or modeled directly, deformable template models
or rubber sheet deformations can be used to the match patterns.
Statistical Pattern Recognition
The statistical pattern recognition approach assumes statistical basis for classification
of data. It generates random parameters that represent the properties of the pattern to be
recognized. The main goal of statistical pattern classification is to find to which category or
class a given sample belongs. Statistical methodologies such as statistical hypothesis testing,
correlation and Bayes classification are used for implementing this method. The effectiveness
of the representation is determined by how well pattern from different classes are well
separated.
To measure the nearness of the given sample with one of the classes, statistical
pattern recognition uses probability of error. Bayesian classifier is a natural choice in
applying statistical methods to pattern recognition. However, its implementation is often
difficult due to the complexity of the problems and especially when the dimensionality of the
system is high. One can also consider simpler solution such as a parametric classifier based
on assumed mathematical forms such as linear, quadratic or piecewise. Initially a parametric
form of the decision boundary is specified; then the best decision boundary of the specified
form is found based on the classification of training samples. Another important issue
concerned with statistical pattern recognition is the estimation of the values of the parameters
since they are not given in practice. In these systems it is always important to understand
how the number of samples affects the classifier design and performance.
Syntactic Pattern Recognition
In many situations there exist interrelationship or interconnection between the
features associated with a pattern. In such circumstances it is appropriate to assume a
hierarchical relationship where a pattern is viewed as being consist of simple sub patterns
which are themselves built with yet another sub pattern. This is the basis of Syntactic pattern
recognition. In this method symbolic data structures such as arrays, strings, trees, or graphs
are used for pattern representation. These data structures define the relations between
fundamental pattern components and allow the representation of hierarchical models. Thus
complex patterns can be represented from simpler ones. The recognition of an unknown
pattern is accomplished by comparing its symbolic representation with a number of
predefined objects. This comparison helps to compute the similarity measurement between
the unknown input and with known patterns.
The symbolic data structures used for the representation of the patterns are
represented by words of symbols or strings. The individual symbols in a string usually
represent components of the atomic pattern. The strings are however one-dimensional in
nature but many patterns are inherently two or more dimensional. One of the most used and
powerful symbolic structure for higher dimensional data representation is a graph. A graph is
composed of a set of nodes and a set of edges in which the nodes represent simpler sub-
patterns and the edges the relations between those sub-patterns. These relations may be
spatial, temporal or of any other type, depending on the problem. An important subclass of a
graph is a tree. A tree has three different classes of nodes, which are root, interior and leave.
Trees are intermediate between strings and graphs. They are interesting for pattern
recognition applications since they are more powerful than strings as a representation of the
object and computationally less expensive than graphs. Another form of symbolic
representation is the array which is a special type of graph which has the nodes and edges
arranged in a regular form. This type of data structure is very useful for low level pattern
representation.
Structural pattern recognition is found to be good because it provides a description of
how the given pattern is constructed from the primitives in addition to classification. This
method is useful in situations where the patterns have a definite structure which can be
captured in terms of a set of rules. However, due to parsing difficulties the implementation of
a syntactic approach is limited. It is very difficult to use this method for segmentation of
noisy patterns and another problem is inference of the grammar from training data. Powerful
pattern recognition capabilities can be achieved by combining the syntactic and statistical
pattern recognition techniques [Fu 1986].
Neural Network
Neural computing is based on the way by which biological neural system store and
manipulates information. It can be viewed as parallel computing environment consisting of
interconnection of large number of simple processors. Neural network have been successfully
applied in many tasks of pattern recognition and machine learning systems. The structure of
neural system is drawn from analogies with biological neural systems. Many algorithms have
been designed to work with neural network learning have been developed. In these
algorithms, a set of rules defines the evolution process undertaken by the synaptic
connections of the networks, thus allowing them to learn how to perform specified tasks.
Neural network models uses a network of weighted directed graphs in which the nodes
are artificial neurons and directed edges are connections between neuron outputs and
neuron inputs. The neural networks have the ability to learn complex nonlinear input-
output relationships, use sequential training procedures, and adapt themselves to the data.
Different types of neural networks are used for pattern classification. Among them Feed-
forward network and Kohonen-Network is commonly used. The learning process
involves updating network architecture and connection weights so that a network can
efficiently perform a specific classification/clustering task. The neural network models
are gaining popularity because of their ability to solve pattern recognition problems,
seemingly low dependence on domain-specific knowledge, and due to the availability of
efficient learning algorithms for practitioners to use. Neural networks are also useful for
implementing nonlinear algorithms for feature extraction and classification. In addition,
existing feature extraction and classification algorithms can also be mapped on neural
network architectures for efficient implementation. In spite of the seemingly different
underlying principles, most of the well-known neural network models are implicitly
equivalent or similar to classical statistical pattern recognition methods.
OR

PATTERN APPROACH /PATTERN MODEL

Statistical patternrecognition Method

Statistical pattern recognition has been used successfully to design a number of

commercial recognition systems.

In statistical pattern recognition, a pattern is represented by a set of d features, or attributes,

viewed as a d- dimensional feature vector.

Well-known concepts from statistical decision theory are utilized to establish decision
boundaries between pattern classes.

The recognition system is operated in two modes: training (learning) and classiﬁcation
(testing) (see Fig. 1).

The role of the preprocessing module is to segment the pattern of interest from the
background, remove noise, normalize the pattern, and any other operation which will
contribute in defining a compact representation of the pattern.
In the training mode, the feature extraction/selection module finds the appropriate
features for representing the input patterns and the classifier is trained to partition
the feature space.

The feedback path allows a designer to optimize the preprocessing and feature
extraction/selection strategies. In the classiﬁcation mode, the trained classiﬁer assigns the
input pattern to one of the pattern classes under consideration based on the measured
features

In these systems it is always important to understand how the number of samples affects
the classiﬁer design and performance

STRUCTURAL PATTERN RECOGNITION METHOD

In many situations there exist interrelationship or interconnection between the features
associated with a pattern.
In such circumstances it is appropriate to assume a hierarchical relationship where a
pattern is viewed as being consist of simple sub patterns which are themselves built with
yet another sub pattern.
This is the basis of Syntactic pattern recognition. In this method symbolic data structures
such as arrays, strings, trees, or graphs are used for pattern representation.
These data structures define the relations between fundamental pattern components and
allow the representation of hierarchical models. Thus complex patterns can be represented
from simpler ones.
The recognition of an unknown pattern is accomplished by comparing its symbolic
representation with a number of predefined objects. This comparison helps to compute the
similarity measurement between the unknown input and with known patterns
Structural pattern recognition is found to be good because it provides a description of how
the given pattern is constructed from the primitives in addition to classification.
This method is useful in situations where the patterns have a definite structure which can
be captured in terms of a set of rules.
However, due to parsing difficulties the implementation of a syntactic approach is limited.
It is very difficult to use this method for segmentation of noisy patterns and another
problem is inference of the grammar from training data.
Powerful pattern recognition capabilities can be achieved by combining the syntactic and
statistical pattern recognition technique
Neural pattern recognition method

Neural computing is based on the way by which biological neural system store and
manipulates information. It can be viewed as parallel computing environment consisting of
interconnection of large number of simple processors.

Neural network have been successfully applied in many tasks of pattern recognition and
machine learning systems. The structure of neural system is drawn from analogies with
biological neural systems.

Many algorithms have been designed to work with neural network learning have been
developed. In these algorithms, a set of rules deﬁnes the evolution process undertaken by the
synaptic

Designing a neural network which is used error back propagation algorithm is not only a
science but also an experimental work.

The reason is that many factors are engaged in designing a network which are the results of
researcher's experiences however with considering some matters we can lead the back
propagation algorithm to better Performance

connections of the networks, thus allowing them to learn how to perform speciﬁed tasks.
Neural network models uses a network of weighted directed graphs in which the nodes are
artiﬁcial neurons and directed edges are connections between neuron outputs and neuron
inputs.

The neural networks have the ability to learn complex nonlinear input-output relationships,
use sequential training procedures, and adapt themselves to the data. Different types of neural
networks are used for pattern classiﬁcation. Among them Feedforward network and
Kohonen-Network is commonly used.

The learning process involves updating network architecture and connection weights so that
a network can efficiently perform a specific classification/clustering task. The neural network
models are gaining popularity because of their ability to solve pattern recognition problems,
seemingly low dependence on domain-specific knowledge, and due to the availability of
efficient learning algorithms for practitioners to use.

Neural networks are also useful for implementing nonlinear algorithms for feature extraction
and classification. In addition, existing feature extraction and classification algorithms can
also be mapped on neural network architectures for efficient implementation. In spite of the
seemingly different underlying principles, most of the well-known neural network models are
implicitly equivalent or similar to classical statistical pattern recognition methods.

The model of a network comprises analog cells like neuron. Fig. shows an instance of these
cells which are used in a network.
This multi layer hierarchal network is made of lots of cell layers. In this network there are
forward and backward links between cells. If this network is used for recognizing the pattern
in this hierarchy, forward signals handle the process of recognizing pattern whereas backward
signals handle the process of separating patterns and reminding.

We can teach this network to recognize each set of patterns. Even being extra instigators or
lack in patterns, this model can recognize it. It is not necessary that the complete reminding
recognize manipulated shapes or the shapes that are changed in size or convert the imperfect
parts to the main mode.

Fig 3 Neuron Model of Pattern Recognition

Template matching

One of the simplest and earliest approaches to pattern recognition is based on template
matching. Matching is carried out to determine the similarity between two entities such as
points, curves, or shapes of the same type.

In template matching, a template or a prototype of the pattern to be recognized is available.

The pattern to be recognized is matched against the stored template while taking into account
all allowable operations such as translation, rotation and scale changes.
The similarity measure, often a correlation, may be optimized based on the available training
set. Often, the template itself is learned from the training set. Template matching is
computationally demanding.

Present day computers with higher computation power, due to their faster processors, has
made this approach more feasible. The rigid template matching even though effective in some
application domains has a number of disadvantages. For example, it would fail if the
patterns are distorted due to the imaging process, viewpoint change, or large intra-class
variations among the patterns. When the deformation cannot be easily explained or modeled
directly, deformable template models or rubber sheet deformations can be used to the match
patterns.

Decision region and Decision Boundary

•Our goal of pattern recognition is to reach an optimal decision rule to categorize the
incoming data into their respective categories

•The decision boundary separates points belonging to one class from points of other

•The decision boundary partitions the feature space into decision regions.

•The nature of the decision boundary is decided by the discriminant functionwhich is used for
decision. It is a function of the feature vector.

general, a pattern classiﬁer carves up (or tesselates or partitions) the feature space into
volumes called decision regions. All feature vectors in a decision region are assigned to the
same category. The decision regions are often simply connected, but they can be multiply
connected as well, consisting of two or more non-touching regions.

The decision regions are separated by surfaces called the decision boundaries. These
separating surfaces represent points where there are ties between two or more categories.

For a minimum-distance classiﬁer, the decision boundaries are the points that are equally
distant from two or more of the templates. With a Euclidean metric, the decision boundary
between Region i and Region j is on the line or plane that is the perpendicular bisector of the
line from mi to mj. Analytically, these linear boundaries are a consequence of the fact that the
discriminant functions are linear. (With the Mahalanobis metric, the decision boundaries are
quadratic surfaces, such as ellipsoids, paraboloids or hyperboloids.)

Decision boundary

Figure 3.2 graphically defines the input space, decision regions, decision boundaries, and
transition regions for a two-dimensional classification problem. To define the input space, we
use a simple two-component input pattern, input vector x = {x1 and x2}. The output vector y
contains three possible classes, i.e., y = {class I, class II, class III}. Note that for every point
within the input space, there must be one and only one class specified. This example has only
three possible output vectors for training the network, y = {[1,0,0], [0,1,0], or [0,0,1]}. The
two-dimensional input space, in Figure 3.2, is constrained by feasible operating limits of the
input variables, xi (i = 1 to n), that is, (1) x1,min < x1 < x1,max; and (2) x2,min < x2 <
x2,max. The possible output classes are mapped within this two-dimensional space.

1. Decision region : a speciﬁc region within the input space which corresponds to a unique
output class. All points within this region contain one and only one output class. Note that the
input space can have multiple decision regions corresponding to multiple output
classes. Figure 3.2 has three decision regions, one for each output class.

2. Decision Boundary: the boundary is the intersection of two different decision regions.
In Figure 3.2, the decision boundary between classes I and II would have an output vector of
y = [0.5,0.5,0]. This example has three decision boundaries: (i) between class I and class II,
(ii) between class I and class III, and (iii) between class II and class III.

3. Transition Region: this area is the buffer between two different decision regions. Here,
we can make only fuzzy inferences about the classiﬁcation because the predicted output
vector is not y = {[1,0,0], [0,1,0], or [0,0,1]}. For example, the transition region between
class I and class II begins as the class I output response, then starts to decrease from 1, and
ends when it reaches 0. Similarly, in the transition region, the class II output response
increases from 0 to 1.

In a statistical-classiﬁcation problem with two classes, a decision boundary or decision

surface is a hypersurface that partitions the underlying vector space into two sets, one for
each class. The classiﬁer will classify all the points on one side of the decision boundary as
belonging to one class and all those on the other side as belonging to the other class. A
decision boundary is the region of a problem space in which the output label of a classiﬁer is
ambiguous.[1]

If the decision surface is a hyperplane, then the classiﬁcation problem is linear, and the
classes are linearly separable.

Decision boundaries are not always clear cut. That is, the transition from one class in the
feature space to another is not discontinuous, but gradual. This effect is common in fuzzy
logic based classiﬁcation algorithms, where membership in one class or another is
ambiguous.

Hyper planes and Hyper surfaces

•For two category case, a positive value of discriminant function decides class 1 and a
negative value decides the other.

•If the number of dimensions is three. Then the decision boundary will be a planeor a 3-D
surface. The decision regions become semi-inﬁnite volumes

•If the number of dimensions increases to more than three, then the decision boundary
becomes a hyper-planeor a hyper-surface. The decision regions become semi-inﬁnite
hyperspaces.

Learning

•The classiﬁer to be designed is built using input samples which is a mixture of all the
classes.

•The classiﬁer learns how to discriminate between samples of different classes.

•If the Learningis offline i.e. Supervised method then, the classifier is first given a set of
training samples and the optimal decision boundary found, and then the classification is done.

•If the learning is online then there is no teacher and no training samples (Unsupervised).The
input samples are the test samples itself. The classiﬁer learns and classiﬁes at the same time.
Straight line decision boundary

Features

We might add other features that are not highly correlated with the ones we already have. Be
sure not to reduce the performance by adding “noisy features” Ideally, you might think the
best decision boundary is the one that provides optimal performance on the training data (see
the following ﬁgure)

Is this a good decision boundary?

Our satisfaction is premature because the central aim of designing a classiﬁer is to correctly
classify new (test) input
Decision Boundary Choice

Our satisfaction is premature because the central aim of designing a classiﬁer is to correctly
classify new (test) input

Better decision boundary

LEARNING AND ADAPTATION

Supervised learning A teacher provides a category label for each pattern in the training set

Unsupervised learning The system forms clusters or “natural groupings” of the unlabeled
input patterns

METRIC SPACE

A point-set S is a metric space if there is a distance function d, which takes ordered pairs (s,t)
of elements of S and returns a distance that satisfies the following conditions

For each pair s, t in S, d(s,t) >0 if s and t are distinct points and d(s,t) =0 if s and t are
identical

For each pair s,t in S, the distance from s to t is equal to the distance from t to s, d(s,t) = d(t,s)
For each tripe s,t,u in S, the sum of the distances from s to t and from t to u is always at least
as large as the distance from s to u
DISTANCE

Distance measures play an important role in PR.

They provide the foundation for many popular and effective algorithms like k-nearest neighbours for
supervised learning and k-means clustering for unsupervised learning.

Different distance measures must be chosen and used depending on the types of the data. As such, it is
important to know how to implement and calculate a range of different popular distance measures and the
intuitions for the resulting scores.

Distance measures play an important role in PR.

A distance measure is an objective score that summarizes the relative difference between two objects in a
problem domain.

Most commonly, the two objects are rows of data that describe a subject (such as a person, car, or house),
or an event (such as a purchase, a claim, or a diagnosis).

Perhaps the most likely way you will encounter distance measures is when you are using a specific PR
algorithm that uses distance measures at its core. The most famous algorithm of this type is the k-nearest
neighbours algorithm, or KNN for short.

distance measures play an important role in PR. Perhaps four of the most commonly used distance
measures in PR are as follows:

 Hamming Distance
 Euclidean Distance
 Manhattan Distance
 Minkowski Distance

Hamming Distance
Hamming distance calculates the distance between two binary vectors, also referred to as binary strings or
bitstrings for short.
You are most likely going to encounter bitstrings when you one-hot encode categorical columns of data.
For example, if a column had the categories ‘red,’ ‘green,’ and ‘blue,’ you might one hot encode each
example as a bitstring with one bit for each column.
 red = [1, 0, 0]
 green = [0, 1, 0]
 blue = [0, 0, 1]
The distance between red and green could be calculated as the sum or the average number of bit
differences between the two bitstrings. This is the Hamming distance.

Euclidean Distance
Euclidean distance calculates the distance between two real-valued vectors.
You are most likely to use Euclidean distance when calculating the distance between two rows of data that
have numerical values, such a floating point or integer values.
If columns have values with differing scales, it is common to normalize or standardize the numerical
values across all columns prior to calculating the Euclidean distance. Otherwise, columns that have large
values will dominate the distance measure.

Manhattan Distance (Taxicab or City Block Distance)

The Manhattan distance, also called the Taxicab distance or the City Block distance, calculates the distance
between two real-valued vectors.
It is perhaps more useful to vectors that describe objects on a uniform grid, like a chessboard or city
blocks. The taxicab name for the measure refers to the intuition for what the measure calculates: the
shortest path that a taxicab would take between city blocks (coordinates on the grid).

It might make sense to calculate Manhattan distance instead of Euclidean distance for two vectors in an
integer feature space.

Minkowski Distance
Minkowski distance calculates the distance between two real-valued vectors.
It is a generalization of the Euclidean and Manhattan distance measures and adds a parameter, called the
“order” or “p“, that allows different distance measures to be calculated.

Pattern Recognition in AI
No ratings yet
Pattern Recognition in AI
3 pages
Pattern and Classification
No ratings yet
Pattern and Classification
20 pages
Pattern Recognition Notes For Students-1
No ratings yet
Pattern Recognition Notes For Students-1
18 pages
Pattern Recognition
No ratings yet
Pattern Recognition
11 pages
UNIT-V Notes
No ratings yet
UNIT-V Notes
24 pages
PATTERN RECOGNITION Final Notes
90% (10)
PATTERN RECOGNITION Final Notes
40 pages
DSP Unit - III
No ratings yet
DSP Unit - III
49 pages
Pattern Recognition
No ratings yet
Pattern Recognition
11 pages
Pattern Recognition
No ratings yet
Pattern Recognition
5 pages
AI Unit-5
No ratings yet
AI Unit-5
66 pages
Pattern Recognition
No ratings yet
Pattern Recognition
12 pages
Unit 5
No ratings yet
Unit 5
4 pages
Pattern Recognition Organizer
No ratings yet
Pattern Recognition Organizer
112 pages
B.Tech Pattern Recognition Lab
No ratings yet
B.Tech Pattern Recognition Lab
19 pages
Basics of Pattern Recognition
No ratings yet
Basics of Pattern Recognition
35 pages
Pattern Recognition
No ratings yet
Pattern Recognition
5 pages
PP&DS 4
No ratings yet
PP&DS 4
82 pages
Pattern Recognition - Organizer - 2023
100% (2)
Pattern Recognition - Organizer - 2023
112 pages
Basic Pattern Recognition Concept
No ratings yet
Basic Pattern Recognition Concept
5 pages
Pattern Lec 1
No ratings yet
Pattern Lec 1
15 pages
INSEM Exam Answerkey 23
No ratings yet
INSEM Exam Answerkey 23
16 pages
PR Notes
No ratings yet
PR Notes
21 pages
Ass
No ratings yet
Ass
8 pages
AI Unit-5 Notes
No ratings yet
AI Unit-5 Notes
25 pages
AI Unit 5
No ratings yet
AI Unit 5
295 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
Pattern Recognition
No ratings yet
Pattern Recognition
11 pages
Course Overview:: Introduction To Pattern Recognition
No ratings yet
Course Overview:: Introduction To Pattern Recognition
8 pages
A Review of Pattern Recognition Techniques
No ratings yet
A Review of Pattern Recognition Techniques
4 pages
Pattern Recognition 21BR551 MODULE 01 NOTES
No ratings yet
Pattern Recognition 21BR551 MODULE 01 NOTES
20 pages
Unit 1 - Pattern Recognition Techniques
No ratings yet
Unit 1 - Pattern Recognition Techniques
23 pages
Introduction To Pattern Recognition System
No ratings yet
Introduction To Pattern Recognition System
12 pages
07 - Chapter 1 PDF
No ratings yet
07 - Chapter 1 PDF
12 pages
Introduction To Pattern Recognition: Anil K. Jain, Robert P.W. Duin
No ratings yet
Introduction To Pattern Recognition: Anil K. Jain, Robert P.W. Duin
5 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
Statistical Pattern Recognition A Review
No ratings yet
Statistical Pattern Recognition A Review
34 pages
Pattern Recognition
No ratings yet
Pattern Recognition
57 pages
(Jain2000) Statistical Pattern Recognition A Review
No ratings yet
(Jain2000) Statistical Pattern Recognition A Review
34 pages
CPE412 Pattern Recognition (Week 1)
No ratings yet
CPE412 Pattern Recognition (Week 1)
19 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
100% (1)
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
Pattern Recognition - A Statistical Approach
No ratings yet
Pattern Recognition - A Statistical Approach
6 pages
IP Unit7
No ratings yet
IP Unit7
29 pages
Pattern
No ratings yet
Pattern
14 pages
Pattern Recognition Notes
No ratings yet
Pattern Recognition Notes
41 pages
Unit 1 Image Proc
No ratings yet
Unit 1 Image Proc
37 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
AIML-5th-Sem - Pattern Recognition - Dr. Sudipta Chakrabarty
No ratings yet
AIML-5th-Sem - Pattern Recognition - Dr. Sudipta Chakrabarty
73 pages
Artificial Narendra
No ratings yet
Artificial Narendra
10 pages
Discussion No 4 Pattern Recognition: Group 3
No ratings yet
Discussion No 4 Pattern Recognition: Group 3
20 pages
Unit - 5
No ratings yet
Unit - 5
11 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
Pattern Recognition Group 19
No ratings yet
Pattern Recognition Group 19
32 pages
Pattern Recognition Techniques in AI
No ratings yet
Pattern Recognition Techniques in AI
6 pages
Irjet V6i11121
No ratings yet
Irjet V6i11121
5 pages
PRA Min
No ratings yet
PRA Min
93 pages
Cib DC22776
No ratings yet
Cib DC22776
20 pages
Summary - Private Sale of A Vehicle - GOV - UK Toyota
No ratings yet
Summary - Private Sale of A Vehicle - GOV - UK Toyota
2 pages
Article150 - Disobedience To Summonscriminallaw (Revisedpenalcode)
No ratings yet
Article150 - Disobedience To Summonscriminallaw (Revisedpenalcode)
2 pages
Agrarian Land Transfer Guide
No ratings yet
Agrarian Land Transfer Guide
9 pages
General Computing I Notes
No ratings yet
General Computing I Notes
37 pages
Type Specimen Book
100% (1)
Type Specimen Book
13 pages
【讲义】刘洪波雅思写作真经班（3 1）
No ratings yet
【讲义】刘洪波雅思写作真经班（3 1）
23 pages
CV For Shantel Panashe Muunganirwa
No ratings yet
CV For Shantel Panashe Muunganirwa
2 pages
Chap13 Buying Merchandise 1
No ratings yet
Chap13 Buying Merchandise 1
18 pages
Wilkins 2011 How Many Species Concepts Are There
No ratings yet
Wilkins 2011 How Many Species Concepts Are There
3 pages
Eu Data Act Addendum
No ratings yet
Eu Data Act Addendum
1 page
Books by Indian Meteorologists
No ratings yet
Books by Indian Meteorologists
8 pages
Communication and Flight Information
No ratings yet
Communication and Flight Information
94 pages
DLL-ict-css Week #1
No ratings yet
DLL-ict-css Week #1
16 pages
Cabadbaran City Ordinance No. 2011-026
No ratings yet
Cabadbaran City Ordinance No. 2011-026
7 pages
Provisional Candidate Results File For June 2020
No ratings yet
Provisional Candidate Results File For June 2020
6 pages
A Commentary On Mark's Gospel
100% (2)
A Commentary On Mark's Gospel
41 pages
ESA 2024 35 DORA Dry Run Exercise Summary Report For Publication
No ratings yet
ESA 2024 35 DORA Dry Run Exercise Summary Report For Publication
28 pages
Anudeep Deekonda: Los Angeles, CA +1 2136913547
No ratings yet
Anudeep Deekonda: Los Angeles, CA +1 2136913547
1 page
KIC Carrier
No ratings yet
KIC Carrier
2 pages
Starbucks - Strategic Management
No ratings yet
Starbucks - Strategic Management
31 pages
Bs en 13121 4 2005 PDF
0% (1)
Bs en 13121 4 2005 PDF
22 pages
Configure Disney VPN F5 On Linux: Credentials and Needed Files
No ratings yet
Configure Disney VPN F5 On Linux: Credentials and Needed Files
4 pages
Analysis
No ratings yet
Analysis
121 pages
Smart Irrigation for Students
No ratings yet
Smart Irrigation for Students
25 pages
التفاعلات الكيميائية ونظريتها وآليتها 2
No ratings yet
التفاعلات الكيميائية ونظريتها وآليتها 2
236 pages
ML-2 Operators Manual 4189340579 UK
100% (1)
ML-2 Operators Manual 4189340579 UK
29 pages
Hour 1 Bundle
No ratings yet
Hour 1 Bundle
9 pages
Ef4e Uppint Filetest 3b Answerkey
100% (2)
Ef4e Uppint Filetest 3b Answerkey
3 pages
AU Online Admission Portal
No ratings yet
AU Online Admission Portal
1 page

PR Unit 1 ....

Uploaded by

PR Unit 1 ....

Uploaded by

Pattern Recognition | Introduction UNIT 1

What is Pattern Recognition ?

Examples: Speech recognition, speaker identification, multimedia document recognition (MDR),

 In classification, an appropriate class label is assigned to a pattern based on an abstraction

The collection of observations is also known as a feature vector. A feature is a distinctive

Pattern recognition possesses the following features:

Training and Learning in Pattern Recognition

Real-time Examples and Explanations:

 Pattern recognition solves classification problems

 Image processing, segmentation and analysis

Pattern recognition is used to give human recognition intelligence to machine which is

 Radar signal classification/analysis

 Finger print identification

The fingerprint recognition technique is a dominant technology in the biometric market. A

 Computer Aided Diagnosis (CAD):

Pattern Recognition | Phases and Activities

Process/ steps/ component in Pattern Recognition System

 Phase 1: Converts images or sounds or other inputs into signal data.

3. Feature Extraction: It deals with the characterization of an object so that it can be

Design principle /Activities for designing the Pattern Recognition Systems

Approaches of Pattern Recognition

Difference Between Statistical Approach and Structural Approach:

1 Statistical decision theory. Human perception and cognition.

2 Quantitative features. Morphological primitives

3 Fixed number of features. Variable number of primitives.

4 Ignores feature relationships. Captures primitives relationships.

5 Semantics from feature position. Semantics from primitives encoding.

6 Statistical classifiers. Syntactic grammars.

Supervised learning classified into two categories of algorithms:

It is basically a type of unsupervised learning method . An unsupervised learning method is a

It is not necessary for clusters to be a spherical. Such as:

Applications of Clustering in different fields

Approach Representation Recognition Typical Criterion

Template Matching Samples, pixels, Correlation, distance Classification error

Statistical Features Discriminant Classification error

Syntactic or Primitives Rules, grammar Acceptance error

Neural networks Samples, pixels, Network function Mean square error

PATTERN APPROACH /PATTERN MODEL

Statistical patternrecognition Method

Statistical pattern recognition has been used successfully to design a number of

In statistical pattern recognition, a pattern is represented by a set of d features, or attributes,

STRUCTURAL PATTERN RECOGNITION METHOD

Fig 3 Neuron Model of Pattern Recognition

In template matching, a template or a prototype of the pattern to be recognized is available.

Decision region and Decision Boundary

In a statistical-classiﬁcation problem with two classes, a decision boundary or decision

Hyper planes and Hyper surfaces

•The classiﬁer learns how to discriminate between samples of different classes.

Is this a good decision boundary?

Better decision boundary

LEARNING AND ADAPTATION

Distance measures play an important role in PR.

Distance measures play an important role in PR.

Manhattan Distance (Taxicab or City Block Distance)

You might also like