0% found this document useful (0 votes)

39 views30 pages

Digit Main

report

Uploaded by

KRISH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views30 pages

Digit Main

report

Uploaded by

KRISH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 30

ABSTRACT

Nowadays, more and more people use images to represent and transmit information. It is also
popular to extract important information from images.
Image recognition is an important research area for its widely applications. In the relatively
young field of computer pattern recognition, one of the challenging tasks is the accurate
automated recognition of human handwriting. Optical Character Recognition (OCR) is a
subfield of Image Processing which is concerned with extracting text from images or scanned
documents. In this project, we have chosen to focus on recognizing handwritten digits
available in the MNIST database. The challenge in this project is to use basic Image
Correlation, also known as Matrix Matching, techniques in order to maximize the accuracy of
the handwritten digits recognizer without going through sophisticated techniques like
machine learning.
Key Words: Image Processing, Optical Character Recognition, Handwritten Digits, Image
Correlation, Matrix Matching, Machine Learning.
TABLE OF CONTENTS
LIST OF FIGURES
LIST OF ABBREVATION
CHAPTER 1
INTRODUCTION

It is very easy to process the images and analyse them in the human brain. If the eye sees a
definite image, it can easily break it apart and know various aspects of it.
That process automatically occurs in the brain, which not only involves the analysis of these
images but also the comparison of their different characteristics with what it already knows,
in order to be able to recognize these elements. There is a field in computer science that tries
to do the same thing for machines, which is Image Processing. Image processing is the field
concerning the analysis of images to extract some useful information from them. This method
takes images and converts them into the digital form readable by computers, it applies certain
algorithms on them, and results in a better-quality image or with some of their characteristics
that could be utilized to extract some information from them. This concept applied in image
processing is actually used in several areas, especially nowadays. Several software’s have
been developed using this concept of image processing. In the present world, self-driven cars
are appearing which can detect other cars and human beings to avoid accidents. Also, some
social media applications, like Facebook, can do facial recognition owing to this technique.
In addition, some software uses it to identify characters in some images, which is the concept
of optical character recognition, that we will be discussing and discovering in this project.
One of the narrow fields of image processing is recognizing characters from an image, which
is called Optical Character Recognition (OCR).
It is meant to read an image containing one or more characters, or read a scanned text of
typed or handwritten characters and be able to recognize them. There have been many
researches in this area for the purposes of developing optimal techniques that possess a high
accuracy and correctness. Among the most used algorithms and that proved a very high
performance are machine learning algorithms, such as Neural Networks, Support Vector
Machine, among others. One of the applications of OCR is recognizing handwritten
characters. And we will focus on building a mechanism that would recognize handwritten
digits. We will be reading
images of hand-written digits obtained from the MNIST database, trying to identify which
digit is represented by that image. To do this we will use the basic Image Correlation
techniques, more commonly known as Matrix Matching. This method is basically based on
matrices manipulations, since it reads images as matrices in which each element is a pixel.
1.2 OBJECTIVES
• To provide an easy user interface to input the object image.
• User should be able to upload the image.
• System should be able to upload the image.
• System should be able to preprocess the given input to supress the background.
• System should be able to detect digit regions present in the image.
• System should retrieve digit present in the image and display them to the user.

1.2 SCOPE
• Improve of human computer interface for computer illiterate people by providing various
computing services on inputs.
• Can be implemented on smart phones, tablets, as a virtual keyboard.
• The system can create paperless environment by digitizing handwritten character.
CHAPTER 2
LITERATURE SURVEY

Handwriting recognition the last frontiers, Proceeding's 15th International Conference

on Pattern Recognition, Barcelona, ICPR-2000, Vol.4, pp. 1-10.
The last frontiers of handwriting recognition are considered to have started in the last decade
of the second millennium. This paper summarizes the nature of the problem of handwriting
recognition, the state of the art of handwriting recognition at the turn of the new millennium,
the results of CENPARMI researchers in automatic recognition of handwritten digits,
touching numerals, cursive scripts, and dates formed by a mixture of the former 3 categories.
Wherever possible, comparable results have been tabulated according to techniques used,
databases, and performance. Aspects related to human generation and perception of
handwriting are discussed. The extraction and usage of human knowledge, and their
cooperation into handwriting recognition systems are presented. Challenges, aims, trends,
efforts and possible rewards, and suggestions for future investigations are also included.

Central Research Laboratory, Performance evaluation of pattern classifiers for

handwritten character recognition. International Journal on Document Analysis and
Recognition, Tokyo 185-8601, Japan.
This paper describes a performance evaluation study in which some efficient classifiers are
tested in handwritten digit recognition. The evaluated classifiers include a statistical classifier
(modified quadratic discriminant function, MQDF), three neural classifiers, and an LVQ
(learning vector quantization) classifier. They are efficient in that high accuracies can be
achieved at moderate memory space and computation cost. The performance is measured in
terms of classification accuracy, sensitivity to training sample size, ambiguity rejection, and
outlier resistance. The outlier resistance of neural classifiers is enhanced by training with
synthesized outlier data. The classifiers are tested on a large data set extracted from NIST
SD19. As results, the test accuracies of the evaluated classifiers are comparable to or higher
than those of the nearest neighbour (1-NN) rule and regularized discriminant analysis (RDA).
It is shown that neural classifiers are more susceptible to small sample size than MQDF,
although they yield higher accuracies on large sample size. As a neural classifier, the
polynomial classifier (PC) gives the highest accuracy and performs best in ambiguity
rejection. On the other hand, MQDF is superior in outlier rejection even though it is not
trained with outlier data. The results indicate that pattern classifiers have complementary
advantages and they should be appropriately combined to achieve higher performance.
A Shallow Convolutional Neural Network for Accurate Handwritten Digits
Classification" 13th international conference, PRIP, Minsk, Belarus, pp. 77-85.
At present the deep neural network is the hottest topic in the domain of machine learning and
can accomplish a deep hierarchical representation of the input data. Due to deep architecture
the large convolutional neural networks can reach very small test error rates below 0.4%
using the MNIST database. In this work we have shown, that high accuracy can be achieved
using reduced shallow convolutional neural network without adding distortions for digits.
The main contribution of this paper is to point out how using simplified convolutional neural
network is to obtain test error rate 0.71% on the MNIST handwritten digit bench-mark. It
permits to reduce computational resources in order to model convolutional neural network.

Handwritten Digit String Recognition using Convolutional Neural Network. 2018 24

International Conference on Pattern Recognition (ICPR).
String recognition is one of the most important tasks in computer vision applications.
Recently the combinations of convolutional neural network (CNN) and recurrent neural
network (RNN) have been widely applied to deal with the issue of string recognition.
However, RNNs are not only hard to train but also time-consuming. In this paper, we propose
a new architecture which is based on CNN only, and apply it to handwritten digit string
recognition (HDSR). This network is composed of three parts from bottom to top: feature
extraction layers, feature dimension transposition layers and an output layer. Motivated by its
super performance of Dense Net, we utilize dense blocks to conduct feature extraction. At the
top of the network, a CTC (connectionist temporal classification) output layer is used to
calculate the loss and decode the feature sequence, while some feature dimension
transposition layers are applied to connect feature extraction and output layer. The
experiments have demonstrated that, compared to other methods, the proposed method
obtains significant improvements on ORAND-CAR-A and ORAND-CAR-B datasets with
recognition rates 92.2% and 94.02%.
Improved handwritten digit recognition using convolutional neural networks (CNN).
Sensors, 20(12), 3344.
Traditional systems of handwriting recognition have relied on handcrafted features and a
large amount of prior knowledge. Training an Optical character recognition (OCR) system
based on these prerequisites is a challenging task. Research in the handwriting recognition
field is focused around deep learning techniques and has achieved breakthrough performance
in the last few years. Still, the rapid growth in the amount of handwritten data and the
availability of massive processing power demands improvement in recognition accuracy and
deserves further investigation. Convolutional neural networks (CNNs) are very effective in
perceiving the structure of handwritten characters/words in ways that help in automatic
extraction of distinct features and make CNN the most suitable approach for solving
handwriting recognition problems. Our aim in the proposed work is to explore the various
design options like number of layers, stride size, receptive field, kernel size, padding and
dilution for CNN-based handwritten digit recognition. In addition, we aim to evaluate various
SGD optimization algorithms in improving the performance of handwritten digit recognition.
A network's recognition accuracy increases by incorporating ensemble architecture. Here, our
objective is to achieve comparable accuracy by using a pure CNN architecture without
ensemble architecture, as ensemble architectures introduce increased computational cost and
high testing complexity. Thus, a CNN architecture is proposed in order to achieve accuracy
even better than that of ensemble architectures, along with reduced operational complexity
and cost. Moreover, we also present an appropriate combination of learning parameters in
designing a CNN that leads us to reach a new absolute record in classifying MNIST
handwritten digits. We carried out extensive experiments and achieved a recognition accuracy
of 99.87% for a MNIST dataset.
CHAPTER 3
DESIGN AND IMPLEMENTATION

3.1 Image Processing

Image processing is a very wide field within computer science which deals mainly with
analysing images and trying to get some information out of them. The image to be processed
is imported then analysed using some computations, which, by the end, results either in an
image with a better quality or some of the characteristics of this image depending on the
purpose of this analysis. This is a very wide field within computer science, which also has
several other subfields of which Optical Character Recognition that we will be mainly
dealing with throughout this project.
There are two ways to provide input to the system. The user can either upload the image of
the digit he wants to detect or the data from the MNIST dataset. The input images are pre-
processed. Using the different classifiers, the recognized digits' accuracy is compared and the
result is obtained. The results obtained are displayed along with the accuracy.
3.2 Optical Character Recognition (OCR)
It is easy for the naked eye to recognize a character when spotted in any document; however,
computers cannot identify the characters from an image or scanned document. In order to
make this possible, a lot of research has been done, which resulted in the development of
several algorithms that made this possible. One of the fields that specialize in character
recognition under the light of Image Processing is Optical Character Recognition (OCR). In
Optical Character Recognition, a scanned document or an image is read and segmented in
order to be able to decipher the characters it contains. The images are taken and are pre-
processed so as to get rid of the noise and have unified colours and shades, then the
characters are segmented and recognized one by one, to finally end up with a file containing
encoded text containing these characters, which can be easily read by computers. Optical
Character Recognition dates back to the early 1900s, as it was developed in the United States
in some reading aids for the blind. In 1914, Emanuel Goldberg was able to implement a
machine able to convert characters into "standard telegraph code". In the 1950s, David
Shepard, who was at that time an engineer at the Department of Defence, developed a
machine that he named Gismo, which is able to read characters and translate them into
machine language. In 1974, Ray Kurzweil decided to develop a machine that would read text
for blind and visually impaired people under his company, Kurzweil Computer Products.
There are several software and programs, nowadays, which use OCR in several different
applications. In 1996, the United States Postal Services were able to develop a mechanism,
HWAI, which recognizes handwritten mail addresses.

3.3 Methods Used in OCR

A lot of research has been done in the field of OCR, and still being done, which resulted in
the development of several algorithms which enable computers to recognize characters from
images or scanned texts. Many of these techniques have attained very high efficiency and a
low error rate. However, these algorithms are still being investigated and improved for a
better performance.
3.3.1 Machine Learning
Machine learning is a field that concerns making programs learn and know how to behave in
different situations using data. One of its applications is Optical Character Recognition.

3.3.2 Artificial Neural Network

An Artificial Neural Network (ANN) is a system that mimics the human's biological neural
network in the brain. It is an algorithm used for machine learning, which means it uses data to
learn how to respond to different inputs. The ANN can be seen as a box, which takes one or
more inputs and gives one output. Inside the box, there exist several interconnected nodes.
The input is fed into the program, which goes through the several layers and nodes of the
ANN and gives an output using a transfer function.
Artificial Neural Networks are used for OCR and have proved a very high accuracy rate. In
this case, the ANN would "recognize a character based on its topological features such as
shape, symmetry, closed or open areas, and number of pixels". The high accuracy of this kind
of algorithms is mainly thanks to its ability of learning from the training set, which would
contain characters with similar features.
some Neural Networks have proven a very high performance. An implementation of the ANN
done by Simard, Steinkraus, and Platt has reduced the error rate of recognizing handwritten
digits from the MNIST dataset to a percentage as low as 0.7%.
3.3.3 Support Vector Machine
Support Vector Machine (SVM) is an algorithm that belongs to machine learning as
well. SVMs are known as high performance pattern classifiers. While Neural Networks aim
at minimizing the training error, SVMs have as goal to minimize the "upper bound of the
generalization error". The learning algorithm in this technique is based on classification and
regression analysis.

This kind of classifier has been used in the recognition of very complex characters like the
Khmer language and has proved a very high performance.

3.3.4 Image Correlation

Image Correlation is a technique used to recognize characters from images. This
approach, also referred to as Matrix Matching, uses mathematical computations in order to
analyse the images. By using this technique, the images are read as matrices, where each
element represents a pixel, which makes it easier to manipulate them using mathematical
approaches. The image to be identified is loaded as a matrix and compared to the images in
the reference set. The test image is overlapped with each image in the reference set to be able
to see how it matches with each one of them so as to tell which one represents it the most.
The decision can be made by seeing the pixels that match and the ones left out from either
one of the two images. This technique has many challenges and limitations, as it only
overlaps the images and tries to see how much they look alike. By using this method,
problems arise when having characters of different sizes, or when one of them is rotated by a
certain angle.
3.3.5 Feature Extraction
Feature extraction is a technique based on pattern recognition. The main idea of feature
extraction is analysing the images and derive some characteristics from these images that
identify each specific element. An example of these characteristics would be the curvatures,
the holes, the edges, etc. In the case of digits recognition, these features could be the holes
inside the digits (for example for the eight, the six, and maybe the two as well) as well as the
angles between some straight lines (for example in the one, the four, and the 6 seven).
Whenever an unknown image is to be recognized, its features are compared to these so that it
can be classified.

3.4 Tools
This project's main objective is to be able to read the images containing the handwritten
digits and be able to identify those digits using basic image correlation techniques. These
images are normally represented and read as matrices, in which every element portrays a
pixel. The image correlation technique takes these matrices and compares them using some
algorithms so as to identify the match that represents the digit we are trying to figure out.
This project will be mainly using matrices and heavy numerical computations, that is why it
is very important to consider the tools that would provide us with a suitable environment for
performing these computations.
3.4.1 Octave
Octave is a free and open-source software that uses a high-level programming
language. It has the same functionalities as MATLAB and is compatible with it. It offers a
very simple and suitable interface to exert some mathematical computations. It provides some
tools to solve mathematical problems like some common linear algebra problems. It is also
very efficient when it comes to the use of resources, i.e., time and memory, when it comes to
these operations. Also, it is very easy to use it when dealing with matrices, as it provides with
many functions and operations that make it less costly to manipulate them. In this project, we
will deal with images as matrices, in which each element represents a pixel, that is why it is
very necessary for us to choose a tool that will make our computations easier and more
efficient in terms of time and memory resources, Both MATLAB and Octave are very easy to
learn and work with and provide a suitable environment for this kind of projects. We have
opted for Octave as it is free and open source.
3.4.2 MNIST Database
The MNIST database, which stands for the Modified National Institute of Standards
and Technology database, is a very large dataset containing several thousands of handwritten
digits. This dataset was created by mixing different sets inside the original National Institute
of Standards and Technology (NIST) sets, so as to have a training set containing several types
and shapes of handwritten digits, as the NIST set was divided into those written by high
school students and others written by the Census Bureau workers. The MNIST dataset 8 has
been the target of so many researches done in recognizing handwritten digits. This allowed
the development and improvements of many different algorithms with a very high
performance, such as machine learning classifiers. In order to be able to implement our
recognizer and test its performance, it is necessary to have a suitable dataset which contains a
large number of handwritten digits. This dataset should be able to allow us to discover the
challenges and limitation of the image correlation technique and push us to look for ways and
rules to enhance it and assess its accuracy. We have opted for this dataset to be used for
testing our program since it has proved a great reliability and importance in the field.

3.5 Feasibility Study

From a technical perspective, since this project makes heavy use of numerical computations,
using Octave is a wise choice as it will make the program more efficient. This software will
also provide us with some libraries to read and manipulate the images that will make the
implementation process easier.
As for the dataset to use in the testing of the project, we have chosen the MNIST Database.
This database contains thousands of handwritten digits that have been used in the
development of programs with a similar aim. This dataset is open for public use with no
charges. It is also very convenient for our project and will help us reduce the time by using
directly as a test set without having to make one ourselves.
Since all the tools to be used in this project are free of charge and very easy to use, we can
conclude that this project is very feasible in terms of financial resources as well as effort and
time.
CHAPTER 4
METHODOLGY

4.1 Getting Familiar with the Tools

The first step we had to go through while working on this project was getting familiar with
the tools used, i., Octave and the MNIST dataset. After setting up the environment for Octave
to work perfectly and downloading the dataset, I have started experimenting with both in
order to get familiar with them and know how to use them easily in the future. Since all the
programming is mainly done in Octave, we had to download it along with its Graphical User
Interface into the computer, and learn a little bit about its functions and how to use it. Octave
is a free software which makes it very easy to work with matrices and vectors and is very
efficient in performing calculations on them. I have started learning how to use it and looking
for its main functions that I will be using in the implementation of the project. For that, I have
used some random images of digits to see how they can be read and modified as well as how
to apply some computations on them. Moreover, I had to investigate the format of the MNIST
dataset and get familiar with its representation. The MNIST dataset, which was used to create
our test set, contains thousands of handwritten digits, represented as matrices. It has been
used in the development of several programs and projects with the same aim as ours. After
downloading the file which contains the handwritten digits, I have loaded it on Octave in
order to visualize the images and figure out how to use and manipulate them.

4.2 Creating the reference and test set

One of the main steps in the project is creating the reference and the test set that will both be
used in the implementation phase. The test set is to be used in order to assess the performance
of the program and evaluate its success or error rate. It is to be taken from the MNIST
dataset, since it contains the handwritten digits that we intend to recognize and identify. As
for the reference set, it is used to compare the test images and be able to identify the digit
they represent. It is to be created using different fonts.
CHAPTER 5
DATA

Data
It is very necessary to know the kind of data we are using before we start the design and the
implementation of the program. That is why we had to have a look at its format to understand
how it is represented before creating the reference and the test set.

5.1. Dataset Format

The dataset that I have downloaded from the MNIST database contains 60,000 images of
handwritten digits, from zero to nine, all grouped in one file. Each of the images is of size 28
by 28 pixels and represents a digit. I have noticed that there is no pattern or order to the way
the images were organized in the file. The images are represented as matrices, of which the
elements represent the pixels. Also, each image has a label that indicates the digit
represented. This label was very helpful later on in order to be able to create the test set.
Furthermore, the data did not contain noise or any major problems to deal with, that is why it
was used without preprocessing it.
5.2. Reference Set
To be able to recognize the digit represented by a certain image, it is required to compare it
with other images containing known digits to be able to make the decision. For that it is
necessary to create a reference set which will contain all these images. That is to say, each
image we would want to recognize is to be compared to the images in the reference set. The
image with the highest match is the one that represents the right number. Since handwritten
digits differ from a person to another, the reference set needs to have digits with different
fonts. That is why, we have created six images of each digit using the online image editor
pixr.com, each one with a different font. The reference set contains images with the same
dimensions as the ones in the MNIST dataset, i.e., 28 by 28 pixels. Furthermore, these images
have a black background and a white font, which made it easier to use and manipulate them
later on using Octave. Furthermore, to make the comparison easier, we have regrouped each
six images representing the same digit under one file. So the resulting reference set was ten
files, each one representing a digit from zero to nine, and containing six images of that digit
in different fonts. The pixels of these images are then changed into zeros and ones, which
makes the overlapping of the images easier. The black background was initially represented
as zeros, so it is left the same. As for the pixels of the white font, each one of them was
represented with a different non zero value depending on the shade of white. These non-zero
values are all converted into ones. The following image displays the digit "2" reference set.
Rest of the reference sets are in Appendix A.

5.3. Test Set

The program to be developed needs to be tested against some images that contain handwritten
digits so as to be able to assess its performance and calculate its success rate. That is why it is
very necessary to create a test set. The test set represents an example of the images containing
the handwritten digits which will have to be compared to the images in the reference set so as
to identify them. This set was formed using the file from the MNIST database. The original
file contained 60,000 images representing different digits. This made it difficult to look for
each number using the label for the testing of the program. In order to make it easier to access
each digit we want; we have decided to store a number of images from each digit in a
separate file. That is why we have stored 20 images of each digit in ten different files. That is
to say, the resulting test set was in the form of ten files, each one of them represents a digit
and contains 20 images of it. These images were extracted from the initial file by reading
them and their labels using Octave. In order to make the manipulation of the matrices/images
easier, we had to make some modifications in the elements of all the matrices representing
the test set as well. The black pixels were originally represented as zeros, so they were left the
same. As for the white ones, each of them had a different non zero number, so we turned
them all into ones.
CHAPTER 6
PROGRAM CODE AND SAMPLE OUTPUT

Handwritten Digit Recognition program in Python using a neural network with the popular
MNIST dataset. We'll use the Keras library with TensorFlow backend for building and
training the neural network model.

6.1. Step-by-Step Explanation and Code

1. Install and Import Libraries
Make sure you have the necessary libraries installed:

pip install tensorflow

Then import the required modules:

import tensorflow as tf from

tensorflow.keras.datasets import mnist from
tensorflow.keras.models import Sequential from
tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

2. Load the Dataset

The MNIST dataset is preloaded in Keras and consists of 60,000 training images and
10,000 testing images of handwritten digits (0 to 9).

(x_train, y_train), (x_test, y_test) = mnist.load_data()

3. Preprocess the Data

Neural networks work better with normalized data. Therefore, we scale the pixel
values of images from a range of 0-255 to 0-1.

x_train, x_test = x_train / 255.0, x_test / 255.0

Also, we convert the labels to categorical (one-hot encoding), turning each label into a
vector where only the target index is 1 (e.g., 7 becomes [0, 0, 0, 0, 0, 0, 0, 1, 0, 0]).

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)
4. Build the Model
We use a simple neural network with one input layer, one hidden layer, and one output
layer. This basic architecture works well for image classification tasks.

model = Sequential([
Flatten(input_shape=(28, 28 )),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])

5. Compile the Model

We specify the optimizer, loss function, and evaluation metric. For multi-class
classification, we use categorical cross-entropy as the loss function, and for
optimization, Adam is a commonly used optimizer.

model.compile(optimizer='adam',

loss='categorical_crossentropy',

metrics=['accuracy'])

6. Train the Model

Train the model on the training data for 5 epochs, with a batch size of 32. This can be
adjusted based on system capability and desired accuracy.

model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

7. Evaluate the Model

Check how the model performs on the test data.

test_loss, test_acc = model.evaluate(x_test, y_test)

print("Test accuracy:", test_acc)

8. Make Predictions
Finally, make predictions on new data (in this case, the test data) and display the first
5 predictions.

predictions = model.predict(x_test)

for i in range(5):

print("Predicted label:", predictions[i].argmax(), "Actual label:",

y_test[i].argmax())

6.2. Explanation of Key Concepts:

 Flatten Layer: This reshapes the 28x28 images into a flat array of 784 pixels for
input to the dense layer.
 Dense Layer: Fully connected layers; the hidden layer has 128 neurons, and the
output layer has 10 neurons (one for each digit).
 Activation Functions: ReLU for hidden layers, which helps introduce non-linearity,
and Softmax for output to ensure output values represent probabilities.
6.3. FULL CODE:
Here’s the complete code:

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

# Load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess data
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# Build model
model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Train model
model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

# Evaluate model
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)

# Make predictions
predictions = model.predict(x_test)
for i in range(5):
print("Predicted label:", predictions[i].argmax(), "Actual label:", y_test[i].argmax())

This model achieves good accuracy on the MNIST dataset and serves as a simple,
foundational introduction to image classification.
6.4. SAMPLE OUTPUT
1. Model Training Process
During the training, the model will display the loss and accuracy for each
epoch. Here’s a sample output for 5 epochs:

Epoch 1/5
1500/1500 [==============================] - 4s 2ms/step - loss:
0.2941 - accuracy: 0.9165 - val_loss: 0.1498 - val_accuracy: 0.9553
Epoch 2/5
1500/1500 [==============================] - 3s 2ms/step - loss:
0.1285 - accuracy: 0.9616 - val_loss: 0.1152 - val_accuracy: 0.9647
Epoch 3/5
1500/1500 [==============================] - 3s 2ms/step - loss:
0.0898 - accuracy: 0.9730 - val_loss: 0.1021 - val_accuracy: 0.9685
Epoch 4/5
1500/1500 [==============================] - 3s 2ms/step - loss:
0.0687 - accuracy: 0.9785 - val_loss: 0.0952 - val_accuracy: 0.9712
Epoch 5/5
1500/1500 [==============================] - 3s 2ms/step - loss:
0.0537 - accuracy: 0.9830 - val_loss: 0.0957 - val_accuracy: 0.9714

2. Model Evaluation on Test Data

After training, the model is evaluated on the test data. The output will include
the test loss and test accuracy:

313/313 [==============================] - 0s 1ms/step - loss:

0.0889 - accuracy: 0.9732
Test accuracy: 0.9732
This indicates that the model has achieved 97.32% accuracy on the test
dataset.
3. Predictions
The code makes predictions for the first five images in the test dataset and
compares them with the actual labels. A sample output might look like this:

Predicted label: 7 Actual label: 7

Predicted label: 2 Actual label: 2
Predicted label: 1 Actual label: 1
Predicted label: 0 Actual label: 0
Predicted label: 4 Actual label: 4

This output shows that the model correctly predicted the labels for the first five
test images.

This accuracy can vary slightly depending on factors such as training

parameters, hardware, and randomness in initialization, but it generally should
be around 97%
CHAPTER 7
CONCLUSION AND FUTURE WORK

Conclusion
Optical Character Recognition is a very broad field concerned with turning an image or a
scanned document containing a set of characters into an encoded text that could be read by
machines. In this project, we have attempted to build a recognizer for handwritten digits
using the MNIST dataset. The challenge of this project was to be able to come up with some
basic image correlation techniques, instead of some sophisticated algorithms, and see to what
extent we can make this mechanism accurate. We have tried several versions and kept trying
to improve each one in order to reach a higher performance rate. The last version has reached
a rate of 57% accuracy. Unfortunately, we could not compare the performance of the
mechanism we have built to some others that have already been designed and/or implemented
before because we did not find any academic paper that tackles this method. The performance
we have reached is far less than that of machine learning, which reaches a performance rate
of 99.3%; however, it could be further improved and made into a better one. The goal of this
project was to explore the field of OCR and try to come up with some techniques that could
be used without going into deep computations, and even if the final result is not very reliable,
it still provides an accuracy way better than random.

Future work
The future steps that to go for would be having a closer look at the results of all the versions
in order to find new rules. By extracting and implementing them, we will be able to enhance
the performance of these versions. Moreover, it would be good if we could make some
modifications to both the reference set and the rules in order to make our program more
general and able to identify both typed and handwritten digits. Furthermore, in the future, we
could make a great use of the matrices that indicate the first maximum overlap of each test
image with the reference images, along with the number of pixels left out from both. These
matrices could be used with some clustering algorithms to build a program able to recognize
handwritten digits with a very high efficiency. Last but not least, we thought about using
linear or high-level regression in the versions we have developed in order to create more
rules. As regression could be used for binary classification and is not very suitable to classify
a digit out of ten, this technique could be used in order to tell which digit is the most suitable,
the first maximum or second maximum, which will enable us to generate more rules; thus,
reach a higher efficiency.
REFERENCES

[1] C.Y. Suen, J. Kim, K. Kim, Q. Xu, L. Lam, Handwriting recognition the last frontiers.
Proc. 15th ICPR, Barcelona, 2000, Vol.4, pp. 1-10.
[2] T Siva Ajay (July 2017), "Handwritten Digit Recognition Using Convolutional Neural
Networks" International Research Journal of Engineering and Technology (IRJET), Vol. 04,
Issue 07, pp. 2971-2976.
[3] Vladimir Golovko, MikhnoEgor, AliaksandrBrich, and AnatoliySachenko (October 2016),
"A Shallow Convolutional Neural Network for Accurate Handwritten Digits Classification"
13th international conference, PRIP, Minsk, Belarus, pp. 77-85.
[4] Hongjian Zhan, ShujingLyu, Yue Lu Shanghai (August 2018), "Handwritten Digit String
Recognition using Convolutional Neural Network", 24th International Conference on Pattern
Recognition (ICPR), pp. 3729-3734.
[5] Ahlawat, S., Choudhary, A., Nayyar, A., Singh, S., & Yoon, B. (2020). Improved
handwritten digit recognition using convolutional neural networks (CNN). Sensors, 20(12),
3344. doi:10.3390/s20123344.
[6] N. Hagita, S. Naito, I. Masuda, Handprinted Kanji characters recognition based on pattern
matching method, Pr oc. ICTP, 1983, pp. 169-174.
[7] D.-S. Lee, S.N. Srihari, Handprinted digit recognition: a comparison of algorithms, Pr oc.
3rd IWFHR, 1993, pp. 153-164.
[8] U. Kreel, J. Sche urmann, Pattern classication techniques based on function
approximation, Handbook of Character Recognition and Document Image Analysis, World
Scientic, 1997, pp.49-78.

Handwritten Digit Recognition Using CNN
100% (1)
Handwritten Digit Recognition Using CNN
6 pages
ManishGiri G 2018465 34
No ratings yet
ManishGiri G 2018465 34
12 pages
JOCC Volume 2 Issue 1 Page 9 19
No ratings yet
JOCC Volume 2 Issue 1 Page 9 19
11 pages
Hand Written Digit Word Recoginiton System
No ratings yet
Hand Written Digit Word Recoginiton System
5 pages
1st Research
No ratings yet
1st Research
13 pages
Chapter 1
No ratings yet
Chapter 1
5 pages
dùng được 11 - số
No ratings yet
dùng được 11 - số
6 pages
Assignment 2, Machine Learning
No ratings yet
Assignment 2, Machine Learning
5 pages
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
No ratings yet
Deep Learning - Handwritten Digit Recognition Using Python REVIEW 0
16 pages
Article Hand Writing Character Recognition Using CNN
No ratings yet
Article Hand Writing Character Recognition Using CNN
6 pages
An In-Depth Deep Learning Approach To Handwritten Digits Recognition
No ratings yet
An In-Depth Deep Learning Approach To Handwritten Digits Recognition
7 pages
Ijirt162606 Paper
No ratings yet
Ijirt162606 Paper
4 pages
HRW
No ratings yet
HRW
28 pages
Handwritten Character Recognition Presentation
No ratings yet
Handwritten Character Recognition Presentation
12 pages
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
No ratings yet
A Comparative Study On Handwriting Digit Recognition Using Neural Networks
5 pages
Information
No ratings yet
Information
19 pages
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
No ratings yet
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
9 pages
Unique Research Paper
No ratings yet
Unique Research Paper
5 pages
Handwritten Text Recognition Using Machine Learning: Journal of Engineering Sciences Vol 14 Issue 02,2023
No ratings yet
Handwritten Text Recognition Using Machine Learning: Journal of Engineering Sciences Vol 14 Issue 02,2023
11 pages
123 Handwritten
No ratings yet
123 Handwritten
10 pages
Research Article
No ratings yet
Research Article
10 pages
Real Time Handwritten Digit Recognition Using Neural Networks For Accurate Marks Entry On Examination Portal
No ratings yet
Real Time Handwritten Digit Recognition Using Neural Networks For Accurate Marks Entry On Examination Portal
7 pages
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
No ratings yet
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
7 pages
Report
No ratings yet
Report
49 pages
31.july Ijmte - 674
No ratings yet
31.july Ijmte - 674
7 pages
Ann Paper
No ratings yet
Ann Paper
4 pages
Hand Written Digit Recognition
No ratings yet
Hand Written Digit Recognition
5 pages
Handwritten Character Recognition Presentation
No ratings yet
Handwritten Character Recognition Presentation
12 pages
Neural Networks for Digit Recognition
No ratings yet
Neural Networks for Digit Recognition
5 pages
564 567, Tesma411, IJEAST
No ratings yet
564 567, Tesma411, IJEAST
4 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Handwritten Digits Recognition
No ratings yet
Handwritten Digits Recognition
8 pages
Final Seminar Presentation2
No ratings yet
Final Seminar Presentation2
14 pages
A Convolutional Neural Network For Handwritten Digit Recognition
No ratings yet
A Convolutional Neural Network For Handwritten Digit Recognition
9 pages
Handwritten Character Recognition Using Deep Learning
No ratings yet
Handwritten Character Recognition Using Deep Learning
8 pages
Handwritten Digit Recognition Using ML&DL
No ratings yet
Handwritten Digit Recognition Using ML&DL
3 pages
1822 B.E Cse Batchno 4
No ratings yet
1822 B.E Cse Batchno 4
64 pages
(IJCST-V10I3P35) :aisha Farhana, Aswani K.S, Aswathy A.C, Divya Jolly M, Elia Nibia
No ratings yet
(IJCST-V10I3P35) :aisha Farhana, Aswani K.S, Aswathy A.C, Divya Jolly M, Elia Nibia
7 pages
Updated 2nd Synopsis
No ratings yet
Updated 2nd Synopsis
33 pages
Neural OCR for Handwriting Recognition
No ratings yet
Neural OCR for Handwriting Recognition
21 pages
IJNRD2304119
No ratings yet
IJNRD2304119
5 pages
Handwritten Digit Recognition Using Image Processing and Neural Networks
No ratings yet
Handwritten Digit Recognition Using Image Processing and Neural Networks
4 pages
CNN vs RNN: MNIST Dataset Analysis
No ratings yet
CNN vs RNN: MNIST Dataset Analysis
21 pages
Handwritten Digits Recognition
No ratings yet
Handwritten Digits Recognition
27 pages
Project Report
No ratings yet
Project Report
44 pages
University Institute of Technology
No ratings yet
University Institute of Technology
9 pages
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
No ratings yet
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
9 pages
Char Rec On It Ion
No ratings yet
Char Rec On It Ion
12 pages
Character Recognition: Handwritten Character Recognition: Training A Simple NN For Classification Using MATLAB
No ratings yet
Character Recognition: Handwritten Character Recognition: Training A Simple NN For Classification Using MATLAB
12 pages
Handwriting Recognition Is A Simple Task For Humans But A Difficult Task For Computers
No ratings yet
Handwriting Recognition Is A Simple Task For Humans But A Difficult Task For Computers
38 pages
Survey Paper
No ratings yet
Survey Paper
12 pages
Handwritten Text Recognition System Based On Neural Network
No ratings yet
Handwritten Text Recognition System Based On Neural Network
6 pages
Phase 2 Pardeep
No ratings yet
Phase 2 Pardeep
9 pages
A Review of Handwritten Text Recognition Using Machine Learning and Deep Learning Techniques
No ratings yet
A Review of Handwritten Text Recognition Using Machine Learning and Deep Learning Techniques
6 pages
TensorFlow Handwriting Recognition
No ratings yet
TensorFlow Handwriting Recognition
28 pages
Paper 1
No ratings yet
Paper 1
3 pages
Convolution Network
No ratings yet
Convolution Network
39 pages
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
No ratings yet
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
3 pages
Nearest Neighbor Classifier Guide
No ratings yet
Nearest Neighbor Classifier Guide
16 pages
Greedy Algorithm
No ratings yet
Greedy Algorithm
11 pages
A Solution Manual To A Practical Introduction To Python Programming by Brian Heinold
30% (27)
A Solution Manual To A Practical Introduction To Python Programming by Brian Heinold
5 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
Advanced Financial Mathematics
No ratings yet
Advanced Financial Mathematics
7 pages
Arima Models: A Guide for Analysts
No ratings yet
Arima Models: A Guide for Analysts
2 pages
Ee/Ec Batch: Hinglish: Signal and System
No ratings yet
Ee/Ec Batch: Hinglish: Signal and System
3 pages
LBC Decoding PDF
No ratings yet
LBC Decoding PDF
14 pages
GCSE Maths Command Words Poster A4 Colour
No ratings yet
GCSE Maths Command Words Poster A4 Colour
1 page
Introduction To Mediation Models With The PROCESS Marco in SPSS
No ratings yet
Introduction To Mediation Models With The PROCESS Marco in SPSS
47 pages
Bolt - PM Task
No ratings yet
Bolt - PM Task
3 pages
Barrier Options Pricing Guide
No ratings yet
Barrier Options Pricing Guide
18 pages
Test Bank
No ratings yet
Test Bank
55 pages
Unit 3 Problem Solving and Search Algorithms
No ratings yet
Unit 3 Problem Solving and Search Algorithms
73 pages
Regulariza On: The Problem of Overfi6ng
No ratings yet
Regulariza On: The Problem of Overfi6ng
19 pages
Final Review: Erin Keith
No ratings yet
Final Review: Erin Keith
25 pages
Robust Control: Saba Rezvanian
No ratings yet
Robust Control: Saba Rezvanian
44 pages
Simulation: Chapter - 13
No ratings yet
Simulation: Chapter - 13
10 pages
Data Cleaning and Datamining
No ratings yet
Data Cleaning and Datamining
54 pages
BIA Data Science Artificial Intelligence
No ratings yet
BIA Data Science Artificial Intelligence
24 pages
Technical Notes For CityMun Projection - 2015CBPP - Phils
No ratings yet
Technical Notes For CityMun Projection - 2015CBPP - Phils
2 pages
Convex Optimization in Data Science
No ratings yet
Convex Optimization in Data Science
31 pages
Theory of Computation
No ratings yet
Theory of Computation
5 pages
Stability Analysis Using Polar Plots
No ratings yet
Stability Analysis Using Polar Plots
15 pages
1 s2.0 S1319157821001798 Main
No ratings yet
1 s2.0 S1319157821001798 Main
12 pages
Sanket Resume PDF
No ratings yet
Sanket Resume PDF
1 page
Sukumar Sankaran: Gmail Linkedin
No ratings yet
Sukumar Sankaran: Gmail Linkedin
1 page
Finite Volume Method
No ratings yet
Finite Volume Method
2 pages
Numerical Linear Algebra, by SUNDARAPANDIAN, V
No ratings yet
Numerical Linear Algebra, by SUNDARAPANDIAN, V
3 pages
QKV v2
No ratings yet
QKV v2
4 pages

Digit Main

Uploaded by

Digit Main

Uploaded by

ABSTRACT

Handwriting recognition the last frontiers, Proceeding's 15th International Conference

Central Research Laboratory, Performance evaluation of pattern classifiers for

Handwritten Digit String Recognition using Convolutional Neural Network. 2018 24

3.1 Image Processing

3.3 Methods Used in OCR

3.3.2 Artificial Neural Network

3.3.4 Image Correlation

3.5 Feasibility Study

4.1 Getting Familiar with the Tools

4.2 Creating the reference and test set

5.1. Dataset Format

5.3. Test Set

6.1. Step-by-Step Explanation and Code

pip install tensorflow

Then import the required modules:

import tensorflow as tf from

2. Load the Dataset

(x_train, y_train), (x_test, y_test) = mnist.load_data()

3. Preprocess the Data

x_train, x_test = x_train / 255.0, x_test / 255.0

5. Compile the Model

6. Train the Model

model.fit(x_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

7. Evaluate the Model

test_loss, test_acc = model.evaluate(x_test, y_test)

print("Test accuracy:", test_acc)

print("Predicted label:", predictions[i].argmax(), "Actual label:",

6.2. Explanation of Key Concepts:

2. Model Evaluation on Test Data

313/313 [==============================] - 0s 1ms/step - loss:

Predicted label: 7 Actual label: 7

This accuracy can vary slightly depending on factors such as training

You might also like