AJAY KUMAR GARG ENGINEERING
COLLEGE
27th DELHI-HAPUR BYPASS ROAD
GHAZIABAD-201001
SOFTWARE REQUIREMENT SPECIFICATION(SRS)
OF
OPTICAL CHARACTER RECOGNITION
B.Tech VII Semester-2010
In the partial fulfillment of Degree of B.Tech (CSE) 2007-2011
Under the guidance of- Submitted by-
Kirti Seth Prabha Kumari
Prerna Pal
Shashank Bhargava
Vibhuti Goel
Contents
1. Introduction
1.1 Purpose
1.2 Scope
2. Requirements
3. Features of the project
4. Description of project
5. Limitations
1.INTRODUCTION:
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation
of scanned images of handwritten, typewritten or printed text into machine-encoded text.
One example of OCR is shown below. A portion of a scanned image of text, borrowed from the web,
is shown along with the corresponding (human recognized) characters from that text.
of descriptive bibliographies of authors and presses. His ubiquity in the broad field of
bibliographical and textual study, his seemingly complete possession of it, distinguished him
from his illustrious predecessors and made him the personification of bibliographical
scholarship in his text.
Figure 1: Scanned image of text and its corresponding recognized representation
A few examples of OCR applications are listed here. The most common for use OCR is the first
item; people often wish to convert text documents to some sort of digital representation.
1. People wish to scan in a document and have the text of that document available in a word
processor.
2. Recognizing license plate number
3. Post Office needs to recognize zip-codes
1.1PURPOSE
The goal of Optical Character Recognition (OCR) is to classify optical patterns (often contained
in a digital image) corresponding to alphanumeric or other characters. The process of OCR
involves several steps including segmentation, feature extraction, and classification.
1.2SCOPE OF PROJECT:
a.)Optical character recognition refers to the branch of computer science that involves reading
text from paper and translating the images into a form that the computer can manipulate (for
example, into ASCII codes).
b.) An OCR system enables you to take a book or a magazine article, feed it directly into an
electronic computer file, and then edit the file using a word processor.
c.)All OCR systems include an optical scanner for reading text, and sophisticated software for
analyzing images. Most OCR systems use a combination of hardware (specialized circuit boards)
and software to recognize characters, although some inexpensive systems do it entirely through
software.
d.)The potential of OCR systems is enormous because they enable users to harness the power of
computers to access printed documents. OCR is already being used widely in the legal
profession, where searches that once required hours or days can now be accomplished in a few
seconds.
e.)This is the technology long used by libraries and government agencies to make lengthy
documents available electronically.
It has high scope in field of virtual phonebook which can scan the visiting cards and can store the
information in the database.
2. REQUIREMENTS
a.) SOFTWARE REQUIREMENTS:
PLATFORM - VIBSUAL STUDIO 3.5
TOOLS - IMAGE PROCESSING TOOLS
b.) HARDWARE REQIUREMENTS :
PROCESSOR - P 4, 1.83 GHz
DISK SPACE - 1 GB
RAM - 1 GB
3) FEATURES OF PROJECT
a.)The character recognition program is written in C#.
b.)It recognizes the text entered by the user in a picture box. The software has the capability of
artificial intelligence that is it can learn more and more characters and can be used for other
languages also.
c.)The data is stored in a text file in binary mode and the program uses ‘BRUTE FORCE
ALGORITHM’ to match the characters from the text file.
d.)For many document input tasks , OCR is the most cost effective and speedy method available.
e.)For further modifications in this project, we can implement handwriting recognition. By which
the computer can recognize the way of writing of particular person.
4.DESCRIPTION OF PROJECT
The Classification Process:
(Classification in general for any type of classifier) There are two steps in building a classifier:
training and testing. These steps can be broken down further into sub-steps.
1. Training
a.) Pre-processing – Processes the data so it is in a suitable form for…
b.) Feature extraction – Reduce the amount of data by extracting relevant information—
Usually results in a vector of scalar values.
c.) Model Estimation – from the finite set of feature vectors, need to estimate a model
(usually statistical) for each class of the training data
2. Testing
a.) Pre-processing
b.) Feature extraction – (both same as above)
c.) Classification – Compare feature vectors to the various models and find the closest
match. One can use a distance measure.
1. Training 2. Recognition
(Testing)
Training Test Data
Data
Pre Processing Pre Processing
Feature Feature
Extraction Extraction
Model Classification
Estimation
Figure 2: The pattern classification process
5. Limitations
Advanced OCR systems can read text in large variety of fonts, but they still have difficulty with
handwritten text and also script fonts that mimic handwriting are still problematic.