See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/344335695
09182201
Article · September 2020
CITATIONS READS
0 204
1 author:
Murali Malaisamy
SRM Institute of Science and Technology
11 PUBLICATIONS 40 CITATIONS
SEE PROFILE
All content following this page was uploaded by Murali Malaisamy on 22 September 2020.
The user has requested enhancement of the downloaded file.
International Conference on Communication and Signal Processing, July 28 - 30, 2020, India
Reader and Object Detector for Blind
M. Murali, Shreya Sharma and Neel Nagansure
Abstract—This work aims to assist the visually impaired people
for reading a text material and detect objects in their
surroundings. The input is taken in the form of an image
captured from the web camera. This image is then processed
either for the purpose of text reading or for object detection
based on user choice. The Raspberry Pi acts as the
microcontroller for processing of the entire process. The text
reading is supported by software named OCR. The read text is
changed into an audio output using the TTS Synthesis. Other
dependencies required for the process include Tesseract Library.
Fig. 1. Braille alphabets
The Object Detection is another aspect of the project which is
implemented using a TensorFlow Object Detection API. It is able Thus there are several reasons why Braille couldn't be the
to detect various objects in its surroundings and provide an audio most helpful language for the blind. With the aim to provide
feedback about the same. The dataset can be trained on various assistance to disabled people we decided to innovate
different situations depending on the user needs, thus making it
scalable
something for the visually disabled or blind people of our
society. The READER would be like a live tutor. It would
Index Terms—Raspberry pi, OCR, tesseract, tensorflow recite the learner the entire page that he/she wishes to read. We
would use computer vision coupled with the IoT technology to
make what we call this, a reader for the blind. Now blind
I. INTRODUCTION people can read whichever book they want to without spending
tons of money and time to get it printed in Braille. Imagine
I N today’s world, technology is growing at an alarming rate.
It has found its way in every field of our life. But this
technology is of no use if it couldn’t provide itself to the aid of
how simple and efficient it would make their lives. The Object
DETECTOR, on the other hand, will act like a pair of virtual
the disabled people. The aspect considered most important in eyes. They would detect the objects present in the field of
human life is education. It is the education that you receive vision using a camera. The live feed will be analyzed by
today shapes your tomorrow. But do blind people get the same Tensor Flow and trained using various datasets to maximize
level of education that we sighted people are privileged for? the detection rate. The Object Detector will pronounce the
The answer is NO. Books read by blind people are scripted in objects present in the vicinity through a speaker. This will
Braille, but the cost of a simple book on counting shapes in enable the user to know exactly what is present around them.
Braille costs around rupees 1300 online. So imagine how The rest of the paper is organized as follows. The literature
much the academic textbooks would cost over the entire survey and the existing work problem are explained in Section
educational lifetime. Not everyone can afford this. Plus it takes II and Section III respectively. The proposed work and its
much longer to print a Braille textbook than a regular implementation are described in Section IV and Section V
textbook. The time and the money needed definitely fall into respectively. The result analysis is discussed in Section VI. At
the expensive segment of the graph. Not everyone can afford last, the paper is concluded with the conclusion of the paper in
this much time and money. Apart from time and money, the Section VII. Finally future work is discussed in section VIII.
learning curve for Braille is also steep. Fig. 1 is showing the
braille alphabets, it takes more training to learn the language of II. LITERATURE SURVEY
the blind than it takes to learn regular alphabets. This steep In [1] author focused on creating a photo-to-speech
learning curve is another reason for Braille not being as application for the blind. The project is called Camera Reading
efficient. Braille also fails to incorporate many new for Blind People, and it aims in development of a mobile
technological advances and innovations. It hasn't evolved application that allows blind user to read text. To achieve this,
much like a language. No new technology has incorporated Optical Character Recognition (OCR) and Text to Speech
Braille in its use case action. Synthesis (TTS) are integrated, which enables the user to take
a picture using a camera and hear the text that exists in the
M. Murali is with SRM Institute of Science and Technology, picture.
Kattankulathur, Tamil Nadu, India (e-mail: muralim@srmist.edu.in). In [2] this work focuses on the complete integration of Text
Shreya Sharma is with SRM Institute of Science and Technology,
Kattankulathur, Tamil Nadu, India (e-mail: ss9090@srmist.edu.in).
to Speech system designed for blind people. The hardware
Neel Nagansure is with SRM Institute of Science and Technology,
Kattankulathur, Tamil Nadu, India (e-mail: nn7295@srmist.edu.in).
978-1-7281-4988-2/20/$31.00 ©2020 IEEE 0795
Authorized licensed use limited to: SRM University. Downloaded on September 10,2020 at 04:37:12 UTC from IEEE Xplore. Restrictions apply.
consists of a Webcam and a Raspberry Pi which accepts an
image of the page to be readout.
The Software consists of OCR package that is responsible
for the Text to Speech conversion. An audio amplifier then is
fed with the output. MATLAB is used by the authors for the
simulation of the proposed system. They used Libraries,
Auditoriums, and Offices to read instructions and notices and
assistance in filing application forms.
In [3] the hardware includes an Ultrasonic Sensor for
environment scanning. A GPS service is also employed. Two
Cameras are placed on the glasses that will generate a disparity
map of the scene and a GPS is used to group objects based on
the location. The Ultrasonic sensor detects any obstacle at a
medium to long-range. The system is optimized to work
effectively in a real-time system. Fig. 2. Distribution of Visually Impaired People across the world.
In [4] the system proposed in the above paper reads the text
In the Fig. 2, the spread of visually impaired people in
present on labels, products and printed notes. It involves the
different areas of the world graph is shown. While the majority
process of extraction of text from an image and its conversion
of it lies in South-Asian continent, lesser numbers are
to speech. The Raspberry Pi used is provided with a battery
witnessed in Middle East & North Africa. Sub-Saharan Africa
backup thus giving a portability aspect to the product.
and Europe have 47.26% and 44.75% respectively.
In [5] various methods such as ‘YOLO’, ‘SSD’ and ‘CNN’
are used to provide the users with the best experience of object
detection. However, accuracy could have been improved.
In [6] the proposed system aims to assist blind people in the
detection of brightness and major colors in real-time by using
RGBby the means of an external camera. It also helps the user
in the identification of fundamental objects. The ability of the
system to have a facial detection system also increases the
scope of application. Hardware System includes the usage of a
Raspberry Pi and a Pi camera for detecting the facial edges
and objects. The software includes the YOLO Algorithm and
MTCNN Networking.
III. PROBLEMS IN THE EXISTING MODEL
The existing model is capable of either detecting the objects
around or reading the text from an image. No work has been
done to inculcate both technologies into one single model. We
realized that these two technologies can be beneficial to the
blind people and its utilization to the maximum should be done
by the integration of the OCR model and Object Detection
model. We have also worked to improve the frame rate of
object detection by limiting the training dataset to our
everyday objects hence focusing more on a smooth and faster
experience.
IV. PROPOSED WORK
This project will present a document reader and object Fig. 3. Procedure of the Proposed System
detector for blind people that is developed using Raspberry Pi.
The procedure of the proposed system is shown in Fig. 3.
It uses the ‘Optical Character Recognition’ (OCR) technology
Choose which module needs to be performed. If the text needs
to read the printed characters captured using USB camera or pi
to be read, webcam clicks a photo of the printed text and fed
camera. OCR AND TTS (Text-To-Speech) are used to convert
into raspberry pi which converts it into audio output using
images of printed text into an intermediate form that is then
libraries such as OCR and TTS. If objects are to be detected,
changed to audio output. TensorFlow software library is used
webcam displays the name of the object around the rectangle
for Object Detetction. It can successfully detect majority of
that surrounds the object and that text is converted into audio
objects in our surroundings including animals, vehicles,
output. This is achieved using Tensor Flow.
humans etc.
Total visually impaired population – 285 million
0796
Authorized licensed use limited to: SRM University. Downloaded on September 10,2020 at 04:37:12 UTC from IEEE Xplore. Restrictions apply.
Abbreviations and Acronyms
OCR - Optical Character Recognition
gTTS – Google Text To Speech
Open CV- Open Computer Vision
TF- Tensor Flow
The hardware includes Raspberry Pi (INR 2300), USB Web
Camera (INR 1500 approx.) and a speaker (INR 800 and
above). Hence, the average cost adds up to roughly INR 5000.
The Software used is open source thus no additional cost for
the software is inculcated into the model.
V. IMPLEMENTATION Fig. 6. Flashing the downloaded OS onto the SD card
A.3.3 Install Python: To Install Python on your Raspberry Pi
A. Reading Text:
you will have to build it yourself. Use the command
A.1 A text document that needs to be read is placed in the
camera view. This document should be printed. It should be Sudo apt-get install python3
written in the English language. The font size should not be A.3.4 Install OpenCV: Use the original documentation present
too small to detect. on the Official OpenCV website to install and download the
A.2 Use a USB web camera to click a picture of the written packages.
text document. This camera acts as an input device to our
project model. It is connected to the Raspberry Pi via a USB https://www.learnopencv.com/install-opencv-4-on-raspberry-
connection. The picture clicked is converted into a .jpg format pi/
and sent for further processing. A.3.5 Install Tesseract OCR:
A.3 The Raspberry Pi is the brain of the entire system. It is a
miniaturized version of a computer. All the processing is done Step 1: Installation using the command
inside the Pi. The pi should be installed with all the required sudo apt-get install tesseract-ocr
applications, libraries, dependencies, etc.
Step 2: Check the version installed
A.3.1 Update the Pi: The updating is shown in Fig. 4
tesseract –v
A.3.6 Install the gTTS :
Step 1: Install the gTTS on your Raspberry Pi using the
command:
pip3 install gTTs
Now your Raspberry Pi is ready to perform the Text Reading.
All the necessary components have been installed successfully.
Fig. 4. Updating Raspberry Pi
A.4 Connect an amplifier to the Raspberry Pi using Bluetooth
A.3.2 Installing the Raspbian OS : or Analog connection. The amplifier could be a speaker or any
other device.
Step 1: Download the Raspbian from the official website. The
downloading details are shown in Fig. 5 A.5 The clicked picture is then processed by the OCR and
gTTS to read out the document.
B. Object Detection
B.1 Use a webcam, connected to the raspberry pi, to click a
picture or get a live feed of the surroundings.
B.2 The Web Camera’s input is given to the raspberry pi for
Fig. 5. Downloading Raspbian OS from Official Website
further processing.
Step 2: Unzip the downloaded file.
B.2.1 Install TensorFlow: Install the TensorFlow on your
Step 3: Write the disc image to your microSD card Raspberry Pi using the command
Step 4: Put the SD Card back into the Pi and boot up. The pip3 install TensorFlow
downloaded card flash is shown in Fig. 6
0797
Authorized licensed use limited to: SRM University. Downloaded on September 10,2020 at 04:37:12 UTC from IEEE Xplore. Restrictions apply.
B.2.2 The TensorFlow Object Detection uses Google’s API of performing any of the two modules. During the Object
Protobuf. Install it using the command Detection the rectangular box that appears around the detected
object adjusts itself according to the size of the image shown
sudo apt-get install protobuf-compiler
in Fig. 8. This is done automatically. The model can detect
B.2.3 set up the Directory and Python path. Download the almost everyday objects in the surrounding like the Television
repository from GitHub using set, Bed, Chair, Fan, Person, Tree, Bird, Animals etc. It is
possible to train the model with other new objects for a wider
gitclone–recurse-submodules
spectrum of detection. The frame rate achieved is 1 and above,
https://github.com/tensorflow/models.git
which is very fast compared to the processing capabilities of
B.2.4 Run the script using the command: the Raspberry Pi. The input that is being fed to the program
can sometimes produce inaccurate results due to bad lighting
python3 Object detection picamera.py –usbcam
conditions, however to overcome this an accuracy meter has
B.2.5 The gTTS is already installed. Provide the output of the been provided which determines how close the detected object
detected objects to the input of gTTS to provide an output in is to the actual object.
the audio format.
B.3 Connect an amplifier to the Raspberry Pi using Bluetooth VII. CONCLUSION
or Analog connection. The amplifier could be a speaker or any This paper shows the implementation of the project ‘Reader
other devices. and Object Detector for Blind’. It was developed to aid blind
in everyday life and help them to be independent. The project
aimed to cover a broader aspect of life and hence we
VI. RESULT ANALYSIS
incorporated both the parts into one. This project aims to assist
Successful results were obtained while executing the the blind people in reading the printed text on pamphlets,
project. Text was read out aloud in the first case whereas in the books, magazines and other printed material. One can be
second objects were detected successfully. assisted in reading their everyday newspaper with the help of
this device. The feature of Object Detection can be used to
help the blind people know more about their surroundings
without having to move around the place. The project faces
certain limitations as well. The project can only identify words
of English language. It can read words that have font size
greater than or equal to 14. Only certain objects can be
identified. The frame rate for object detection is slow due to
the small computation power of Raspberry Pi.
VIII. FUTURE WORK
The frame rate for the object detection is very low. With
better computational power we will be able to increase the
accuracy and speed of the object detection. After training the
model with other languages we would be able to read several
Fig. 7. Reader Part other languages and reach out to more people with a language
diversity.
REFERENCES
[1] Roberto Neto, Nuno Fonseca Camera Reading For Blind People.
Procedia Technology 16 ( 2014 ) 1200 – 1209
[2] D.Velmurugan , M.S.Sonam , S.Umamaheswari , S.Parthasarathy ,
K.R.Arun A Smart Reader for Visually Impaired People Using
Raspberry PI DOI 10.4010/2016.699 ISSN 2321 3361 © 2016 IJESC.
[3] Jamal S. Zraqou, Wissam M. Alkhadour and Mohammad Z. Siam Real-
Time Objects Recognition Approach for Assisting Blind People.
International Journal of Current Engineering and Technology E-ISSN
2277 – 4106, P-ISSN 2347 – 5161.
[4] Amal Jojie , Ashbin George , Dhanya Dhanalal Nayana J. Book Reader
for Blind. ISSN (e): 2250-3021, ISSN (p): 2278-8719 PP 33-38.
Fig. 8. Object detection
[5] P. Rajeshwari, P. Abhishek, P. Srikanth, T. Vinod. International Journal
During the analysis of the Reader part it was noticed that of Trend in Scientific Research and Development (IJTSRD) Volume: 3 |
Issue: 3 | Mar-Apr 2019 Available Online: www.ijtsrd.com e-ISSN:
font below the size of 16 was difficult to detect shown in Fig. 2456 – 6470
7. Also, a plain background behind the text produced better [6] Ferdousi Rahman, Israt Jahan Ritun, NafisaFarhin, JiaUddin
and more accurate results than an abstract background. The .https://dl.acm.org/doi/proceedings/10.1145/3309074
hundred percentage accuracy cannot be expected while
0798
Authorized licensed use limited to: SRM University. Downloaded on September 10,2020 at 04:37:12 UTC from IEEE Xplore. Restrictions apply.
View publication stats