Final Report
Final Report
A Project Report
Bachelor of Technology
in
Dr. S. SRITHAR
K L E F, Green Fields,
Page | 1
Page | 2
Page | 3
DECLARATION
The Project report entitled “Controlling PowerPoint Using Hand Gestures” is a record
of the bonafide work Sk. Faiz Ahmed, P. Tushar, and P. Sandeep submitted in
partial fulfillment for the award of B.Tech in Computer Science and Engineering
to K L University.The results embodied in this report have not been copied from any
other departments/Universities/Institutions.
Signature of Student
Page | 4
CERTIFICATE
This is to certify that the Project report entitled “Controlling PowerPoint Using Hand
Gestures” is being submitted by Sk. Faiz Ahmed, P. Tushar, and P. Sandeep submitted in partial
fulfillment for the award of B. Tech in Computer Science and Engineering to the K L University
is a record of bonafide work carried out under our guidance and supervision. The results
embodied in this report have not been copied from any other
departments/Universities/Institutions.
Signature of Supervisor
S. SRITHAR
Page | 5
ACKNOWLEDGEMENT
I express my sincere gratitude to our HOD Dr. A. Senthil for his administration towards
our academic growth. I record it as my privilege to deeply thank you for providing us with the
efficient faculty and facilities to make our ideas into reality.
I express my sincere thanks to our project supervisor S. Srithar sir for Helping us to
complete this paper irrespective of the time you were available to helpus all day’s and
night’s to encourage us, to give association of ideas and Sir youmotivated us to venture
this report successfully.
P. Tushar 190031258
P. Sandeep 190031228
Page | 6
TABLE OF CONTENTS
ABSTRACT x
LIST OF FIGURES viii
I INTRODUCTION 10
I.I What is gesture 10
I.II types of gestures 10
I.III benefits of gestures 11
I.IV applications 12
II LITERATURE SURVEY 14
III THEORETICAL ANALYSIS 15
III.I Image preprocessing 15
III.II Processing types 16
III.III Frameworks 17
III.IV Libraries 18
IV EXPERIMENTAL INVESTIGATION 19
V EXPERIMENTAL RESULTS 19
VI PROBLEM STATEMENT 20
VII SYSTEM REQUIREMENTS 20
VIII FLOW CHART 21
IX CODE 22
X RESULTS AND ANALYSIS 25
X.I Testing Objectives 25
X.II Outputs 26
X.III Slide changing 26
Page | 7
X.IV Cursor on slide 27
X.V Drawing & Erasing 27
XI CONCLUSION
X.I Conclusion 28
REFERENCES 30
PUBLICATION
Page | 8
LIST OF FIGURES
10.1 Results -1 25
10.2 Results -2 26
10.3 Results -3 26
10.4 Results -4 27
10.5 Analysis 27
Page | 9
ABSTRACT
Page | 10
CHAPTER 1
INTRODUCTION
Page |11
Basic theme of presentations is to show the people how we imagine about telling
something. So majority of the people will use presentations at certain time of their life.
And every one will use presentations to show their feelings towards their certain future
work or may be their current work. To make the presentations we have lot of styles
which were being provided by windows.So we get to know that to make changes in a
slide we will always need an external device or we should be present physically there.
So in the era of new technologies and new software’s and to take it further we have
thought of making presentations more interactive in a way that even people who are
watching.
So most of the systems will have web cams now a days and we as a group
thought that we should be capturing a live video and to recognize gestures of the person
to change the slide or may be to show cursor on the particular object in the presentation,
or to draw on the presentation and we can do many more.Basically what does
presentations mean, a presentation is a technique for conveying information from a
speaker to an audience. Presentations are usually demos, introductions, lectures, or
speeches intended to enlighten, convince, inspire, motivate, develop goodwill, or
introduce a new idea.Presentation abilities are one of the most important skill sets for
students in higher education institutions. Students are encouraged to adopt systematic
approachesto support their presentations in the classroom to help them develop their
presenting abilities.
The fundamental reason for emphasizing presenting skills in higher education
is to assist students establish professionalism in the classroom. Students may improve
their presenting abilities utilizing high-tech equipment in this day and age, boosting
their ability to convey material in a professional manner.Good presenting abilities may
also help you come up with better ideas, discover current information, and build
creative thinking. It is typical to see students utilize slides or other materials when
presenting, which helps them feel at ease in frontof an audience.When students develop
original and fascinating slides to accompany their discussion, they are able to generate
fresh ideas. The usage of presenting aids allows for a lot more fascinating discourse,
and creating such aids may help students acquire confidence. Students are generally
allocated 10 to 15 minutes in he classroom present a certain topic. As a result, the
more discussions they give, the more they learn.
Page |12
Applications of hand gestures in daily life:
Hand gesture recognition systems have been used for a variety of applications
in many fields, such as sign language translation, virtual worlds, smart surveillance,
robot control, medical systems, and so on. A summary of several hand gesture
application areas is shown below.
Because sign language is used for interpreting and explaining a certain topic
during a discussion, it has gained special attention. Many methods have been suggested
to identify gestures using various forms of sign languages. For instance, The border
histogram, MLP neural network, and dynamic programming matching were used to
detect American Sign Language ASL. JSL (Japanese Sign Language) was detected
using a Recurrent Neural Network, 42 alphabets, and 10 words. Arabic Sign Language
ArSL was detected using two types of neural networks, partially and fully recurrent
neural networks.
One of the intriguing applications in this sector is controlling the robot with
gestures. suggested a technique for directing a robot via hand position indicators that
employs numbering to count the five fingers The commands are provided to the robot
to complete a certain duty, with each sign having a distinct meaning and representing
a separate function, for example, "one" means "go ahead," "five" means "stop," and so
on.
Hand postures and gestures are used to control the television. The open and
close hand gestures are used to control TV activities such as turning the TV on and off,
increasing and decreasing the volume, muting the sound, and changing the channel.
Another contemporary use of hand motion is number recognition. In a real-
time system, HMM was used to suggest an autonomous system that could extract and
detect a meaningful gesture from hand motion of Arabic digits ranging from 0 to 9.
Page |13
CHAPTER-2
LITERATURE SURVEY
The hand signal is the finger position and posture of the hand that is commonly
employed for nonverbal communication [12]. Human hand and finger motions are
discovered and identified in Python using a transfer key learning method. Background
removal, hand ROI separation, contour detection, and finger recognition utilizing
transfer key learning are all part of this process flow where a model is trained by CNN
[5]. The major purpose, according to a review of various other ways described by the
researchers, is to aid presenters in producing a great presentation through more natural
computer interaction [3].
The strategies and methods of hand gesture recognition through the human-
computer interface. Human-computer interaction is an important part of most people's
everyday lives. Our team investigated some Human-Computer Interaction (HCI)
strategies and methods for recognizing human hand gestures using various
methodologies [13]. The objective of gesture signal recognition is to build a method
that can categorize distinct human signal gestures and utilize them to control devices.
According to William and Michal Roth [14], gesture classification and
interpolation can be recognized by the orientation of the histogram. For dynamic gesture
signals, the Spatio-Temporal margin gradients feature vector is useful [11]. Hand
gestures are a unique approach to interacting between humans and computers. The hand
gesture method has the benefit of being simple to employ when compared to previous
approaches [9]. The typical approach of utilizing a mouse and keyboard will be altered
by employing this technique since hand gestures will be used to interact with the
computer. An ultrasonic sensor is utilized in this technology to categorize hand
movements in real time.
Gesture recognition enables you to manage your creative presentation
without using a remote control or even touching the screen of your Smartphone or
tablet. Because PowerPoint can be used to create online lessons and demos, using
gesture graphics and symbols in our presentations to demonstrate how a programme is
used may be beneficial [6]. For example, if you can switch pages during a presentation
using a gesture, the presentations will be more immersive and appealing to the audience.
[10] This can also be beneficial for modern digital TVs that can recognize gestures,
such as the Samsung Smart TV. MS PowerPoint is an essential element of our
Page |14
professional and academic lives.
Hand gesture recognition is dynamic [8] which is automated and resistant
to changes in pace, style, and hand position. Our method is based on action graphs,
which have comparable resilient features to regular HMM but require less training data
since they allow states to be shared between movements [2].
Ahmed, K et al. [1] developed a novel hand gesture signal recognition system.
A low-cost technique was designed to celebrate real-time hand signal gesture
recognition. This article discusses the idea of building to manage PowerPoint slides
without using a keyboard or mouse, but rather by utilizing hand gestures.
Page |15
CHAPTER-3
THEORETICAL ANALYSIS
IMAGE PRE-PROCESSING:
Pre-processing is meant to raise the overall image quality so that we can study
it in a much more thorough way. Pre-processing will help us eliminate unwanted
distortionsand improve a few components that are crucial to the actual application we
are working on. Those characteristics could change depending on the application.
Pre-processing refers to activities with pictures at the most basic level of abstraction,
where both input and output are intensity images. These iconic pictures are the same
as the original sensor data, with an intensity image often represented by a matrix of
image function values (brightness’s). Although geometric transformations of images
(e.g. rotation, scaling, and translation) are classified as pre-processing methods here
because similar techniques are used, the goal of pre-processing is an improvement of
the image data that suppresses unwilling distortions or enhances some image features
important for further processing.
Page | 16
The role of cameras in our connected future is shifting as human connection
with technology becomes more natural and seamless. Camera makers are reinventing
its possibilities in the linked home, business, or schools of the future, rather than just
as a device for capturing and transmitting video. Gesture control cameras for
conferences, in particular, are gaining popularity and creating enthusiasm as the next
generation of camera technology. These cameras, rather than facial expressions or
eye movements, watch hand motions in order to identify pre-programmed gestures
and respond appropriately by activating pre-determined activities. This article delves
into how these cameras might improve online meeting experiences, the possible
issues that may come from their incorporation into collaborative spaces, and why
we'll be seeing a lot more of them in the near future.
FRAMEWORK:
Page | 18
LIBRARIES:
OpenCV Applications for machine learning, image processing, andcomputer
vision all require OpenCV. Python, C++, Java, and many other programming
languages are all supported by OpenCV. The use of photos and videos can be used to
identify objects, faces, or even a person's handwriting. It is a free librarythat has been
used to complete tasks including face recognition, object tracking, landmark
recognition, and many others. Both video recording and hand detection are done using
it.
Page | 19
CHAPTER-4
EXPERIMENTAL INVESTIGATION
We have tried various tastings in our project so that to find any mistakes or errors
in our experiment. So, in our investigation everything works fine. And more over the
cursor option is working absolutely accurate. The other three functions are that we can
draw anything on the presentation by showing our index finger. The second feature is
that we can undo anything we've doneby flashing three middle fingers. The third feature
is that we may highlight what we wish to display inthe presentation.
To extend our project we have decided to control pdf as well. So that without
having concern we can control pdf as well. Any scrolling object can be controlled using
this project. We also tried to draw, erase and pointing option but due to glitches we
haven’t completed yet.
In final we have achieved controlling both presentations and pdf files using hand
gestures using transfer learning algorithm.
Page | 20
CHAPTER-5
EXPERIMENTAL RESULTS
When we place our show our gestures to the camera it does recognize and we can
also watch how does system recognize our hand and it also shows the skeleton part of
our hand i.e. to know whether our fingers are closed or opened, to set and to read gesture
of our hand. So we already set some gestures and some actions linked so that whenever
we show that gesture system should do action which is assigned to it.
In presentations there are five actions we can use with the help of gestures they
are changing slide forward, changing slide backward, drawing, undo drawing and
cursor. With the help of these gestures we can present presentations flawlessly without
controlling any external device. This makes presentations more interactive as well as
less effort required compared to normal method.
First action is moving the slide forward which works accurate and to achieve that
we have to show our palm i.e. we should show our five fingers opened to change the
slide forward. In the same way to change the slide backward we have to close our five
fingers / we can show our fist to the system then it moves the slide backward or to the
previous one. These both actions works 14 out 15 and even 15 times if we have well-
lit surroundings when the camera starts to recognize.
The other 3 actions are for making the people understand what they are learning
i.e. they are more useful for interactive items such as when presentations contains some
graphs, images where the explainer always wants to point out what he’s teaching. So
for that purpose we have 3 actions. One is we can draw on the slide where ever we want
and 2nd actions which helps in undoing the first action that is drawing. 2nd action works
like an eraser. And the 3rd actions is for pointing out anything that is with the help of
our finger we can point out on the presentation.
Page | 21
CHAPTER-6
PROBLEM STATEMENT
In current world with the growth of technology we prefer using electronic
devices to showcase our work with the help of power point presentations. Basically,
power point is popular presentation program which is developer by windows. So, in
daily life this power point presentation plays very important role. But to interact with
power point presentations we should approach the device to change the slideor to write,
explain. So, to make humans life easy we have come up with natural and live
interaction of power point presentation. And pdf representations in daily life is
marginally increased in our day to day life. And even we have created a solution so
that you may not relay on a device to change the slide, to change the page in pdfs.
CHAPTER-7
SYSTEM REQUIREMENTS
- Webcam: A video camera can be used to stream images or videos in real time to
or across a network. In this proposed system, we would initially take pictures of
the hand motions the user made while using the webcam
- 4 GB RAM
- Any modern CPU
- 2.5GB disk and 1GB for caches of Disk Space - 1024*768 Monitor
resolution
• Operating System: Microsoft Windows 8 or later
• macOS 10.14 or later
• Any Linux distribution that supports Gnome, KDE, or Unity DE. PyCharm
is not available for some Linux distributions, such as RHEL6 orCentOS6,
that do not include GLIBC 2.14 or later.
Page | 22
CHAPTER-8
FLOW CHART
From the above flowchart we can say that whenever we start the implementation the
first thing is system recognizes is our live actions. From that it detects which finger is
opened or close. With respective to that it decides which action to implement and it
implements on presentation.
Page | 23
CHAPTER-9
CODE
from cvzone.HandTrackingModule import Hand
Detectorimport cv2
import os
import NumPy as np
# Parameters
width, height = 1280, 720
gesture Threshold = 300
folder Path = "Presentation"
# Camera Setup
cap = cv2.VideoCapture(0)
cap.set(3, width)
cap.set(4, height)
# Hand Detector
detectorHand = HandDetector(detectionCon=0.8, maxHands=1)
# Variables
imgList = []
delay = 30
buttonPressed = False
counter = 0
drawMode = False
imgNumber = 0
delayCounter = 0
annotations = [[]]
annotationNumber = -1
annotationStart = False
hs, ws = int(120 * 1), int(213 * 1) # width and height of small image
while True:
# Get image frame
success, img = cap.read()
img = cv2.flip(img, 1)
pathFullImage = os.path.join(folderPath, pathImages[imgNumber])
imgCurrent = cv2.imread(pathFullImage)
else:
annotationStart = False
else P a g e | 23
annotationStart = False
if buttonPressed:
counter += 1
if counter > delay:
counter = 0
buttonPressed = False
cv2.imshow("Slides", imgCurrent)
cv2.imshow("Image", img)
key = cv2.waitKey(1)
if key == ord('q'):
break
P a g e | 24
CHAPTER-10
RESULTS & ANALYSIS
When we execute above code a live video with green line will be shown on the
display and we can see that structure of our fingers and our hands. To be precise it looks like skeleton
structure. With that we can see that which finger is up and which finger is down. So our code will
write 1 if finger is up and 0 is finger is down. We can customize our code to change respective
gestures.
Now we can see that if we show thumbs up the respective slide has been changed and as
shown in figure last slide presentation is changed. And another information regarding our project is,
we should show our hand near to that green line which is our centre line. Our algorithm works only
if our hand is present on that line. As image shows that person showinghis thumb and slide is
changed i.e. moved forward.
P a g e | 25
Fig-10.2 Next page navigation
This image shows clearly that the person shows his thumb and the
slide has been changed to next. And if we are in well-lit room conditions our
gestures work accurately. If we are in low light conditions our image
predictor will capture things which have more noise to it. So to reduce image
noise we have to be in well lighted room to get good output from the
presentation.
And the other three functionalities are that if we show our index finger we
can draw anything on the presentation. And second functionality is we can undo
anythingwe have drawn by showing three middle fingers. And the third functionality
is we can point out what we want to show in the presentation.
P a g e | 26
Fig-10.4 draw option
As per our analysis of our project every function is accurate and has accuracy rate more than
80% based on our results except the left function. And more over the live video also plays
important role and the environment should be neat and there should not be multiple palms
in the video and the camera should be capable of capturing theright images and the images
or live video should have less image noise. With less image noise accuracy rate may be
higher than our results.
P a g e | 27
CHAPTER-11
CONCLUSION
We conclude that with the help of live camera we can operate the presentation
slides and pdf even without reaching the device which we are using to display. So
effortlessly we can use our gesture to change the slides or to point out something or
may be drawsomething on the presentation pages.
So this makes human life better and makes it easier than compared to
physically reaching the device in between the presentation to change the slides or to
point out what we are discussing.
So with the help of this project we are going to use this flawlessly by making
the power point presentation easier than the traditional way. We have used some code
and the external cam which almost present in all the devices these days to make this
possible. So we conclude that we can operate slides with the help of hand gesture by
recognizing with the help of fingers and their actions basically with the hand signs so
it makes you easy to present your presentations without distractions
P a g e | 28
REFERENCES
[1] Ahmed, K, et al. “A new hand gestures recognition system”, Indonesian journal of
electrical engineering and computer science, Vol. 18, No. 1, pp. 49~55, 2020.
[2] A. Kurakin, et al. “A real time system for dynamic hand gesture recognition with a depth
sensor”, Signal Processing Conference (EUSIPCO), January 2012.
[3] Damiete O. Lawrence, et al. “impact of human-computer interaction (HCI) on users in
higher educational system: southampton university as a case study”, International Journal
of Management Technology, vol 6, 2019.
[4] L. Chen, et al. “A Survey on Hand Gesture Recognition”, 2013 International Conference on
Computer Sciences and Applications, pp. 313-316, 2013.
[5] Li, G, et al. “Hand gesture recognition based on convolution neural network”, Cluster
Comput 22 (Suppl 2), pp. 2719–2729, 2019.
[6] Meera Paulson, et al. “Smart Presentation Using Gesture Recognition", International
Journal for Research Trends and Innovation, Volume 2, Issue 3, 2017.
[7] Munir Oudah, et al. “Hand Gesture Recognition Based on Computer Vision. Review of
Techniques”, I. J. Imaging 2020, vol 6,pg 73, 2020.
[8] P. Molchanov, et al. "Multi-sensor system for driver's hand-gesture recognition," 2015 11th
IEEE International Conference and Workshops on Automatic Face and Gesture
Recognition (FG), 2015, pp. 1-8, doi: 10.1109/FG.2015.7163132.
[9] R.Remolda, et al. “A study on controlling a computer with hand gesture”, International
Journal of Computer Science and Mobile Computing, Vol.8 Issue.9, pg. 215-218, 2019.
[10] Sebastian Raschka, et al. “Machine Learning in Python: Main Developments and
Technology Trends in Data Science, Machine Learning, and Artificial Intelligence”,
Multidisciplinary Digital Publishing Institute, 2020.
[11] S. J. Wan, et al. “Variance based color image quantization for frame buffer display,” Color
Res.Applicat., vol.15, no.1, pp.52–58, 1990.
[12] Srinivasa Rao K , et al. "Hand Gesture Recognition and Appliance Control Using Transfer
Learning",International Journal of Engineering Research and Applications, Vol. 11, Issue 7,
(Series-III) pp. 37-46, July 2021.
[13] Sundus Munir, et al. “Hand Gesture Recognition: A Review”, International Journal of
scientific & technology research, volume 10, 2021.
[14] William T. Freeman et al. “Orientation Histograms for Hand Gesture Recognition”,
Mitsubishi Electric Research Laboratories Inc, IEEE Intl. Wkshp. on Automatic Face and
Gesture Recognition, Zurich, 1995.
P a g e | 30