Inhouse Reportt
Inhouse Reportt
2024-2025
( VI Semester)
A Project Report on
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
Submitted by
22BTRCN047,22BTRCN068,22BTRCN110,
22BTRCN206, 22BTRCN211
CERTIFICATE
This is to certify that the project work titled “Gesture-Controlled Virtual Mouse Using
Real-Time Hand Tracking” is carried out Bathala Harsha(22BTRCN047), Deekshith
R(22BTRCN068), Hrishikesh U Gowda(22BTRCN110), Poornesh D(22BTRCN206), Pranshu
Jain(22BTRCN211), a bonafide student(s) of Bachelor of Technology at the School of
Engineering & Technology, Faculty of Engineering & Technology, JAIN (Deemed-to-be
University), Bangalore in partial fulfillment for the award of in-house project, during the year
2024-2025.
1.
2.
Place : Bangalore
Date :
Signature of Student(s)
i
Gesture-Controlled Virtual Mouse Using Real-Time Hand Tracking
ABSTRACT
The primary objective of this project is to create and deploy a gesture-based virtual
mouse system that integrates real-time hand tracking, mouse cursor movement,
mouse click recognition, and scrolling. The system takes advantage of a webcam to
recognize hand gestures, providing a natural, free-hand experience for users.
Computer vision libraries, including OpenCV and MediaPipe, are applied to track
the hand, and pyautogui is used for emulating the movement of the mouse on the
computer.
Besides, this project explores the viability of multi-hand gesture control wherein the
right hand is used for mouse movement and clicking, while the left hand may be
used for scrolling. A mode-switching functionality is also provided such that the user
can interchange the roles of the right and left hands. Through this project,
sophisticated concepts in interactive user interface design, machine learning, and
computer vision are demonstrated, which lead to a seamless and immersive user
experience.
Gesture-Controlled Virtual Mouse Using Real-Time Hand Tracking
TABLE OF CONTENTS
Chapter 1 08
1. Introduction 08
1.1Background & Motivation 08
1.2Objective 10
1.3Delimitation of research 11
1.4Benefits of research 12
Chapter 2 13
2. LITERATURE SURVEY 13
2.1 Literature Review 13
2.2 Inferences Drawn from Literature Review 15
Chapter 3 16
3. Problem Formulation and Proposed Work 16
3.1 Introduction 16
3.2 Problem Statement 16
3.3 Proposed Algorithms 17
3.4 Proposed Work 18
Chapter 4
4. Implementation 19
Software algorithm 19
Chapter 5 21
Results and discussion 21
Chapter 6 23
Conclusions And Future Scope 23
Appendices xxvii
Appendix – I xxvii
Appendix – II xxviii
Information Regarding Student xxxi
Photograph Along With Guide xxxii
Gesture-Controlled Virtual Mouse Using Real-Time Hand Tracking
Linear Regression
Chapter 1
1. INTRODUCTION
In the modern computer world, human-computer interaction (HCI) is growing more
complex, and the trend is moving toward gesture-based interfaces to enable users to
communicate with systems in a more natural way. The conventional input devices like the
mouse and keyboard are being supplemented or even substituted by more natural, hands-
free solutions. Computer vision-driven gesture-based systems are also proving to be a
promising substitute for conventional input devices, a simple and seamless way of
commanding digital devices. The "Gesture Controlled Virtual Mouse Using Real-Time
Hand Tracking" project aims to create a system by which people can use natural hand
movements to control their computers, the potential for greater ease and comfort of use
leading the way. With the aid of real-time hand tracking, the system can identify specific
hand movements to simulate mouse movement, clicking, and scrolling, for an immersive
and user-friendly experience.
The breakthrough development of computer vision and machine learning has greatly
affected human-computer interaction. The traditional methods of input, such as interacting
through a mouse and keyboard, are limiting and laborious, especially among disability
patients or individuals who want an intuitive interface. Gesture input has the potential to
remove these limitations and enable users to manipulate their machines without actually
touching a mouse or keyboard. The motivation for this project is to explore the possibility
of gesture-based hand interaction as an intuitive and natural way to interact with a
computer. Existing systems are often hardware-intensive or require significant training.
This project is intended to create a light and easy-to-use system that will be able to operate
on the mere use of a standard webcam, providing a simple yet stable way for users to control
their devices using gestures.
Flowchart for developing the Virtual Mouse system using hand gesture recognition and
computer vision techniques.
1.2. Objective
The main goal of this project is to implement and create a gesture-based virtual mouse
system for interacting with computers by hand gestures. The application shall:
• Offer real-time tracking of hands to identify gestures and move the mouse cursor.
• Implement left and right click capabilities through certain hand gestures.
• Be able to enable multi-hand gestures, where each hand enables a different operation
(right hand cursor movement, left hand scroll).
• Have a mode-switching capability to flip the hands' roles for further control flexibility.
• Use on-screen real-time instructions in an attempt to guide users how they need to
utilize the system.
Lastly, the system aims to balance gesture recognition and intuitive user engagement in
order to produce a free-hand, sensitive virtual mouse improving user experience.
To ensure that the project is not too big and it remains attuned, there are some restrictions
imposed:
• The system uses a webcam for the capture of hand gestures without having to use any
special hardware in the form of depth sensors.
• It doesn't work on supporting voice recognition or any other high-level modes of input;
instead, it is based solely on hand gestures.
• The system employs a limited number of hand movements to ensure simplicity,
focusing on real-time cursor movement, clicks, and scrolling.
• The project has been implemented for Windows operating systems and may need
additional adaptations for cross-platform use in the future.
• The system does not employ advanced error handling or advanced machine learning
model training for recognizing hand gestures; rather, it employs pre-trained models for
efficiency.
These limitations render the project feasible yet demonstrate essential principles in hand
gesture recognition and computer vision.
• Educational Value: It is a rich learning space for learning about computer vision,
hand tracking, and user interface design.
• Innovative Interaction: The project provides users with a new and intuitive way of
interaction with the computer and hand-free for everyone who desires more relaxing
ways of input.
• User Experience Centricity: With real-time gesture recognition, the project focuses
on creating an intuitive and user-centric experience.
• Scalability: The system design can be scaled in the future even more to include
additional features, such as voice operation, multi-device support, or more complex
gestures.
• Practical Application: The project has the potential for being the platform upon
which a more complex system could be created using gesture controls, and one could
utilize the system in countless different industries such as assistive technology or even
game and design.
On a broader scale, the project is concerned with exploring the possibility of combining
computer vision, machine learning, and human-computer interaction to develop innovative
systems that enable more interaction between humans and technology.
Chapter 2
2. LITERATURE SURVEY
The analysis of current literature brings out the accelerated development and usability of
gesture-based Human-Computer Interaction (HCI), specifically in the field of virtual
mouse systems. The consistent finding across a series of studies is the greater use of
computer vision and real-time hand tracking to simulate the functionality of an old physical
mouse. Such systems usually utilize technologies like OpenCV, MediaPipe, or depth
sensors like Kinect to record hand movements through a regular webcam, allowing control
over cursor dynamics, clicking, and scrolling actions.
One of the main drivers of these developments is the minimization of reliance on physical
input devices. This is particularly important for optimizing accessibility for individuals
with physical disabilities and minimizing touch in public or shared computing
environments, which is particularly relevant in the context of health crises like the COVID-
19 pandemic. Studies also establish that gesture-controlled systems can be used in a range
of environments, including factories, remote locations, and public kiosks, where
conventional input is impractical or undesirable.
The literature reviewed exhibits very high rates of accuracy in gesture recognition and
system responsiveness. For example, there are some systems with over 99% recognition
rates, which indicates that there has been significant improvement in reliable input
interpretation. In spite of these improvements, a number of studies identify areas for
improvement. These include improving gesture recognition under varying lighting
conditions, refining motion smoothing algorithms, and ensuring consistent performance
across different user hand shapes and sizes.
In addition, the literature indicates the potential for expanding these systems into more
application-specific uses. Potential future enhancements include incorporating machine
learning methods for dynamic gesture recognition, as well as integrating multiple data
sources in different modalities to enhance system robustness and user flexibility.
In conclusion, the current body of work solidly underlines the future and potential of hand-
gesture-controlled virtual mouse systems. As computer vision and real-time processing
technologies continue to improve, gesture-control systems will tend to provide increasingly
seamless, natural, and touchless user experience, transforming paradigms in computing
interaction.
Chapter 3
3.1. Introduction
With the increasing need for more intuitive and user-friendly input systems, the
conventional inputs such as the mouse and keyboard are being replaced more and more by
gesture-based input systems. With the development of technology, particularly in computer
vision, there is a potential for developing hands-free interaction systems with a more natural
and effective method of interacting with digital devices. However, the majority of modern
gesture recognition systems are handicapped by the necessity for high accuracy, low
latency, or ease of use. The goal of this project is to build a solid system for a virtual mouse
that, with real-time hand tracking, allows users to move the cursor, click, and scroll at their
own discretion without any need for physical peripherals, thereby lowering barriers to use
and user comfort.
The conventional method of mouse input is based largely on physical interaction, which may
be uncomfortable and inefficient, especially for disabled individuals or those wanting a more
ergonomic and intuitive interface to interact with their devices. Existing gesture-controlled
systems are generally hampered by inaccuracy in tracking, poor gesture recognition, and
unresponsiveness. Additionally, the majority of gesture-based systems require special
hardware or extensive calibration processes, thereby limiting their utility. This project
eliminates such obstacles by utilizing a real-time hand gesture-based virtual mouse system
based on computer vision technology such as OpenCV and MediaPipe to efficiently trace and
recognize the movements of hands with minimal latency. The system is user-friendly, requiring
only a standard webcam and no other such special hardware.
The objective of this project is to develop an efficient and easy-to-use virtual mouse system
that supports real-time mouse cursor, click, and scrolling control with hand tracking. The
project will comprise the following components:
• Real-Time Gesture Recognition and Hand Tracking: OpenCV and MediaPipe will
track the movement of the hands and convert it to make the mouse pointer move.
• Mouse Functions and Cursor Movement: The users will drive the system with simple
hand motions such as opening, closing, or pointing, which will be interpreted as cursor
movement.
• Advanced User Experience: A role-shifting feature will be introduced, which allows
users to exchange their hands' roles, and display real-time instructions in order to allow
users to adapt to the gestures.
• Backend Integration: PyAutoGUI will handle mouse movement and clicking, while the
system will use OpenCV for video capture and MediaPipe for hand gesture tracking.
•User Friendliness and Accessibility: The system shall be simple to use with minimal or
no setup and calibration and shall be usable by people with physical disabilities, offering a
hands-free option in place of the traditional mouse input.
The system will provide an interactive, fluid experience that may be expanded in the future
to enable other capabilities such as voice command, multiple hand tracking, or
compatibility with other smart products. It will offer a fast, real-time, and personal solution
for controlling a mouse using hands.
Chapter 4
4. SOFTWARE ALGORITHM
Initialize Overview:
The virtual mouse system is intended to track and interpret hand movements through real-
time video feeds to manipulate the mouse pointer, simulate clicks, and facilitate scrolling.
The system employs MediaPipe for tracking hands, pyautogui for mimicking mouse
movement, and OpenCV for capturing and processing the video feed.
5. Scrolling Control
• Scroll Detection:
The scrolling is managed with the left or right hand based on mode settings. The
vertical movement of the index finger is tracked by the system and, on the basis of
palm relative movement, the direction of scrolling is decided.
• Scroll Sensitivity:
A factor of scroll sensitivity is used to adjust responsiveness for scrolling
operations. The threshold level is fixed as a minimal amount of vertical shifting
(0.005 units) to ensure a smoother scrolling process.
6. Mode Switching
• Hand Role Assignment
The system enables control and scroll swapping between the users' left hand and
right hand dynamically via the 'M' key. The system offers easy hand use in a
flexible manner for various user preferences.
7. Real-Time Instructions and Display
• Use Instructions:
Semi-transparent overlay at the top of the screen for displaying real-time use
instructions that direct the user. The instructions change based on mode (left-hand
or right-hand control/scroll).
• Video Feed Display
The instructions dynamically change so that the user knows at all times what the
current hands do (i.e., which hand controls the cursor and which scrolls).
8. Real-Time Updates and UI
• Camera Preview:
The video stream is resized and displayed in a self-contained window with real-
time user feedback about hand movements and gestures. It improves the overall
user experience.
• Exit and Mode Switching:
Users can exit the application using the 'Q' key, and the 'M' key can alternate
between left and right hands without any glitches for control/scrolling.
9. Final Notes
• Hand Tracking and Gesture Recognition:
MediaPipe provides decent hand tracking to allow for precise hand landmark
detection. The system focuses on targeting meaningful hand landmarks such as
the index, middle, and ring fingers for carrying out actions such as clicks and
scrolling.
• Usability:
The system provides smooth, natural control of cursor, click, and scroll with basic
hand movement, making interaction better without a traditional mouse or
touchpad.
Chapter 5
• Accuracy of Gestures:
The virtual mouse system exhibits high precision in converting movement of the hands into
gestures that enable the cursor to follow alongside the movement of the user's hand without
serious lag. The system has real-time tracking through a smooth and responsive interface.
• User Experience
Its user-friendly nature provides an engaging and hands-off experience with simpler setup.
• Hand Landmarks:
Index, middle, and ring fingers are the most significant hand landmarks used to identify
gestures. The location of these landmarks relative to one another aids in distinguishing
between movement, scrolling, and click gestures.
• Smoothing Factor:
Alpha smoothing factor plays a critical role in enabling natural, smooth cursor movement.
• Gesture Heatmap:
A heatmap of cursor movement can be drawn to illustrate hand movements and positions.
It checks if the system is tracking the hand's motion appropriately.
5.4 Discussion
• Advantages:
o Accuracy and Speed: The system delivers accurate gesture detection and quick response
time.
o Seamless Integration: The virtual mouse integrates smoothly with existing user
interfaces to improve interaction without the need for extra hardware.
o User Flexibility: Mode-switching makes users more flexible, and any hand can use the
system.
•Limitations & Future Enhancements:
o Gesture Complexity: Additional gestures can be introduced to broaden functionality
(multi-finger gestures for complex operations).
o Device Support: Subsequent versions may incorporate support for mobile devices or
compatibility with smart glasses for augmented reality experiences.
Conclusion:
The virtual mouse system successfully integrates hand gesture recognition and real-
world usability to provide an intuitive input method for cursor control, click events, and
scrolling with little delay.
6.1 Conclusions
The virtual mouse project has effectively proven an innovative, hands-free method of
controlling a computer using real-time tracking of hand gestures. Using OpenCV,
MediaPipe, and pyautogui, the system effectively maps hand movement to cursor action,
with the additional capability of clicking and scrolling. The project points out the potential
of computer vision in creating intuitive human-computer interaction systems.
Major accomplishments are:
While the virtual mouse project has reached a functional and stable state, there are several
avenues for improvement and future expansion:
1. Gesture Recognition Improvements
o Multi-Finger Gestures: Improve gesture recognition to support more
sophisticated multi-finger gestures for more sophisticated actions, such as swipe
gestures or pinch-to-zoom.
o Improved Tracking: Enhance the precision and responsiveness of hand
tracking by introducing machine learning models that have been trained to detect
smaller hand motions or more subtle gestures.
o Finger Pose Detection: Add finger pose recognition so users can manipulate
tasks such as object rotation or scaling using their fingers in 3D space.
2. Personalization to Users
o Hand Preference Learning: The user's device is capable of learning and
retaining user preference, including control or scroll-hand, and making
automatic adjustments based on user patterns of usage.
o Interchangeable Gestures: Allow users to create their own gestures for specific
functionalities, like launching a particular application or performing a function.
3. Performance Optimization
o Latency Reduction: Increase processing speed and lower latency to make the
cursor glide even smoother and respond quicker to clicking. This can be done
through optimized hand detection algorithms and more advanced hardware.
o Cross-Platform Support: Make the system function smoothly across platforms
(Windows, macOS, Linux) and interact with standard operating systems'
accessibility features.
4. ADVANCED INTERACTION FEATURES
o Voice Control Integration: Integrate voice command with gesture control to
introduce a more powerful interaction framework. For instance, users would be
able to say "click" to mimic a click without executing the associated gesture.
o Haptic Feedback: Provide haptic feedback through a wearable device so that
users receive a vibration when they undertake an activity such as a click or scroll.
By incorporating these enhancements, the virtual mouse project could evolve into a powerful
and versatile tool, suitable for various applications such as accessibility, gaming, and interactive
environments. The development of additional features will enable the system to offer even more
immersive, user-friendly, and practical solutions for hands-free interaction with technology.
[3] N. S. TK and A. Karande, "Real-Time Virtual Mouse using Hand Gestures for
Unconventional Environment," 2023 14th International Conference on Computing
Communication and Networking Technologies (ICCCNT), Delhi, India, 2023, pp. 1-6,
doi: 10.1109/ICCCNT56998.2023.10308331. keywords: {Human computer
interaction;Performance evaluation;Input devices;Mice;User experience;Real-time
systems;Space exploration;Virtual mouse;Image processing;OpenCV;Me-diapipe;hand
gestures;Artificial Intelligence;Machine Learning},
[4] X. Xue, W. Zhong, L. Ye and Q. Zhang, "The simulated mouse method based on
dynamic hand gesture recognition," 2015 8th International Congress on Image and Signal
Processing (CISP), Shenyang, China, 2015, pp. 1494-1498, doi:
10.1109/CISP.2015.7408120. keywords: {Mice;Gesture recognition;Hidden Markov
models;Trajectory;Cameras;Target tracking;simulated mouse;dynamic hand gesture
recognition;hand tracking;gesture recognition rate},
SYSTEM ARCHITECTURE