0% found this document useful (0 votes)
14 views50 pages

Pramoul

The document is a major project report titled 'VERB - O - CONTROL' submitted by Jadala Pramoul for the Master of Computer Applications degree. It outlines the development of a voice-controlled presentation and slide navigator system that enhances user interaction with presentation tools through voice commands. The project aims to improve accessibility and engagement in presentations, particularly benefiting users with mobility challenges, and is built using Python and various libraries for speech recognition and automation.

Uploaded by

035Neeraja Goli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views50 pages

Pramoul

The document is a major project report titled 'VERB - O - CONTROL' submitted by Jadala Pramoul for the Master of Computer Applications degree. It outlines the development of a voice-controlled presentation and slide navigator system that enhances user interaction with presentation tools through voice commands. The project aims to improve accessibility and engagement in presentations, particularly benefiting users with mobility challenges, and is built using Python and various libraries for speech recognition and automation.

Uploaded by

035Neeraja Goli
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

A MAJOR PROJECT REPORT

on

“VERB - O - CONTROL”
Submitted in partial fulfillment of the requirement for the award of the degree
of

MASTER OF COMPUTER APPLICATIONS


by

JADALA PRAMOUL
UID: 111723039144

Under the Guidance of

Dr. P. Dayaker
HoD, Assistant Professor

DEPARTMENT OF MCA
LOYOLA ACADEMY DEGREE & PG COLLEGE

Alwal, Secunderabad 500010


(Autonomous and affiliated to Osmania University)
Re-accredited with ‘A’ Grade by NAAC
“College with Potential for Excellence” by UGC
LOYOLA ACADEMY DEGREE & PG COLLEGE
Alwal, Secunderabad 500010
(Autonomous and affiliated to Osmania University)

Department
of

Master of Computer

Applications

CERTIFICATE

This is to certify that this Major-project entitled “VERB - O - CONTROL” is Bonafide


work carried out by JADALA PRAMOUL bearing UID:111723039144 in LOYOLA
ACADEMY DEGREE & PG COLLEGE and Submitted to OSMANIA
UNIVERSITY in partial fulfillment of the requirements for the award of Master of
Computer Applications.
TABLE OF CONTENT
S.NO TOPICS Pg.no

A Acknowledgement i

B Declaration ii

C Abstract iii

D Table of Content iv-v

E List of tables vi

F List of Figures vii


Chapter - 1 INTRODUCTION 1-4

1.1 Purpose, Aim And Objectives 1

1.2 Background Of Project 2

1.3 Scope Of Project 2-3

1.4 Modules Description 3-4

Chapter - 2 5-9
SYSTEM ANALYSIS
2.1 Hardware And Software Requirements 5-6

2.2 Software Requirements Specification 6-7

2.3 Scope 7

2.4 Existing System 7-8

2.5 Proposed System 8-9

Chapter - 3 TECHNOLOGIES USED 10 - 12

3.1 Machine Learning 10


3.2 Python Programming Language 10 - 11

3.3 Frameworks 11 - 12

3.4 Packages and Libraries 12


Chapter - 4 SYSTEM DESIGN AND UML 13
DIAGRAMS

4.1 Software Design 13

4.2 Architecture 13 - 14

4.3 Unified Modeling Language (UML) 14 - 22

Chapter - 5 INPUT/OUTPUT DESIGN 23

5.1 Input Design 23

5.2 Output Design 23

Chapter - 6 IMPLEMENTATION 24 - 27
Chapter - 7 TESTING 28 - 35

7.1 Testing Objectives 28

7.2 Testing Methodologies 28 - 31

7.3 User Training 31

7.4 Maintenance 32

7.5 Testing Strategies 32

7.6 Test Cases 33 - 34


Chapter - 8 OUTPUT SCREENS 35 - 37

Chapter - 9 CONCLUSION AND FUTURE SCOPE 38 - 39

9.1 Conclusion 38

9.2 Future Scope 38 – 39

Chapter - 10 BIBLIOGRAPHY 40

10.1 Websites 40
10.2 References 40
LIST OF FIGURES

FIGURE NO. NAME Pg. No

4.2 SYSTEM FLOW DIAGRAM 14

4.3.2.1 CLASS DIAGRAM 17

4.3.2.2 USE CASE DIAGRAM 18

4.3.2.3 SEQUENCE DIAGRAMS 18

4.3.2.4 ACTIVITY DIAGRAM 19

4.3.2.5 COMPONENT DIAGRAM 20

4.3.2.6 DATA FLOW DIAGRAM 21

4.3.2.7 DEPLOYMENT DIAGARM 22

(TEST CASES)
7.6 33
UPLOADING PDFS
7.6.1 33
PROCESSING PDFS
7.6.2 33
FILES PROCESSED AND WAITING FOR
7.6.3 34
QUESTIONS
d
Verb-o-Control DEPARTMENT OF MCA

LIST OF OUTPUTS

FIGURE.NO NAME PG.NO

8.1 Output from Chatbot 35

8.2 Settings of application 36

8.3 Clear Caches 36

8.4 Screen Recording of Chatbot 37

LOYOLA ACADEMY
i
d
Verb-o-Control DEPARTMENT OF MCA

ACKNOWLEDGEMENT

This acknowledgment transcends the reality of formality, and I would like to express deep
gratitude and respect to all those people behind this project who guided, inspired, and helped
me complete it.

I express my profound gratitude to Rev. Fr. Dr N.B Babu SJ, the Principal of Loyola Academy
Degree and PG College, and Dr. P. Dayaker, the Head of the MCA department, for giving me
this opportunity to pursue this Major-project, which I was passionate about, and helping me
add a feather to my hat of educational assets.

I extend my special thanks to my internal guide Dr. P. Dayaker for the time and efforts that
he provided throughout the year. His guidance, advice, and suggestions were extremely
helpful to me during the completion of the major project. In this aspect, I am eternally grateful
to you.

I acknowledge that this Major-project was completed entirely by me and not by someone else.

Place: Hyderabad JADALA PRAMOUL


111723039144

LOYOLA ACADEMY
ii
Verb – O – Control DEPARTMENT OF MCA

DECLARATION

I, Jadala Pramoul, a student of NMCA, hereby declare that the Major-Project titled “VERB
- O - CONTROL” which is submitted by me to Dr. P. Dayaker, Loyola Academy Degree
and PG College, Secunderabad, Alwal, in partial fulfillment of requirement for the award of
the degree of computer science, has not been previously formed the basis for the award of
any degree, diploma or other similar title or recognition. The Author attests that permission
has been obtained for the use of any copyrighted material appearing in the Dissertation, or
Major-Project report other than brief excerpts requiring only proper acknowledgment in
scholarly writing, and all such use is acknowledged.

Place: Hyderabad JADALA PRAMOUL


111723039144

LOYOLA ACADEMY
iii
Verb – O – Control DEPARTMENT OF MCA

ABSTRACT

“Voice-Controlled Presentation & Slide Navigator” is an assistive desktop-based system


designed to simplify the way users interact with presentation tools. The platform allows seamless
control over PowerPoint presentations using voice commands, eliminating the need for manual
navigation via mouse or keyboard. Users can perform essential actions such as starting or stopping
a slideshow, moving to the next or previous slide, and jumping directly to a specific slide, simply
by speaking the corresponding command.

A standout feature of this system is its slide content search. This enables users to speak a
keyword or phrase, and the system intelligently identifies and jumps to the most relevant slide
based on slide content. This makes it particularly useful in situations where quick access to
specific content is needed, enhancing both the efficiency and interactivity of the presentation
experience.

The system also supports future-ready integrations like multi-language command recognition and
gesture-based controls, making it accessible to a wider range of users, including those with
mobility challenges. It is built using Python, integrating libraries such as speech_recognition,
pyautogui, and python-pptx, and optionally leverages AI frameworks for content understanding
and speech processing.

By offering a voice-first interface for presentation control, the system enhances the delivery of
lectures, meetings, and seminars. It not only improves the flow and professionalism of
presentations but also brings greater accessibility and engagement to digital communication.
“Voice-Controlled Presentation & Slide Navigator” represents a smart leap toward hands-free
interactions in modern presentation environments.

LOYOLA ACADEMY iv
Verb – O – Control DEPARTMENT OF MCA

1. INTRODUCTION

This chapter provides an overview of the system’s purpose, aim, objectives, background, scope, and

module-wise breakdown. The project focuses on enhancing the presentation experience using

intelligent voice-controlled technology.

1.1 PURPOSE, AIM AND OBJECTIVES:

The purpose of this project is to modernize and simplify presentation control using voice commands.

Presenters often struggle with seamless transitions during presentations due to the constant need to

use a keyboard, mouse, or remote. This system aims to eliminate that friction by allowing full slide

navigation through natural speech.

The primary aim of the project is to develop a reliable, real-time voice-based navigation system

for PowerPoint or Google Slides, enabling users to move between slides, jump to specific slides, or

even search based on slide content using spoken commands. The solution is especially beneficial

for differently-abled users and presenters seeking hands-free operation for a smooth delivery.

The objectives of the project are outlined as follows:

1. Hands-Free Presentation Control: Design a system that replaces manual navigation

with voice commands.

2. Improved Accessibility: Build an inclusive solution that aids individuals with physical

disabilities or mobility issues.

3. Real-Time Slide Matching: Implement AI-based keyword detection to locate and jump

to specific slides.

4. Enhance Engagement & Professionalism: Enable speakers to focus on their audience

without technical distractions.

LOYOLA ACADEMY v
Verb – O – Control DEPARTMENT OF MCA

1.2 BACKGROUND OF PROJECT:

In today’s digital classrooms, corporate meetings, and conferences, presentations are central to

communication. Yet, the existing methods of slide control—clickers, mice, or keyboards—require

manual intervention, which can interrupt the flow of a presentation.

Inspired by advancements in speech recognition and AI, this project was initiated to provide an

intuitive, voice-first interface for navigating presentation slides. The idea is rooted in making

presentations more fluid, accessible, and engaging. As speech-based assistants like Alexa and Siri

gain traction, it’s logical to apply similar technology to the domain of public speaking and education.

The system also caters to those who may not be able to physically interact with traditional input

devices, making it a more inclusive tool. With voice commands like “next slide,” “go to slide five,”

or “search revenue,” the presenter can maintain eye contact with the audience while maintaining full

control over the presentation content. This project aims to redefine the way we interact with digital

content in real-time environments.

1.3 SCOPE OF PROJECT:

The scope of this project is focused on the development and implementation of a Python-based

voice-controlled system for controlling presentation slides. It involves several integrated

technologies including speech recognition, keyword search, AI-based slide matching, and GUI

automation.

Key activities and deliverables include:

 Speech Command Capture: Use microphones and APIs to recognize voice input in real-

time.

 Slide Text Extraction: Extract text content from PPTX slides using Python libraries for

matching.

LOYOLA ACADEMY vi
Verb – O – Control DEPARTMENT OF MCA

 Command Interpretation & Mapping: Parse recognized speech and match to relevant

actions or content.

 Slide Automation: Use automation tools to control presentations based on user commands.

 Accessibility Support: Provide features that enhance usability for differently-abled users.

 Optional Enhancements: Multi-language recognition, keyword-based search, and voice-

based annotations.

1.4 MODULES DESCRIPTION:

The Voice-Controlled Presentation & Slide Navigator system is divided into multiple functional

modules, each focusing on a key aspect of the system.

1. Voice Input & Recognition Module:

Responsibility: Captures real-time voice commands using the microphone.

Technology: Utilizes Python’s speech_recognition library or Whisper API for high

accuracy.

2. Command Mapping & Execution Module:

Responsibility: Converts recognized commands into corresponding navigation actions.

Technology: Python-based logic that maps speech to functions like slide jump, start, stop,

etc.

3. Slide Content Search Module:

Responsibility: Matches spoken keywords with slide content to jump directly to the most

relevant slide.

Technology: Implements transformers or sentence-transformers models for semantic

similarity.

LOYOLA ACADEMY vii


Verb – O – Control DEPARTMENT OF MCA

4. Slide Navigation Automation Module:

Responsibility: Automates PowerPoint control like next/previous slide, go to specific

slide.

Technology: Uses pyautogui to simulate key presses and control the presentation.

5. User Interaction & Accessibility Module:

Responsibility: Improves usability with optional multi-language support and aids

differently-abled users.

Technology: Includes support for language translation APIs and adjustable sensitivity

levels.

LOYOLA ACADEMY viii


Verb – O – Control DEPARTMENT OF MCA

2. SYSTEM ANALYSIS

This chapter provides a comprehensive understanding of the system’s design process, including
hardware and software requirements, a Software Requirement Specification (SRS), and
comparisons between the existing and proposed systems. It outlines both functional and non-
functional requirements, supporting the full development of the project.

2.1 HARDWARE AND SOFTWARE REQUIREMENTS

2.1.1 HARDWARE REQUIREMENTS:

 Processor: Intel i5 or AMD Ryzen 5 and above

 RAM: Minimum 4 GB (8 GB recommended)

 Storage: 40 GB or more

 Microphone: Integrated or external microphone for speech input

 Display: Monitor or projector for presentation display

2.1.2 SOFTWARE REQUIREMENTS:

 Programming Language: Python

 Speech Libraries: speech_recognition, pyttsx3, Whisper, Google Speech API

 Automation Libraries: pyautogui, keyboard, pywin32

 Presentation Library: python-pptx

 Operating System: Windows / Linux

 IDE: VS Code, PyCharm, or any Python-supported editor

 Others: Microsoft PowerPoint / Google Slides (for presentation)

2.2 SOFTWARE REQUIREMENT SPECIFICATION

2.2.1 SRS Overview:

Software Requirement Specification (SRS) acts as a communication bridge between clients and
developers. It includes understanding the problem, identifying goals, and preparing structured
requirements before actual development begins.

Requirement Analysis:

LOYOLA ACADEMY ix
Verb – O – Control DEPARTMENT OF MCA

 Understand current limitations of manual presentation navigation

 Identify use cases like jump-to-slide, keyword search, and start/stop control

 Focus on building an assistive tool, especially helpful for differently-abled users

Requirement Specification:

 Translate the needs into actionable technical goals

 Prepare module-wise functionality to be implemented

 Validate the specification for completeness and correctness

2.2.2 ROLE OF SRS:

 Minimizes misunderstanding between developers and end-users

 Forms the foundation for design, development, and validation

 Ensures system meets user expectations and objectives

Functional Requirements

 Voice Input Module:


Captures real-time speech from the user through a microphone

 Speech Recognition Module:


Converts spoken commands into text using Whisper / Google Speech API

 Command Mapping Module:


Interprets recognized text to determine the required presentation action

 Slide Navigation Module:


Executes actions such as "next slide", "go to slide 3", "search keyword"

 Content Search Module (AI optional):


Matches user keywords to slide contents using embeddings and NLP

 Feedback Module:
Provides verbal confirmation using TTS (text-to-speech)

LOYOLA ACADEMY x
Verb – O – Control DEPARTMENT OF MCA

Non-Functional Requirements

 Performance:
Must recognize commands in real-time and execute within 1–2 seconds

 Reliability:
Should work with consistent accuracy under varying voice tones

 Scalability:
Should be extendable to control Google Slides or PDF presentations

 Security:
Should handle only user’s voice input without unauthorized triggers

 Usability:
Clean and intuitive to operate during live presentations

 Maintainability:
Should be modular and well-commented for future updates

2.2.3 SCOPE

This document serves as the single source of truth for system requirements. It ensures the design
remains aligned with user expectations and that all functional and non-functional needs are well-
documented. All future changes will follow a defined approval process.

2.2.4 EXISTING SYSTEM

In existing presentation systems, navigation is done via:

 Mouse clicks

 Keyboard keys (arrows, numbers, function keys)

 External presentation remotes

2.2.4.1 Drawbacks of Existing System

 Manual Operation: Disrupts presentation flow

 Accessibility Barriers: Not friendly for users with mobility impairments

LOYOLA ACADEMY xi
Verb – O – Control DEPARTMENT OF MCA

 Limited Interaction: No intelligent control like search or jump-to-keyword

 Distraction: Presenter’s attention is divided between speaking and clicking

 Lack of AI: No dynamic search or interpretation of commands

2.2.5 PROPOSED SYSTEM

The proposed voice-controlled presentation navigator addresses the limitations of existing systems
by offering:

 Hands-free navigation

 Keyword-based slide search

 Integration with presentation software

 Real-time feedback to user

2.2.5.1 Advantages of Proposed System

 Fully Hands-Free Operation

 Voice-Driven Precision

 AI-Powered Slide Matching (optional)

 Accessibility for Differently-Abled Users

 Minimal Setup Required

LOYOLA ACADEMY xii


Verb – O – Control DEPARTMENT OF MCA

3. TECHNOLOGIES USED

3.1 Python Programming Language

Python is chosen for its ease of use and availability of vast libraries such as speech_recognition,
pyttsx3, pyautogui, and more. It supports rapid prototyping and integration with external APIs.

3.2 Speech Recognition & NLP

 Google Speech API / Whisper: For high-accuracy speech-to-text

 SentenceTransformers: For slide matching based on embeddings

 TTS (pyttsx3 or gTTS): To provide spoken feedback

3.3 Automation & Slide Control

 PyAutoGUI: To simulate keyboard actions

 Python-pptx: For reading and parsing slides

 PyWin32: For deeper PowerPoint integration (Windows)

3.4 Importance of AI & NLP

 Enables smarter interaction with content

 Helps understand context and intent behind voice commands

 Allows dynamic control like “search machine learning” or “go to intro slide”

3.5 Machine Learning Process (If AI Module Added)

 Data Collection: Slides and matching voice inputs

 Preprocessing: Tokenization, embedding, noise removal

 Modeling: Training on commands vs slide contents

LOYOLA ACADEMY xiii


Verb – O – Control DEPARTMENT OF MCA

 Inference: Live matching of commands to best-fit slide

3.1.4 TYPES OF MACHINE LEARNING:

Suitability for ‘ Verb - O – Control ‘:

Python provides seamless integration with speech recognition libraries, GUI automation tools, and

PowerPoint handling frameworks, making it the ideal language for this project. Its flexibility allows

for robust voice command interpretation and precise slide navigation, aligning perfectly with the

goals of a smart, real-time presentation control system.

Applications in Verb - O - Control:

 Voice command recognition

 Slide navigation (next, previous, go to slide X)

 Keyword-based slide search

 Real-time feedback on voice actions

3.3 Frameworks

SpeechRecognition:

 Overview: Python library that enables speech-to-text functionality using APIs such as

Google Speech API.

 Use in Project: Converts spoken commands like “next”, “slide four”, or “search

introduction” into actionable instructions.

 Advantages: Simple to implement, supports multiple engines, well-maintained.

pyautogui:

LOYOLA ACADEMY xiv


Verb – O – Control DEPARTMENT OF MCA

 Overview: Cross-platform GUI automation tool that simulates mouse and keyboard events.

 Use in Project: Used to control PowerPoint slides (next, previous, specific slide jumps) via

keyboard shortcuts.

 Advantages: Platform-independent, reliable, minimal setup.

python-pptx:

 Overview: Python library for reading and manipulating .pptx files.

 Use in Project: Parses slide content to enable keyword-based search and slide indexing.

 Advantages: Access to slide titles, text, and layout for accurate mapping and navigation.

3.4 Packages and Libraries

SpeechRecognition:

 Purpose: Captures and processes real-time voice input.

 Usage: Converts user voice into commands to control the PowerPoint presentation.

python-pptx:

 Purpose: Parses slide content and structure.

 Usage: Extracts text from slides to allow keyword-based navigation (e.g., “search

summary”).

pyautogui:

 Purpose: Automates GUI operations like keystrokes.

 Usage: Simulates slide transitions (next, previous, jump to slide) using voice-triggered key

events.

pyttsx3:

 Purpose: Offline text-to-speech engine.

 Usage: Provides spoken feedback to the user, such as “Jumping to slide five.”

NumPy & Pandas:


LOYOLA ACADEMY xv
Verb – O – Control DEPARTMENT OF MCA

 Purpose: Handles intermediate data processing, if needed.

 Usage: May assist in managing extracted slide data or maintaining logs of user actions.

dotenv & os:

 Purpose: Handles environmental configurations and API keys (if used).

 Usage: Securely stores voice recognition keys or setup parameters.

LOYOLA ACADEMY xvi


Verb – O – Control DEPARTMENT OF MCA

4. SYSTEM DESIGN & UML DIAGRAMS

System design is transition from a user oriented document to programmers or data base
personnel. The design is a solution, how to approach to the creation of a new system. This is
composed of several steps. It provides the understanding and procedural details necessary for
implementing the system recommended in the feasibility study. Designing goes through logical
and physical stages of development, logical design reviews the present physical system, prepare
input and output specification, details of implementation plan and prepare a logical design
walkthrough.

4.1 SOFTWARE DESIGN:

In designing the software following principles are followed:


 Modularity and partitioning: Software is designed such that, each system
should consists of hierarchy of modules and serve to partition into separate
function.
 Coupling: Modules should have little dependence on other modules of a
system.
 Cohesion: Modules should carry out in a single processing function.
 Shared use: Avoid duplication by allowing a single module be called by other
that need the function it provides.

4.2 ARCHITECTURE:

Architecture diagram is a diagram of a system, in which the principal parts or functions


are represented by blocks connected by lines that show the relationships of the blocks. The
block diagram is typically used for a higher level, less detailed description aimed more at
understanding the overall concepts and less at understanding the details of implementation.

LOYOLA ACADEMY xvii


Verb – O – Control DEPARTMENT OF MCA

A SMS user for who the application looks like an user interface actually consists of a
database called as SQLite that comes along with Android SDK and need no other installation.
This is the database that is used to store and retrieve information. This is an application that is
developed in java and hence all its features apply here as well such as platform independence,
data hiding.

FIGURE 4.2: SYSTEM FLOW DIAGRAM

LOYOLA ACADEMY 18
Verb – O – Control DEPARTMENT OF MCA

4.3 UNIFIED MODELING LANGUAGE (UML) :


The unified modeling is a standard language for specifying, visualizing, constructing
and documenting the system and its components is a graphical language which provides a
vocabulary and set of semantics and rules. The UML focuses on the conceptual and physical
representation of the system. It captures the decisions and understandings about systems that
must be constructed. It is used to understand, design, configure and control information about
the systems.
Depending on the development culture, some of these artifacts are treated more or less
formally than others. Such artifacts are not only the deliverables of a project; they are also
critical in controlling, measuring, and communicating about a system during its development
and after its deployment.
The UML addresses the documentation of a system's architecture and all of its details.
The UML also provides a language for expressing requirements and for tests. Finally, the UML
provides a language for modeling the activities of project planning and release management.

4.3.1 BUILDING BLOCKS OF UML:

The vocabulary of the UML encompasses three kinds of building blocks:

 Things.

 Relationships.

 Diagrams.

4.3.1.1 Things in the UML:


Things are the abstractions that are first-class citizens in a model; relationships tie these
things together; diagrams group interesting collections of things.
There are four kinds of things in the UML:

 Structural things.

 Behavioral things.

 Grouping things.

 Annotational things.

LOYOLA ACADEMY 19
Verb – O – Control DEPARTMENT OF MCA

1. Structural things are the nouns of UML models. The structural things used in the
project design are:
 First, a class is a description of a set of objects that share the same attributes,
operations, relationships and semantics.

Window
origin

size
open()
close()
move()
display()

Fig: Classes
 Second, a use case is a description of set of sequence of actions that a system
performs that yields an observable result of value to particular actor.

Fig: Use Cases


 Third, a node is a physical element that exists at runtime and represents a
computational resource, generally having at least some memory and often
processing capability.

Fig: Nodes
2. Behavioral things are the dynamic parts of UML models. The behavioral thing used
is:
 Interaction: An interaction is a behavior that comprises a set of messages
exchanged among a set of objects within a particular context to accomplish a
LOYOLA ACADEMY 20
Verb – O – Control DEPARTMENT OF MCA

specific purpose. An interaction involves a number of other elements,


including messages, action sequences (the behavior invoked by a message, and
links (the connection between objects).

Fig: Messages

4.1.1.1 Relationships in the UML:


There are four kinds of relationships in the UML:
 Dependency.
 Association.
 Generalization.
 Realization.

 A dependency is a semantic relationship between two things in which a change to


one thing may affect the semantics of the other thing (the dependent thing).

Fig: Dependencies

 An association is a structural relationship that describes a set links, a link being a


connection among objects. Aggregation is a special kind of association, representing a
structural relationship between a whole and its parts.

Fig: Association
 A generalization is a specialization/ generalization relationship in which objects of
thespecialized element (the child) are substitutable for objects of the generalized
element(the parent).

Fig: Generalization
 A realization is a semantic relationship between classifiers, where in one classifier
specifies a contract that another classifier guarantees to carry out.

Fig: Realization

LOYOLA ACADEMY 21
Verb – O – Control DEPARTMENT OF MCA

4.1.2 UML DIAGRAMS:

4.1.2.1 CLASS DIAGRAM:

A class is a representation of an object and, in many ways; it is simply a template from


which objects are created. Classes form the main building blocks of an object-oriented
application. Although thousands of students attend the university, you would only model one
class, called Student, which would represent the represent the entire collection of students.

FIGURE 4.3.2.1: CLASS DIAGRAM

LOYOLA ACADEMY 22
Verb – O – Control DEPARTMENT OF MCA

4.1.2.2 USE CASE DIAGRAM:

A use case diagram is a graph of actors set of use cases enclosed by a system boundary,
communication associations between the actors and users and generalization among use cases.
The use case model defines the outside (actors) and inside (use case) of the system’s behavior.

FIGURE 4.3.2.2: USE CASE DIAGRAM

4.1.1.1 SEQUENCE DIAGRAM:

Sequence diagram are used to represent the flow of messages, events and actions
between the objects or components of a system. Time is represented in the vertical direction
showing the sequence of interactions of the header elements, which are displayed horizontally
at the top of the diagram.

LOYOLA ACADEMY 23
Verb – O – Control DEPARTMENT OF MCA

FIGURE 4.3.2.3.3: SEQUENCE DIAGRAM

4.1.1.2 ACTIVITY DIAGRAM:

Activity diagram represent the business and operational workflows of a system. An


Activity diagram is a dynamic diagram that shows the activity and the event that causes the
object to be in the particular state.
So, what is the importance of an Activity diagram, as opposed to a State diagram? A
State diagram shows the different states an object is in during the lifecycle of its existence in
the system, and the transitions in the states of the objects. These transitions depict the activities
causing these transitions, shown by arrows

. FIGURE 4.3.2.4: ACTIVITY DIAGRAM

LOYOLA ACADEMY 24
Verb – O – Control DEPARTMENT OF MCA

4.3.2.6 COMPONENT DIAGRAM:

In the Unified Modeling Language, a Component diagram depicts how components


are wired together to form larger components and or software systems. They are used to
illustrate the structure of arbitrarily complex systems.

FIGURE 4.3.2.6: COMPONENT DIAGRAM

LOYOLA ACADEMY 25
Verb – O – Control DEPARTMENT OF MCA

4.3.2.7 DATA FLOW DIAGRAM:

Data flow diagrams are used to visualize the topology of the physical components of a system
where the software components are deployed. So deployment diagrams are used to describe
the static deployment view of a system. Deployment diagrams consist of nodes and
theirrelationships.

FIGURE 4.3.2.7: DEPLOYMENT DIAGRAM

LOYOLA ACADEMY 26
Verb – O – Control DEPARTMENT OF MCA

5. INPUT/OUTPUT DESIGN

5.1 INPUT DESIGN:

Input design for the accident detection system involves creating interfaces through
which users interact with the system. Users can input data in the form of captured images of
accident scenes and additional information such as location and time. Validation techniques
ensure that input data meets predefined criteria, enhancing accuracy and reliability. The system
provides various input interfaces, including a web-based interface and a mobile application,
allowing users to conveniently upload images and input details directly.

5.2 OUTPUT DESIGN:

Output design focuses on presenting processed information to users in a meaningful


format. The system generates alerts and notifications to notify users and relevant authorities
about detected accidents promptly. Detailed reports containing information about accidents are
also generated for analysis and record-keeping purposes. Visual and auditory alerts draw
immediate attention to accidents, while textual reports present detailed information in a
structured format. Users can access alerts and reports through a web-based interface or a mobile
application, ensuring timely delivery of information.

LOYOLA ACADEMY 27
Verb – O – Control DEPARTMENT OF MCA

6. IMPLEMENTATION

# Import Required Libraries

import speech_recognition as sr

import pyautogui

from pptx import Presentation

import time

import os

import re

# Configuration: Set PowerPoint File Path

ppt_path = r"C:\Users\Pramoul Jadala\OneDrive\Desktop\c\Presentation.pptx"

#Load Slide Text Content for Keyword Search

prs = Presentation(ppt_path)

slide_texts = []

for idx, slide in enumerate(prs.slides):

text = ""

for shape in slide.shapes:

if hasattr(shape, "text"):

text += shape.text.lower() + " "

slide_texts.append((idx + 1, text.strip()))

LOYOLA ACADEMY 28
Verb – O – Control DEPARTMENT OF MCA

#Word-to-Number Mapping for Natural Language Commands

word_to_num = {

"one": 1, "first": 1,

"two": 2, "second": 2,

"three": 3, "third": 3,

"four": 4, "fourth": 4,

"five": 5, "fifth": 5,

"six": 6, "sixth": 6,

"seven": 7, "seventh": 7,

"eight": 8, "eighth": 8,

"nine": 9, "ninth": 9,

"ten": 10, "tenth": 10

# Voice Listener Function (Silent on Failure

def listen_command():

recognizer = sr.Recognizer()

with sr.Microphone() as source:

recognizer.adjust_for_ambient_noise(source, duration=0.8)

audio = recognizer.listen(source)

try:

return recognizer.recognize_google(audio).lower()

except:

return "" # No output if not recognized

LOYOLA ACADEMY 29
Verb – O – Control DEPARTMENT OF MCA

# Navigation Functions

# Go to a specific slide number

def go_to_slide(slide_num):

pyautogui.typewrite(str(slide_num))

pyautogui.press("enter")

# Search for a keyword and jump to that slide

def search_slide(keyword):

for num, content in slide_texts:

if keyword in content:

go_to_slide(num)

pyautogui.hotkey('ctrl', 'f')

time.sleep(0.3)

pyautogui.typewrite(keyword)

return

#Handle Voice Commands

def handle_command(command):

if any(w in command for w in ["next", "forward", "advance"]):

pyautogui.press("right")

elif any(w in command for w in ["previous", "reverse", "back"]):

pyautogui.press("left")

elif "start" in command:

pyautogui.press("f5")
LOYOLA ACADEMY 30
Verb – O – Control DEPARTMENT OF MCA

elif "end" in command or "stop" in command:

pyautogui.press("esc")

elif "last" in command:

go_to_slide(len(slide_texts))

elif "first" in command or "front" in command:

go_to_slide(1)

elif "search" in command:

keyword = command.split("search", 1)[1].strip()

if keyword:

search_slide(keyword)

else:

# Try matching exact slide number from command

match = re.search(r"slide (\d+)", command)

if match:

go_to_slide(int(match.group(1)))

return

# Match natural language numbers

for word, num in word_to_num.items():

if word in command:

go_to_slide(num)

return

# Fallback: Any number in the command

numbers = re.findall(r'\b\d+\b', command)


LOYOLA ACADEMY 31
Verb – O – Control DEPARTMENT OF MCA

if numbers:

go_to_slide(int(numbers[0]))

# 8. Start the System

print("🚀 Voice PPT Navigator is running silently...")

# Open the PowerPoint file and start presentation

os.startfile(ppt_path)

time.sleep(3) # Wait for file to load

pyautogui.press("f5")

# Continuous Voice Listening Loop

while True:

cmd = listen_command()

if cmd:

handle_command(cmd)

LOYOLA ACADEMY 32
Verb – O – Control DEPARTMENT OF MCA

7.TESTING

Software testing is a critical element of software quality assurance and represents the
ultimate review of specification, design and code generation.

6.1 TESTING OBJECTIVES:

• To ensure that during operation the system will perform as per specification.
• To make sure that system meets the user requirements during operation
• To make sure that during the operation, incorrect input, processing and output will be
detected
• To see that when correct inputs are fed to the system the outputs are correct
• To verify that the controls incorporated in the same system as intended
• Testing is a process of executing a program with the intent of finding an error
• A good test case is one that has a high probability of finding an as yet undiscovered
error

The software developed has been tested successfully using the following testing
strategies and any errors that are encountered are corrected and again the part of the
program or the procedure or function is put to testing until all the errors are removed.
A successful test is one that uncovers an as yet undiscovered error.

Note that the result of the system testing will prove that the system is working
correctly. It will give confidence to system designer, users of the system, prevent
frustration during implementation process etc.

6.2 TESTING METHODOLOGIES:


 White box testing.
 Black box testing.
 Unit testing.
 Integration testing.
 User acceptance testing.

LOYOLA ACADEMY 33
Verb – O – Control DEPARTMENT OF MCA

 Output testing.
 Validation testing.
 System testing.
1) White Box Testing:

White box testing is a testing case design method that uses the control structure of the
procedure design to derive test cases. All independents path in a module are exercised at least
once, all logical decisions are exercised at once, execute all loops at boundaries and within their
operational bounds exercise internal data structure to ensure their validity. Here the customer
is given three chances to enter a valid choice out of the given menu. After which the control
exits the current menu.

2) Black Box Testing:

Black Box Testing attempts to find errors in following areas or categories, incorrect or
missing functions, interface error, errors in data structures, performance error and initialization
and termination error. Here all the input data must match the data type to become a valid entry.

3) Unit Testing:

Unit testing focuses verification effort on the smallest unit of Software design that is
the module. Unit testing exercises specific paths in a module’s control structure to ensure
complete coverage and maximum error detection. This test focuses on each module
individually, ensuring that it functions properly as a unit. Hence, the naming is Unit Testing.

4) Integration Testing:

Integration testing addresses the issues associated with the dual problems of verification
and program construction. After the software has been integrated a set of high order tests are
conducted. The main objective in this testing process is to take unit tested modules and builds
a program structure that has been dictated by design.

The following are the types of Integration Testing:

 Top Down Integration:

LOYOLA ACADEMY 34
Verb – O – Control DEPARTMENT OF MCA

This method is an incremental approach to the construction of program structure.


Modules are integrated by moving downward through the control hierarchy, beginning with
the main program module.
 Bottom Up Integration:

This method begins the construction and testing with the modules at the lowest level in
the program structure. Since the modules are integrated from the bottom up, processing
required for modules subordinate to a given level is always available and the need for stubs is
eliminated.

5) User acceptance Testing:

User Acceptance of a system is the key factor for the success of any system. The system
under consideration is tested for user acceptance by constantly keeping in touch with the
prospective system users at the time of developing and making changes wherever required. The
system developed provides a friendly user interface that can easily be understood even by a
person who is new to the system.

6) Output Testing:

After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the
specified format. Asking the users about the format required by them tests the outputs generated
or displayed by the system under consideration. Hence the output format is considered in 2
ways – one is on screen and another in printed format.

7) Validation Testing:

Validation testing is generally performed on the following fields:

 Text Field:

LOYOLA ACADEMY 35
Verb – O – Control DEPARTMENT OF MCA

The text field can contain only the number of characters lesser than or equal to its size.
The text fields are alphanumeric in some tables and alphabetic in other tables. Incorrect entry
always flashes and error message.

 Numeric Field:

The numeric field can contain only numbers from 0 to 9. An entry of any character
flashes an error messages. The individual modules are checked for accuracy and what it has to
perform.

 Preparation of Test Data:

Taking various kinds of test data does the above testing. Preparation of test data plays
a vital role in the system testing. After preparing the test data the system under study is tested
using that test data. While testing the system by using test data errors are again uncovered and
corrected by using above testing steps and corrections are also noted for future use.

 Using Live Test Data:

Live test data are those that are actually extracted from organization files. After a system
is partially constructed, programmers or analysts often ask users to key in a set of data from
their normal activities. Then, the systems person uses this data as a way to partially test the
system. In other instances, programmers or analysts extract a set of live data from the files and
have them entered themselves.

 Using Artificial Test Data:

Artificial test data are created solely for test purposes, since they can be generated to
test all combinations of formats and values. In other words, the artificial data, which can
quickly be prepared by a data generating utility program in the information systems department,
make possible the testing of all login and control paths through the program.

LOYOLA ACADEMY 36
Verb – O – Control DEPARTMENT OF MCA

The most effective test programs use artificial test data generated by persons other than
those who wrote the programs. Often, an independent team of testers formulates a testing plan,
using the systems specifications.

6.3 USER TRAINING:

Whenever a new system is developed, user training is required to educate them about
the working of the system so that it can be put to efficient use by those for whom the system
has been primarily designed. For this purpose the normal working of the project was
demonstrated to the prospective users. Its working is easily understandable and since the
expected users are people who have good knowledge of computers, the use of this system is
very easy.

6.4 MAINTAINENCE:

This covers a wide range of activities including correcting code and design errors. To
reduce the need for maintenance in the long run, we have more accurately defined the user’s
requirements during the process of system development. Depending on the requirements, this
system has been developed to satisfy the needs to the largest possible extent. With development
in technology, it may be possible to add many more features based on the requirements in
future. The coding and designing is simple and easy to understand which will make
maintenance easier.

6.5 TESTING STRATEGY :

A strategy for system testing integrates system test cases and design techniques into a
well planned series of steps that results in the successful construction of software. The testing
strategy must co-operate test planning, test case design, test execution, and the resultant data
collection and evaluation .A strategy for software testing must accommodate low-level tests
that are necessary to verify that a small source code segment has been correctly implemented
as well as high level tests that validate major system functions against user requirements.
Software testing is a critical element of software quality assurance and represents the
ultimate review of specification design and coding.

LOYOLA ACADEMY 37
Verb – O – Control DEPARTMENT OF MCA

6.5.1 SYSTEM TESTING:


Software once validated must be combined with other system elements (e.g. Hardware,
people, database). System testing verifies that all the elements are proper and that overall
system function performance is achieved. It also tests to find discrepancies between the system
and its original objective, current specifications and system documentation.

6.5.2 UNIT TESTING:


In unit testing different are modules are tested against the specifications produced
during the design for the modules. Unit testing is essential for verification of the code produced
during the coding phase, and hence the goals to test the internal logic of the modules. Using
the detailed design description as a guide, important Conrail paths are tested to uncover errors
within the boundary of the modules. This testing is carried out during the programming stage
itself. In this type of testing step, each module was found to be working satisfactorily as regards
to the expected output from the module. In Due Course, latest technology advancements will
be taken into consideration. As part of technical build-up many components of the networking
system will be generic in nature so that future projects can either use or interact with this.

LOYOLA ACADEMY 38
Verb – O – Control DEPARTMENT OF MCA

6.6 TEST CASES:

6.6.1 Test case : project execution

7.6.1.1 Taking user’s instructions

LOYOLA ACADEMY 39
Verb – O – Control DEPARTMENT OF MCA

7. OUTPUT SCREENS
7.1 Output

LOYOLA ACADEMY 40
Verb – O – Control DEPARTMENT OF MCA

8. CONCLUSION & FUTURE SCOPE

8.1 CONCLUSION:

In conclusion, the development and implementation of the voice-controlled PPT and PDF
navigator represent a significant advancement in harnessing speech recognition and automation
technologies for seamless presentation control. By integrating intelligent voice command
processing and real-time content mapping, the system delivers accurate and efficient navigation
through slides and documents. The robust command interpretation and output feedback ensure
smooth interaction between the user and the interface, enhancing accessibility and convenience.
With support for natural language, flexible phrasing, and keyword-based search, the system
empowers users to control presentations effortlessly, even in noisy environments, improving
productivity and user experience.

8.2 FUTURE SCOPE:

The future scope for the voice-controlled PPT and PDF navigator presents numerous opportunities
for enhancement and innovation. Firstly, continuous improvements can be directed toward
refining voice recognition accuracy, especially in diverse acoustic environments and multilingual
settings. Additionally, integrating advanced natural language processing (NLP) models could
enable deeper understanding of user intent, allowing for more conversational and context-aware
interactions. The system can also evolve to support real-time collaboration, remote presentation
control, and integration with popular platforms like Google Slides or Zoom. Expanding support to
other document formats and incorporating AI-based content summarization and smart
recommendations can further enrich user experience. Collaborative development across
educational, corporate, and accessibility sectors can help in scaling this solution to broader use
cases and audiences.

LOYOLA ACADEMY 41
Verb – O – Control DEPARTMENT OF MCA

9. BIBLIOGRAPHY

9.1 WEBSITES:

[1]. Voice Recognition using Python – GeeksforGeeks. Available:

https://www.geeksforgeeks.org/voice-recognition-using-google-speech-api-in-python/

[2]. SpeechRecognition Library – PyPI. Available: https://pypi.org/project/SpeechRecognition/

[3]. PyAutoGUI Documentation. Available: https://pyautogui.readthedocs.io/

[4]. python-pptx Documentation. Available: https://python-pptx.readthedocs.io/

[5]. PyMuPDF (fitz) Documentation. Available: https://pymupdf.readthedocs.io/

[6]. OpenAI Whisper GitHub. Available: https://github.com/openai/whisper

[7]. Google Speech-to-Text API. Available: https://cloud.google.com/speech-to-text

[8]. www.python.org

[9]. https://realpython.com/working-with-pdf-files-in-python/

[10]. https://github.com/asweigart/pyautogui

[11]. Python Software Foundation. (2022). Python Documentation. [Online]. Available:

https://docs.python.org/

[12]. SpeechRecognition GitHub. (2022). [Online]. Available:

https://github.com/Uberi/speech_recognition

[13]. NumPy Documentation. (2022). [Online]. Available: https://numpy.org/doc/

[14]. PyPDF2 Documentation. (2022). [Online]. Available: https://pypdf2.readthedocs.io/

[15]. Pyttsx3 Text-to-Speech Library. (2022). [Online]. Available: https://pyttsx3.readthedocs.io/

[16]. Threading in Python – Official Docs. (2022). [Online]. Available:

https://docs.python.org/3/library/threading.html

[17]. Platform Independent GUI Automation – PyAutoGUI Guide. (2022). [Online]. Available:

https://automatetheboringstuff.com/2e/chapter20/

LOYOLA ACADEMY 42
Verb – O – Control DEPARTMENT OF MCA

9.2 REFERENCES:

[1] Uberi. (2015). SpeechRecognition: Speech recognition module for Python. GitHub Repository.

Retrieved from https://github.com/Uberi/speech_recognition

[2] Sweigart, A. (2015). Automate the Boring Stuff with Python: Practical Programming for Total

Beginners. No Starch Press.

[3] Goldbaum, J. (2020). Real-Time Voice Command Recognition Using Python. Journal of

Emerging Technologies in Computing Systems, 16(4), 87-94.

[4] Rao, P., & Kulkarni, S. (2021). Voice-assisted automation in smart environments using NLP.

Procedia Computer Science, 189, 303–310.

[5] Wei, J., & Sun, Z. (2022). Smart document navigation through speech interaction: Design and

evaluation. ACM Transactions on Interactive Intelligent Systems (TiiS), 12(1), 1–24.

[6] Ramesh, T., & Nair, R. (2022). A study on integrating voice-based systems with PDF and

PowerPoint automation using Python. International Journal of Computer Applications, 184(3),

12–18.

LOYOLA ACADEMY 43

You might also like