0% found this document useful (0 votes)
41 views28 pages

Report Further Page

Uploaded by

Hasan Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views28 pages

Report Further Page

Uploaded by

Hasan Patel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Vikram – AI Voice Desktop Assistant

Abstract

An AI desktop assistant that talks like humans is a system that can perform tasks and provide
different services to the individual as per the individual’s dictated commands. This is done through
a synchronous process involving recognition of speech patterns and then, responding via synthetic
speech. The most famous example of this is Apple Inc.’s “SIRI” which helps the end user to
communicate with mobile devices using voice to perform actions by delegating requests
accordingly. Same kind of application is also developed by Google that is “Google Voice Search”
which is used in Android Phones. But these applications mostly work with Internet services. A
desktop assistant with voice recognition intelligence, which takes the user input in the form of
voice, processes it and returns the output in various forms like answer questions, make
recommendations, and perform actions, etc. as per the end user. In the modern world, Artificial
Intelligence (AI) has rapidly evolved to become an integral part of our daily lives. AI-based
personal assistants have gained popularity due to their ability to perform tasks efficiently and
simplify human-computer interactions. The primary goal is to bridge the communication gap
between humans and machines, creating a more engaging and satisfying user experience. The
assistant will leverage the latest advancements in Natural Language Processing (NLP) and AI
technologies. Keywords: SIRI, Google Voice Search, Web- browser, Internet, Speech recognition.

1
Vikram – AI Voice Desktop Assistant

List of Figures

Fig. 3.1 Methodology

Fig 4.2 Processing in Existing System

Fig 4.3 Proposed System Architecture

Fig 5.1 Use Case Diagram

Fig 5.2 Activity Diagram

Fig 5.3.1 Data Flow Diagram Level 0

Fig 5.3.2 Data Flow Diagram Level 1

List of Tables

Table 2.3 Summary of Literature Survey

Table 3.2.1 Timeline with Milestones

Table 3.2.2 Gantt Chart

2
Vikram – AI Voice Desktop Assistant

Chapter 1

INTRODUCTION

In today’s era almost all tasks are digitalized. We have Smartphone in hands and it is nothing less
than having world at your finger tips. These days we aren’t even using fingers. We just speak of the
task and it is done.AI voice assistant, also known as a virtual or digital assistant, is a device that
uses voice recognition technology, natural language processing, and Artificial Intelligence (AI)to
respond to people. Virtual assistants, understand natural language voice commands and performs
tasks for users. The AI assistant can also perform other activities such as read news and weather
updates, open Google, You Tube, time, play mu-sic and open and close any apps and websites, etc.
This system is designed to be used efficiently on desktops. Personal assistant software improves
user productivity by managing routine tasks of the user and by providing information from online
sources to the user. Vikram is effortless to use. Call the wake-up word ‘Vikram’ followed by the
command and within seconds, it gets executed. This project was started on the premise that there is
sufficient amount of openly available data and information on the web that can be utilized to build a
virtual assistant that has access to making intelligent decisions for routine user activities.

3
Vikram – AI Voice Desktop Assistant

1.1 Problem Statement

Despite the availability of multiple virtual assistants, their usage remains limited, particularly due to
issues in voice recognition. Many struggle to understand English spoken with non-native accents,
such as the Indian accent. While these assistants are optimized for mobile devices, desktop integration
is lacking. Therefore, there is a need for a desktop-based virtual assistant that accurately
understands English in an Indian accent. Furthermore, these assistants often fail to answer questions
correctly due to lack of context or intent recognition, requiring continuous optimization and large
amounts of data for efficient performance.

1.2 Fundamentals

The development of an AI desktop assistant relies on several key technologies and concepts that make
the interaction between humans and machines more intuitive. These are:
 Natural Language Processing (NLP): NLP is the field of AI that enables machines to
understand, interpret, and respond to human language. This allows the assistant to
understand voice commands or typed instructions and provide meaningful responses.
 Speech Recognition: This technology converts spoken words into text that the AI system
can understand. Libraries like Google’s Speech Recognition or Microsoft’s Speech SDK are
often used to build this feature.
 Machine Learning (ML): ML models are used to improve the assistant's ability to
understand user input over time. As more interactions occur, the system learns and improves
the accuracy of its responses.
 Task Automation: The assistant is designed to execute specific tasks like opening files,
controlling system settings, or sending emails, enhancing productivity and convenience.
 Voice Synthesis (Text-to-Speech): To provide responses audibly, text-to-speech
technologies are used, allowing the assistant to "speak" back to the user with the help of
voice generation tools.

1.3 Objectives

The primary objectives of this AI desktop assistant project are:


 Ease of Use: To create an assistant that is easy to use and interact with, using both text and
voice commands for seamless communication.
 Automation: To automate frequent tasks like searching the web, opening programs, and file
management, thereby reducing user workload.
 Efficiency: To improve productivity by offering quick solutions and command execution.
 Scalability: To design the system in a way that can be easily expanded in the future with
more features, such as integration with third-party apps or multi-language support.
 Customization: To allow users to personalize the assistant based on their preferences,
including voice type, commands, and task priorities.
4
Vikram – AI Voice Desktop Assistant

1.4 Organization of the Project Report

The report is organized as follows:


 Chapter 1: Introduction – This chapter provides an overview of the project, background,
and the basic motivation for creating an AI desktop assistant.
 Chapter 2: Fundamentals – This chapter explores the fundamental concepts and
technologies used in the development of the assistant, including AI, NLP, and speech
recognition.
 Chapter 3: Objectives – Outlines the specific goals that the project aims to achieve.
 Chapter 4: Scope of the Project – Describes the boundaries of the project, including the
features and limitations of the current system.
 Chapter 5: Project Design and Development – Discusses the architectural design,
algorithms, and workflow of the assistant.
 Chapter 6: Conclusion and Future Scope – Summarizes the achievements of the project
and potential areas for future enhancement.

1.5 Scope of the Project

The scope of this AI desktop assistant project encompasses the following areas:
 Core Functionalities: The assistant is built to perform tasks like speech-to-text conversion,
text-to-speech responses, web searches, file handling, and basic automation of system tasks
like opening applications.
 User Interaction: The project focuses on creating a smooth and intuitive interaction
experience for users through voice and text commands.
 Language Processing: The assistant will use NLP to understand user commands accurately
and respond accordingly.
 Task Management: It can manage routine tasks such as setting reminders, organizing to-do
lists, and controlling media or system functions.
 Limitations: The project, in its current state, will support only English and will have limited
integration with external applications. Future versions could include support for more
languages, APIs, and advanced features such as learning from user behaviour.

5
Vikram – AI Voice Desktop Assistant

Chapter 2
Literature
Survey

2.1 Introduction:

This chapter provides a review of the literature on AI desktop assistant systems, particularly
focusing on methods implemented using Python. It covers a variety of approaches, including speech
recognition systems, neural networks, and hybrid models. Each technique is evaluated based on its
advantages and limitations, offering insights into the current challenges and areas for improvement.
By summarizing the state of the art in AI assistants, this review aims to identify existing gaps and
opportunities for further research and development in creating more efficient, responsive, and
adaptable desktop assistants.
The techniques discussed in this review are selected based on their relevance to the development of
intelligent desktop assistants, with a focus on enhancing performance, improving user interaction, and
ensuring scalability. The chapter concludes with a summary of the findings, highlighting potential
future directions for research in this domain.

2.2 Literature review:

1. ”AI Based Voice Assistant Using Python” by Deepak Shende, Ria Umahiya, Monika Raghorte- In this
paper, we discussed the design and implementation of a Digital Assistance. The project is built using open
source software modules with PyCharm community backing which can accommodate any updates in the near
future.

2. ”JARVIS”-AIVoiceAssistant by Rajat Sharma, Adweteeya Dwivedi- An AI Voice Assistant System uses


speech recoginition, gTTs and other AI techniques along with Neural Networks and Natural Language
Processing for a smart responsive system to the given circumstances or conditions.

3. ”Voice Assistant Using Python and AI by Divisha Pandey, Afra Ali, Shweta Dubey, Muskan Srivastava-
The paper tells about the new emerging technology for the desktop users This new service is based on
internet of things, speech recognition and various other modern technologies like artificial intelligence,
natural language processing and deep learning.

4. ”Virtual Assistant Using Python” Vedant Kulkarni Department of Computer Engineering Maeer’s MIT
Polytechnic Pune- by The new framework has defeated the vast majority of the constraints of the current
framework and works as indicated by the plan detail given.The task what we have created is work all the
more effectively. The virtual assistant effectively takes voice inputs
Vikram – AI Voice Desktop Assistant

2.3 Summary of Literature Surve

Sr Paper Advantages Limitations

 Uses open-source  Struggles with voice recognition


software, allowing for in accents outside the training
continuous updates and dataset.
Deepak Shende et al. [2]
1 improvements.  Limited customization for
 Modular design allows specific user needs, such as
easy integration of new handling regional accents or
features. complex commands.

 Combines neural
 Requires large datasets to
networks and NLP to
improve accuracy and
2 improve context
responsiveness.
Rajat Sharma et al. [3] recognition.
 Struggles with
 Highly responsive to user
personalization, especially in
inputs in various
multi-user environments.
conditions.

 Efficient for desktop use


and supports real-time  Limited support for voice
3 voice interaction. recognition in regional accents.
Divisha Pandey et al. [4]  Uses modern  Performance degrades when
technologies like deep handling complex, multi-step
learning and NLP to commands.
enhance performance.

 Provides better
 Requires high computational
performance in
power, especially for deep
4 recognizing accents,
learning tasks.
Vedant Kulkarni [5] including Indian accents.
 The system still faces
 Can automate a wide
challenges with handling multi-
range of desktop tasks,
user environments.
making it versatile.

Table 2.3 Summary of Literature Surve


Vikram – AI Voice Desktop Assistant

Chapter 3

Project Overview

The AI Desktop Assistant project is designed to function as a comprehensive voice-controlled


desktop assistant, enabling users to manage their entire desktop environment through natural
language commands. This assistant allows users to perform a variety of tasks, such as launching
applications, organizing files, setting reminders, and conducting web searches, all through voice
interactions. Additionally, it features the capability to generate and manipulate Excel sheets using
voice commands, simplifying data entry and management. By leveraging advanced AI and natural
language processing (NLP), the assistant enhances productivity and streamlines workflows, making
it an indispensable tool for everyday desktop tasks.
Vikram – AI Voice Desktop Assistant

3.1 Project Development Model for AI Desktop Assistant "Vikram"

A . Why was the project initiated?


The project was initiated to create a voice-activated virtual assistant tailored to recognize English in
Indian accents and execute tasks on desktop systems. Existing virtual assistants are often optimized
for mobile devices and struggle to accurately interpret non-native English accents. "Vikram" aims
to bridge this gap by offering a desktop-based solution with improved accent recognition.

B . Project Deliverables
What’s being delivered (In Scope):
 A desktop-based AI voice assistant, "Vikram," capable of recognizing English spoken
in Indian accents.
 Core functionalities include:
o Voice command recognition and processing.
o Task execution like opening applications, playing music, browsing websites,
and checking weather/time.
o Integration with natural language processing (NLP) for improved
conversational abilities.
o Personalization features allowing users to customize commands and
assistant responses.
o Basic conversational AI using OpenAI or similar NLP APIs to handle user queries.
What’s not being delivered (Out of Scope):
 Advanced AI features such as deep learning models for emotion recognition or
continuous learning.
 Mobile integration or functionality outside of desktop systems.
 Support for multiple languages beyond English.
 Hardware development (e.g., dedicated smart devices).
 Complex, multi-user conversations or enterprise-level system integration.

Assumptions to Clarify the Deliverables:


 The assistant will be used primarily by individual users in a desktop environment.
 Users will have access to microphones and necessary hardware for speech input.
Vikram – AI Voice Desktop Assistant

 Basic internet access is assumed for accessing APIs like OpenAI.


 Users will mostly be familiar with English and basic command structures.

Clarifications Needed:
 The specific desktop platforms supported (Windows, Linux, Mac).
 The extent of customization possible by the end-user (e.g., can users create
custom commands?).
 Any limitations on the number or type of tasks "Vikram" can handle simultaneously.
This model provides a clear understanding of the project’s scope, deliverables, and
assumptions, helping align project goals with user needs.
C. Project Constraints
1. Technical Constraints:
o Limited Indian accent training data may impact speech recognition accuracy.
o Hardware dependency: Microphone quality varies across desktop systems.
o Reliance on external APIs (OpenAI, speech recognition) can affect response
time, availability, or cost.
o Desktop-specific design limits adaptability to mobile platforms.
2. Resource Constraints:
o API usage costs may limit frequent or advanced queries.
o Team expertise in AI, voice recognition, and NLP may limit feature complexity.
3. Time Constraints:
o Strict development deadlines restrict time for extensive feature refinement.
o API/library updates may cause delays due to compatibility issues.
4. User Constraints:
o Software must run on a range of desktop specs, limiting features for low-
end systems.
o Assumes basic user knowledge of voice assistants.
5. Regulatory & Ethical Constraints:
o Privacy concerns necessitate strict data protection and user consent protocols.
o Data storage must comply with local privacy laws.
3.2 Timeline with milestones and Gannt Chart
Vikram – AI Voice Desktop Assistant

 Timeline with milestones

Week Start Date End Date Tasks & Milestones

1 2-Aug-2024 9-Aug-2024 Project Initiation: Define objectives, identify stakeholders.

Requirements Gathering: Create user stories and technical


2 10-Aug-2024 16-Aug-2024
specifications.

3 17-Aug-2024 22-Aug-2024 Sprint Planning: Set sprint goals and prioritize tasks.

Development: Begin implementation of speech


4 23-Aug-2024 29-Aug-2024
recognition system
Development: Continue developing text-to-
5 30-Aug-2024 6-Sep-2024
speech functionality.
Development: Implement task execution modules (e.g.,
6 7-Sep-2024 13-Sep-2024
opening apps).
Development: Integrate OpenAI API for advanced
7 14-Sep-2024 18-Sep-2024
responses.
Development: Implement conversational AI for natural
8 19-Sep-2024 22-Sep-2024
interactions.
Testing and Feedback: Conduct user testing and gather
9 23-Sep-2024 27 -Sep-2024
feedback.
Refinement and Optimization: Analyze feedback and
10 28-Sep-2024 5-Oct-2024
improve functionalities.
11 5-Oct-2024 12-Oct-2024 Deployment: Release beta version for broader audience.
Iteration and Enhancement: Plan for future updates based
12 12-Oct-2024 23 - Oct-2024
on user data.

Table 3.2.1 Timeline with milestones


Vikram – AI Voice Desktop Assistant

Gannt Chart
 Gannt Chart of AI Voice Desktop Assistant

Table 3.2.2 Gantt Chart


Vikram – AI Voice Desktop Assistant

3.3 Methodology
1. Environment Setup and Library Installation
 Create a Virtual Environment: Isolate project dependencies using virtualenv or conda.
 Install Required Libraries: Use pip to install essential libraries:
o SpeechRecognition: For speech-to-text conversion.
o pyttsx3 or gTTS: For text-to-speech conversion.
o OpenAI: For accessing OpenAI's API for advanced language models.
o wikipedia: For accessing Wikipedia's knowledge base.
o pyaudio: For audio input/output.
o Other libraries as needed for specific functionalities.
2. Speech Recognition and Text-to-Speech
 Speech-to-Text:
o Use SpeechRecognition to capture audio input from a microphone.
o Process the audio to recognize spoken words and convert them into text.
 Text-to-Speech:
o Employ pyttsx3 or gTTS to synthesize text into natural-sounding speech.
o Customize voice, speed, and pitch as needed.

Fig 3.1 Methodology


Vikram – AI Voice Desktop Assistant

3. Core Functionality and Task Execution


 Intent Recognition:
o Use natural language processing techniques or rule-based approaches to identify
user intent from the recognized text.
 Task Execution:
o Develop modules for various tasks:
 Basic tasks: Time/date, weather, playing music, opening websites.
 Complex tasks: Searching the web, providing summaries,
translating languages.
 Error Handling and Fallback:
o Implement error handling mechanisms to gracefully handle unexpected inputs
or errors.
4. Advanced Functionality with OpenAI
 API Integration:
o Set up an OpenAI API key and integrate it into the application.
o Use OpenAI's language models (e.g., GPT-3) to generate more sophisticated
and contextually relevant responses.
 Advanced Features:
o Contextual Understanding: Leverage OpenAI's models to understand the context
of user queries and provide more accurate responses.
o Creative Text Generation: Generate stories, poems, or scripts based on user prompts.
o Code Generation: Assist with coding tasks by generating code snippets or
explaining complex code.
5. Conversational AI and User Experience
 Dialog Management:
o Design a dialogue system to manage the flow of conversation, track the
conversation history, and maintain context.
 Natural Language Processing:
o Use techniques like intent recognition, entity extraction, and sentiment analysis
to understand user queries.

Chapter 4
Vikram – AI Voice Desktop Assistant

Design of Algorithm

4.1 Algorithm Design for AI Desktop Assistant


1. Initialization
 Load Required Libraries:
o Import libraries such as SpeechRecognition, PyAudio, gTTS (Google Text-
to- Speech), and Wikipedia.
 Initialize Systems:
o Set up the speech recognition and text-to-speech systems.
2. User Input Handling
 Listening for User Input:
o Continuously monitor the microphone for user speech.
 Record Audio:
o When audio input is detected, record the audio for processing.
3. Speech Recognition
 Convert Audio to Text:
o Use the SpeechRecognition module to convert recorded audio into text.
 Error Handling:
o If recognition fails, prompt the user to repeat their input.
4. Command Processing
 Parse Recognized Text:
o Analyze the recognized text to determine the user’s intent.
 Intent Detection:
o Use keyword matching or Natural Language Processing (NLP) techniques to
identify the appropriate action (e.g., playing music, fetching weather, retrieving
Wikipedia articles).
Vikram – AI Voice Desktop Assistant

5. Task Execution
 Execute Commands:
o Based on the identified command, execute the appropriate action:
 Web Searches: Use web scraping or APIs to retrieve information.
 Music Playback: Access the music library and play the requested track.
 Time/Date Queries: Retrieve the current date and time.
 Wikipedia Queries: Fetch summaries of articles using the Wikipedia API.
6. External API Integration
 Real-Time Information:
o For commands requiring real-time data (e.g., weather updates, news), call
the appropriate external API.
 Data Parsing and Formatting:
o Parse and format the retrieved data to ensure user-friendly output.
7. Response Generation
 Generate Text Response:
o Create a response based on the results of the executed command.
 Text-to-Speech Conversion:
o Use the text-to-speech system to convert the text response into speech.
8. User Feedback
 Output Speech Response:
o Play the generated speech response back to the user.
 Prompt for Further Input:
o Ask the user if they have more commands or questions.
9. Loop Back
 Continuous Interaction:
o Return to the user input handling step to allow for ongoing interaction.
10. Exit Condition
o Recognize Exit Commands: Provide a mechanism for the user to exit the
assistant (e.g., by recognizing commands like "exit" or "quit").
Vikram – AI Voice Desktop Assistant

4.2 Existing System Architecture


The audio command is taken as input through microphone of the device. The next task of voice
assistant will be to analyze audio command and give apriori ate output to the user. The working
process of existing system is shown below:
1. Audio Command: In an AI desktop assistant, audio commands are used to interact with the assistant
through spoken language. These commands are typically processed by the assistant’s speech
recognition capabilities, and the assistant responds accordingly.

2. Microphone : The microphone in an AI desktop assistant is responsible for capturing user voice
commands, which are then processed through various stages, including analog-to-digital
conversion, speech recognition, natural language understanding, response generation, and finally,
user interaction.

3. Speech Recognition: Speech recognition, also known as automatic speech recognition (ASR) or
voice recognition, is a technology that enables a computer or machine to con vert spoken language
into text or commands. It’s a critical component of various applications and systems, including
voice assistants, transcription services, voice controlled devices, and more

Fig 4.2 Processing in existing system


Vikram – AI Voice Desktop Assistant

4.3 Proposed System Architecture


In contrast, the proposed architecture for the AI Desktop Assistant aims to create a seamless and
efficient user experience, integrating advanced technologies to address the limitations of
existing systems. Below are the detailed components and functionalities of the proposed
architecture:

Detailed Components of Proposed Architecture


1. User Interface (UI):
o Functionality: Designed for both voice and text inputs, providing users with
visual feedback on actions. A user-friendly interface ensures ease of navigation
and operation.
o Features: Includes dashboards, quick command suggestions, and error
corrections for voice input.
2. Voice Recognition & NLP Module:
o Functionality: Combines advanced voice recognition with natural
language processing to understand and interpret user commands accurately.
o Features:
 Supports a wide range of accents and dialects.
 Adapts to user speech patterns over time, improving accuracy.
 Allows for more natural conversation flow, enabling users to issue
commands in a more conversational tone.
3. Centralized Task Management System:
o Functionality: Integrates various applications (calendars, emails, file
management) into a single interface.
o Benefits:
 Users can manage all tasks from one place, improving efficiency.
 Tasks can be executed using voice commands without the need to
switch between applications.
4. Integration Layer:
o Functionality: Acts as middleware connecting the assistant with third-party
services, allowing data retrieval and task execution across applications (e.g., syncing
with Google Calendar or accessing cloud storage).
o Benefits: Promotes flexibility and ensures that the assistant can leverage
existing services to enhance its capabilities.

5. Cloud Storage:
Vikram – AI Voice Desktop Assistant

o Functionality: Provides secure storage for user data, preferences, and


interaction history, enabling synchronization across devices.
o Benefits:
 Users can access their information from any device.
 Enhanced data security and backup features reduce the risk of data loss.
6. AI Processing Logic:
o Functionality: Utilizes machine learning algorithms to analyze user interactions
and improve the assistant's performance over time.
o Benefits:
 Provides personalized responses based on user behavior and preferences.
 Increases the assistant's contextual awareness, enabling it to make
smarter decisions and recommendations.
Benefits of Proposed Architecture
 Enhanced Accessibility: Full voice control improves usability for individuals
with disabilities.
 Improved Efficiency: Centralized task management and seamless integrations save
users time and reduce frustration.
 Dynamic Learning: The AI component allows the system to evolve based on
user interactions, leading to a more tailored experience.
 Robust Data Handling: Cloud storage solutions provide flexible, secure, and
accessible data management.

Fig 4.3 Proposed system architecture

4.3.1 Packages used: 


Vikram – AI Voice Desktop Assistant

 Speech Recognition: Speech Recognition library is used for listening to the words
spoken by the users that is taken as input from microphone as a source and then process it
for finding out its meaning and convert them into text format. This library allows machine
system to understand the human language.
 Pyttsx3: Pyttsx3 stands for Python text to speech library is used for making our voice
assistant talk to us. It supports common text to speech engines which is like a tool that
converts text into speech and makes voice assistant able to talk to its user. We can make
it talking in both male and female voices according to requirement.
 Wikipedia: We need to use Wikipedia library so that we can get information from
Wikipedia on any topic or we can also ask for solution to our query or simply we can
perform Wikipedia search for any topic using this library. This Library in python needs
Internet connection for fetching results and it will provide results to user in text as well
as voice format.
 Datetime: This is an essential module to support the functionality of Date and time.
Whenever user wants to know the current date and time or the user wants to schedule a
task at a certain time then this module will be helpful to them.
 PyAutoGUI: PyAutoGUL is a Python Package which has control over the mouse and
the keyboard it is able to simulate the mouse cursor moves as well as clicks the button
press. With the help of particular 2-D coordinate we can click on exact location on
screen.
 PyWhatkit: PyWhatKit is a Python Library which has number of features like Sending
messages, images through WhatsApp, playing YouTube videos, converting image to
ASCII, sending emails etc.
 Keyboard: - Keyboard is library in Python which provides user the full control over
the Keyboard. Especially the ‘press ()’ and ‘write ()’ function helps for controlling
keyboard keys as well as writing messages on screen.
 Speedtest library is essential to test internet bandwidth. It helps to evaluate the uploading
as well as downloading speed of Internet. All the result that we get are in Megabits.
 OS: OS (Operating System) module in Python is used for interacting with operating
system. Particularly we are using the ‘Start file ()’ to open any application that are installed
in our system.
Vikram – AI Voice Desktop Assistant

Chapter 5

Project Design & Process Workflow

5. Project Design & Process Workflow


This section outlines the design and workflow for the AI Desktop Assistant project, including use case
diagrams, algorithms, and hardware/software specifications.

5.1 Use Case Diagram / Activity Diagram /DFD

Fig 5.1 Use case diagram

In this project there is only one user. The user queries command to the system. System then
interprets it and fetches answer. The response is sent back to the user.
Vikram – AI Voice Desktop Assistant

5.2 Activity Diagram

Fig 5.2 Activity diagram

Initially, the system is in idle mode. As it receives any wake up call it begins execution. The
received command is identified whether it is a questionnaire or a task to be performed. Specific
action is taken accordingly. After the Question is being answered or the task is being performed,
the system waits for another command. This loop continues unless it receives quit command. At
that moment, it goes back to sleep.
Vikram – AI Voice Desktop Assistant

5.3 Data Flow Diagram

5.3.1 DFD Level 0 (Context Level Diagram)

Fig 5.3.1 DFD Level 0

5.3.2 DFD Level 1

Fig 5.3.2 DFD Level 1


Vikram – AI Voice Desktop Assistant

5.4 Hardware Specifications/Software Specification


1. Processor:
o Minimum: Intel Core i3 or equivalent
o Recommended: Intel Core i5 or higher
2. RAM:
o Minimum: 4 GB
o Recommended: 8 GB or higher
3. Storage:
o Minimum: 100 GB HDD
o Recommended: 256 GB SSD for faster performance
4. Audio:
o Integrated or external microphone and speakers
5. Graphics:
o Integrated graphics are sufficient, but a dedicated GPU can enhance performance
for more complex tasks.

Software Specification
1. Operating System:
o Windows 10 or higher
2. Python:
o Version 3.6 or higher
3. IDE:
o PyCharm or any other Python IDE
4. Required Libraries:
o pyttsx3: Text-to-speech conversion
o SpeechRecognition: Voice command recognition
o pyPDF2: PDF reading capabilities
o smtplib: For sending emails
o pywhatkit: For WhatsApp messaging
o pyautogui: For automating keyboard and mouse tasks
o pyQt or similar: For GUI development
Vikram – AI Voice Desktop Assistant

Chapter 6

Conclusion & Future Scope

6.1 Conclusion

The AI desktop assistant represents a remarkable achievement in the field of Natural Language
Processing and human-computer interaction. By simulating human-like conversations, this
technology offers new possibilities for personal and professional use. However, ongoing research,
user feedback, and ethical considerations will be essential in shaping the future of such AI
assistants. An AI desktop assistant that talks like humans represents a remarkable advancement in
natural language processing and artificial intelligence technologies. By emulating human-like
conversation, such an assistant significantly enhances user interaction and usability, making it more
intuitive, engaging, and accessible .Moreover, a human-like AI desktop assistant has the potential
to transform industries like customer service, healthcare, education, and more. It can revolutionize
the way we interact with technology, creating more inclusive and user-friendly experiences for
people of all ages and backgrounds.
Vikram – AI Voice Desktop Assistant

6.2 Future Scope

The future scope for AI desktop assistants is vast and promising, as these intelligent virtual
companions continue to evolve and integrate with various aspects of our digital and physical lives.
Here are some key areas of future development and opportunities for AI desktop assistants:

1. Enhanced Natural Language Understanding (NLU): Future AI desktop assistants will


exhibit even more advanced NLU capabilities, allowing for more natural and context-aware
conversations. They will better understand nuances, sarcasm, and multi-turn dialogues, making
interactions feel increasingly human-like.
2.Multimodal Interaction: AI desktop assistants will support a combination of voice, text, and
graphical inputs, enabling users to seamlessly switch between modes of communication based
on their preferences and context.
3. Education and Training: AI desktop assistants will play a role in personalized education and
training, offering learners tailored content, explanations, and guidance in various subjects and
skills.
4. Healthcare Support: They can provide health-related information, monitor vital signs,
remind users of medication schedules, and offer telehealth features to connect users with
healthcare professionals.
5. Business Applications: In the business world, AI desktop assistants will assist with tasks like
data analysis, customer support, and workflow automation, enhancing productivity and
decision- making.
6. E-commerce and Customer Service: They will offer personalized product
recommendations, answer customer queries, and assist with online shopping, improving the
customer experience
Vikram – AI Voice Desktop Assistant

References

1. Alotto, F., Scid` a, I., and Osello, A. (2020). “Building modeling with artificial intelligence
and speech recognition for learning purpose.” Proceedings of EDULEARN20 Conference, Vol. 6.
7th.
2. Beirl, D., Rogers, Y., and Yuill, N. (2019). “Using voice assistant skills in family
life.”ComputerSupported Collaborative Learning Conference, CSCL, Vol. 1, Inter-national
Society of the Learning Sciences, Inc. 96–103.
3. Canbek, N. G. and Mutlu, M. E. (2016). “On the track of artificial intelligence: Learning
with intelligent personal assistants.” Journal of Human Sciences, 13(1), 592–601. 13
4. Malodia, S., Islam, N., Kaur, P., and Dhir, A. (2021). “Why do people use artificial
intelligence (AI)-enabled voice assistants?.” IEEE Transactions on Engineering Management.
5. Nasirian, F., Ahmadian, M., and Lee, O.-K. D. (2017). “Ai-based voice assistant
systems: evaluating from the interaction and trust perspectives.
6. RAJA, K. D. P. R. A. (2020). “Jarvis ai using python.
7. Terzopoulos, G. and Satratzemi, M. (2019). “Voice assistants and artificial intelligence
in education.” Proceedings of the 9th Balkan Conference on Informatics. 1–6.
8. ”JARVIS”- AI Voice Assistant,Rajat Sharma , Adweteeya Dwivedi,International Journal
of Science and Research (IJSR)
Vikram – AI Voice Desktop Assistant

Acknowledgement

We would like to express our sincere gratitude to everyone who has supported us throughout this
project. Your invaluable guidance, constructive criticism, and friendly advice have been
instrumental to our success. We are particularly indebted to Dr. Umakant Gohatre, our project
guide, for his unwavering support, mentorship, and expertise. His guidance and constant
supervision, coupled with his provision of necessary information, have been invaluable. We are
also grateful to our overall major project coordinator, Dr. Umakant Gohatre, for her guidance
throughout the entire project process. We would like to extend our heartfelt thanks to our friends
and family for their unwavering support, encouragement, and the conducive environment they
provided for our project work. Their contributions, including their participation in literature
surveys, have been invaluable to our success.

You might also like