Final 1 Report Ss
Final 1 Report Ss
BY
Prof.Mrs.Khemnar K.C
CERTIFICATE
At
Department of Artificial Intelligence and Machine Learning,
Sahyadri valley College of Engineering and Technology Rajuri,
Rajuri - 412411
....................... .......................
Internal Examiner External Examiner
(Prof. Mrs Khemnar K.C.) (Prof. )
Certificate By Guide
This is to certify that [BE-004] and group members are MS Shirure Sakshi Namdev,
Mr. Talole Talole Dilip, MS.Argade Swati Babasaheb, MS.Jadhav Snehal Sanjay has
completed the Project Stage II work under my guidance and supervision and that, I
have verified the work for its originality in documentation, problem statement,
implementation and results presented in the dissertation. Any reproduction of other
necessary work is with the prior permission and has given due ownership and
included in the references.
Place: Rajuri
ACKNOWLEDGEMENT
“Feeling gratitude and not expressing it, is like wrapping a present and not giving it”
The satisfaction that accompanies the successful completion of any task would be incomplete
without mentioning the people who made it possible. So, I am grateful to number of
individuals whose professional guidance along with encouragement have made it very
pleasant endeavor to undertake this project.
I take this opportunity to express my profound gratitude and deep regards to my
guide and truly teacher of teachers Prof.Mrs Khemnar K.C. for their exemplary
guidance, monitoring and constant encouragement throughout the course of this project. I
also express a deep sense of gratitude to Prof.Mrs Khemnar K.C, Head of AI&ML
Engineering Department for their valuable guidance and encouragement.
I am obliged to our Principal Dr S.B.Zope and vice principle Prof P.Balaramudu for
their inspiration and co-operation.
Last but not least I have to express our feelings towards all staff members of Sahyadri
Valley College of Engineering and special thanks to my family, colleague and friends for
their moral support and help.
LIST OF PUBLICATIONS
List of Tables
A virtual assistant is a software agent that can perform tasks or services for an individual.
Sometimes the term "chatbot" is used to refer to virtual assistants generally or specifically
those accessed by online chat. Virtual Assistant (VA) is a term that applies to computer-
simulated environments that can simulate physical presence in places in the real world, as
well as in imaginary worlds. This report discusses ways in which new technology could be
harnessed to create an intelligent Virtual Personal Assistant (VPA) with a focus on user-
based information. This project is a technical brief on Virtual Assistant technology and its
opportunities and challenges in different areas. This project also describes the challenges of
applying virtual Assistant technology. In today's advanced hi-tech world, the need of
independent living is recognized in case of visually impaired people who are facing main
problem of social restrictiveness. They suffer in strange surroundings without any manual
aid. Visual information is the basis for most tasks, so visually impaired people are at
disadvantage because necessary information about the surrounding environment is not
available. With the recent advances in inclusive technology, it is possible to extend the
support given to people with visual impairment.
INDEX
Acknowledgement..............................................................................................................................3
SYNOPSIS
Objective:
1. To develop a Python-based virtual assistant capable of performing various tasks based on
voice commands, including web searches, YouTube interactions, and Wikipedia queries,
while providing spoken feedback to the user.
2. Voice Command Execution: To develop a virtual assistant that can accurately recognize
and execute various user commands given via speech, such as performing web searches,
playing media on YouTube, and retrieving information from Wikipedia.
3. Interactive Feedback: To provide real-time, spoken feedback to the user based on the
recognized commands, ensuring a smooth and engaging interaction through the text-tospeech
functionality.
Abstraction:
A virtual assistant is a software agent that can perform tasks or services for an individual.
Sometimes the term "chatbot" is used to refer to virtual assistants generally or specifically
those accessed by online chat. Virtual Assistant (VA) is a term that applies to computer-
simulated environments that can simulate physical presence in places in the real world, as
well as in imaginary worlds. This report discusses ways in which new technology could be
harnessed to create an intelligent Virtual Personal Assistant (VPA) with a focus on user-
based information. This project is a technical brief on Virtual Assistant technology and its
opportunities and challenges in different areas. The project focuses on virtual assistant types
and structural elements of a virtual assistant system. In this project, we tried to study virtual
Environment and virtual Assistant Interfaces, and the paper presents applications of virtual
assistant that helps in providing opportunities for humanity in various domains. This project
also describes the challenges of applying virtual Assistant technology. In today's advanced hi-
tech world, the need of independent living is recognized in case of visually impaired people
who are facing main problem of social restrictiveness. They suffer in strange surroundings
without any manual aid. Visual information is the basis for most tasks, so visually impaired
people are at disadvantage because necessary information about the surrounding environment
is not available. With the recent advances in inclusive technology, it is possible to extend the
support given to people with visual impairment.
Problem Statement:
In today’s fast-paced world, users are increasingly looking for hands-free, quick, and
convenient ways to interact with their devices. A voice-activated assistant can provide a
solution by allowing users to perform tasks, retrieve information, and control other
applications or devices using only their voice. The project aims to develop a Python-based
voice assistant that listens to user commands, interprets the intent, and executes the requested
action.
Hardware Requirements
1. A computer or laptop with a working microphone
2. Speakers or headphones for audio output
Software Requirements
1. Python 3.x
2. Libraries: speech recognition, pyttsx3, Wikipedia, web-browser, PyAutoGUI, time,
Pywhatkit
Methodology
1. Initialization: The script initializes the speech recognition and text-to-speech engines.
2. Speech Recognition: Captures and converts spoken words into text using the speech
recognition library.
3. Command Handling: Analyses the text to determine the appropriate action, including:
• Opening a Google or YouTube search result.
• Playing a YouTube video.
• Retrieving Wikipedia summaries.
4. Feedback: Uses text-to-speech to provide responses and feedback to the user.
• Loop: Continuously listens for commands until a termination
command ("good bye") is detected.
Expected Outcomes
1. The virtual assistant will accurately recognize and respond to predefined voice commands.
2. Users will be able to perform Google searches, YouTube searches and playback, and
retrieve Wikipedia information through voice commands.
3. The assistant will provide spoken feedback, ensuring an interactive experience.
Conclusion:
The script demonstrates the feasibility of creating a functional virtual assistant using
Python with minimal dependencies. By integrating speech recognition and text-to-speech
functionalities, the script offers a practical tool for users seeking a basic voice-controlled
assistant. Future improvements could include more sophisticated command parsing,
enhanced error handling, and additional functionalities.
Chapter 1
INTRODUCTION
Voice assistants are artificial intelligence (AI) systems that enable users to interact with
devices and perform tasks using natural language voice commands. Voice assistants have
become increasingly popular in recent years, with many people using them to control smart
devices, access information, and perform a variety of tasks on their smartphones, smart
speakers, and other devices.
Voice assistants use natural language processing (NLP) algorithms and machine learning
techniques to understand and respond to user requests. They can be activated using a specific
trigger word or phrase, such as "Hey Siri" or "Ok Google," and can perform a wide range of
tasks, such as answering questions, setting reminders, playing music, or controlling smart
home devices.
Voice assistants have the potential to make many everyday tasks more convenient and
efficient, as they allow users to interact with devices and systems using their voice rather than
requiring them to use a physical interface or input commands manually. However, voice
assistants also raise privacy and security concerns due to the sensitive personal data that they
may collect, store, and process.
Overall, voice assistants are an emerging and rapidly evolving technology that has the
potential to transform how people interact with devices and systems, and they will likely
continue to play an important role in the development of AI and the internet of things (IoT) .
This is because the visual interface with virtual assistants is missing. Users simply cannot see
or touch a voice interface. Virtual Assistants are software programs that help you ease your
day-40-day tasks, such as showing weather report, playing music etc. They can take
commands via text (online chat bots) or by voice.
CHAPTER 2
LITERATURE SURVEY
Authors: Manjusha Jadhav, Krushna Kalyankar, Gnaesh Narkhede and Swapnil Kharose
In this modern era, day to day life became smarter & interlinked with technology. We already
know some voice assistant like Google, Siri etc. Now in our voice assistant system, it can act
as your smart friend, daily schedule manager, to do writer, calculator & search tool. This
project works on speech input & give output through speech & text on screen. This assistant
attaches with the world wide web to provide result that the user required. Natural language
processing algorithm helps machines to engage in communication using natural human
language in many forms.
Digitization brings new possibilities to ease our daily life activities by the means of assistive
technology. Amazon Alexa, Apple Siri, Microsoft Cortana, Samsung Bixby, to name only a
few were successful in the age of smart personal assistants (spas). A voice assistant is defined
a digital assistant that combines artificial intelligence, machine learning Speech Recognition,
Natural Language Processing (NLP), Speech Synthesis and various actuation mechanisms to
sense and influence the environment. We use different NLP techniques to convert Speech to
text (STT), then process the text, convert Text to Speech (TTS), add various functionalities.
However, SPA research seems to be highly fragmented among different disciplines, such as
computer science, human-computer-interaction and information systems, which leads to
reinventing the wheel approaches‟ and thus impede progress and conceptual clarity. In this
paper, we present an exhaustive, integrative literature review to build a solid basis for future
research. Hence, we contribute by providing a consolidated, integrated view on prior research
and lay the foundation for a SPA classification scheme.
Authors: Prof. Suresh V. Reddy, Chandresh Chhari, Prajwal Wakde, Nikhil Kamble
In today‟s develop generation, how cool is it to build your own personal assistants like Alexa
or Siri? It‟s not very complex and may be effortlessly performed in Python. Personal virtual
assistants are capturing numerous attentions lately. Chat bots are not unusual in maximum
business web sites. The predominant agenda of our voice help makes human beings clever
and supply immediate and computed effects. The fundamental mission of a voice assistant is
to reduce using enter gadgets like keyboard, mouse, touch pens, and so forth. This will lessen
both the hardware fee and space taken by it.
CHAPTER 3
Methodology
3.1 Overview of existing system
A virtual voice assistant is a software program that utilizes natural language processing and
voice recognition technologies to understand and respond to spoken commands and queries.
It allows users to interact with their devices, applications, and services using voice
commands, and can perform a wide range of tasks such as making phone calls, scheduling
appointments, setting reminders, and providing information. Some popular examples of
virtual voice assistants include Amazon Alexa, Google Assistant, and Apple Siri. These AI
powered systems can be integrated with other devices and services to create a more seamless
and convenient user experience.
Main objective of building personal assistant software (a virtual assistant) is using semantic
data sources available on the web, user generated content and providing knowledge from
knowledge databases. The main purpose of an intelligent virtual assistant is to answer
questions that users may have. This may be done in a business environment, for example, on
the business website, with a chat interface. On the mobile platform, the intelligent virtual
assistant is available as a call-button operated service where a voice asks the user “What can I
do for you?” and then responds to verbal input. Virtual assistants can tremendously save you
time. We spend hours in online research and then making the report in our terms of
understanding. Provide a topic for research and continue with your tasks while the assistant
does the research. Another difficult task is to remember test dates, birthdates or anniversaries.
It comes with a surprise when you enter the class and realize it is class test today. Just tell
assistant in advance about your tests and she reminds you well in advance so you can prepare
for the test. One of the main advantages of voice searches is their rapidity. In fact, voice is
reputed to be four times faster than a written search: whereas we can write about 40 words
per minute, we are capable of speaking around 150 during the same period of time. In this
respect, the ability of personal assistants to accurately recognize spoken words is a
prerequisite for them to be adopted by consumers.
Chapter 4
IMPLEMENTATION PLAN
Planning plays an important role in successful completion of a project. This plan
acts as a checklist of the task done. It helps in stating the amount of work to be done
in a stipulated period of time. It can possible to judge ourselves against the time
chart and can set milestones for our project.
Block Diagram:
4.1 Data Flow Diagram
Block Diagram:
4.1.1 Level 0 data flow diagram
August 2024 3rd & 4th Week Presentation with Design. Preparation of
preliminary report.
September 1st Project Stage-I. Coding (At least 2 Module
2024 finish (30%) of total work)
September 2n & 3rd Week =First demonstration on project work
d expected(60%) of total work.
2024
September 4th Week Discussion with Project Guide. Test plan,
2024 Design and Installation
Octomber 1st & 2nd Week Completion of remaining work and any
2024 changes suggested from Project Guide. Final
Project Demonstration
Chapter 5
SOFTWARE REQUIREMENT SPECIFICATION
5.1 Libraries:
5.1.1 Pyttsx3-
It is a text to speech conversion library in python which is used to convert the text given in
the
parenthesis to speech. It is compatible with python 2 and 3. An application invokes the
pyttsx3.init() factory function to get a reference to a pyttsx3. it is a very easy to use tool
which
converts the entered text into speech. The pyttsx3 module supports two voices first is female
and the second is male which is provided by “sapi5” for windows.
Command to install: - pip install pyttsx3
It supports three TTS engines:
- sapi5- To run on windows nsss
- NSSpeechSynthesizer on Mac OS X espeak
– eSpeak on every other platform
5.1.2 Speech_recognition-
It allows computers to understand human language. Speech recognition is a machine's ability
to listen to spoken words and identify them. We can then use speech recognition in Python to
convert the spoken words into text, make a query or give a reply. Python supports many
speech
recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API.
Command to install: - pip install SpeechRecognition.
5.1.4 Wikipedia: -
This is a Python library that makes it easy to access and parse data from Wikipedia. Search
Wikipedia, get article summaries, get data like links and images from a page, and more.
Wikipedia is a multilingual online encyclopedia.
Command to install: - pip install Wikipedia
5.1.5 Webbrowser-
Webbrowser module is a convenient web browser controller. It provides a high-level
interface
that allows displaying Web-based documents to users. webbrowser can also be used as a CLI
tool. It accepts a URL as the argument with the following optional parameters: -n opens the
URL in a new browser window, if possible, and -t opens the URL in a new browser tab. This
is a built-in module so installation is not required.
5.1.6 Datetime-
This module is used to get the date and time for the user. This is a built-in module so there is
no need to install this module externally. Python Datetime module supplies classes to work
with date and time. Date and datetime are an object in Python, so when we manipulate them,
we are actually manipulating objects and not string or timestamps.
5.1.7 Time-
This module provides many ways of representing time in code, such as objects, numbers, and
strings. It also provides functionality other than representing time, like waiting during code
execution and measuring the efficiency of our code. This is a built-in module so the
installation
is not necessary.
5.1.8 Requests-
The requests module allows you to send HTTP requests using Python. The HTTP request
returns a Response Object with all the response data. With it, we can add content like
headers,
form data, multipart files, and parameters via simple Python libraries. It also allows you to
access the response data of Python in the same way.
Command to install: - pip install requests.
5.1.9 Pywhatkit -
Python's pywhatkit module. As we know, Python provides numerous libraries and pywhatkit
is
one of them. The pywhatkit module is used to send the message by the Python script. Using
this module, we can send the message to the desired number with a few lines of code.
It uses WhatsApp web to send these messages.
5.1.10 Pyautogui: -
PyAutoGUI is a cross-platform Python module that allows you to programmatically control
the
mouse and keyboard, making it a powerful tool for automating tasks within graphical user
interfaces (GUIs)
5.2.1. PYTHON
Python is an OOPs (Object Oriented Programming) based, high level, interpreted programming
language. It is a robust, highly useful language focused on rapid application development (RAD).
Python helps in easy writing and execution of codes. Python can implement the same logic with as
much
as 1/5th code as compared to other OOPs languages. Python provides a huge list of benefits to all. The
usage of Python is such that it cannot be limited to only one activity. Its growing popularity has
allowed
it to enter into some of the most popular and complex processes like Artificial Intelligence (AI),
Machine Learning (ML), natural language processing, data science etc. Python has a lot of libraries
for
every need of this project. For this project, libraries used are speech recognition to recognize voice,
Pyttsx for text to speech, selenium for web automation etc.
Keyword: information
Keyword: Name
Chapter 6
SYSTEM DESIGN
6.1 Requirement Analysis
In order to effectively design and develop a system, it is important to understand and
document
the requirements of the system. The process of gathering and documenting the requirements
of
a system is known as requirement analysis. It helps to identify the goals of the system, the
stakeholders and the constraints within which the system will be developed. The
requirements
serve as a blueprint for the development of the system and provide a reference point for
testing
and validation.
● Hardware Requirements
• Processor – 2.3 GHz or more
• RAM – 4 GB or more
• Disk Space – 50 GB or more
• Input Devices – Microphone & Keyboard
• Output Devices – Speaker & Monitor
• Internet Connection
● Software Requirements
• Python 3.12
• APIs
• News API
• WolframAlpha API
• OpenWeatherMap API
• TMDB API
• DreamStudio API
OUTCOMES
The outcomes of a virtual voice assistant project can be quite broad and can vary depending
on
the goals and specific implementation of the assistant. Here are some key outcomes that such
a project could yield:
2.Operational Efficiency
Automated Routine Tasks: Assistants can handle tasks like scheduling, reminders, and FAQs,
reducing the workload on support staff.
- 24/7 Availability: A virtual assistant can operate around the clock, providing support
outside
of business hours and increasing efficiency.
- Reduced Costs: With automation, organizations can save on labor costs, as the need for
human intervention for routine tasks decreases.
CODING SECTION
import asyncio
from random import randint
from PIL import Image
import requests
from dotenv import get_key
import os
from time import sleep
try:
# Try to open and display the image
img = Image.open(image_path)
print(f"Opening image: {image_path}")
img.show()
sleep(1) # Pause for 1 second before showing the next image
except IOError:
print(f"Unable to open {image_path}")
try:
# Read the status and prompt from the data file
with open(r"Frontend\Files\ImageGeneration.data", "r") as f:
Data: str = f.read()
else:
sleep(1) # Wait for 1 second before checking again
except:
pass
#############################################################################
# Import required libraries
from AppOpener import close, open as appopen # Import functions to open and close apps.
from webbrowser import open as webopen # Import web browser functionality.
from pywhatkit import search, playonyt # Import functions for Google search and YouTube playback.
from dotenv import dotenv_values # Import dotenv to manage environment variables.
from bs4 import BeautifulSoup # Import BeautifulSoup for parsing HTML content.
from rich import print # Import rich for styled console output.
from groq import Groq # Import Groq for AI chat functionalities.
import webbrowser # Import webbrowser for opening URLS.
import subprocess # Import subprocess for interacting with the system.
import requests # Import requests for making HTTP requests.
import keyboard # Import keyboard for keyboard-related actions.
import asyncio # Import asyncio for asynchronous programming.
import os # Import os for operating system functionalities.
Topic: str = Topic.replace("Content", "") # Remove "Content " from the topic.
ContentByAI = ContentWriterAI(Topic) # Generate content using AI.
try :
appopen(app, match_closest=True, output=True, throw_error=True) # Attempt to open the app.
return True # Indicate success.
except :
# Nested function to extract links from HTML content.
def extract_links(html):
if html is None:
return []
soup = BeautifulSoup(html, 'html.parser' ) # Parse the HTML content.
links = soup.find_all('a', {'jsname' : 'UWckNb'}) # Find relevant links.
return [link.get('href') for link in links] # Return the links.
# if response.status_code == 200:
# return response.text # Return the HTML content.
# else :
# print("Failed to retrieve search results.") # Print an error message.
# return None
# if html:
# link = extract_links(html)[0] # Extract the first link from the search results.
# webopen(link) # Open the link in a web browser.
# return True
# if html:
# links = extract_links(html)
# if links: # Ensure we have valid links before accessing index 0
# webopen(links[0]) # Open the first search result
# else:
# print(f"No valid links found for '{app}', opening Google search instead.")
# webopen(f"https://www.google.com/search?q={app}") # Open Google search results
page
# else:
# print("Failed to retrieve Google search results.")
if "chrome" in app:
pass # Skip if the app is Chrome.
else:
try:
close(app, match_closest=True, output=True, throw_error=True) # Attempt to close the app.
return True # Indicate success.
except :
return False # Indicate failure.
else:
print(f"No Function Found. For {command}") # Print an error for unrecognized commands.
# if __name__ == "__main__":
# asyncio.run(Automation(["open facebook", "open instagam", "open telegram", "play bewajah",
"content resigntion letter"]))
############################################################################
from groq import Groq # importing the groq library to use its api
from json import load, dump # imporing functions to read and write json files
import datetime # importing the database module for real-time date and time information
from dotenv import dotenv_values # importing dotenv_values to read environment variables from
a .env files
# Retrieve specific environment variables for username, assistant name, and API key.
Username = env_vars.get ("Username")
Assistantname = env_vars.get("Assistantname")
GroqAPIKey = env_vars.get("GroqAPIKey")
# Define a system message that provides context to the AI chatbot about its role and behavior.
System = f"""Hello, I am {Username}, You are a very accurate and advanced AI chatbot named
{Assistantname} which also has real-time up-to-date information from the internet.
*** Do not tell time until I ask, do not talk too much, just answer the question.***
*** Reply in only English, even if the question is in Hindi, reply in English.***
*** Do not provide notes in the output, just answer the question and never mention your training data.
***
"""
try:
# Load the existing chat log from the JSON file.
with open(r"Data\ChatLog.json", "r") as f:
messages = load(f)
Answer = Answer.replace("</s>", "") # Clean up any unwanted tokens from the response.
#################################################################################
function startRecognition() {
recognition = new webkitSpeechRecognition() || new SpeechRecognition();
recognition.lang = '';
recognition.continuous = true;
recognition.onresult = function(event) {
const transcript = event.results[event.results.length - 1][0].transcript;
output.textContent += transcript;
};
recognition.onend = function() {
recognition.start();
};
recognition.start();
}
function stopRecognition() {
recognition.stop();
output.innerHTML = "";
}
</script>
</body>
</html>'''
# Replace the language setting in the HTML code with the input language from the environment
variables.
HtmlCode = str(HtmlCode).replace("recognition.lang = '';", f"recognition.lang =
'{InputLanguage}' ;")
return new_query.capitalize()
while True:
try :
# Get the recognized text from the HTML output element.
Text = driver.find_element(by=By.ID, value="output").text
if Text:
# Stop recognition by clicking the stop button.
driver.find_element(by=By.ID, value="end").click()
##################################################################################
env_vars = dotenv_values(".env")
Username = env_vars.get("Username")
Assistantname = env_vars.get("Assistantname")
DefaultMessage = f'''{Username} : Hello {Assistantname}, How are you?
{Assistantname} : Welcome {Username}. I am doing well. How may i help you?'''
subprocesses = []
Functions = ["open", "close", "play", "system", "content", "google search", "youtube search"]
def ShowDefaultChatIfNoChats():
File = open(r'Data\ChatLog.json', "r", encoding='utf-8')
if len(File.read())<5:
with open(TempDirectoryPath('Database.data'),'w', encoding='utf-8') as file:
file.write("")
with open(TempDirectoryPath('Responses.data'), 'w', encoding='utf-8') as file:
file.write(DefaultMessage)
def ReadChatLogJson():
with open(r'Data\ChatLog.json', 'r', encoding='utf-8' ) as file:
chatlog_data = json. load(file)
return chatlog_data
def ChatLogIntegration():
json_data = ReadChatLogJson()
formatted_chatlog = ""
for entry in json_data:
if entry[ "role"] == "user":
formatted_chatlog += f"User: {entry['content' ]}\n"
elif entry[ "role"] == "assistant":
formatted_chatlog += f"Assistant: {entry[ 'content' ]}\n"
formatted_chatlog = formatted_chatlog.replace("User", Username + " ")
formatted_chatlog = formatted_chatlog.replace("Assistant", Assistantname + " ")
def ShowChatsOnGUI():
File = open(TempDirectoryPath('Database.data'), "r", encoding='utf-8')
Data = File.read()
if len(str(Data))>0:
lines = Data.split('\n')
result = '\n'.join(lines)
File.close()
File = open(TempDirectoryPath('Responses.data'), "w", encoding='utf-8' )
File.write(result)
File.close()
def InitialExecution():
SetMicrophoneStatus("False")
ShowTextToScreen("")
ShowDefaultChatIfNoChats()
ChatLogIntegration()
ShowChatsOnGUI()
InitialExecution()
def MainExecution():
TaskExecution = False
ImageExecution = False
ImageGenerationQuery = ""
SetAssistantStatus("Listening ...")
Query = SpeechRecognition()
ShowTextToScreen(f"{Username}:{Query}")
SetAssistantStatus("Thinking ...")
Decision = FirstLayerDMM(Query)
print("")
print(f"Decision:{Decision}")
print("")
if ImageExecution == True:
try:
p1 = subprocess.Popen(['python', r'Backend\ImageGeneration.py'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
stdin=subprocess.PIPE, shell=False)
subprocesses.append(p1)
except Exception as e :
print(f"Error starting ImageGeneration.py: {e}")
if G and R or R:
SetAssistantStatus("Searching... ")
Answer = RealtimeSearchEngine(QueryModifier(Mearged_query))
ShowTextToScreen(f"{Assistantname}:{Answer}")
SetAssistantStatus("Answering... ")
TextToSpeech(Answer)
return True
else:
for Queries in Decision:
if "general" in Queries:
SetAssistantStatus("Thinking... ")
QueryFinal = Queries.replace("general", "")
Answer = ChatBot(QueryModifier(QueryFinal))
ShowTextToScreen(f"{Assistantname}:{Answer}")
SetAssistantStatus("Answering... ")
TextToSpeech(Answer)
return True
def FirstThread():
while True:
CurrentStatus = GetMicrophoneStatus()
if CurrentStatus == "True":
MainExecution()
else:
AIStatus = GetAssistantStatus()
else:
SetAssistantStatus("Available... ")
def SecondThread():
GraphicalUserInterface()
if __name__ == "__main__":
thread2 = threading.Thread(target=FirstThread, daemon=True)
thread2.start()
SecondThread()
################################################################################
def AnswerModifier(Answer):
lines = Answer.split('\n')
non_empty_lines = [line for line in lines if line.strip()]
modified_answer = '\n'.join(non_empty_lines)
return modified_answer
def QueryModifier(Query):
new_query = Query.lower().strip()
query_words = new_query.split()
question_words = ["how", "what", "who", "where", "when", "why", "which", "whose", "whom",
"can you", "what's", "where's", "how's"]
else:
if query_words[-1][-1] in ['.', '?', '!']:
new_query = new_query [:-1] + "."
else:
new_query += "."
return new_query.capitalize()
def SetMicrophoneStatus(Command):
with open(rf'{TempDirPath}\Mic.data', "w", encoding='utf-8' ) as file:
file.write(Command)
def GetMicrophoneStatus():
with open(rf'{TempDirPath}\Mic.data', "r", encoding='utf-8' ) as file:
Status = file.read()
return Status
def SetAssistantStatus(Status):
with open(rf'{TempDirPath}\Status.data', "w", encoding='utf-8') as file:
file.write(Status)
def GetAssistantStatus():
with open(rf'{TempDirPath}\Status.data', "r", encoding='utf-8') as file:
Status = file.read()
return Status
def MicButtonInitialed():
SetMicrophoneStatus ("False")
def MicButtonClosed():
SetMicrophoneStatus ("True")
def GraphicsDirectoryPath(Filename):
Path = rf'{GraphicsDirPath}\{Filename}'
return Path
def TempDirectoryPath(Filename):
Path = rf'{TempDirPath}\{Filename}'
return Path
def ShowTextToScreen(Text):
with open(rf'{TempDirPath}\Responses.data', "w", encoding='utf-8') as file:
file.write(Text)
class ChatSection(QWidget):
def __init__(self):
super(ChatSection, self).__init__()
layout = QVBoxLayout(self)
layout.setContentsMargins(-10, 40, 40, 100)
layout.setContentsMargins (-10, 40, 40, 100)
layout.setSpacing(-100)
self.chat_text_edit = QTextEdit()
self.chat_text_edit.setReadOnly(True)
self. chat_text_edit.setTextInteractionFlags(Qt.NoTextInteraction) # No text interaction
self.chat_text_edit.setFrameStyle(QFrame.NoFrame)
layout.addWidget(self.chat_text_edit)
self.setStyleSheet("background-color:black;")
layout.setSizeConstraint(QVBoxLayout. SetDefaultConstraint)
layout.setStretch(1, 1)
self.setSizePolicy(QSizePolicy(QSizePolicy.Expanding, QSizePolicy.Expanding))
text_color = QColor(Qt.blue)
text_color_text = QTextCharFormat()
text_color_text.setForeground(text_color)
self. chat_text_edit.setCurrentCharFormat(text_color_text)
self.gif_label = QLabel()
self.gif_label.setStyleSheet("border:none;")
movie = QMovie(GraphicsDirectoryPath('Jarvis.gif' ))
max_gif_size_W = 480
max_gif_size_H = 270
movie.setScaledSize(QSize(max_gif_size_W, max_gif_size_H))
self.gif_label.setAlignment (Qt.AlignRight | Qt.AlignBottom)
self.gif_label. setMovie(movie)
movie.start()
layout.addWidget(self.gif_label)
self.label = QLabel("")
self.label.setStyleSheet("color: white; font-size: 16px; margin-right: 195px; border: none;
margin-top:-30px;")
self.label.setAlignment(Qt.AlignRight)
layout.addWidget(self.label)
layout.setSpacing(-10)
layout.addWidget(self.gif_label)
font = QFont()
font.setPointSize(13)
self.chat_text_edit.setFont(font)
self.timer = QTimer(self)
self.timer.timeout.connect(self.loadMessages)
self.timer.timeout.connect(self.SpeechRecogText)
self.timer.start(5)
self.chat_text_edit.viewport().installEventFilter(self)
self.setStyleSheet("""
QScrollBar:vertical {
border:none;
background:black;
width:10px;
margin:0px 0px 0px 0px;
}
QScrollBar::handle:vertical {
background:white;
min-height:20px;
}
QScrollBar::add-line:vertical {
background:black;
subcontrol-position:bottom;
subcontrol-origin:margin;
height:10px;
}
QScrollBar::sub-line:vertical {
background:black;
subcontrol-position:top;
subcontrol-origin:margin;
height:10px;
}
QScrollBar::up-arrow:vertical, QScrollBar::down-arrow:vertical {
border:none;
background:none;
color:none;
}
QScrollBar::add-page:vertical, QScrollBar::sub-page:vertical {
background:none;
}
""")
def loadMessages(self):
global old_chat_message
if None==messages:
pass
elif len(messages) <= 1:
pass
elif str(old_chat_message)==str(messages):
pass
else:
self.addMessage(message=messages, color='White')
old_chat_message = messages
def SpeechRecogText(self):
with open(TempDirectoryPath('Status.data'), "r", encoding='utf-8') as file:
messages = file.read()
self.label.setText(messages)
if self.toggled:
self.load_icon(GraphicsDirectoryPath('voice.png'), 60, 60)
MicButtonInitialed()
else:
self.load_icon(GraphicsDirectoryPath('mic.png'), 60, 60)
MicButtonClosed()
class InitialScreen(QWidget):
def SpeechRecogText(self):
with open(TempDirectoryPath('Status.data'), "r", encoding='utf-8') as file:
messages = file.read()
self.label.setText(messages)
if self.toggled:
self.load_icon(GraphicsDirectoryPath('Mic_on.png'), 60, 60)
MicButtonInitialed()
else:
self.load_icon(GraphicsDirectoryPath('Mic_off.png'), 60, 60)
MicButtonClosed()
class MessageScreen(QWidget):
def __init__(self, parent=None):
super().__init__(parent)
desktop = QApplication.desktop()
screen_width = desktop.screenGeometry().width()
screen_height = desktop.screenGeometry().height()
layout = QVBoxLayout()
label = QLabel("")
layout.addWidget(label)
chat_section = ChatSection()
layout.addWidget(chat_section)
self.setLayout(layout)
self.setStyleSheet("background-color: black;")
self.setFixedHeight(screen_height)
self.setFixedWidth(screen_width)
class CustomTopBar(QWidget):
def __init__(self, parent, stacked_widget):
super().__init__(parent)
self.initUI()
self.current_screen = None
self.stacked_widget = stacked_widget
def initUI(self):
self. setFixedHeight(50)
layout = QHBoxLayout (self)
layout.setAlignment(Qt.AlignRight)
home_button = QPushButton()
home_icon = QIcon(GraphicsDirectoryPath("Home.png"))
home_button.setIcon(home_icon)
home_button.setText ("Home")
home_button.setStyleSheet( "height:40px; line-height:40px; background-color:white;
color:black")
message_button = QPushButton()
message_icon = QIcon(GraphicsDirectoryPath("Chats.png"))
message_button.setIcon(message_icon)
message_button.setText("Chat")
message_button.setStyleSheet ("height:40px; line-height:40px; background-color:white;
color:black")
minimize_button = QPushButton()
minimize_icon = QIcon(GraphicsDirectoryPath('Minimize2.png'))
minimize_button.setIcon(minimize_icon)
minimize_button.setStyleSheet("background-color:white")
minimize_button.clicked.connect(self.minimizeWindow)
self.maximize_button = QPushButton()
self.maximize_icon = QIcon(GraphicsDirectoryPath('Maximize.png'))
self.restore_icon = QIcon(GraphicsDirectoryPath('Minimize.png'))
self.maximize_button.setIcon(self.maximize_icon)
self.maximize_button.setFlat(True)
self.maximize_button.setStyleSheet("background-color:white")
self.maximize_button.clicked.connect(self.maximizeWindow)
close_button = QPushButton()
close_icon = QIcon(GraphicsDirectoryPath('Close.png'))
close_button.setIcon(close_icon)
close_button.setStyleSheet("background-color:white")
close_button.clicked.connect(self.closeWindow)
line_frame = QFrame()
line_frame.setFixedHeight(1)
line_frame.setFrameShape(QFrame.HLine)
line_frame.setFrameShadow(QFrame.Sunken)
line_frame.setStyleSheet("border-color: black;")
title_label = QLabel(f"{str(Assistantname).capitalize()} ") # Advanced Virtual Assistant
Name
title_label.setStyleSheet ("color:black; font-size:18px; background-color:white")
home_button.clicked.connect(lambda:self.stacked_widget.setCurrentIndex(0))
message_button.clicked.connect(lambda:self.stacked_widget.setCurrentIndex(1))
layout.addWidget(title_label)
layout.addStretch(1)
layout.addWidget(home_button)
layout.addWidget(message_button)
layout.addStretch(1)
layout.addWidget(minimize_button)
layout.addWidget(self.maximize_button)
layout.addWidget(close_button)
layout.addWidget(line_frame)
self.draggable = True
self.offset = None
def minimizeWindow(self):
self.parent().showMinimized()
def maximizeWindow(self):
if self.parent().isMaximized():
self.parent().showNormal()
self.maximize_button.setIcon(self.maximize_icon)
else :
self.parent().showMaximized()
self.maximize_button.setIcon(self.restore_icon)
def closeWindow(self):
self.parent().close()
def showMessageScreen(self):
if self.current_screen is not None:
self.current_screen.hide()
message_screen = MessageScreen(self)
layout = self.parent().layout()
if layout is not None:
layout.addWidget(message_screen)
self.current_screen = message_screen
class MainWindow(QMainWindow):
def __init__(self):
super().__init__()
self.setWindowFlags(Qt. FramelessWindowHint)
self.initUI()
def initUI(self):
desktop = QApplication.desktop()
screen_width = desktop.screenGeometry().width()
screen_height = desktop. screenGeometry().height()
stacked_widget = QStackedWidget(self)
initial_screen = InitialScreen()
message_screen = MessageScreen( )
stacked_widget.addWidget(initial_screen)
stacked_widget.addWidget(message_screen)
self.setGeometry(0, 0, screen_width, screen_height)
self.setStyleSheet("background-color: black;")
top_bar = CustomTopBar(self, stacked_widget)
self.setMenuWidget(top_bar)
self.setCentralWidget(stacked_widget)
def GraphicalUserInterface():
app = QApplication(sys.argv)
window = MainWindow()
window.show()
sys.exit(app.exec_())
if __name__ == "__main__":
GraphicalUserInterface()
Chapter 9
OUTPUT
Chapter 10
SYSTEM OVERVIEW
10.1 Advantages
1 Instant Access to Information: Users can quickly get answers to questions, check weather
forecasts, news updates, traffic conditions, and more, simply by asking. It's like having an
expert available instantly.
2 Personalization: Voice assistants can learn and adapt to individual user preferences, habits,
and past interactions over time. This allows them to provide personalized recommendations,
tailored responses, and proactive assistance.
3 Natural Interaction: Interacting with a voice assistant through spoken commands feels
more natural and intuitive than traditional input methods. This makes technology more user-
friendly for a wider demographic, including children and the elderly.
5 Multilingual Support: Many voice assistants support multiple languages and dialects,
breaking down language barriers and catering to a global audience.
1. Performance: Voice assistants may have limitations in terms of their performance, such as
the speed at which they can process and respond to user requests, or the complexity of tasks
that they can handle.
2. Privacy and security: Voice assistants may not always clearly communicate their data collection
and sharing practices to users, which could raise concerns about transparency and
consent.
3. Customization: Voice assistants may not offer users a high degree of customization or
control over their functionality, which could limit their usefulness and appeal to users.
4. Accuracy: Voice assistants may not always accurately understand or respond to user
requests and queries, which can lead to frustration and a poor user experience.
5. Capabilities: Voice assistants may not support all tasks or functions that users may want to
perform, and they may not be able to integrate with all devices or systems.
10.3 Features
• It can get some real time information such as news headlines, weather report, IP address,
Internet speed, and system stats.
• It can also get entertaining contents such as jokes, latest movies or TV series, and playing
songs and videos in YouTube.
• It can also generate an image from given text and can also send an email. It can perform
system operations such as opening/closing/switching tabs, copying/pasting/deleting/selecting
the text, creating a new file, taking screen shot, minimizing/maximizing/switching/closing
windows.
• It can also get brief information on any topic, perform arithmetic operations, and answer
any general knowledge question.
• It can perform google search, find map or distance between two places on google maps.
• We can also get the chat history along with date & time of the query.
CONCLUSION
In conclusion, the voice assistant developed in this project is capable of performing various
tasks such as browsing the internet, sending emails, generating images, and interacting with
the user through conversation. It is able to do so by utilizing various APIs and technologies
such as stability’s, Google Speech Recognition, and SMTP.
The voice assistant is also able to perform system tasks such as opening and closing tabs,
windows, and applications, as well as taking screenshots and manipulating text in the
clipboard.
Chapter 12
Biblograpy
Prof. Suresh V. Reddy, Chandresh Chhari, Prajwal Wakde, Nikhil Kamble, "Personal
Desktop Virtual Voice Assistant using Python," IJCRT, 2022.
OTHER DOCUMENTATION
13.1 Based paper
13.2 Published paper
13.3 Plagiarism