Research Paper: A.I.
Based Virtual Personal Assistant
Abhay [abhaypmms@gmail.com]
Sooraj Pratap [soorajpratap05@gmail.com]
Abhishek Gautam [abhishekgautam730249@gmail.com]
Ritesh Yadav [yadavritesh829@gmail.com]
(Department of C.S.E. & I.T. Veer Bahadur Singh Purvanchal University Jaunpur 211001)
Keyword -: AI, Python, Voice Assistant, Virtual Assistant, Speech Recognition, Personal Assistant, Desktop Assistant
                                                ABSTRACT
As we know Python is an emerging language so it becomes easy to write a script for Voice Assistant in Python.
The instructions for the assistant can be handled as per the requirement of user. Speech recognition is the
process of converting speech into text. This is commonly used in voice assistants like Alexa, Siri, etc. In
Python there is an API called Speech Recognition which allows us to convert speech into text. It was an
interesting task to make my own assistant. It became easier to send emails without typing any word, searching
on Google without opening the browser, and performing many other daily tasks like playing music, opening
your favourite IDE with the help of a single voice command. In the current scenario, advancement in
technologies is such that they can perform any task with same effectiveness or can say more effectively than
us.
Functionalities of this project include:
1. It can send emails.
2. It can read PDF.
3. It can send text on WhatsApp.
4. It can open command prompt, your favourite IDE, notepad etc.
5. It can play music.
6. It can do Wikipedia searches for you.
7. It can open websites like Google, YouTube, etc., in a web browser.
8. It can give weather forecast.
9. It can give desktop reminders of your choice.
10. It can have some basic conversation.
Now the basic question arises in mind that how it is an AI? The virtual assistant that I have created is like if it
is not an A.I, but it is the output of a bundle of the statement. But fundamentally, the mail purpose of A.I
machines is that it can perform human tasks with the same efficiency or even more efficiently than humans.
     1. INTRODUCTION                                     2. FEATURES
Artificial Intelligence when used with machines, it       ➢ Tasks: A task is a personal or work-
shows us the capability of thinking like humans. In         related assignment you want to track
this, a computer system is designed in such a way           through completion. A task can occur once
that typically requires interaction from human. As          or repeat (a recurring task). A recurring
we know Python is an emerging language so it                task can repeat at regular intervals or repeat
becomes easy to write a script for Voice Assistant          based on the date you mark the task
in Python. The instructions for the assistant can be        complete. For example, you might want to
handled as per the requirement of user. Speech              send a status report to your manager on the
recognition is the Alexa, Siri, etc. In Python there        last Friday of every month, and get a
is an API called Speech Recognition which allows            haircut when one month has passed since
us to convert speech into text. It was an interesting       your last haircut. Recurring tasks are added
task to make my own assistant. It became easier to          one at a time to the task list. When you
send emails without typing any word, searching on           mark one occurrence of the task complete,
Google without opening the browser, and                     the next occurrence appears in the list.
performing many other daily tasks like playing              Users can also create Task Requests. A task
music, opening your favourite IDE with the help of          request enables the user to assign tasks to
a single voice command. In the current scenario,            the other people, and to receive task
advancement in technologies is such that they can           requests from others. When someone
perform any task with same effectiveness or can             assigns a task, that person gives up
say more effectively than us. By making this                ownership of the task (unless the task is
project, I realized that the concept of AI in every         declined). Anyone who assigns a task can
field is decreasing human effort and saving time.           keep an updated copy in their task list and
As the voice assistant is using Artificial                  receive status reports for the task.
Intelligence hence the result that it is providing are      Associated with task requests is a task list.
highly accurate and efficient. The assistant can help     ➢ Internet Applications: The Personal
to reduce human effort and consumes time while              Assistant allows personnel to access,
performing any task, they removed the concept of            customize, and engage the internet to help
typing completely and behave as another individual          them source information ranging from
to whom we are talking and asking to perform task.          weather, directions, and schedules, to stock
The assistant is no less than a human assistant but         performance, competitive data, and news.
we can say that this is more effective and efficient        All using simple, conversational voice
to perform any task. The libraries and packages             commands, e.g. trip management, airline
used to make this assistant focuses on the time             reservation and hotel reservations. The
complexities and reduces time. The functionalities          convergence of the richness of the internet
include, it can send emails, it can read PDF, it can        and the accessibility and mobility of the
send text on WhatsApp, it can open command                  phone is now forming a vast new network
prompt, your favourite IDE, notepad etc., It can            - a Voice Web - where Internet content can
play music, it can do Wikipedia searches for you, it        be accessed from any phone, anywhere,
can open websites like Google, YouTube, etc., in a          using human voice. A voice portal can be
web browser, it can give weather forecast, It can           defined as "speech-enabled access to Web
give desktop reminders of your choice. It can have          based information." In other words, a voice
some basic conversation. Tools and technologies             portal provides telephone users with a
used are PyCharm IDE for making this project, and           natural-language interface to access and
I created all Python files in PyCharm. Along with           retrieve Web content. An Internet browser
this I used following modules and libraries in my           can provide Web access from a computer
project. pyttsx3, Speech Recognition, Datetime.             but not from a telephone. A voice portal is
                                                            a way to do that. browser.
  3.   SYSTEM ARCHITECTURE                                     4. METHODOLOGY
➢ Take Command: The function is used to
  take the command as input through
  microphone of user and returns the output                         Speech Recognition
                                                                                                          API calls
                                                                                                           System call
  as string.                                         Voice input
                                                                         Module
➢ Wish Me: This function greets the user
                                                                                         Python Backend
  according to the time like Good Morning,                                                                System calls
                                                                                                               System call
  Good Afternoon and Good Evening.
                                                                     Text to Speech
➢ Task Execution: This is the function which        Speech Output
                                                                                                          Extracting Data
  contains all the necessary task execution
  definition like send Email, pdf reader, news,     1) Python: Python is an OOPs (Object
  and many conditions in if condition like             Oriented Programming) based, high-level,
  “open google,” “open notepad,” “search on            interpreted programming language. It is a
  Wikipedia”,” play music” and “open                   robust, highly useful language focused on
  command prompt” etc.                                 rapid application development (RAD).
                                                       Python helps the easy writing and execution
                                                       of codes. Python can implement the same
                      User
                                                       logic with as much as 1/5th of code as
       Voice                        Voice              compared to other OOPs languages. Python
       Input                         Output            provides a huge list of benefits to all. The
                                                       usage of Python is such that it cannot be
                                                       limited to only one activity. Its growing
   Microphone                       Speaker            popularity has allowed it to enter some of
                                                       the most popular and complex processes
                                                       like Artificial Intelligence (AI), Machine
                                                       Learning       (ML),      natural     language
                                                       processing, Data science, etc. Python has a
                                                       lot of libraries for every need of this project.
                                                       For JARVIS, libraries used are speech
   Speech – to               Text – to                 recognition to recognize a voice, Pyttsx3
                       a                               for text-to speech, selenium for web
  Text Conversion          Speech Conversion
                                                       automation, etc. Python is reasonably
                                                       efficient. Efficiency is usually not a
                                                       problem for small examples. If your Python
                                                       code is not efficient enough, a general
   Other                                               procedure to improve it is to find out what
                                          WWW          is taking most of the time and implement
 Application
                                                       just that part more efficiently in some
                                                       lower-level languages. This will result in
                                                       much less programming and more efficient
                                     Request
                                                       code (because you will have more time to
                                                       optimize) than writing everything in a low-
                    Keyword
                                                       level language.
                                                    2) Que.py: Que.py is a Python framework to
                                         Response
                                                       transform natural language questions into
                                                       queries in a database query language. It can
                                                       be easily customized to different kinds of
                                                       questions in natural language and database
                                                       queries. So, with a little coding, you can
    build your system for natural language
    access to your database.                        6. USED LIBRARY’S
 3) Pyttsx3: Pyttsx3 stands for Python Text to
                                                      ➢ pyttsx3: It is a python library which
    Speech. It is a cross-platform Python                converts text to speech.
    wrapper for text-to-speech synthesis. It is a     ➢ Speech Recognition: It is a python module
    Python package supporting common text-               which converts speech to text.
    to-speech engines on Mac OS X, Windows,           ➢ pywhatkit: It is python library to send
    and Linux. It works for both Python2.x and           WhatsApp message at a particular time
    3. versions. Its main advantage is that it           with some additional features.
    works offline.                                    ➢ Datetime: This library provides us the
 4) NLP and Voice Recognition: Natural                   actual date and time.
    language processing (NLP) techniques are          ➢ Wikipedia: It is a python module for
    used to process and understand the voice             searching anything on Wikipedia.
    commands the desktop voice assistant              ➢ Smtplib: Simple mail transfer protocol that
    receives. This may involve tasks such as            allows us to send mails and to route mails
    speech          recognition,         language       between mail servers.
    understanding, intent recognition, and            ➢ pyPDF2: It is a python module which can
    context extraction, to accurately interpret          read, split, merge any PDF.
    the user's commands.                              ➢ Pyjokes: It is a python library which
 5) SQLite: SQLite is a capable library, that            contains lots of interesting jokes in it.
    provides an in-process relational database        ➢ Web browser: It provides interface for
    for efficient storage of small-to-medium             displaying web-based documents to users.
    sized data sets. It supports most of the          ➢ Pyautogui: It is a python library for
    common features of SQL (Structured Query             graphical user interface.
    Language) with few exceptions. Best of all,       ➢ OS: It represents Operating System related
    most Python users do not need to install             functionality.
    anything to get started working with              ➢ sys: It allows operating on the interpreter
    SQLite, as the standard library in most              as it provides access to the variables and
    distributions’ ships with the sqlite3 module.        functions that usually interact strongly
    SQLite runs embedded in memory                       with the interpreter.
    alongside your application, allowing you to       ➢ Speed test: Python provides a library
    easily extend SQLite with your own Python            called speed test which is useful for testing
    code. SQLite provides quite a few hooks, a           the internet speed. It is basically a
    reasonable      subset    of    which     are        command-line interface for checking the
    implemented by the standard library                  internet bandwidth. An internet speed test
    database driver.                                     is usually run to measure the travel speed
                                                         between your device and the server you
5. SOFTWARE REQUIREMENTS                                 want to connect to, over the internet
                                                         connection you are using. It usually
  ➢ Operating system should be window 7 or               displays the upload and download speeds
    higher.                                              as the result.
  ➢ The kernel version should be 3.0.16 or            ➢ Decouple: Decouple helps you to organize
    higher.                                              your settings so that you can change
  ➢ Support of other basic applications like             parameters without having to redeploy
    maps, calendar, camera, web connection               your app.
    etc
                                                          •   store parameters in .env files;
                                                          •   define comprehensive default values;
                                                          •   properly convert values to the correct
                                                              data type.
 7. WORKING PRINCIPLE                                               8.   REAL LIFE APPLICATION
         Five Steps in Natural Language                                  ➢ Saves time: A desktop voice
         Processing are:                                                   assistant which works on the voice
                                                                           command offered to it, it can do
                                                                           voice searching, voice-activated
                     LEXICAL ANALYSIS                                      device control and can let us
                                                                           complete a set of tasks.
                          S                                              ➢ Conversational interaction: It
                 SSYNTACTIC ANALYSIS
                                                                           makes it easier to complete any task
                                                                           as it automatically does it by using
                                                                           the essential module or libraries of
                  SEMANTIC ANALYSIS                                        Python, in a conversational
                                                                           interaction way. Hence any user
                                                                           when instruct any task to it, they
                                                                           feel like giving task to a human
             DISCLOSURE INTEGRATION                                        assistant     because      of    the
                                                                           conversational     interaction   for
                                                                           giving input and getting the desired
                                                                           output in the form of task done.
                 PRAGMATIC ANALYSIS
                                                                         ➢ Reactive nature: The desktop
                                                                           assistant is reactive which means it
                                                                           know human language very well
   Speech Recognition Architecture:                                        and understand the context that is
                                                                           provided by the user and gives
                                                                           response in the same way, i.e.
                                                                           human understandable language,
                                                                           English. So, user finds its reaction
                       Cepstral Feature                                    in an informed and smart way.
                         Extraction
                                                                         ➢ Multitasking:         The       main
         O                                                                 application of it can be its
                       MFCC Feature                                        multitasking ability. It can ask for
                                                                           continuous instruction one after
                       fEATURE                                             other until the user “QUIT” it.
                   Gaussian Acoustic Model                               ➢ No Trigger phase: It asks for the
                                                                            instruction and listen the response
                                                                            that is given by user without
P(O/W)                                           P(W)
                                                                            needing any trigger phase and then
                                                                            only executes the task.
               Phone
                                   HMM Lexicon
             Likelihoods
                                                       N-gram
                                                   language model
                                Viterbi
                                decoder
                9. EXPECTED OUTCOME
 ➢ Password -
                   Figure 9.1- Password
➢ Alarm -
                      Figure 9.2 - Alarm
➢ Google Search -
                    Figure 9.3 – Input Google Search
                    Figure 9.4 – Output Google Search
                                            10. Conclusion
Using the NLP built-in with synthetic talent we have completed a design of intelligent virtual digital assistant
that can manage applications, replies to user queries, and additionally carries out internet searches and
converses with the human by interacting intelligently and manipulates the devices, this consists of looking in
Wikipedia, opening Google, YouTube, Facebook, MS-Word, MS-Excel, MS-PowerPoint, sending mails, login
to your Gmail through voice, lock your PC, restart your PC, shut down your PC, play music, placing alarms,
getting climate notifications, etc. Thus, with the aid of examining the current voice assistant systems, we have
come up with this proposed machine which is environment friendly and helps us to be more organized. There
are lot of enhancements to be developed in the future to address the various user needs like users with various
disabilities through recent machine learning algorithms.
                                        11. Acknowledgment
The Lord has been faithful in granting the strength, wisdom, knowledge, and the courage needed throughout
this period of study. We thank the Almighty God and Father above, for life and success of this study. We wish
to show immense appreciation to our supervisor, Mr. Pravin Kumar Pandey who executed his duties of
supervising this work in a passionate, and an expeditious manner. We are thankful to and fortunate enough to
get constant encourage, support and guidance from HOD Dr. Sanjeev Gangwar and all Teaching staff of
C.S.E. department of Veer Bahadur Singh Purvanchal University Jaunpur which helped us in
successfully completing Research Paper and Project Work. My heartfelt gratitude goes to my parents, siblings
and loved ones for their support in diverse forms. My sincere appreciation goes to my good friends as well as
my course mates. GOD BLESS YOU ALL.
                                             12. Reference
[1] Dr. Kshama, V. Kulhalli, Dr. Kotrappa Sirbi, Mr. Abhijit J. Patankar, “Personal Assistant with Voice
Recognition Intelligence,” International Journal of Engineering Research and Technology. vol 10, no.1, pp.
416-418, (2017).
[2] Kukade, Ruchita G. Fengse, Kiran D. Rodge, Siddhi P. Ransing, Vina M. Lomte "Virtual Personal Assistant
for the Blind,” International Journal of Computer Science and Technology (JCST), vol 9, Issue 4, PP.2251-
2253, October - December 2018.
[3] Tushar Gharge, Chintan Chitroda, Nishit Bhagat, Kathapriya Giri, “AI Smart Assistant,” International
Research Journal of Engineering and Technology (IRJET), vol: 06 Issue: 01, PP-3862-3863, January 2019.
[4] M. A. Jawale, A. B. Pawar, D. N. Kyatanavar, “Smart Python Coding through Voice Recognition,”
International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol: 8 Issue-10, PP-
3283-3284, August 2019.
 [5] Isha S. Dubey, Jyotsna S. Verma, Ms. Arundhati Mehendale, "An Assistive System for Visually Impaired
using Raspberry Pi," International Journal of Engineering Research & Technology (IJERT), vol 8 Issue 05,
PP-608-609, May 2019.