Final Doc - Project
Final Doc - Project
A Main project thesis submitted in partial fulfillment of requirements for the award of
degree for VIII semester.
BACHELOR OF TECHNOLOGY
by
This is to certify that the main project entitled “AI-Powered Interview Assistant” being
submitted by
in partial fulfilment for the award of the degree “Bachelor of Technology” in Computer Science
and Engineering(AI&ML) to the Jawaharlal Nehru Technological University, Kakinada is a
record of bonafide work done under my guidance and supervision during VIII semester of the
academic year 2023-2024.
The results embodied in this record have not been submitted to any other university
or institution for the award of any Degree or Diploma.
PAGE \* MERGEFORMAT 2
ACKNOWLEDGEMENT
We would like to express our deep sense of gratitude to our esteemed institute Gayatri
Vidya Parishad College of Engineering (Autonomous), which has provided us an
opportunity to fulfill our cherished desire.
We express our sincere thanks to our principal Dr. A. B. KOTESWARA RAO, Gayatri
Vidya Parishad College of Engineering (Autonomous) for his encouragement to us during
this project, giving us a chance to explore and learn new technologies in the form of mini
projects.
We express our deep sense of Gratitude to Dr. D. UMA DEVI, Associate Professor and
Associate head of CSE with AI & ML and Incharge Head of Department of Computer
Science and Engineering, Gayatri V idya Parishad College of Engineering
(Autonomous) for giving us an opportunity to do the project in college.
We express our profound gratitude and our deep indebtedness to our guide Dr. R. Seeta
Sireesha, Associate Professor, Department of Computer Science and Engineering,
whose valuable suggestions, guidance and comprehensive assessments helped us a lot in
realizing our project.
We also thank our coordinators, Dr. CH. SITA KUMARI, Associate Professor,
Department of Computer Science and Engineering, Ms. K. SWATHI, Assistant
Professor, Department of Computer Science and Engineering, and Ms. P. POOJA
RATNAM Assistant Professor Department of Computer Science and Engineering, for
the kind suggestions and guidance for the successful completion of our project work.
PAGE \* MERGEFORMAT 2
ABSTRACT
In the rapidly evolving educational landscape, technical students face a daunting task
balancing coursework, internships, and part-time jobs. The challenge intensifies as graduation
nears, with insufficient interview preparation fostering anxiety and eroding confidence. The
rigidity of schedules and fixed interview timelines exacerbates the issue, impeding students'
ability to translate theoretical knowledge into practical skills demanded by the competitive
tech industry. Existing skill development models often fall short, lacking personalized one-
on-one interview experiences and struggling to accurately assess performance and areas for
improvement. To address these shortcomings, we propose a novel approach leveraging
OpenAI keys and LangChain models. This innovative solution tailors interview questions to
individual resumes, enhancing the learning experience. Streamlit facilitates seamless
interaction, while integration with OpenAI enriches simulation sophistication. The result is a
comprehensive solution that represents a paradigm shift in interview preparation,
empowering students with the confidence and skills needed to excel in job interviews among
rapid technological advancements.
PAGE \* MERGEFORMAT 2
INDEX
CHAPTER 1. INTRODUCTION……………………………………………….1
1.1 About the Algorithm………………………………………….2
1.2 Purpose…................................................................................7
1.3 Scope…....................................................................................7
CHAPTER 2. SRS DOCUMENT……………………………………………….9
2.1 Functional Requirements.………………………………....9
2.2 Non -Functional Requirements.…………………………...9
2.3 Minimum Hardware Requirements……………………….10
2.4 Minimum Software Requirements.……………………....10
PAGE \* MERGEFORMAT 2
CHAPTER 5. PROJECT DESCRIPTION..............................................19
5.1 Problem Definition….......................................................19
5.2 Project Overview..............................................................19
5.3 Module Description….....................................................20
PAGE \* MERGEFORMAT 2
1. INTRODUCTION
1.1 OBJECTIVE
The primary objective of this application is to revolutionize the interview preparation
process, offering a holistic solution that combines technological innovation with strategic
foresight. Through the seamless integration of Streamlit, LangChain, and OpenAI, it
endeavors to provide individuals with a tailored and immersive experience that transcends
traditional methods of preparation[2].
Firstly, the application aims to streamline the preparation journey by leveraging
Streamlit's interactive interface, ensuring user-friendly navigation and engagement[6].
Additionally, through LangChain's text processing capabilities, it seeks to automate the
extraction of relevant information from resumes, facilitating the generation of personalized
interview questions that align with the candidate's skills and experiences.
1|Page
Moreover, the application aspires to enhance candidates' communication proficiency
through the
integration of audio recording functionality. By leveraging LangChain's speech-to-text
conversion capabilities, it enables individuals to articulate their responses verbally, fostering
a more dynamic and immersive preparation experience[3]. Through iterative analysis and
feedback mechanisms, the application empowers candidates to refine their responses and
elevate their interview performance
Overall, the objective of this application is to empower individuals with the tools and
insights necessary to confidently navigate the interview process and secure their desired
career opportunities. By harnessing the combined capabilities of Streamlit, LangChain, and
OpenAI, it aims to redefine the paradigm of interview preparation, setting a new standard
for efficiency, effectiveness, and empowerment[2].
Document loaders:
Document loaders act as the primary entry point for bringing data into our system. They
provide the initial step in the data ingestion process, facilitating the seamless integration of
textual content from various sources.
Text Loader:
2|Page
The Text Loader component serves as a foundational element in our system, responsible for
sourcing textual documents from various data repositories. By seamlessly interfacing with
diverse sources including local files and cloud-based storage solutions, Text Loader ensures
the reliable acquisition of data essential for subsequent processing and analysis[1].
Text Splitters:
Text Splitter efficiently breaks down large documents into manageable chunks, enhancing
processing efficiency and enabling targeted analysis. Coherent Chunking: Utilizes advanced
algorithms to ensure that text chunks maintain coherence and relevance, preserving the
contextual integrity of the original document. Optimized Processing: By segmenting text
into smaller units, Text Splitter optimizes subsequent retrieval and analysis processes,
facilitating faster and more accurate information extraction.
3|Page
Character Text Splitter:
At the core of our data preprocessing pipeline, the Character Text Splitter module plays a
pivotal role in segmenting large textual documents into manageable fragments. Utilizing
sophisticated character-based splitting algorithms, this component optimizes data processing
efficiency and enhances retrieval performance by isolating relevant sections of text.
Vector Database:
In the ever-evolving landscape of artificial intelligence, vector databases stand as pivotal
solutions, indexing and storing vector embeddings to enable swift retrieval and similarity
searches. As we navigate through the AI revolution, these databases emerge as
indispensable tools, addressing the escalating complexity and scale of modern data
processing. By harnessing the semantic richness embedded within vector representations,
they empower applications reliant on large language models and generative AI, facilitating
efficient knowledge retrieval and long-term memory maintenance.
Through seamless integration with embedding models, these databases augment AI
capabilities, facilitating tasks such as semantic information retrieval with unparalleled
efficiency. Thus, they play a
pivotal role in enhancing the effectiveness of AI-driven applications, embodying the
synergy between advanced data management and transformative AI innovation.
4|Page
Fig-1.3 Vector Database
FIASS:
FAISS (Facebook AI Similarity Search) is a cutting-edge library designed for efficient
similarity search and clustering of high-dimensional vector data. Developed by Facebook AI
Research, FAISS offers optimized algorithms tailored for large-scale datasets encountered
in AI applications. Its advanced indexing techniques, such as Product Quantization (PQ) and
Hierarchical Navigable Small World (HNSW), ensure rapid and accurate nearest neighbor
search operations.
FAISS supports essential functionalities like CRUD operations and metadata filtering,
simplifying data management. Additionally, FAISS enables horizontal scaling, distributing
index structures across multiple machines for enhanced performance and scalability. As a
cornerstone technology, FAISS empowers AI systems with swift and precise retrieval of
semantic information
Retrieval:
Retrieval mechanisms orchestrate the process of fetching relevant data based on user
queries, bridging the gap between raw data and actionable insights. The
RetrievalQAWithSourcesChain leverages sophisticated algorithms to identify and retrieve
pertinent information, taking into account multiple data sources and query types. By
employing techniques such as semantic search and ensemble retrieval, it enhances the
precision and comprehensiveness of search results, empowering users with actionable
knowledge.
5|Page
Retrieval Questions and Answers With Sources Chain :
The RetrievalQAWithSourcesChain module represents the pinnacle of our system's retrieval
capabilities. Incorporating advanced algorithms, this component enables users to pose
complex queries and retrieve relevant documents with exceptional efficiency. By integrating
multiple data sources and leveraging semantic understanding,
RetrievalQAWithSourcesChain empowers users to extract actionable insights from vast
repositories of textual data with unparalleled accuracy and speed.
Fig-1.5 Retrieval
Streamlit UI:
The Streamlit UI component serves as the user-facing interface of our system, providing
intuitive access to its functionalities. Designed for simplicity and ease of use, Streamlit UI
enables users to explore, query, and visualize data effortlessly. By offering a seamless and
interactive experience, the UI enhances user engagement and ensures efficient utilization of
our system's capabilities across diverse applications and use cases.
Built upon Streamlit's framework, the UI offers a user-friendly experience, enabling
effortless access to various functionalities and insights. Concurrently, project coding
encompasses the implementation of underlying algorithms and logic, ensuring the
robustness and functionality of our system. Through meticulous coding practices and
adherence to best practices, we uphold the integrity and reliability of our solution[6].
1.3 PURPOSE
The purpose of the provided code and application is to streamline and enhance the interview
6|Page
preparation process for job seekers. By leveraging advanced technologies such as Streamlit,
LangChain, and OpenAI, the application offers a sophisticated platform for generating
personalized technical interview questions based on the content of uploaded resumes[2].
Through seamless integration with document loaders and text splitters, the
application efficiently extracts relevant information from resumes, ensuring that generated
questions are tailored to each candidate's unique skills and experiences.
Additionally, the incorporation of audio recording functionality allows candidates to
verbally respond to interview questions, fostering dynamic and immersive preparation
sessions[3]. The application's objective is to empower job seekers with the tools and
resources needed to confidently navigate the interview process and secure their desired
career opportunities.
Overall, the code and application aim to revolutionize interview preparation by
providing a user-friendly interface, intelligent question generation capabilities, and
interactive features for audio-based responses[4].
By combining cutting-edge technologies with a focus on user-centric design, the
application strives to enhance the efficiency, effectiveness, and confidence of job seekers as
they prepare for interviews. With its comprehensive approach and innovative features, the
application sets out to redefine the standard for interview preparation in the modern job
market.
At its core, the application seeks to empower individuals with a strategic advantage
in their career pursuits. Through intelligent question generation and personalized feedback
mechanisms, it fosters a deeper understanding of one's strengths and areas for improvement,
enabling candidates to showcase their capabilities with confidence and precision during
interviews.
1.4 SCOPE
The scope of our project encompasses the development of a comprehensive platform
tailored to streamline the interview process through the integration of advanced AI
technologies. By leveraging natural language processing and machine learning algorithms,
our application aims to analyze candidate resumes, generate personalized technical
questions, and facilitate efficient evaluation of their skills and experiences. With a focus on
enhancing both the candidate and recruiter experience, our platform seeks to revolutionize
traditional hiring practices by providing a data-driven approach to talent assessment and
7|Page
selection. Here are some potential areas of focus:
Document Loaders: Retrieve documents from diverse sources including private S3
buckets and public websites. Integrates with major providers such as AirByte and
Unstructured[1].
Text Splitters: Segment large documents into manageable chunks using specialized
algorithms for different document types like code, markdown, etc.
Embedding Models: Generate embeddings to capture semantic meaning of text,
offering over 25 integrations with diverse providers from open source to proprietary
API models.
Vector Stores: Facilitate efficient storage and retrieval of embeddings with
integrations with over 50 vector stores, ranging from open-source local options to
cloud-hosted proprietary solutions.
Retrievers: Retrieve relevant documents from the database using various algorithms
including basic semantic search and advanced techniques like Parent Document
Retriever, Self Query Retriever, and Ensemble Retriever.
Indexing: Sync data from any source into a vector store for efficient retrieval,
preventing duplication, re-computation of unchanged content, and minimizing
resource utilization while improving search results.
2. SRS DOCUMENT
8|Page
A software requirements specification (SRS) is a document that describes what the software
will do and how it will be expected to perform.
9|Page
• Operability : The interface of the system will be consistent.
• Usability : The user interface must be intuitive and provide clear instructions,
accommodating candidates of diverse technical backgrounds throughout the
interview process.
• Efficiency : Once user has learned about the system through his interaction, he can
perform the task easily.
• Understandability : Because of user friendly interfaces, it is understandable to the
users
10 | P a g e
3. ANALYSIS
11 | P a g e
responses to these questions, fostering immersive preparation sessions. With intuitive user
interface design powered by Streamlit, our application aims to elevate interview readiness to
new heights, empowering candidates with confidence and proficiency.
For training our interview question generation model, we employ a combination of
advanced techniques:
• Contextual Analysis: Utilizing LangChain's OpenAI API, we capture nuanced
patterns within resume content to generate contextually relevant questions[2].
• Semantic Understanding: Leveraging LangChain's language processing
capabilities, we assess the semantic relevance of questions generated from resume
data.
• Fluency Optimization: Fine-tuning OpenAI's GPT models ensures the fluency and
coherence of interview questions, enhancing their natural language generation
capabilities.
• Personalization Strategies: Implementing LangChain's adaptive algorithms, we
tailor question generation based on individual user feedback and preferences[5].
Interactive Learning: Integrating user interaction mechanisms, we employ
ensemble learning approaches to refine question generation processes,
incorporating user input for continual enhancement. Iterative
• Improvement: Through iterative training and model refinement using LangChain's
resources, we continuously optimize the question generation process, ensuring the
highest quality output[5].
• It is very time-saving
• Dynamic Question Generation
• Accurate results
• Automated Resume Parsing
• User- friendly graphical interface
• Highly reliable
• Cost effective
• Operational : Define the urgency of the problem and the acceptability of any
solution, includes people-oriented and social issues: internal issues, such as
manpower problems, labor objections, manager resistance, organizational conflicts,
and policies; also, external issues, including social acceptability, legal aspects, and
government regulations.
• Technical : Is the feasibility within the limits of current technology? Does the
technology exist at all? Is it available within a given resource?
• Economic : Is the project possible, given resource constraints? Are the benefits that
will accrue from the new system worth the costs? What are the savings that will
result from the system, including tangible and intangible ones? What are the
development and operational costs?
• Schedule : Constraints on the project schedule and whether they could be reasonably met.
ECONOMIC FEASIBILITY:
Economic analysis could also be referred to as cost/benefit analysis. It is the most frequently
used method for evaluating the effectiveness of a new system. In economic analysis the
13 | P a g e
procedure is to determine the benefits and savings that are expected from a candidate system
and compare them with costs. Economic feasibility study related to price, and all kinds of
expenditure related to the scheme before the project starts. This study also improves project
reliability. It is also helpful for the decision-makers to decide the planned scheme processed
latter or now, depending on the financial condition of the organization. This evaluation
process also studies the price benefits of the proposed scheme. Economic Feasibility also
performs the following tasks.
• Cost of packaged software.
• Cost of doing full system study.
• Is the system cost Effective?
TECHNICAL FEASIBILITY:
A large part of determining resources has to do with assessing technical feasibility. It
considers the technical requirements of the proposed project. The technical requirements are
then compared to the technical capability of the organization. The systems project is
considered technically feasible if the internal technical capability is sufficient to support the
project requirements. The analyst must find out whether current technical resources can be
where the expertise of system analysts is
beneficial, since using their own experience and their contact with vendors they will be able
to answer the question of technical feasibility. Technical Feasibility also performs the
following tasks.
OPERATIONAL FEASIBILITY:
Operational feasibility is a measure of how well a proposed system solves the problems and
takes advantage of the opportunities identified during scope definition and how it satisfies
the requirements identified in the requirements analysis phase of system development. The
operational feasibility refers to the availability of the operational resources needed to extend
14 | P a g e
research results beyond on which they were developed and for which all the operational
requirements are minimal and easily accommodated. In addition, the operational feasibility
would include any rational compromises farmers make in adjusting the technology to the
limited operational resources available to them. The operational Feasibility also perform the
tasks like
• Does the current mode of operation provide adequate response time?
• Does the current of operation make maximum use of resources.
• Determines whether the solution suggested by the software development team is
acceptable.
• Does the operation offer an effective way to control the data?
• Our project operates with a processor and packages installed are supported by the
system.
15 | P a g e
4. SOFTWARE DESCRIPTION
4.2 LangChain
LangChain is an innovative blockchain-based platform that revolutionizes multilingual
communication and translation services. It offers a decentralized solution to bridge language
barriers, providing a secure and efficient environment for users to interact across linguistic
boundaries. By leveraging blockchain technology, LangChain ensures transparency,
immutability, and trust in language transactions. Users can access a wide range of language
services, including translation, interpretation, and language learning, all within a
decentralized ecosystem. With LangChain, individuals, businesses, and organizations can
seamlessly communicate and collaborate on a global scale, unlocking new opportunities and
fostering cross-cultural understanding[5].
4.3 Python
Python is an interpreted, object-oriented, high-level programming language with dynamic
semantics. Its high-level built-in data structures, combined with dynamic typing and
dynamic binding, make it very attractive for Rapid Application Development, as well as for
use as a scripting or glue language to connect existing components together. Python's
simple, easy to learn syntax emphasizes readability and therefore reduces the cost of
program maintenance. Python supports modules and packages, which encourages program
modularity and code reuse. The Python interpreter and the extensive standard library are
available in source or binary form without charge for all major platforms, and can be freely
distributed.
16 | P a g e
4.4 Open AI
OpenAI stands as a pioneering research organization at the forefront of artificial intelligence
development, dedicated to advancing the boundaries of AI technology and its accessibility.
Renowned for its groundbreaking research and innovative solutions, OpenAI strives to
democratize AI through its APIs, tools, and research findings, empowering developers,
businesses, and researchers worldwide. With a focus on responsible AI deployment, OpenAI
fosters collaborations, conducts cutting-edge research, and promotes ethical AI practices. Its
contributions span various domains, from natural language processing and computer vision
to reinforcement learning and robotics. Through its commitment to transparency and
collaboration, OpenAI continues to shape the future of AI, driving impactful advancements
that benefit society as a whole[2].
4.5 PyCharm
PyCharm stands as a premier integrated development environment (IDE) meticulously
crafted for Python programming, renowned for its robust features and user-friendly
interface. Developed by JetBrains, PyCharm offers a comprehensive suite of tools designed
to enhance the productivity and efficiency of Python developers. Its intelligent code
completion, advanced debugging capabilities, and seamless integration with version control
systems streamline the development workflow. PyCharm provides support for various
Python frameworks and libraries, facilitating the creation of diverse applications ranging
from web development to data analysis and machine learning. With its extensive plugin
ecosystem and customizable settings, PyCharm caters to the unique needs of developers,
enabling them to build high-quality software with ease. Whether working on personal
projects or large-scale enterprise applications, PyCharm remains a preferred choice for
Python developers seeking a feature-rich and intuitive development environment.
4.6 Streamlit
Streamlit is a Python library that simplifies the creation of interactive web applications for
data science and machine learning projects. It offers a straightforward and intuitive way to
build user-friendly interfaces without the need for extensive web development experience.
With Streamlit, developers can seamlessly integrate data visualizations, input widgets, and
text elements to create dynamic applications that enable users to explore and interact with
17 | P a g e
data in real-time. Its declarative syntax and automatic widget rendering make prototyping
and deploying applications quick and efficient. Streamlit's seamless integration with popular
data science libraries like Pandas, Matplotlib, and TensorFlow further enhances its
capabilities, allowing developers to leverage their existing knowledge and tools. Overall,
Streamlit empowers data scientists and machine learning engineers to share insights,
prototypes, and models with stakeholders effectively, accelerating the development and
deployment of data-driven applications[6].
18 | P a g e
5. PROBLEM DESCRIPTION
Fig-5.1 Overview
19 | P a g e
The steps involved in the project are: -
The output of our project is a user-friendly interface where technical students can
upload their resumes and final result is to provide a score between 0 to 100 and for each
question along with the areas of improvement.
20 | P a g e
available in source or binary form without charge for all major platforms, and can be freely
distributed.
Python Streamlit is a powerful open-source framework that allows developers to create
interactive web applications with ease. With its simple and intuitive syntax, developers can
quickly build data-driven applications using Python code. Streamlit provides various
components and widgets for creating interactive elements such as buttons, sliders, and
charts, making it ideal for building user-friendly interfaces.
Module
21 | P a g e
answering questions sequentially. Each question triggers the initiation of audio recording
upon user interaction with a designated button, leveraging the device's microphone.
Recorded audio is subsequently transcribed into text format using the Google Speech
Recognition API[3]. Captured responses are stored in session state variables for subsequent
analysis. Upon completion of all questions, the application proceeds to analyze user
responses. A formulated query facilitates scrutiny of questions and corresponding answers,
yielding scores and areas for improvement for each question-answer pair. LangChain's
question-answering capabilities process the query, presenting findings to the user via the
application interface.
22 | P a g e
Fig 5.3 - Implementations
23 | P a g e
workflow.
• The RecursiveCharacterTextSplitter() class initializes a text splitter object designed
to break down text into smaller, manageable chunks. This functionality aids in
processing large volumes of text efficiently, particularly when dealing with lengthy
documents such as PDF files.
• Furthermore, the OpenAIEmbeddings() class initializes an embeddings object used
for handling text embeddings, which are essential for various natural language
processing tasks such as semantic similarity analysis.
• The FAISS.from_texts() method is employed to create a FAISS vector store from
the text data extracted from documents. This vector store facilitates efficient
similarity searches and other vector-based operations, enhancing the performance
of the question-answering system.
• Within the Streamlit framework, several functions and widgets are utilized to create
the user interface and manage application state. Functions such as
st.file_uploader(), st.header(), st.write(), st.button(), st.title(), st.empty(), and
st.error() are employed to display text, widgets, and interactive elements on the
Streamlit app.
• Additionally, the st.session_state attribute is utilized to access and manage session
state variables, enabling data persistence and user interaction tracking within the
application. Overall, these functionalities contribute to the creation of a user-
friendly and interactive interview preparation tool.
24 | P a g e
6. SYSTEM DESIGN
25 | P a g e
6.2 USE CASE
AI-powered interview assistant that automates interview prep and analysis for both
recruiters and interviewees. Here's a breakdown of the use case:
For Recruiters:
For Interviews:
• Upload Resume: The interviewee uploads their resume for interview preparation.
• Personalized Questions: The application generates relevant technical questions
26 | P a g e
based on their resume.
• Practice Interview: The interviewee can practice answering the generated questions
and record their responses.
• Self-Assessment: The interviewee can get feedback on their answers, including
scores and areas for improvement.
• PDF Text Extraction: The script utilizes a PDF parsing library to extract text from
the uploaded resume, converting the PDF content into a readable format.
• Text Chunking: The extracted text is then divided into manageable chunks or
segments, which could correspond to different sections of the resume such as
education, work experience, skills, etc. This segmentation helps in generating
relevant questions based on different aspects of the resume.
• Question Generation: Based on the segmented content of the resume, the script
generates technical questions related to the candidate's skills, experience, and
educational background. These questions are formulated programmatically to
assess the candidate's proficiency in various technical domains.
• Audio Question Playback: The generated questions are presented to the user
audibly, one by one. The user listens to each question and prepares their response.
• Audio Answer Recording: The user records their answers to the questions via audio
27 | P a g e
input. This audio data is captured and stored for further processing.
• Speech-to-Text Conversion: The recorded audio answers are then converted into
text format using speech recognition algorithms. This enables the script to analyze
and evaluate the responses effectively.
• Storage of Answer Data: The converted text answers are stored in a structured
format, associating each response with its corresponding question for later analysis.
• Answer Analysis and Scoring: Finally, the script utilizes Streamlit, a web
application framework, to present the recorded answers to the user. It analyzes each
answer, providing a score and highlighting areas of improvement based on
predefined criteria. This analysis could involve assessing the completeness,
relevance, and clarity of the responses relative to the questions asked.
28 | P a g e
7. IMPLEMENTATION
7.1 RAW DATA
29 | P a g e
Fig 7.4 - Chunk division (Complete
30 | P a g e
7.2 SAMPLE CODE (app.py)
import streamlit as st
import pickle
from PyPDF2 import PdfReader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
from langchain.callbacks import get_openai_callback
import os
import time
import sounddevice as sd
import soundfile as sf
import tempfile
import speech_recognition as sr
# Sidebar contents
with st.sidebar:
st.title('🤗💬 AI Powered Interview Assistant')
st.markdown('''
## About
Team Members:
- Rameez Ahmad
- Vijay
- Nuzhat
- J Bhavya
''')
# Add a level selection dropdown
31 | P a g e
# level_options = ['Easy', 'Medium', 'Hard']
# selected_level = st.selectbox("Select question level:", level_options)
def convert_audio_to_text(audio_file):
r = sr.Recognizer()
try:
with sr.AudioFile(audio_file) as source:
audio_data = r.record(source)
text = r.recognize_google(audio_data)
return text
except sr.UnknownValueError:
return ""
except sr.RequestError as e:
return f"Could not request results from Google Speech Recognition service; {e}"
# Sample rate
fs = 44100
# Default recording duration
32 | P a g e
duration = 30
# Main function
def main():
st.header("Personalized Interviewer 💬")
33 | P a g e
# Check if the extracted text is empty
if not text.strip():
st.error("No text found in the PDF.")
return
# Generate questions
query = f"Give {st.session_state.n} {selected_level.lower()} level technical
questions on the skills and projects from the above pdf for {job_role.lower()}"
if query:
docs = st.session_state.vectorstore.similarity_search(query=query, k=3)
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6,
34 | P a g e
max_tokens=500)
chain = load_qa_chain(llm=llm, chain_type="stuff")
with get_openai_callback() as cb:
response = chain.run(input_documents=docs, question=query)
if not st.session_state.questions:
st.session_state.questions = list(response.split('?'))[0:-1]
st.header("Questions")
for i, question in enumerate(st.session_state.questions):
st.write(f"{question}")
start_recording = st.button(f"Start Answering {i + 1}")
if start_recording:
st.write("Listening...")
audio_file = record_audio(duration, fs)
st.write("Time's up!")
35 | P a g e
curr_qns_ans["Answer"] = text
st.session_state.recorded_answers[i] = curr_qns_ans
query = f"""Analyze all the above questions and corresponding answers and
give a score between 0 to 100 and also provide the areas of improvement for betterment
of the candidate. The list of questions and answers are as follows, providing a review
only for answered questions: {str(st.session_state.recorded_answers)}. Give analysis for
every question and corresponding answer. The format of the review is '[Question
number] : [score]/100 Areas of improvement: [suggestions to improve]'. Every
question's response should be separated by '###'. For example:
Question 2: Score - N/A Areas of improvement: The candidate did not provide
an answer for this question, so no score or areas of improvement can be given
and question number starts from 1.Please give each answer seperated by ### """
count = 0
for i, question in enumerate(st.session_state.questions):
if st.session_state.recorded_answers[i]["Answer"] != "Not Answered Yet":
count += 1
36 | P a g e
answered_all = True if count == st.session_state.n else False
if query and answered_all:
st.title("Analysis and Review")
docs = st.session_state.vectorstore.similarity_search(query=query, k=3)
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.6,
max_tokens=1000)
chain = load_qa_chain(llm=llm, chain_type="stuff")
with get_openai_callback() as cb:
response = chain.run(input_documents=docs, question=query)
reviews = response.split("###")
for review in reviews:
st.write(review)
# st.write(response)
except Exception as e:
st.error(f"An error occurred: {str(e)}")
37 | P a g e
7.3 Results
The application operates by first receiving user input. It then processes this input
using algorithms based on the proposed system's design. Next, it executes the
necessary actions, generates output, and presents it to the user. Finally, it may prompt
for further interaction or loop back to receive additional input.
38 | P a g e
Fig 7.6 – Home Page
Fig 7.7 - Resume Uploading , Select question level , and Enter job role
Fig 7.9 - Interview Questions generation from Resume
40 | P a g e
Fig 7.13 - Analysis and suggestions after all questions being
41 | P a g e
Fig 7.14 - Error and Exception handling for invalid resumes
Importance of Testing
The importance of software testing is imperative. A lot of times this process is
skipped, therefore, the product and business might suffer. To understand the
importance of testing, here are some key points to explain
• Software Testing saves money
• Provides Security
• Improves Product Quality
• Customer satisfaction
42 | P a g e
Testing is of different ways The main idea behind the testing is to reduce the
errors and do it with a minimum time and effort.
43 | P a g e
Fig 8.1 – Partial Resume
Test case 2: Upon uploading a valid resume to the application with level of ease
and without any job description, we got questions just based on resume and
level of ease.
Test case 3 : When attempting to upload a file format other than PDF, a
warning message is displayed indicating that the file type is not allowed.
Our project leverages advanced technologies like LangChain and OpenAI to revolutionize
the interview preparation process[5]. By automating question generation based on resume
content and enabling personalized audio responses, we offer a dynamic and efficient
platform for candidates to hone their interview skills. The seamless integration of machine
learning algorithms ensures objectivity, fairness, and real-time feedback, enhancing the
overall interview experience
Both benefits and drawbacks exist with our project. On the positive side, it automates
question generation and response recording, streamlining the interview preparation process.
Additionally, it provides personalized feedback and analysis, enhancing candidate
performance and confidence. However, reliance on machine learning algorithms may
introduce biases or inaccuracies in question generation, impacting the quality of interview
practice. Our system may not fully replicate the nuances of human interaction in interview
scenarios, and users should supplement their preparation with real-world practice and
feedback.
The main challenge that we faced while working on this project was the need for internet
46 | P a g e
connectivity and API access may limit accessibility and usability in certain environments.
The future scope of our project is expansive, driven by our overarching objective of
revolutionizing interview preparation processes.
We continue to refine our system, we aim to leverage cutting-edge technologies to enhance
user experience and effectiveness.
This includes exploring advanced natural language processing techniques to generate more
contextually relevant and diverse interview questions[1].
Additionally, we envision integrating machine learning algorithms to provide personalized
feedback and performance analytics to users.
Moreover, we plan to expand the application's capabilities by incorporating features such
as mock interview simulations and industry-specific question sets.
These enhancements will ensure that our platform remains at the forefront of interview
preparation innovation, catering to diverse user needs and preferences.
Therefore, these are some upcoming upgrades or enhancements that we intend to make.
47 | P a g e
11. REFERENCE LINKS
6. Streamlit : https://docs.streamlit.io/
48 | P a g e