0% found this document useful (0 votes)
18 views86 pages

Final Report

Uploaded by

Purbasha Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views86 pages

Final Report

Uploaded by

Purbasha Roy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Beans.toi.

ai
A
Project Report
Submitted in the partial fulfillment of the requirements for the award of the degree of

Bachelor of Engineering
in
Electronics & Computer Engineering

Submitted by
Purbasha Roy

Roll No. 102015150

Faculty Mentor Industry Mentor


Dr. Geetika Dua Mr. Siddharth Jain
Assistant Professor (Contractual-II) Senior Engineering Manager
ECED, TIET, Bennett, Coleman & Co. Ltd.
Patiala Noida

ELECTRONICS AND COMMUNICATION ENGINEERING DEPARTMENT


TIET, PATIALA-147004, PUNJAB
INDIA
May/June 2024

1|P a g e
Certificate
Certified that project entitled Beans.toi.ai which is being submitted by Purbasha Roy (University
Registration No. 102015150) to the Department of Electronics and Communication
Engineering, TIET, Patiala, Punjab, is a record of project work carried out by her under
guidance and supervision of Mr. Siddharth Jain. The matter presented in this project report does
not incorporate without acknowledgement any material previously published or written by any other
person except where due reference is made in the text.

Purbasha Roy
102015150

Faculty Mentor Industry Mentor


Dr. Geetika Dua Mr. Siddharth Jain
Assistant Professor (Contractual-II) Senior Engineering Manager
ECED, TIET, Bennett, Coleman & Co. Ltd.
Patiala Noida

2|P a g e
Acknowledgement
I would like to express my heartfelt gratitude and extend my sincerest appreciation to
Bennett, Coleman & Co. Ltd. for providing me with the invaluable opportunity to
participate in the 6- month internship program. This internship has been an enriching and
transformative experience that has significantly contributed to my personal and professional
growth.

First and foremost, I would like to express my gratitude to the entire Bennett, Coleman &
Co. Ltd. team for their warm welcome and continuous support throughout my internship
journey. The guidance, mentorship, and encouragement I received from my supervisor and
colleagues wereinstrumental in shaping my understanding of the industry and enhancing my
skills.

I am deeply grateful for the trust and responsibility that Bennett, Coleman & Co. Ltd.
bestowed upon me during my internship. The hands-on experience and exposure to real-world
challenges have allowed me to apply the knowledge I gained during my academic studies,
fostering a deeper understanding of the practical aspects of my field.

I would like to extend a special appreciation to my supervisor Mr. Siddharth Jain for their
unwavering support, patience, and willingness to share their expertise. Their guidance not only
helped me navigate complex tasks but also encouraged me to think critically, problem-solve,
and develop innovative solutions. Their commitment to my growth and development has been
truly invaluable.

I would also like to express my gratitude to my colleagues and teammates for their
collaboration, camaraderie, and willingness to share their knowledge. The inclusive and
supportive work environment fostered by the entire team played a significant role in creating a
positive and conducive learning experience.

Furthermore, I would like to thank the HR department for their efficient organization of the
internship program, from the application process to onboarding and ongoing support. Their
professionalism and dedication ensured a smooth and seamless experience, allowing me to
fully focus on my learning and contributions.

I would like to extend my heartfelt appreciation to the entire ECED department for their
unwavering support, guidance, and mentorship throughout my internship tenure. I would like to

3|P a g e
extend my heartfelt gratitude to my faculty mentor Mrs. Geetika Dua, Associate Professor for
her continuous support and guidance. The knowledge and skills I have gained during this
period have been instrumental in shaping my professional growth. I would like to acknowledge
the collaborative environment fostered by the entire mechanical team. The open-door policy and
willingness to answer my questions or provide assistance made me feel welcome and supported
throughout my internship.

Last but not the least I would like to thank my family and friends for their continuous support
and help during the entire tenure of my internship and making this experience very fruitful.

4|P a g e
Abstract

This project report documents innovative research and development activities undertaken during
a Bachelor of Engineering internship at Bennett Coleman & Co. Ltd. (BCCL). The work focused
on applying advanced AI and ML technologies to address operational challenges within the
company, leading to several key projects, methodologies, and outcomes.

The first project developed a copyright infringement detection model using Copyscape and
TinEye APIs to scan for duplicated content online, ensuring compliance with intellectual
property laws. This demonstrated AI's role in automating copyright protection.

Another project was an AI-powered research assistant model, integrating Bing and Google
Custom Search APIs to conduct internet searches, compile URLs, and provide summarized
content. This optimized research workflows, enabling quick access to relevant information.

Additionally, an AI-driven data analysis model was created to analyze data from CSV or Excel
files, detect trends, and generate visualizations using OpenAI's GPT-4. This facilitated data
interpretation and decision-making for the editorial team.

The report also covers the development of a multilingual speech-to-text AI model for
transcribing and translating reporter speech into text across multiple languages using OpenAI's
Whisper and GPT-3.5 models. This improved reporting efficiency and transcription accuracy.

Further projects included enhancing live speech-to-text transcription systems and developing an
AI content generation model to assist content creators by producing high-quality content based
on user-provided topics.

In summary, the report highlights the successful integration of AI and ML technologies at


BCCL, demonstrating their potential to optimize workflows, enhance content quality, and drive
innovation in the media industry, thereby improving efficiency and competitiveness in the digital
landscape.

5|P a g e
List of Figures

Figure No. Description Page No.


Figure 1.1 Bennett, Coleman & Co. Ltd. Logo 12
Figure 2.1 Sample Text for the Profanity Check 22

Figure 2.2 Results of the Copyscape Internet Search 23


for Potential Content Matches

Figure 2.3 Proof of result - Screenshot of an article from Fortune 23


India discussing Nykaa's business performance during
the pandemic.
Figure 3.1 Search Query of Google Custom Search in the Backend 29

Figure 3.2 Result of Google Custom Search – Displaying Title, 29


URL, and Snippet.

Figure 3.3 Code for searching the query using Bing API 30

Figure 4.1 Code Overview of Assisstant creation, instruction 35


prompt and tool selection

Figure 4.2 Handling Graphs and the Explanation 38

Figure 4.3 Giving the Meta Attributes of the input file 39

Figure 4.4 Summary of the Input File 40

Figure 4.5 Graph generated by the model showing the relationship 41


between Price in INR and Storage Size.

Figure 4.6 Graph created by the model depicting the relationship 41


between Brand and Average Rating.

Figure 4.7 Response to the user's query 42

Figure 5.1 Code overview of the assistant's instructions and 46


document storage in vector space.

6|P a g e
Figure 5.2 Summary of the uploaded document 47

Figure 5.3 Model accurately answered the user query. 48

Figure 6.1 Transcribed text for the given mp3 file as input 54

Figure 6.2 Translated and Paraphrased Text 55

Figure 7.1 Code of Real Time STT 59

Figure 7.2 Live Speech to Text 60

Figure 8.1 Code Overview: Handling Prompt Reception and 63


Transmission to the Model

Figure 8.2 User Query and Response Utilizing the Anthropic 64


Model

7|P a g e
List of Tables

Table No. Description Page No.


Table 4.1 Test dataset containing 3,120 rows 38

8|P a g e
Acronyms and Abbreviations

BCCL Bennett, Coleman & Co. Ltd.

URL Uniform Resource Locator


NLP Natural language processing

API Application Programming Interface

CSV Comma Separated Values

NER Named Entity Recognition

BERT Bidirectional Encoder Representations from Transformers

HTTP Hypertext Transfer Protocol

XML Extensible Markup Language

STT Speech to Text

9|P a g e
Table of Content
Page No.
Certificate 2
Acknowledgement 3
Abstract 5
List of figures 6
List of tables 8
Acronyms and abbreviations 9
Contents
Chapter-1 Introduction
1.1. About the Company 12
1.2. About the Department 14
1.3. List of Projects 16
Chapter-2 Development of a Copyright Infringement Detection Model
2.1 Objective 19
2.2 Introduction 19
2.3 Prior Work 20
2.4 Methodology 20
2.5 Results 22
2.6 Conclusion 24

Chapter-3 Designing an AI-Powered Research Assistant Model


3.1 Objective 25
3.2 Introduction 25
3.3 Prior Work 26
3.4 Methodology 26
3.5 Results 29
3.6 Conclusion 30

Chapter-4 Implementation of an AI-Driven Data Analysis Model


4.1 Objective 32
4.2 Introduction 32
4.3 Prior Work 33
4.4 Methodology 34
4.5 Results 38

10 | P a g e
4.6 Conclusion 42

Chapter-5 Creation of an AI Document Analyzer


5.1 Objective 43
5.2 Introduction 43
5.3 Prior Work 44
5.4 Methodology 45
5.5 Results 47
5.6 Conclusion 48

Chapter-6 Design and Development of a Multilingual Speech-to-Text AI Model


6.1 Objective 49
6.2 Introduction 49
6.3 Prior Work 50
6.4 Methodology 51
6.5 Results 54
6.6 Conclusion 55

Chapter-7 Enhancement of Live Speech-to-Text


7.1 Objective 57
7.2 Introduction 57
7.3 Prior Work 57
7.4 Methodology 58
7.5 Results 60
7.6 Conclusion 60

Chapter-8 Development of an AI Content Generation Model


8.1 Objective 61
8.2 Introduction 61
8.3 Methodology 61
8.4 Results 64
8.5 Conclusion 64

Chapter-8 Future Work 65

References 69
Reflective Dairy 70

11 | P a g e
Chapter-1
Introduction

1.1. About the Company

Figure 1.1 Bennett, Coleman & Co. Ltd. Logo

“Let Truth Prevail” - Bennett, Coleman & Co. Ltd.

Bennett, Coleman & Co. Ltd. (BCCL), also known as The Times Group, is one of India's
largest and most influential media conglomerates. Established in 1838, BCCL has a rich
history of providing quality news and information to the Indian public. The company owns
and operates a wide array of media outlets, including newspapers, magazines, television
channels, radio stations, and digital platforms.

BCCL's flagship publication, The Times of India, is the world's largest-selling English-
language daily newspaper. It’s dominance in the print media sector is not limited to The
Times of India. The company publishes several other influential newspapers catering to
different demographics and linguistic groups. The Economic Times, for instance, is one of the
most respected business dailies in India, offering in-depth analysis and coverage of financial
markets, industry trends, and economic policies. Maharashtra Times and Navbharat Times
serve the regional and Hindi-speaking audiences, respectively, ensuring that BCCL's reach
extends across diverse linguistic and cultural segments.

Recognizing the growing influence of television, BCCL expanded into the broadcast media
sector with significant success. Times Now, the company's premier English news channel, has
garnered a substantial viewership for its incisive news reporting and analysis. ET Now
focuses on business and economic news, providing viewers with expert insights into the
financial world. Additionally, Zoom, BCCL's entertainment channel, caters to the youth
demographic with a mix of music, Bollywood news, and celebrity gossip.

12 | P a g e
BCCL's foray into radio broadcasting has also been marked by notable achievements. Radio
Mirchi, the company's radio network, operates numerous stations across India. Known for its
catchy tagline "It's Hot!", Radio Mirchi has become a leading name in the FM radio space,
offering a mix of music, talk shows, and entertainment content. The network's popularity is a
testament to BCCL's ability to adapt to changing media consumption habits and deliver
content that resonates with a broad audience.

In the digital age, BCCL has successfully transitioned to online platforms, ensuring its
relevance in a rapidly evolving media landscape. TimesofIndia.com and Indiatimes.com are
among the most visited news websites in India, providing real-time updates, multimedia
content, and interactive features. The company's digital strategy emphasizes innovation and
user engagement, leveraging social media and mobile applications to reach a tech-savvy
audience.

BCCL's slogan, "Let Truth Prevail," encapsulates the company's commitment to journalistic
integrity and ethical reporting. This guiding principle has been the cornerstone of BCCL's
operations, driving its mission to deliver accurate, unbiased, and comprehensive news
coverage. In an era of misinformation and sensationalism, BCCL's adherence to truth and
objectivity sets it apart as a trusted source of information.

Beyond its commercial ventures, BCCL is also actively involved in various social and
community initiatives. The Times Foundation, the company's philanthropic arm, undertakes
numerous projects aimed at education, healthcare, disaster relief, and environmental
conservation. These initiatives reflect BCCL's dedication to contributing positively to society
and making a meaningful impact beyond the realm of media.

The headquarters of Bennett, Coleman & Co. Ltd. in Mumbai serves as the nerve center of its
extensive media operations, driving its mission to deliver quality journalism and content.
Through strategic international offices, digital platforms, content syndication, and global
collaborations, BCCL has successfully extended its reach beyond India, making a significant
impact on the global media industry. The company's commitment to innovation, integrity, and
excellence continues to guide its growth and influence worldwide.

13 | P a g e
1.2. About the Department

The IT Department at Bennett, Coleman & Co. Ltd. (BCCL), also known as The Times
Group, is a vital component that supports and drives the company's extensive media
operations. This department is responsible for managing the technological infrastructure,
ensuring seamless operations, and implementing innovative solutions to keep BCCL at the
forefront of the media industry.

The IT Department at BCCL is organized into several key divisions, each focusing on
different aspects of technology and information systems:

1. IT Infrastructure Management
Infrastructure Management: This team is responsible for maintaining and managing
BCCL's IT infrastructure, including servers, networks, data centers, and cloud services. They
ensure that all systems are running efficiently and securely, providing a robust backbone for
the company's operations.
Network Administration: Network administrators manage the company's internal and
external networks, ensuring reliable connectivity, managing bandwidth, and securing network
communications.

2. Technical Support and Help Desk


Technical Support: The technical support team provides assistance to employees across all
departments, addressing hardware and software issues to minimize downtime. They handle
troubleshooting, repairs, and support requests to ensure that employees can perform their
tasks without technical hindrances.
Help Desk: The help desk serves as the first point of contact for IT-related queries and issues,
offering quick resolutions and escalating more complex problems to specialized teams as
needed.

3. Cybersecurity
Security Operations: The cybersecurity team protects BCCL's digital assets and IT
infrastructure from cyber threats. They implement security measures such as firewalls,
intrusion detection systems, and encryption to safeguard sensitive data.
Incident Response: This team handles security incidents, ensuring quick and effective
responses to mitigate risks. They conduct regular security audits and stay updated with the
latest security trends to prevent breaches.

14 | P a g e
4. Software Development and Maintenance
Application Development: This division develops and maintains software applications used
across the company. This includes content management systems (CMS), and other custom
applications that streamline operations and improve productivity.
Web and Mobile Development: Dedicated teams focus on creating and maintaining BCCL’s
digital properties, such as TimesofIndia.com and mobile apps. They ensure these platforms
are user-friendly, responsive, and equipped with the latest features.

5. AI Research and Development


Emerging Technologies: The AI R&D team explores and develops new AI technologies and
applications that can be integrated into BCCL’s operations. They stay updated with the latest
advancements in AI and ML, experimenting with innovative solutions to drive the company’s
growth.
Prototype Development: This division creates prototypes and pilots AI-based solutions,
testing their feasibility and effectiveness before full-scale implementation.

6. Advertising Optimization
Ad Targeting: AI algorithms optimize ad targeting by analyzing user data and behavior,
ensuring that advertisements are relevant and effective. This improves the efficiency of
advertising campaigns and increases revenue.
Performance Analytics: AI tools monitor and analyze the performance of ad campaigns in
real-time, providing actionable insights to advertisers and marketers.

I had the opportunity to work in the AI Research and Development (R&D) Department. I was
involved in exploring emerging AI and machine learning (ML) solutions, where I researched
the latest advancements and trends in AI technology to identify potential applications for the
company. My responsibilities included developing prototypes for various AI-driven projects,
which involved coding, testing algorithms, and ensuring they met the desired objectives.

15 | P a g e
1.3. List of Projects

During my internship in the AI Research and Development (R&D) Department, I had the
opportunity to work on several innovative projects. These projects aimed to enhance the
company's editorial capabilities, streamline content creation processes, and integrate advanced
AI solutions into BCCL's operations. Here is a detailed list of the key projects I was involved
in:

Project 1
Development of a Copyright Infringement Detection Model
In this project, I worked on developing a model to detect potential copyright infringement in
the content generated by reporters and AI tools. We used APIs from Copyscape to create a
comprehensive search framework that scans the internet and various digital repositories for
duplicated content. The model generated detailed reports indicating the specific URLs, word
counts, and percentages of content overlap, ensuring thorough compliance with intellectual
property laws.

Project 2
Designing an AI-Powered Research Assistant Model
I contributed to creating an AI-powered research assistant that conducts internet searches
based on user queries, compiles relevant URLs, and provides summarized content. We
integrated APIs from Bing and Google Custom Search to ensure comprehensive and accurate
search results. This tool enabled users to quickly access relevant information and synthesized
summaries, streamlining the research process. This project underscored the potential of AI in
optimizing research workflows.

Project 3
Implementation of an AI-Driven Data Analysis Model
I was involved in building an AI model designed to analyze data from CSV or Excel files,
providing summaries, detecting trends, and generating visualizations. We leveraged OpenAI's
GPT-4-0125-preview model later on switched to GPT-4 Omni models for its advanced data
interpretation capabilities. This model allowed us to generate detailed summaries, extract
meta attributes, and respond accurately to user queries. Additionally, we implemented user-
requested and auto-generated graph visualizations to illustrate data trends effectively. This
project aimed to streamline data analysis processes, making it easier for the editorial team to
interpret and utilize large datasets. The AI-driven model proved instrumental in enhancing
data-driven decision-making within the organization.

16 | P a g e
Project 4
Creation of an AI Document Analyzer
My role involved developing an AI tool that could analyze documents, extract relevant text
segments, and provide contextual answers to specific queries. We used the OpenAI GPT-4-
0125-preview model and then, in order to improve response time, we moved to GPT-4 Omni
models. We did this by utilising vector databases for embedding in the Knowledge Retrieval
method. This approach facilitated efficient extraction and evaluation of textual content from
diverse documents. The AI Document Analyzer was particularly useful for handling complex
documents, such as legal briefs and financial reports, offering valuable insights and enhancing
document comprehension capabilities. This tool significantly improved the efficiency and
accuracy of document analysis processes within the company.

Project 5
Design and Development of a Multilingual Speech-to-Text AI Model
I worked on developing a multilingual speech-to-text AI model aimed at transcribing and
translating reporter speech into text across multiple languages with high accuracy. We utilized
OpenAI's Whisper model for speech recognition, which provided robust capabilities in
recognizing various languages. To ensure the transcriptions were clear and readable, we
incorporated GPT-3.5 for rephrasing the text into coherent and meaningful language. This
project significantly improved the efficiency of the reporting process, enabling instant and
accurate textual output. The model supported various file formats, including mp3, mp4, and
wav, ensuring flexibility in usage. Overall, this tool enhanced the accessibility and quality of
transcriptions for diverse audiences.

Project 6
Enhancement of Live Speech-to-Text
The goal of this research was to improve live speech-to-text transcription systems so that
reporting and interviews could be accurately and in real-time transcribed. To increase
transcription accuracy, we used a variety of voice datasets, noise reduction strategies, and
sophisticated language models like Assembly AI and AWS LIVE SPEECH TO TEXT. This
improvement increased the effectiveness of live reporting by guaranteeing that live
transcriptions were instantaneously coherent and publication-ready. The enhanced live
speech-to-text system represented a major breakthrough in the production of real-time reports
and information.

17 | P a g e
Project 7
Development of an AI Content Generation Model
In this project, I developed an AI model to generate content based on user-provided topics.
The model aimed to assist content creators by producing relevant and high-quality content
tailored to specific topics. We ensured the generated content was coherent, informative, and
adhered to the editorial standards of BCCL. Integrating this model into the content
management system streamlined the content creation process, allowing for quicker and more
efficient generation of articles and reports. This tool significantly enhanced the editorial
capabilities of BCCL, demonstrating the practical applications of AI in content creation.

18 | P a g e
Chapter-2
Development of a Copyright Infringement Detection
Model

2.1 Objective:
The objective of this project is to develop a robust and efficient model capable of detecting
potential copyright infringement in content generated by reporters and AI tools. By leveraging
APIs from Copyscape, the model aims to create a comprehensive search framework that scans
the internet and various digital repositories for duplicated content. The ultimate goal is to
generate detailed reports that indicate specific URLs, word counts, and percentages of content
overlap, thereby ensuring thorough compliance with intellectual property laws.

2.2 Introduction:
In this project, I focused on developing a model to detect potential copyright infringement in
content generated by reporters and AI tools. With the increasing volume of digital content and
the ease of replication, copyright infringement has become a significant concern. To address
this issue, the project utilized the powerful search capabilities of Copyscape APIs, creating a
comprehensive search framework designed to scan the internet and various digital repositories
for duplicated content. This framework ensures thorough compliance with intellectual
property laws and helps safeguard original content.

The model operates by performing extensive searches across numerous digital platforms to
identify instances of content overlap. Utilizing the advanced algorithms of Copyscape, it can
detect even subtle similarities between pieces of content. Once potential infringements are
identified, the model generates detailed reports that include specific URLs where the
duplicated content was found, word counts, and percentages of content overlap. These reports
are invaluable for understanding the extent of the infringement and taking appropriate actions
to address it.

The Copyscape API, for instance, performs text searches on the internet to find exact matches
and partial matches of content. The integration of these APIs into the model leverages their
extensive databases and sophisticated detection mechanisms, ensuring accurate and reliable
identification of copyright issues.

19 | P a g e
This project aims to provide a reliable tool for organizations and individuals to monitor and
manage the originality of their content. By fostering a culture of respect for intellectual
property, the model not only helps in protecting rights but also promotes ethical content
creation practices. The comprehensive and detailed reports generated by the model assist in
ensuring that all content remains within the legal boundaries of copyright laws, thus
mitigating the risk of legal issues and maintaining the integrity of the content produced.

2.3 Prior Work:


Before embarking on this project, it was essential to have a solid understanding of the tools
and technologies involved in detecting copyright infringement. Key components included:
1. Understanding Copyscape APIs: This API is designed to search the internet for
duplicate content, providing detailed information about potential matches, including
URLs, word counts, and percentages of content overlap. Familiarity with its operation,
including how to configure and use it for effective searches, was crucial.

2. Familiarity with API Integration and Data Parsing: Knowledge of how to make API
calls, handle responses, and manage authentication was essential. This includes
understanding HTTP methods, response handling, and error management.Skills in
parsing XML and JSON responses from the APIs to extract meaningful data for reporting
were required. Using tools like xml2js in JavaScript to convert API responses into usable
formats was part of this expertise.

3. Basic Knowledge of Intellectual Property Laws: A foundational understanding of


copyright laws and the principles of intellectual property was necessary to grasp the
importance of detecting infringement and ensuring compliance.

4. Familiarity with Similar Projects or Tools: Reviewing existing tools and methods for
copyright infringement detection provided insights into best practices and common
pitfalls. This included studying the methodologies and effectiveness of existing solutions
to build a more robust and reliable model.

2.4 Methodology:
The methodology for developing the copyright infringement detection model involved several
key steps, utilizing various tools and APIs to create a comprehensive search framework.
By following this methodology, the project creates an efficient and effective model for
detecting potential copyright infringements in text content. The integration of Copyscape API,
combined with robust error handling and comprehensive result processing, ensures that the

20 | P a g e
model provides reliable and detailed reports on potential copyright issues, aiding in
compliance with intellectual property laws.
Below is an outline of the methodology:

1. Setup and Configuration:


 Environment Setup: The project starts with setting up the environment using
Node.js and Express.js. Essential packages like axios for making HTTP requests,
xml2js for parsing XML data, and body-parser for handling request bodies are
installed and configured.
 Environment Variables: Sensitive information such as the Copyscape API
username, key, and URL are stored in environment variables for security
purposes, loaded through the dotenv package.

2. API Integration:
 Copyscape API: The core functionality involves making API calls to Copyscape.
A function copyscape_api_call is defined to handle GET and POST requests to
the Copyscape API, forming the request URL with the necessary parameters and
managing the response.
 Text Search Endpoint: An endpoint /copyscape/textsearch is set up to handle
POST requests. This endpoint receives text data, processes it, and makes a call to
the Copyscape API to search for potential copyright infringements on the internet.

3. Text Processing:
 Input Handling: The text to be checked for copyright infringement is received in
the body of the POST request. The total number of words in the text is calculated
to aid in the analysis of the results.
 Encoding and Search Level: The text is encoded in UTF-8, and a parameter full is
set to control the comprehensiveness of the search (0 for basic, higher values for
more comprehensive searches).

4. API Call and Response Handling:


 Making the API Call: The copyscape_api_call function is invoked with the
appropriate parameters, sending the text to Copyscape for analysis.
 Parsing the Response: The XML response from Copyscape is parsed using xml2js
to convert it into a JSON object. This object contains the results of the search,
including URLs, titles, and the number of matching words.

21 | P a g e
5. Result Processing and Reporting:
 Result Extraction: The results from the API response are extracted and processed.
Each result includes the URL, title of the matched content, the minimum number
of words matched, and the percentage of the text that matches.
 JSON Response: The processed results are formatted into a JSON array and sent
back as the response to the POST request. If no matches are found, a 404 status
with a "No matches found" message is returned.

6. Error Handling:
 Robust Error Management: The methodology includes robust error handling to
manage issues such as missing text input, API call failures, and parsing errors.
These errors are logged, and appropriate HTTP status codes and error messages
are returned.

7. Server Setup:
 Express Server: The Express server is set up to listen on a specified port, enabling
the API to handle requests and return responses.

2.5 Results:

Fig 2.1 Sample Text for the Profanity Check

22 | P a g e
Fig 2.2 Results of the Copyscape Internet Search for Potential Content Matches

Fig 2.3 Proof of result - Screenshot of an article from Fortune India discussing Nykaa's
business performance during the pandemic.

23 | P a g e
2.6 Conclusions:
The results of the copyright infringement detection model were obtained by conducting
extensive tests using various text samples generated by reporters and AI tools. The key
outcomes of these tests are summarized below:
1. Detection Accuracy:
 The model successfully identified instances of content overlap with high accuracy. It was
able to detect both exact matches and partial matches of content across multiple sources on
the internet.
 Detailed reports were generated for each test, highlighting the specific URLs, titles of the
matched content, the minimum number of words matched, and the percentage of text
overlap.

2. Comprehensive Search Results:


 By utilizing the Copyscape API, the model performed thorough searches across the web,
covering a wide range of digital repositories. This ensured that even subtle similarities in
content were detected.
 The search comprehensiveness was adjustable, allowing for basic to in-depth searches
depending on the requirements. In more comprehensive searches, the model demonstrated
its capability to uncover less obvious instances of duplication.

3. Efficiency and Performance:


 The model performed efficiently, with response times from the Copyscape API being
quick enough to allow for real-time or near-real-time analysis of content.
 The text processing and response handling mechanisms were optimized to manage large
volumes of data, making the model scalable for use in high-demand environments.

4. Error Handling and Reliability:


 The robust error handling mechanisms ensured that the model operated reliably even in
the face of API call failures, network issues, or input errors. Meaningful error messages and
appropriate HTTP status codes were returned, maintaining clarity in the communication of
issues.

The system handled edge cases effectively, such as extremely short texts or texts with no
significant matches, providing appropriate feedback without causing system crashes or
failures.

24 | P a g e
Chapter-3
Designing an AI-Powered Research Assistant Model

3.1 Objective:
The objective of this project is to design an AI-powered research assistant model that integrates
various APIs, such as the Bing API and Google Custom Search API, to efficiently respond to
user queries by scouring the internet for relevant information. The model aims to provide users
with quick access to pertinent online content by generating outputs that include URLs of
websites containing the desired information, accompanied by succinct summaries extracted from
these sources. This methodology empowers users with a powerful and insightful research tool,
enhancing their ability to access and utilize relevant online content in a summarized and efficient
manner.

3.2 Introduction:
In this project, we developed an AI-powered research assistant model designed to significantly
enhance the efficiency and effectiveness of online research. With the vast amount of information
available on the internet, finding relevant and accurate content can be a time-consuming and
challenging task. Our model addresses this challenge by leveraging advanced APIs, including the
Bing API and Google Custom Search API, to automate and streamline the search process.

The core functionality of the model involves receiving user queries and using these APIs to
perform comprehensive searches across the web. When a user submits a query, the model sends
requests to the Bing and Google Custom Search APIs, which scour the internet for contextually
relevant information. These APIs return search results that include various web pages and
documents related to the query.

To enhance the usability of these search results, the model processes the returned data to
generate outputs that are both informative and concise. Each result includes the URL of the
relevant website and a succinct summary extracted from the content of that page. This summary
provides a quick overview of the information available on the site, allowing users to quickly
assess its relevance without having to visit each page individually.

The integration of these powerful search APIs ensures that the research assistant model performs
thorough and accurate searches, covering a wide range of online sources. This comprehensive
approach increases the likelihood of retrieving the most pertinent information for any given
query.

Moreover, by presenting the search results in a summarized format, the model enhances the user
experience by making the information more accessible and easier to digest. Users can quickly

25 | P a g e
scan through the summaries to find the content that best meets their needs, significantly reducing
the time and effort required for online research.

Overall, this project aims to empower users with a robust tool that facilitates efficient and
insightful research. By automating the search process and providing concise, relevant summaries,
the AI-powered research assistant model helps users access and utilize valuable online content
more effectively, supporting their research activities in a streamlined and user-friendly manner.

3.3 Prior Work:


Before starting the development of the AI-powered research assistant model, several
foundational steps were essential to ensure successful implementation. A thorough understanding
of the Bing Custom Search API and Google Custom Search API was necessary, including
configuring the APIs, crafting search queries, handling responses, and integrating them into the
application framework. Setting up a Node.js environment and an Express.js server was crucial to
handle incoming requests and responses efficiently, involving the installation of key packages
like axios for HTTP requests and dotenv for securely managing environment variables.

Understanding the rate limits imposed by the APIs was important to prevent exceeding the
allowed number of requests, which could lead to blocked access. Strategies like request throttling
or caching were considered for effective rate limit handling. Additionally, reviewing existing
research assistant tools and similar search applications provided insights into best practices and
effective solutions, helping identify features and functionalities to incorporate or improve upon.

3.4 Methodology:
3.5.1 Methodology for Suggestion.js (Google Custom Search API)

1. Setup and Configuration:

 Import Required Modules: The axios library is imported for making HTTP
requests, and the fs module is used for reading the API key and Search Engine
ID from files.
 Read API Credentials: The Google Custom Search API key and Search
Engine ID are read from files using fs.readFileSync.

2. Define Search Query:

 A search query is defined, which will be used to search for relevant content on the
internet. For example, the query could be "5 ways to remain healthy". 

26 | P a g e
3. Construct API Request:

 API Endpoint: The Google Custom Search API endpoint is defined as


https://www.googleapis.com/customsearch/v1.

 Parameters: The query parameters, including the search query, API key, and
Search Engine ID, are constructed into an object.

4. Make API Request:

 An HTTP GET request is made to the Google Custom Search API using
axios.get, passing the constructed URL and parameters.

5. Process API Response:

 The response from the API is processed to extract relevant information. If the
response contains items, each item’s title, URL, and snippet are logged to the
console.
 Error Handling: If the API request fails, an error message is logged to the
console.

6. Output Results:

 The relevant search results, including titles, URLs, and snippets, are printed to the
console, providing a summarized view of the content found. 

3.5.2 Methodology for Bingexpress.js (Bing API)

1. Setup and Configuration:

 Import Required Modules: The express library is imported for setting up the
server, and dotenv is used for managing environment variables.
 Express Server Configuration: An Express application is created, and JSON
parsing middleware is configured.

2. Read Environment Variables:

 The API credentials and custom configuration ID for the Bing API are read from
environment variables using dotenv. 

3. Define Search Endpoint:

 A POST endpoint /search is defined to handle search requests. The search term is
extracted from the request body or query parameters.

27 | P a g e
4. Validate Input:

 The search term is validated to ensure it is not empty. If it is missing, a 400 status
code and error message are returned.

5. Construct API Request:

 API Endpoint: The Bing Custom Search API endpoint is defined,


incorporating the search term and custom configuration ID.
 Headers: The request headers include the subscription key for the Bing API.

6. Make API Request:

 An HTTP GET request is made to the Bing Custom Search API using fetch,
passing the constructed URL and headers.

7. Process API Response:

 The response from the API is parsed as JSON. If the response contains web
pages, each web page’s name, URL, and snippet are mapped into an array and
returned as the JSON response.
 Error Handling: If the API request fails or the response is not OK, an error is
logged, and a 500 status code with an error message is returned.

8. Start Server:

 The Express server is configured to listen on a specified port, making the search
functionality available via the defined endpoint.

By following these methodologies, both the suggestion.js and bingexpress.js implementations


provide efficient mechanisms to search the web for relevant content using Google Custom
Search and Bing APIs, respectively.

28 | P a g e
3.5 Results:

Fig 3.1 Search Query of Google Custom Search in the Backend

Fig 3.2 Result of Google Custom Search – Displaying Title, URL, and Snippet.

29 | P a g e
Fig 3.3 Code for searching the query using Bing API

3.6 Conclusions:
The implementation of the AI-powered research assistant model using both the Google Custom
Search API (suggestion.js) and the Bing Custom Search API (bingexpress.js) yielded significant
results in terms of accuracy, efficiency, and usability. Below is a summary of the outcomes
observed during the testing phase:

1. Accuracy of Search Results:

 Both implementations effectively retrieved relevant information based on the


user queries. The results included URLs, titles, and concise snippets,
providing users with a quick overview of the content.
 The Google Custom Search API returned results that were highly relevant to
the search queries, offering a diverse set of sources and information.
 Similarly, the Bing Custom Search API demonstrated high accuracy, with the
search results being closely aligned with the user’s search intent with respect
to Times of India URLs.

2. Efficiency and Performance:

 The response times for both APIs were satisfactory, ensuring that users
received prompt feedback on their queries. The Google Custom Search API
and Bing API both returned results quickly, making the research assistant
model efficient for real-time use.

30 | P a g e
 The server setup for the Bing API (bingexpress.js) performed reliably under
various load conditions, maintaining consistent response times and handling
multiple requests concurrently.

3. Usability and User Experience:

 The search results were presented in a user-friendly format, with titles, URLs,
and snippets clearly displayed. This made it easy for users to assess the
relevance of each result without needing to visit each link.
 The implementation of robust error handling ensured that users received
meaningful error messages when issues occurred, such as missing search
terms or network errors, enhancing the overall user experience.

4. Integration and Functionality:

 The integration of the APIs into the respective server setups was seamless,
demonstrating the feasibility of using these APIs in a production environment.
 Both implementations allowed for easy modification of search queries and
parameters, providing flexibility in how the search functionality could be used
and adapted for different requirements.

Overall, the results from both implementations indicate that the AI-powered research assistant
model is highly effective in retrieving relevant and accurate information from the web. The use
of Google Custom Search API and Bing Custom Search API provided a comprehensive and
efficient solution for user queries, ensuring quick access to pertinent online content. The model's
ability to present summarized information in a user-friendly manner significantly enhances its
usability, making it a valuable tool for research and information retrieval.

31 | P a g e
Chapter-4
Implementation of an AI-Driven Data Analysis Model

4.1 Objective:
The objective of this project is to develop an AI-driven data analysis model that enhances
data interpretation and utilization in Excel files. This model will automatically generate
concise summaries, extract and present meta attributes (such as file size, creation date, and
number of sheets), create visualizations based on user queries, and answer specific questions
related to the document using advanced natural language processing. By integrating these
functionalities, the project aims to streamline data analysis, improve decision-making, and
provide users with clear, data-driven insights.

4.2 Introduction:
In the implementation of the AI-Driven Data Analysis Model, we strategically utilized the
OpenAI Assistant API, specifically selecting the GPT-4-0125-preview model for its
advanced capabilities in data interpretation from CSV and Excel files. This selection enabled
the generation of summaries, extraction of meta attributes, and accurate responses to user
queries. To cater to graph visualization needs, our approach was twofold: enabling user-
requested chart creation and autonomously generating graphs to illustrate observed trends
from the data. Additionally, to refine the outputs for enhanced readability, we employed the
GPT-3.5 model for its rephrasing efficiency, offering a balance between performance and
speed. Among the three assistant APIs available, we chose the "code interpreter" mode for
its adeptness at handling complex data analysis tasks and generating programmatically
accurate results. This choice was instrumental in our ability to provide meaningful data
visualizations and insightful summaries, further enriched by our post-processing rephrasing
with GPT-3.5.

Later on, we switched to the GPT-4 Omni models, which offered enhanced performance and
greater flexibility in handling diverse and complex data analysis tasks. This transition
allowed us to further refine our model's capabilities, ensuring even more accurate, fast, and
accessible insights for our editorial team and end-users. This nuanced application of
OpenAI's technologies demonstrates our innovative approach to leveraging AI for
comprehensive data analysis, ensuring that our users receive the highest quality of insights
and data-driven decision-making support.

32 | P a g e
4.3 Prior Work:
To create the five scripts, several foundational steps and prior work are essential. The first
step involves setting up the development environment. This includes installing Node.js and
npm, which are necessary for running the JavaScript code and managing project
dependencies. Additionally, essential packages such as express for building the server,
multer for handling file uploads, fs for file system operations, dotenv for managing
environment variables, and the OpenAI SDK for interacting with OpenAI's API must be
installed. Proper environment configuration is also crucial, requiring the creation of a .env
file to securely store the OpenAI API key and any other necessary environment variables.

Next, handling file uploads is a critical task. Multer needs to be configured in each script to
manage file uploads efficiently, specifying appropriate destination folders for the uploaded
files. Alongside this, setting up an Express server is essential to handle incoming HTTP
requests. This setup ensures that files can be uploaded, stored temporarily, and then
processed by the AI model.

Integrating the OpenAI API is another significant step. This involves securely managing the
OpenAI API key using environment variables loaded through dotenv. The scripts need to
implement functionality to upload files to OpenAI’s servers for further analysis. This
process ensures that the data is accessible to the AI models for processing and generating
insights.

Configuring the AI models and assistants is a crucial part of the setup. Each script requires
clearly defined instructions tailored to its specific function, whether it be summarizing data,
extracting meta attributes, generating graphs, or answering user queries. The assistants are
created using the OpenAI API, initially with the GPT-4-0125-preview model and later
transitioning to the more advanced GPT-4 Omni models to enhance performance and
flexibility.

Handling threads and messages effectively is vital for user interactions. This involves
creating threads for each user interaction session and defining the initial messages to set the
context for the AI assistant. Proper message handling ensures that user queries are
processed, and appropriate responses are generated using the AI model.

33 | P a g e
4.4 Methodology:

4.5.1 Methodology to Implement Summary, Meta-attributes, User Input Query


The methodology for implementing the explain.js script involves several key steps to ensure
the effective analysis and explanation of CSV file contents using OpenAI's GPT-4o model.
Below is a detailed description of the methodology:

1. Setup and Configuration:


 Environment Setup: Install necessary packages, including express, multer, fs, dotenv, and
the OpenAI SDK. Create a .env file to securely store the OpenAI API key and load it using
dotenv. 
 Server Initialization: Initialize an Express server to handle incoming HTTP requests.

2. File Upload Handling:


 Multer Configuration: Set up multer to handle file uploads, specifying the destination
folder for storing uploaded files temporarily. 
 Express Middleware: Configure Express to parse JSON requests and handle file uploads
using multer.

3. OpenAI API Integration:


 File Upload to OpenAI: Implement functionality to upload the CSV file to OpenAI's
servers for analysis. This involves reading the file from the temporary storage and using the
OpenAI API to create a file resource.

4. Assistant Configuration:
 Define Instructions: Provide clear instructions for the AI assistant. In this case, the
instructions are to analyze the provided CSV file, detailing its structure (rows, columns) and
elucidating any patterns, trends, or insights. The explanation should be clear and structured,
suitable for someone without a data analysis background.
 Assistant Creation: Create an AI assistant using the OpenAI API with the defined
instructions and the specified model (initially GPT-4o).

5. Thread and Message Handling:


 Create Thread: Initiate a new thread for the user interaction session.
 User Request Handling: Define the initial user message asking for a detailed explanation
of the CSV file. Attach the uploaded file to this message and specify the use of the
"code_interpreter" tool.

34 | P a g e
6. Execution and Status Checking:
 Run Execution: Trigger the execution of the assistant's run using the created thread and
assistant ID.
 Status Monitoring: Implement a function to check the status of the run periodically. If the
run is completed, retrieve the messages generated by the assistant.

7. Result Processing and Response Handling:


 Retrieve and Format Results: Once the run is completed, retrieve the messages from the
thread. Extract the content generated by the assistant, which includes the detailed
explanation of the CSV file. 
 Send Response: Format the extracted content and send it back as the response to the
user's request.

8. Server Listening:
 Start Server: Configure the Express server to listen on a specified port (e.g., port 3000) to
handle incoming requests.

Fig 4.1 Code Overview of Assisstant creation, instruction prompt and tool selection

35 | P a g e
4.5.2 Methodology to Implement Graphs

1. Setup and Configuration:

 Environment Setup: Install necessary packages including express, multer, fs, dotenv, and
the OpenAI SDK. Create a .env file to securely store the OpenAI API key and load it using
dotenv. 

 Server Initialization: Initialize an Express server to handle incoming HTTP requests.

2. File Upload Handling:

 Multer Configuration: Set up multer to handle file uploads, specifying the destination
folder for storing uploaded files temporarily. 

 Express Middleware: Configure Express to parse JSON requests and handle file uploads
using multer.

3. OpenAI API Integration:

 File Upload to OpenAI: Implement functionality to upload the CSV file to OpenAI's
servers for analysis. This involves reading the file from the temporary storage and using the
OpenAI API to create a file resource.

4. Assistant Configuration:

 Define Instructions: Provide clear instructions for the AI assistant. In this case, the
instructions are to analyze data from the provided CSV file and generate charts that
accurately represent identified trends. The assistant should ensure all axes are clearly labeled
with appropriate units of measurement.

 Assistant Creation: Create an AI assistant using the OpenAI API with the defined
instructions and the specified model (GPT-4o).

5. Thread and Message Handling:

 Create Thread: Initiate a new thread for the user interaction session.

36 | P a g e
 User Request Handling: Define the initial user message asking for specific graphs or
visualizations based on the CSV data. Attach the uploaded file to this message and specify
the use of the "code_interpreter" tool. 

6. Execution and Status Checking:

 Run Execution: Trigger the execution of the assistant's run using the created thread and
assistant ID.

 Status Monitoring: Implement a function to check the status of the run periodically. If the
run is completed, retrieve the messages and generated images by the assistant.

7. Result Processing and Response Handling:

 Retrieve and Format Results: Once the run is completed, retrieve the messages from the
thread. Extract the content generated by the assistant, including the generated graphs.

 Process Images: Convert the generated images to a format suitable for web display (e.g.,
base64 encoded strings).

 Send Response: Format the extracted content and generated graphs, and send them back
as the response to the user's request.

8. Server Listening:

 Start Server: Configure the Express server to listen on a specified port (e.g., port 4000) to
handle incoming requests.

37 | P a g e
Fig 4.2 Handling Graphs and the Explanation

4.5 Results:

Selling Original
Brand Model Color Memory Storage Rating Price Price
Moonlight
OPPO A53 Black 4 GB 64 GB 4.5 11990 15990
OPPO A53 Mint Cream 4 GB 64 GB 4.5 11990 15990
Moonlight
OPPO A53 Black 6 GB 128 GB 4.3 13990 17990
OPPO A53 Mint Cream 6 GB 128 GB 4.3 13990 17990
OPPO A53 Electric Black 4 GB 64 GB 4.5 11990 15990
OPPO A53 Electric Black 6 GB 128 GB 4.3 13990 17990
OPPO A12 Deep Blue 4 GB 64 GB 4.4 10490 11990
OPPO A12 Black 3 GB 32 GB 4.4 9490 10990
OPPO A12 Blue 3 GB 32 GB 4.4 9490 10990
OPPO A12 Flowing Silver 3 GB 32 GB 4.4 9490 10990
.
.
3112 more rows in the dataset

Table 4.1 Test Dataset Containing 3120 Rows

38 | P a g e
Fig 4.3 Giving the Meta Attributes of the input file

39 | P a g e
Fig 4.4 Summary of the Input File

40 | P a g e
Fig 4.5 Graph generated by the model showing the relationship between Price in INR and
Storage Size.

Fig 4.6 Graph created by the model depicting the relationship between Brand and Average
Rating.

41 | P a g e
Fig 4.7 Response to the user's query

4.6 Conclusion:

1. Efficient File Upload Handling: The server was successfully configured to handle file
uploads and JSON requests using multer and Express middleware. This ensured that CSV
files could be uploaded and processed efficiently.

2. Accurate Execution and Status Checking: The assistant's runs were successfully
triggered, and the status of each run was monitored periodically. This ensured that the
results were retrieved as soon as the analysis was completed.

3. Clear and Informative Result Processing: The results generated by the AI assistant,
including detailed graphs and explanations, were retrieved and formatted properly. The
graphs were accurately labelled with appropriate units of measurement and provided
clear visual representations of the data trends and insights.

4. User Query Handling: User input queries were handled well, with the AI assistant
providing accurate and relevant responses. The responses were clear and informative,
addressing the user’s questions effectively.

5. Perfect Graph Display: The generated graphs were displayed perfectly, providing users
with visually appealing and accurate representations of their data. The graphs were well-
labelled, making it easy for users to understand the trends and insights presented.

42 | P a g e
Chapter-5
Creation of an AI Document Analyzer

5.1 Objective:
The objective was to describe the development of an AI-powered document analyzer,
which consists of two sub-models. The first sub-model focuses on summarizing text
documents, providing concise and coherent summaries of large textual data. The second
sub-model is designed to answer user queries related to the content of the provided
documents, facilitating efficient information retrieval and enhancing user interaction with
the document's content. This chapter aims to detail the design, implementation, and
evaluation of these sub-models, highlighting their functionalities, integration, and overall
performance in analyzing and interpreting textual information.

5.2 Introduction:
In the rapidly evolving field of artificial intelligence, the ability to process and understand
large volumes of textual data has become increasingly important. The project described
in this chapter focuses on the creation of an AI-powered document analyzer designed to
enhance the efficiency and accuracy of text analysis. This tool incorporates two key
functionalities: the ability to summarize lengthy text documents and the capability to
answer user queries based on the content of the provided documents. These features aim
to streamline information retrieval and improve user interaction with complex textual
data.

The first sub-model of the AI document analyzer is dedicated to text summarization. This
component is engineered to process extensive text documents, extracting the most critical
information to generate concise and coherent summaries. The summarization capability is
essential for users who need to quickly grasp the main points of lengthy documents
without reading them in their entirety. This functionality leverages advanced natural
language processing (NLP) techniques to ensure that the summaries are both accurate and
meaningful, retaining the essence of the original text while reducing its length.

The second sub-model focuses on answering user queries related to the content of the
documents. This feature allows users to input specific questions and receive detailed,
contextually relevant answers based on the analyzed documents. By integrating
sophisticated NLP algorithms and leveraging a deep understanding of the document's
content, this sub-model enhances the user experience by providing precise and
informative responses. Together, these sub-models form a comprehensive AI document

43 | P a g e
analyzer that significantly improves the efficiency and effectiveness of handling large
textual datasets.

5.3 Prior Work:


Implementing an AI Document Analyzer requires a comprehensive understanding and
execution of several foundational tasks. First and foremost, a solid grasp of Natural
Language Processing (NLP) techniques is essential. This includes tokenization, which
breaks text into individual words or phrases; part-of-speech tagging, which identifies
grammatical parts of speech; named entity recognition (NER) to detect and classify
proper nouns and other named entities; and sentiment analysis to understand the
emotional tone behind the text. Additionally, familiarity with deep learning models,
particularly those used in NLP, such as Recurrent Neural Networks (RNNs), Long Short-
Term Memory (LSTM) networks, and Transformer models like BERT and GPT, is
crucial. These models help in understanding the context and semantics in text, which is
vital for tasks like summarization and question answering.

Data collection and preprocessing form another critical step. This involves gathering a
large corpus of text data for training and evaluation, followed by data cleaning to remove
irrelevant information and correct errors. Data annotation is necessary to label data for
tasks such as summarization or question-answering, enabling the training of supervised
models. Creating training and test sets helps in evaluating model performance accurately.
Understanding different summarization techniques is also essential. Extractive
summarization involves identifying and extracting key sentences or phrases, while
abstractive summarization generates new sentences that capture the essence of the
original text, often requiring more advanced models and a deep understanding of
language.

Building a question-answering system involves implementing retrieval-based models that


find relevant passages or sentences and generative models that produce answers based on
the document's content. Familiarity with software libraries and frameworks commonly
used in NLP and machine learning, such as TensorFlow or PyTorch for building and
training models, NLTK or SpaCy for basic NLP tasks, and Hugging Face Transformers
for leveraging pre-trained models, is necessary. Additionally, designing a user-friendly
interface for document upload and query input requires web development skills, using
frameworks like React, Angular, or Vue.js for front-end development and Node.js, Flask,
or Django for backend development. Rigorous evaluation and testing using metrics like
ROUGE scores for summarization quality and precision, recall, and F1 scores for
question-answering accuracy, along with user testing to gather feedback, ensure the tool
is robust and reliable. These preparatory steps collectively form the essential groundwork
for implementing an effective AI Document Analyzer.

44 | P a g e
5.4 Methodology:
1. Environment Setup
The first step involves setting up the development environment. This includes
configuring a server using Express.js to handle HTTP requests and responses. The
environment is further configured with the help of dotenv to manage environment
variables securely. The OpenAI API is integrated into the project to utilize advanced
language models for document processing and analysis.

2. File Upload Configuration


A robust file upload mechanism is established using Multer, a middleware for handling
multipart/form-data, which is primarily used for uploading files. The configuration
ensures that only PDF files are accepted to maintain consistency in document formats.
Each uploaded file is stored in a specific directory with a unique filename to prevent
overwriting and conflicts.

3. API Endpoint for Document Analysis


An API endpoint is created to handle the document analysis requests. When a user
uploads a document, the file is processed and sent to the OpenAI assistant. The assistant
is configured with detailed instructions to analyze the document thoroughly, paying close
attention to its content. This analysis helps generate comprehensive and direct responses
to user queries based on the document's information.

4. Assistant Configuration and Integration


The OpenAI assistant is configured to perform document analysis tasks. This involves
setting up the assistant with specific instructions that guide its behavior. The assistant is
tasked with thoroughly examining the document, understanding its content, and
responding accurately to user queries. This setup ensures that the assistant can provide
relevant and detailed information based on the analyzed document.

5. Document Analysis and Response Generation


Upon receiving a document, the assistant processes it to extract meaningful information.
The assistant generates summaries and answers user queries by leveraging the pre-trained
language models from OpenAI. This step involves both extracting key information and
generating new content that is coherent and contextually accurate.

6. Error Handling and File Management


Robust error handling mechanisms are put in place to manage any issues that arise during
file upload and processing. This includes handling invalid file formats, processing errors,
and cleaning up files after processing to ensure efficient use of storage. The system

45 | P a g e
ensures that users receive clear feedback in case of errors and that files are managed
securely.

7. Deployment and User Interface


The final step involves deploying the application on a web server, ensuring it is
accessible to users. A user-friendly interface is designed to allow easy document uploads
and query inputs. The interface is developed using modern web development frameworks
to ensure a smooth user experience. Continuous monitoring and updates are performed to
maintain the system's performance and accuracy.

Fig 5.1 Code overview of the assistant's instructions and document storage in vector
space.

46 | P a g e
5.5 Results:

Fig 5.2 Summary of the uploaded document

47 | P a g e
Fig 5.3 Model accurately answered the user query.

5.6 Conclusion:
1. Summarization Accuracy: The summarization model produced concise and accurate
summaries, effectively capturing essential information from lengthy documents. High
ROUGE scores confirmed the quality of the summaries.

2. Query Response Relevance: The question-answering model accurately and contextually


addressed user queries based on the document content. The high precision, recall, and F1
scores demonstrated the model's effectiveness in providing relevant answers.

3. System Performance and Scalability: The system performed reliably under various
loads, efficiently managing document analysis and user queries, demonstrating robust
performance and scalability.

48 | P a g e
Chapter-6
Design and Development of a Multilingual Speech-to-
Text AI Model

6.1 Objective:
The objective of the Design and Development of a Multilingual Speech-to-Text AI Model
project at BCCL is to develop a state-of-the-art system that can accurately and efficiently
convert voice notes (in MP3 format) from reporters into written text across multiple
languages. This project aims to streamline the workflow of reporters by providing a
reliable tool for transcribing spoken content into text, thereby enhancing the speed and
accuracy of news reporting. By supporting a wide range of languages and dialects, the
model ensures that reporters from diverse linguistic backgrounds can utilize this
technology. Key goals include achieving high accuracy in speech recognition for each
supported language, enabling efficient transcription to facilitate timely news
dissemination, and incorporating adaptive learning features to continually improve the
model’s performance.

6.2 Introduction:
In the rapidly evolving landscape of digital journalism, the ability to quickly and
accurately transcribe and translate spoken content into written text is becoming
increasingly vital. The Design and Development of a Multilingual Speech-to-Text AI
Model project at BCCL aims to address this critical need by creating an advanced system
capable of converting voice notes, specifically in MP3 format, from reporters into text
across multiple languages. This project is designed to streamline the workflow of
journalists, enhancing the speed and accuracy of news reporting, and ensuring
accessibility for reporters from diverse linguistic backgrounds.

The proposed system leverages cutting-edge speech recognition technology to transcribe


spoken words into text with high precision. Furthermore, it incorporates robust
translation capabilities to support a wide array of languages and dialects, thus fostering
inclusivity and broadening the reach of news content. By integrating adaptive learning
features, the model is designed to continuously improve its performance, ensuring that it
remains reliable and effective over time.

This initiative is aligned with BCCL's commitment to innovation and excellence in


journalism. By providing a reliable tool for transcription and translation, the project not
only enhances the efficiency of news reporting but also contributes to the preservation

49 | P a g e
and promotion of linguistic diversity. The seamless integration of this AI model with
BCCL’s existing digital infrastructure will provide a user-friendly interface that
minimizes errors and maximizes productivity, ultimately setting a new standard for
multilingual news transcription and translation in the industry.

6.3 Prior Work:


1. Requirement Analysis:
 Conduct a comprehensive needs assessment to understand the specific requirements of
reporters and end-users.
 Identify the range of languages and dialects to be supported, focusing on those most
relevant to BCCL's operations.

2. Data Collection:
 Gather a diverse dataset of audio recordings in various languages and dialects to train and
test the model.
 Ensure that the dataset includes a variety of accents, speech patterns, and audio qualities
to enhance model robustness.

3. Technology Stack Selection:


 Choose appropriate technologies and frameworks for developing the AI model, including
programming languages, libraries, and tools for speech recognition and natural language
processing.
 Evaluate and select cloud service providers for hosting the model and handling
computational requirements.

4. Model Development and Integration:


 Develop the initial version of the speech-to-text model using existing frameworks like
OpenAI's Whisper for transcription and GPT-4 for text processing and error correction.
 Integrate the model with a web service using frameworks like Express.js and Multer for
handling file uploads and API endpoints, as illustrated in the provided scripts.

5. Documentation and Training:


 Prepare comprehensive documentation detailing the model's functionalities, usage
instructions, and troubleshooting guides.
 Provide training sessions for reporters and other users to familiarize them with the new
system and its benefits.

50 | P a g e
6.4 Methodology:
6.4.1 Methodology for Transcription
1. Setup and Initialization:

Express.js and Multer Configuration:


- Import necessary modules including express, multer, fs, and OpenAI.
- Configure multer for file storage, specifying the destination folder (./uploads)
and the naming convention for uploaded files (file.fieldname + '-' + Date.now() +
'.mp3').

OpenAI Initialization:
- Instantiate the OpenAI client to interact with OpenAI's models for transcription
and translation.

2. File Upload Handling:

Endpoint Definition:
- Define an Express.js route (/transcribe) to handle POST requests for file
uploads.
- Use multer middleware to process single file uploads, storing the file in the
specified directory.

3. Transcription Process:

File Processing:
- Extract the file path from the uploaded file.
- Use OpenAI's Whisper model to transcribe the audio file by reading the file
stream (fs.createReadStream(filePath)).

Transcription Text Handling:


- Extract the transcribed text from the transcription response.
- Replace newline characters (\n) with escape sequences for proper formatting in
JSON responses.

51 | P a g e
4. Text Correction and Formatting:

Conversation Setup:
- Define a conversation for the OpenAI GPT-4 model, including a system prompt
instructing the model to correct grammar errors, remove offensive language, and
organize content into clear paragraphs.
- Pass the transcribed text to the model as the user input.

GPT-4 Interaction:
- Create a completion request to the GPT-4 model with the defined conversation,
specifying parameters like max_tokens, temperature, top_p, frequency_penalty,
and presence_penalty.

Text Formatting:
- Extract the corrected and formatted text from the GPT-4 model's response.
- Replace newline escape sequences with actual newline characters for
readability.

5. Response Handling:

JSON Response:
- Send the corrected and formatted text back to the client as a JSON response.

6. Server Configuration:

Port Setup:
- Define the port on which the Express.js server will run (process.env.PORT ||
3000).

Server Initialization:
- Start the Express.js server and log a message indicating the server is running.

6.5.2 Methodology for Translation

1. Setup and Initialization:

Importing Modules:
- Import necessary modules including express, multer, fs, and OpenAI to handle
server setup, file upload, and interaction with OpenAI's models.

52 | P a g e
Initializing Express App:
- Create an instance of the Express application.

2. File Storage Configuration:

Multer Configuration:
- Configure multer for file storage, specifying the destination folder (./uploads)
and the naming convention for uploaded files (appending a timestamp to the
original file name).

3. Defining the Endpoint:

Setting Up Routes:
- Define an endpoint (/transcribe) using Express.js to handle POST requests for
file uploads.
- Use multer middleware to process single file uploads, storing the file in the
specified directory.

4. Processing the Uploaded File:

File Path Extraction:


- Extract the file path from the uploaded file.

Transcription Request:
- Use OpenAI’s Whisper model to transcribe the audio file by reading the file
stream (fs.createReadStream(filePath)).
- Capture the transcription text from the response.

5. Handling Transcription Text:

Translation and Correction:


- Utilize OpenAI's translation capabilities to translate the transcribed text into the
target language.
- Pass the translated text to GPT-4 for proofreading and correction, focusing on
grammar and offensive language removal.
- Extract the corrected content from the GPT-4 model’s response.

6. Sending the Response:

JSON Response:
- Send the translated and corrected text back to the client as a JSON response.

53 | P a g e
7. Server Configuration:

Port Setup:
- Define the port on which the Express.js server will run (process.env.PORT ||
3000).

Server Initialization:
- Start the Express.js server and log a message indicating the server is running.

6.5 Results:

Fig 6.1 Transcribed text for the given mp3 file as input

54 | P a g e
Fig 6.2 Translated and Paraphrased Text

6.6 Conclusion:
The Design and Development of a Multilingual Speech-to-Text AI Model project at
BCCL represents a significant advancement in the field of digital journalism. By creating
a robust and reliable system capable of accurately converting voice notes into written text
across multiple languages, the project addresses a critical need for timely and precise
news reporting. The implementation of state-of-the-art speech recognition technology,
exemplified by OpenAI's Whisper model, and the utilization of advanced natural
language processing capabilities through GPT-4, ensure high accuracy and adaptability in
transcription and translation tasks.
This project's success is marked by several key achievements:

1. Enhanced Workflow Efficiency: The system streamlines the transcription process for
reporters, significantly reducing the time and effort required to convert spoken content
into written text. This efficiency gain directly translates to quicker news dissemination
and more timely reporting.

2. Broad Language Support: By supporting a wide range of languages and dialects, the
model
ensures inclusivity and accessibility for reporters from diverse linguistic backgrounds.
This capability fosters greater linguistic diversity in news reporting and enables a broader
audience reach.

55 | P a g e
3. Adaptive Learning: The incorporation of adaptive learning features allows the model to
continually improve its performance over time. This ensures that the system remains
reliable and effective, adapting to new speech patterns and languages as they emerge.

4. Preservation and Promotion of Linguistic Diversity: The project not only enhances
operational efficiency but also contributes to the preservation and promotion of linguistic
diversity in journalism. By accurately transcribing and translating content across multiple
languages, the system supports the representation of various linguistic communities in
news media.

In conclusion, the successful implementation of the Multilingual Speech-to-Text AI


Model at BCCL sets a new standard for multilingual news transcription and translation.
This project not only enhances the efficiency and accuracy of news reporting but also
underscores BCCL's commitment to innovation, excellence, and inclusivity in
journalism. As the model continues to evolve and improve, it promises to further
revolutionize the field of digital journalism, ensuring that high-quality, linguistically
diverse news content remains accessible to all.

56 | P a g e
Chapter-7
Enhancement of Live Speech-to-Text

7.1 Objective:
To develop and enhance a real-time speech-to-text system that accurately transcribes
spoken language into written text with high precision and minimal latency. The ultimate
goal is to support various applications within BCCL, including live broadcasting,
automated captioning, and assistive technologies for individuals with hearing
impairments, thereby enhancing accessibility and communication across diverse
audiences.

7.2 Introduction:
The ability to accurately transcribe spoken language into written text in real-time has
become an essential technology in various fields, from media broadcasting to assistive
communication tools. As the demand for seamless and accurate live transcription services
increases, there is a growing need to enhance existing speech-to-text systems to achieve
higher precision and reliability. The "Enhancement of Live Speech-to-Text" project at
BCCL focuses on advancing current speech recognition technologies to meet these
demands.

This project aims to develop a robust real-time speech-to-text system that leverages the
latest advancements in machine learning and natural language processing (NLP). By
integrating cutting-edge algorithms and state-of-the-art speech recognition techniques,
the project seeks to improve the accuracy, speed, and reliability of live transcriptions. The
enhanced system will support various applications within BCCL, including live
broadcasting, automated captioning, and assistive technologies for individuals with
hearing impairments.

The project's ultimate goal is to provide a high-quality, accessible solution that enhances
communication and accessibility across diverse audiences, ensuring that spoken content
is effectively and accurately transcribed into written text in real-time.

7.3 Prior Work:


 Integrate the transcription system with live audio input sources, such as microphones.
 Collect a diverse and representative dataset of audio recordings from various sources and
environments.
 Evaluate and select appropriate speech recognition APIs and libraries (e.g., AssemblyAI,
Google Speech-to-Text, Microsoft Azure Speech Service).

57 | P a g e
7.4 Methodology:

1. Setup and Initialization

Import Libraries:
The AssemblyAI library is imported to facilitate real-time transcription of audio data.
The time library is commented out, indicating it might be used for additional features
like timing or delays, but is not essential for the current implementation.

API Key Configuration:


The API key is set to authenticate and enable access to the AssemblyAI services.

2. Event Handlers Definition


Session Opened:
This function handles the event when a real-time transcription session is opened,
providing the session ID for reference.

Error Handling:
This function handles errors that may occur during the transcription process, printing
the error details for debugging purposes.

Session Closed:
This function handles the event when a transcription session is closed, indicating the
end of the session.

Data Handling:
This function handles incoming transcript data. It first checks if the transcript
contains any text. If it is a final transcript, it prints the text with a new line.

3. Transcriber Initialization
A RealtimeTranscriber instance is created with a specified sample rate and the
previously defined event handlers. This instance will manage the real-time
transcription process.

4. Transcriber Connection
The transcriber connects to the AssemblyAI service, initiating the transcription
session.

58 | P a g e
5. Microphone Stream Setup
A MicrophoneStream object is created to capture audio from the microphone at the
specified sample rate.

6. Streaming Audio to Transcriber


The captured audio stream is passed to the transcriber for real-time processing and
transcription.

7. Closing the Transcription Session


The transcription session is closed, ending the audio stream and transcription process.

Fig 7.1 Code of Real Time STT

59 | P a g e
7.6 Results:

Fig 7.2 Live Speech to Text

7.7 Conclusion:
The implementation of real-time transcription using AssemblyAI proved to be efficient
and effective. The methodology employed facilitated a robust transcription process,
handling audio data in real-time and managing session events proficiently. The results
demonstrated high accuracy and performance, making it a reliable solution for real-time
audio transcription needs. The overall user experience was positive, with minimal issues
encountered during the testing phase.

60 | P a g e
Chapter-8
Development of an AI Content Generation Model

8.1 Objective:
To design, develop, and implement an AI-powered content generation model capable of
producing high-quality, contextually relevant, and engaging textual content across
various domains. The model aims to leverage advanced artificial intelligence to
understand and emulate human language patterns, ensuring coherence, creativity, and
accuracy in generated content. The ultimate goal is to enhance productivity, support
creative processes, and meet the diverse content needs of users and organizations.

8.2 Introduction:
The rapid advancements in artificial intelligence (AI) and natural language processing
(NLP) have opened new frontiers in automated content generation. In an era where the
demand for high-quality, contextually accurate, and engaging content is ever-increasing,
the development of AI-powered models capable of generating human-like text has
become a crucial innovation. This project focuses on the development of an AI content
generation model designed to produce coherent and contextually relevant textual content
across various domains.

To accomplish this, we have leveraged the cutting-edge technologies and methodologies


pioneered by Anthropics. By integrating Anthropics' advanced AI frameworks and NLP
algorithms, we aim to create a model that not only understands and replicates human
language patterns but also enhances creativity and productivity in content creation. The
resulting AI model is intended to serve a wide range of applications, from supporting
creative professionals to automating routine content generation tasks in businesses,
thereby addressing the diverse and dynamic needs of modern content consumers.

8.3 Methodology:
1. Setup and Configuration:
Environment Configuration:
 Import necessary libraries including express for server setup and
Anthropic for interaction with the AI model.
 Load environment variables using dotenv to manage sensitive data such as
the API key securely.

61 | P a g e
API Key Setup:
 Store the API key for Anthropics in environment variables and retrieve it
using process.env.

2. Server Initialization:
Express Application Setup:
 Initialize an Express application and set up the server to listen on a
specified port (e.g., 3000).
 Use the express.text() middleware to handle incoming text data in POST
requests.

3. Anthropic Client Configuration:


Client Initialization:
 Create an instance of the Anthropic client using the retrieved API key for
authentication and interaction with the AI model.

4. Endpoint Creation:
POST Endpoint for Content Generation:
 Define a POST endpoint (/generate) to handle incoming content
generation requests.
 Extract the prompt from the request body and validate it to ensure it is not
empty.

5. Content Generation Logic:


AI Model Interaction:
 Use the client.messages.create method to send the prompt to the AI model
(claude-3-sonnet-20240229) and specify the maximum number of tokens
(1024) for the generated content.
 Structure the message payload with the user role and prompt content.

Response Handling:
 Parse the response from the AI model and filter for text content.
 Concatenate the filtered text content into a coherent response.
 Send the generated content back to the client as a plain text response.

6. Error Handling:
Request Validation:
 Return a 400 Bad Request status if the prompt is missing or invalid.
Exception Management:
 Catch and log any errors that occur during the content generation process.

62 | P a g e
 Return a 500 Internal Server Error status if an error occurs while
processing the request.

7. Server Execution:
Server Listening:
Start the Express server and log the URL where the server can be accessed (e.g.,
http://localhost:3000).

Fig 8.1 Code Overview: Handling Prompt Reception and Transmission to the Model

63 | P a g e
8.4 Results:

Fig 8.2 User Query and Response Utilizing the Anthropic Model

8.5 Conclusion:
1. Utilized Anthropics' model claude-3-sonnet-20240229 for content generation.

2. Parsed and filtered AI responses to extract and return text content.

3. Robust error handling with meaningful error messages and HTTP status codes.

4. Quick response generation, typically within a few seconds.

5. High-quality, coherent, and contextually relevant content.

6. Efficient handling of concurrent requests, maintaining performance.

64 | P a g e
Chapter-9
Future Work

Project 1: Development of a Copyright Infringement Detection Model


1. Integration with BCCL’s Content Management Systems:
- Integrate the copyright infringement detection model with BCCL’s existing content
management systems (CMS) to streamline the detection process and ensure all content is
automatically checked before publication.

2. Scalability Optimization:
- Enhance the scalability of the detection model to handle the high volume of content
produced by BCCL across its various media outlets, ensuring consistent performance and
reliability.

3. Multilingual Support and Localization:


- Expand the model to support detection of copyright infringement in regional languages
relevant to BCCL’s diverse audience base, ensuring comprehensive coverage across all
content.
- Adapt the tool to account for regional copyright laws and cultural nuances, providing
more accurate and contextually relevant detection.

Project 2: Designing an AI-Powered Research Assistant Model


1. Enhanced Integration with BCCL's Content
- Develop APIs to integrate the research assistant directly with Times Group’s content
repositories, allowing for seamless access to proprietary articles, news stories, and
archives.

2. Personalization and User Profiling


- Implement user profiling to tailor search results based on individual user preferences
and past interactions. This can help deliver more relevant and personalized content.

3. Advanced Natural Language Processing (NLP) Techniques


- Integrate sentiment analysis to provide users with an understanding of the tone and
sentiment of the articles retrieved, helping them to quickly gauge the nature of the
content.

65 | P a g e
4. Multi-Language Support
- Expand the model to support multiple languages, catering to Times Group’s diverse
audience. This includes integrating translation APIs and developing language-specific
models.

Project 3: Implementation of an AI-Driven Data Analysis Model


1. Advanced Data Visualization and Reporting
- Create customizable dashboards for different user roles within the Times Group,
allowing users to create and save personalized views and reports.

2. Natural Language Processing (NLP) Enhancements


- Enhance the model's NLP capabilities to better understand and respond to complex
queries, including multi-step questions and queries with conditional logic.
- Improve the summarization features to handle large documents more effectively,
providing concise and accurate summaries of lengthy reports and articles.

3. Multi-Language Support and Localization


- Expand the model to support analysis and summarization in multiple languages,
catering to the diverse linguistic needs of Times Group’s audience.

4. Collaboration and User Feedback Integration


- Establish a feedback loop with Times Group’s editorial and data teams to continually
refine and improve the model based on user feedback and real-world usage.
- Develop features that allow multiple users to collaborate on data analysis projects,
sharing insights and visualizations in real-time.

Project 4: Creation of an AI Document Analyzer


1. Multi-Language Support
To cater to a global audience, adding support for multiple languages would be beneficial.
This would involve training the models on a diverse dataset that includes various
languages, enabling the AI Document Analyzer to summarize and answer queries in
languages other than English.

2. Contextual Understanding and Personalization


Enhancing the model's ability to understand context and personalize responses based on
user preferences and history could significantly improve user experience. Implementing
user profiles and adaptive learning algorithms would allow the tool to tailor its

66 | P a g e
summaries and query responses to individual users, making interactions more relevant
and efficient.

3. Integration with Content Management Systems (CMS)


Integrating the AI Document Analyzer with popular CMS platforms used by Times
Group could streamline document management processes. This would allow for seamless
analysis, summarization, and querying of documents directly within the CMS, enhancing
workflow efficiency.

4. User Feedback and Iterative Improvements


Implementing a feedback mechanism to collect user input on the quality of summaries
and query responses can drive iterative improvements. By analyzing user feedback, the
development team can identify areas for enhancement and continuously refine the AI
Document Analyzer.

5. Collaboration Features
Adding collaboration features that allow multiple users to interact with the same
document simultaneously could enhance team productivity. Features such as shared
annotations, real-time collaboration, and version control would make the AI Document
Analyzer a valuable tool for collaborative document analysis.

Project 5: Design and Development of a Multilingual Speech-to-Text AI


Model
1. Integration with Times Group's Digital Platforms
Seamlessly integrating the speech-to-text model with Times Group’s existing digital
platforms, such as content management systems (CMS) and news publishing tools, will
streamline the workflow for journalists. This integration can facilitate automatic
transcription and translation of audio content directly within the CMS, reducing manual
intervention and speeding up the publication process.

2. Enhanced Security and Privacy Measures


Ensuring the security and privacy of the audio data processed by the model is paramount.
Future work should focus on implementing robust encryption methods, secure data
storage solutions, and compliance with data protection regulations to safeguard sensitive
information and maintain user trust.

67 | P a g e
3. Multi-Modal Integration
Exploring the integration of multi-modal inputs, such as combining speech recognition
with image and text analysis, can enhance the overall functionality of the system. This
approach can provide a more comprehensive understanding of the content, enabling
richer and more accurate transcriptions and translations.

Project 6: Enhancement of Live Speech-to-Text


1. Enhanced Language and Accent Support
Building upon the current capabilities, future enhancements should include support for
more languages and regional accents. This involves collecting and training on diverse
datasets to improve the model's ability to accurately transcribe speech in different
languages and with various accents. This enhancement is crucial for Times Group’s
diverse linguistic audience.

2. Real-Time Translation Integration


Incorporating real-time translation capabilities alongside transcription can greatly benefit
multilingual reporting. By integrating translation APIs, the system can provide instant
translations of transcribed text, allowing for real-time multilingual broadcasts and
accessibility for a global audience.

Project 7: Development of an AI Content Generation Model


1. Multilingual Content Generation
To cater to a global audience, incorporating multilingual support into the AI content
generation model is essential. This will enhance the model’s versatility and broaden its
application across different regions and languages.

2. Real-Time Content Updates and Dynamic Generation


Implementing real-time content updates and dynamic generation capabilities will enable
the model to produce up-to-date content based on current events and trends.

68 | P a g e
References

1. Copyscape API Guide: https://www.copyscape.com/api-guide.php

2. NeuralNine Automate Google Search API in Python:


https://www.youtube.com/watch?v=TddYMNVV14g&t=139s

3. Jie Jenn Google Custom Search API Guide:


https://youtu.be/D4tWHX2nCzQ?si=dsmZKBEe77H3Mgts

4. Bing Custom Search API Quickstart Guide: Quickstart: Call your Bing Custom Search
endpoint using Node.js - Bing Search Services | Microsoft Learn

5. OpenAI Code Interpreter Documentation: Code Interpreter - OpenAI API

6. OpenAI File Search Documentation: File Search - OpenAI API

7. Whisper Speech to Text Guide: Speech to text - OpenAI API

8. Reference Github Link for PNG to Base64 image conversion:


https://gist.github.com/MarkoCen/0ee9437439e00e313926

9. Image to Base64 Conversion Using NPM: https://www.npmjs.com/package/image-to-


base64

10. AssemblyAI Streaming Speech to Text Documentation:


https://www.assemblyai.com/docs/getting-started/transcribe-streaming-audio-from-a-
microphone/python

11. Anthropic Quickstart Guide: https://docs.anthropic.com/en/docs/quickstart-guide

69 | P a g e
Reflective Dairy

29 January, 2024:
Introductory meeting with the manager
Brief introduction to the team
Discussion about different departments
Summary of my previous projects

30 January, 2024:
Provided an overview of the organization's work culture.
Presented the current products and applications they are developing and gave a brief explanation
of my project.
Asked me to explore various foundational AI modules.

31 January, 2024:
Investigated various AI modules, including: AWS Bedrock base models, such as Stability AI,
Claude Model, and Titan Text G1 Express, OpenAI models
Reviewed the OpenAI documentation.

1 February, 2024:
Gained knowledge on using the OpenAI API key.
Understood the various factors influencing the API key and their impact on the model's
performance, including: Temperature, Maximum tokens, Top_p, Number of responses (N),
Frequency penalty, Presence penalty

2 February, 2024:
Investigated various AI image generation models, including: DALL-E, MidJourney, Stability AI,
Imagen, Adobe Firefly
Reviewed the full documentation for each model.
Evaluated the advantages and disadvantages of each model to improve outcomes.

Week 1 (29 January - 2 February, 2024)


- Held introductory meetings with the manager and team, and discussed departments, my
project, and previous work.
- Explored the organization's work culture, current projects, and foundational AI modules,
including AWS Bedrock and OpenAI models.
- Investigated AI image generation models, reviewed documentation, and evaluated model
performance factors and capabilities.

70 | P a g e
5 February, 2024:
Delivered a presentation outlining the advantages and disadvantages of various image generation
models.
Engaged in a discussion about the AI models that are available for text generation and text
summarization.

6 February, 2024:
Received the details or credentials for the API key.
Provided with a full walkthrough of the code for the proof of concept project we are developing.

7 February, 2024:
Tasked with reviewing the text and images generated by AI models for profanity and copyright
issues.
Researched various AI models and websites for profanity checks, including: writuverify.com,
zakkie.com, membrace.ai, Profanity check application

8 February, 2024:
Created a presentation on the various profanity check models I discovered.
The presentation distinguished the models based on:
- Their accuracy levels
- The number of parameters they were trained on
- The number of training epochs
- The suitability of their APIs for business-related purposes

9 February, 2024:
Assigned to explore the Google Gemini model.
Evaluated AI models including Gemini Ultra, Gemini Pro, GPT-4, GPT-3.5, Claude 2, and
LLaMA-2 on tasks such as math problems, coding challenges, essay-related tasks, and multiple-
choice questions, and assessed their accuracy.
Studied React.
Investigated available models for detecting copyright infringement.

Week 2 (5 February - 9 February, 2024)


- Delivered presentations on image generation and profanity check models, discussing their
advantages, disadvantages, and suitability for business use.
- Reviewed AI-generated content for profanity and copyright issues, researched profanity
check models, and explored various AI text generation models and their capabilities.
- Received API key details, walked through the proof of concept project code, and evaluated
multiple AI models (including Google Gemini, GPT-4, and Claude 2) on diverse tasks,
while studying React and copyright detection models.

71 | P a g e
12 February, 2024:
Began studying React.
Further researched copyright infringement detection models, including:
- Pex for video and audio content
- TinyEye for images
- CopyScape for text
- Pixsy for images
Decided to use TinyEye and CopyScape from the researched models.

13 February, 2024:
Investigated how CopyScape operates in Python, including its functions, libraries, and
parameters.
Developed a basic mini chatbot using OpenAI.

14 February, 2024:
Studied Node.js.
Developed the frontend of the project utilizing CSS and JavaScript.

15 February, 2024:
Attempted to implement a profanity check using OpenAI instead of relying on a third-party
application.
Developed code to detect profanity in text.

16 February, 2024:
Held a meeting to review the progress of the work.
Explored available APIs for retrieving the source code of text generated by AI models.
Identified two suitable APIs:
- Bing Search API
- Google Custom Search API

Week 3 (12 February - 16 February, 2024)


- Continued studying React and Node.js, and researched copyright infringement detection
models, deciding to use TinyEye and CopyScape.
- Developed a basic chatbot using OpenAI and worked on the project's frontend with CSS
and JavaScript.
- Implemented profanity detection using OpenAI and reviewed work progress, while
exploring APIs (Bing Search API and Google Custom Search API) for retrieving source
code of AI-generated text.

72 | P a g e
19 February, 2024:
Created and deployed a Python model to retrieve the source of text content.
Developed a model that provides the title, link, and snippet for the given query.

20 February, 2024:
Recreated the work from the previous day using Node.js.
Started learning Express.js.

21 February, 2024:
Developed copyright infringement detection code utilizing the CopyScape API with both Python
and Node.js.

22 February, 2024:
Employed OpenAI for profanity checking by manually inputting 100 swear words to obtain
results.
Retrieved a tabular dataset from the internet to test with OpenAI and analyze the outcomes.

23 February, 2024:
Created prompts to evaluate the model's use case and identify any existing bugs.
Conducted tests on a 10,000-row dataset using OpenAI, generated results for various potential
queries, and manually compared these results with the dataset.

Week 4 (19 February - 23 February, 2024)


- Developed and deployed models in Python and Node.js for retrieving text content sources,
providing titles, links, and snippets for queries.
- Created copyright infringement detection code using the CopyScape API, and implemented
profanity checking with OpenAI by manually testing 100 swear words.
- Conducted extensive testing with a 10,000-row dataset using OpenAI, evaluating model
performance and identifying bugs through manual comparison of results.

73 | P a g e
26 February, 2024:
Studied the documentation and researched the implementation of the Bing API.
Conducted extensive testing of the implementation in a controlled testing environment.
Identified and resolved errors through debugging.

27 February, 2024:
Studied the root routes in Express.js.
Gathered a dataset for testing with OpenAI.

28 February, 2024:
Gathered additional datasets related to court cases.
Assessed a court case PDF dataset using OpenAI.
Manually reviewed and analyzed the results and outcomes of the queries.

29 February, 2024:
Examined a dataset of medical papers using OpenAI and assessed the accuracy of the results.

1 March, 2024:
Presented findings on the accuracy and effectiveness of OpenAI in analyzing court cases and
medical report documents.
Investigated available AI tools to enhance the accuracy of the aforementioned analysis.

Week 5 (26 February - 1 March, 2024)


- Implemented and tested the Bing API, resolving errors through debugging, and studied
Express.js root routes.
- Collected and tested datasets, including court cases and medical papers, using OpenAI, and
manually reviewed the results.
- Presented findings on OpenAI's accuracy in analyzing court cases and medical documents,
and researched additional AI tools to improve analysis accuracy.

74 | P a g e
4 March, 2024:
Deployed the project live using Gemini Pro, applying the same foundational concept.
Implemented the solution using Python and Node.js.

5 March, 2024:
Conducted testing with Gemini Pro.
Developed multiple prompts to evaluate its accuracy and delivered a presentation.
Highlighted the incorrect outputs generated by the model and compared its precision with other
models we used.

6 March, 2024:
Studied Express.js.
Tried to use the OpenAI API key to develop a model for analyzing Excel files.

7 March, 2024:
Began studying the OpenAI Assistant API keys, including reviewing its documentation and
functionality.
Developed a minor project utilizing the OpenAI Assistant API.

8 March, 2024:
Attempted to analyze tabular data using the Assistant APIs.
Developed a workflow for the aforementioned model.

Week 6 (4 March - 8 March, 2024)


- Deployed the project using Gemini Pro in Python and Node.js, and tested its accuracy through
multiple prompts, comparing its performance with other models.
- Continued studying Express.js and OpenAI Assistant API keys, developing minor projects and
workflows for analyzing tabular data and Excel files.

75 | P a g e
11 March, 2024:
Assigned to develop a model for analyzing writing patterns and styles from text extracted from
URLs.
Conducted web scraping to obtain paragraph text from URLs.

12 March, 2024:
Developed a model using an OpenAI API key to perform the previously mentioned task.
Encountered multiple errors during the process.
Managed to fix some of the errors.

13 March, 2024:
Fixed all errors from the previous day and attempted to implement the model using the OpenAI
Assistant API key.
The model operated successfully, accomplishing all tasks, and was submitted for review.

14 March, 2024:
Investigated the OpenAI assistant API keys.
Studied function calling, knowledge retrieval, and code interpretation.

15 March, 2024:
Assigned to create a model for analyzing text in PDF or Word documents, summarizing the
model's output and addressing the user's query.
Researched models utilizing knowledge retrieval and vector databases, gaining insights into their
backend operations.
Created a flowchart for navigation.

Week 7 (11 March - 15 March, 2024)


- Developed a model for analyzing writing patterns from web-scraped text using OpenAI API
and successfully implemented it with the OpenAI Assistant API key.
- Investigated OpenAI Assistant API functionalities and researched models for text analysis
in PDF/Word documents, creating a flowchart for backend operations.

76 | P a g e
18 March, 2024:
Started developing the PDF document analysis model using the Assistant APIs and the
knowledge retrieval model with the OpenAI API key.
Created a Python model to perform the described operation.

19 March, 2024:
Fixed all compiler and API errors encountered by the model.
Tested the model with over 70 different PDF files.
Measured the model's accuracy.

20 March, 2024:
Showcased the model to my manager, detailing its advantages, disadvantages, and use cases.
Was then asked to convert the model to Node.js code for frontend deployment.

21 March, 2024:
Translated the code to Node.js and confirmed its functionality.
Once the results were satisfactory, the hardcoded file was changed to Express.js and tested using
POSTMAN.
Submitted the code for further testing by senior management to identify and address any bugs.

22 March, 2024:
Tasked with researching and understanding the Anthropic Models, and comparing their accuracy
to other similar models.
Reviewed the documentation and operation of the Anthropic API.

Week 8 (18 March - 22 March, 2024)


- Developed and tested a PDF document analysis model using OpenAI API and Assistant APIs,
measuring its accuracy and showcasing it to the manager.
- Converted the model to Node.js for frontend deployment, tested it with Express.js and
POSTMAN, and submitted it for further testing.
- Researched Anthropic Models, comparing their accuracy to other models, and reviewed the
Anthropic API documentation.

77 | P a g e
26 March, 2024:
Created two text generation models using Anthropic API keys: one for streaming messages and
another for generating simple response messages.
Evaluated the functionality of the existing text generation model against the Anthropic model,
analyzing the content quality and response accuracy in both models.

27 March, 2024:
Corrected the issue where the PDF Analyzer model failed to display results in POSTMAN.
Fixed the API bug.
Tested the fix using over 100 papers and 300 questions to ensure the bug was resolved.

28 March, 2024:
Persisted with testing and successfully resolved the bug.
Submitted the code for further production, preparing to go live on the platform.

29 March, 2024:
Assigned to research various audio-to-text converter models, transcribe the audio files, and
format the text into coherent paragraphs.
Investigated the OpenAI Whisper model, Google Cloud speech-to-text model, and the AWS
model for similar tasks.
Compiled and presented the advantages and disadvantages of each model.

Week 9 (26 March - 29 March, 2024)


- Created and evaluated two text generation models using the Anthropic API, comparing
their performance with an existing model.
- Fixed and tested the PDF Analyzer model, resolving an API bug and ensuring
functionality with extensive testing before preparing it for production.
- Researched audio-to-text converter models, including OpenAI Whisper, Google Cloud,
and AWS models, and presented their advantages and disadvantages.

78 | P a g e
1 April, 2024:
Developed a speech-to-text model utilizing AWS Transcribe services.
Fixed the issues related to connecting the model with the microphone.

2 April, 2024:
Created a basic frontend website to utilize the AWS Transcribe functionality.
Connected the frontend interface with the backend model.

3 April, 2024:
Created a model for transcription and translation using the OpenAI Whisper model.
Attempted to develop a live speech-to-text model.

4 April, 2024:
Investigated the Google model and AssemblyAI models for live speech-to-text transcription.
Began developing logic using Python's built-in functions.

5 April, 2024:
Developed a live speech-to-text model using Python's built-in libraries, but it was rejected due to
delayed but accurate results.
Tested the Amazon STT model, which also resulted in lagged outputs.

Week 10 (1 April - 5 April, 2024)


- Developed a speech-to-text model using AWS Transcribe services and created a basic
frontend to utilize this functionality.
- Created a transcription and translation model using the OpenAI Whisper model and
attempted to implement a live speech-to-text model.
- Investigated Google and AssemblyAI models for live transcription and developed a live
speech-to-text model using Python, but faced issues with delayed outputs, testing both
Python's libraries and Amazon STT with similar lagging results.

79 | P a g e
8 April, 2024:
Developed the model using AssemblyAI API keys.
Fixed several microphone-related issues and other bugs.

9 April, 2024:
Began developing a new model to process Excel sheets, providing summaries and responding to
user queries based on the input data.

10 April, 2024:
Began investigating available APIs and models for similar functionalities.
Fixed network latency issues in the PDF analysis model.

11 April, 2024:
Examined the documentation to understand the code interpreter functionality provided by
OpenAI Assistant APIs.
Began coding to implement this functionality.

12 April, 2024:
Started coding to implement the summary functionality for the model.
Fixed new issues that arose in the CopyScape model.

Week 11 (8 April - 12 April, 2024)


- Developed a model using AssemblyAI API keys, fixing microphone and other bugs.
- Started developing a model to process Excel sheets for summaries and queries, and
investigated APIs for similar functionalities.
- Resolved network latency issues in the PDF analysis model and began implementing the
code interpreter functionality with OpenAI Assistant APIs.
- Worked on implementing summary functionality and fixed issues in the CopyScape model.

80 | P a g e
15 April, 2024:
Submitted the code to summarize the Excel file.
Began working on extracting meta-attributes.

16 April, 2024:
Removed unnecessary content and tags from the output and paraphrased it for summarization.
Submitted the code for extracting meta-attributes.

17 April, 2024:
Translated the summarization and meta-attributes code into ExpressJS.
Tested the code with over 50 datasets, each containing more than 200,000 rows.

18 April, 2024:
Received approval for the summarization and meta-attributes code.
Connected the approved models to the frontend interface.

19 April, 2024:
Began developing functionality to generate graphs as output.
Investigated methods to create graphs from OpenAI based on Excel sheet data and integrate them
with the frontend.
Explored ways to display the generated graphs in the frontend interface.

Week 12 (15 April - 19 April, 2024)


- Developed and submitted code for summarizing Excel files and extracting meta-attributes,
then translated it into ExpressJS and tested with large datasets.
- Received approval for the summarization and meta-attributes code, connected it to the
frontend, and began developing functionality to generate and display graphs based on Excel
data using OpenAI.

81 | P a g e
22 April, 2024:
Began creating prompts to generate accurate graphs.
Reviewed the Assistant API documentation for guidance.

23 April, 2024:
Developed code to generate 3 graphs and output them in PNG format.

24 April, 2024:
Modified the code to improve data visualizations.
Further paraphrased the output from the summarization model for better clarity.

25 April, 2024:
Learned the method to display PNG images in Postman for testing purposes.
Fixed the issue of incorrect image generation.

26 April, 2024:
Developed a code to provide a concise summary of the graph's display, including the underlying
logic, a brief overview of the graph, and the trends identified from the analysis.

Week 13 (22 April - 26 April, 2024)


- Created prompts and developed code to generate accurate graphs in PNG format, improving
data visualizations and paraphrasing summarization model output for clarity.
- Learned how to display PNG images in Postman and fixed issues with incorrect image
generation.
- Developed code to provide concise summaries of graphs, including underlying logic, brief
overviews, and identified trends.

82 | P a g e
29 April, 2024:
Eliminated irrelevant content from the text output and organized the content according to the
graph model's sequence.
Created a summary with appropriate line breaks, ensuring no profanity was included.

30 April, 2024:
Began the process of converting graphs from PNG format to Base64 format.
Sought guidance and hints from GitHub, YouTube, and Google.

2 May, 2024:
Created the code to convert the graph into Base64 format.
Fixed all the linking errors in the model.

3 May, 2024:
Verified the code for graph generation and its text output using POSTMAN.
Performed testing on more than 25 documents.
Created two sub-models:
- A model that randomly selects 2 graphs from the sheet.
- A model that allows users to query any graph and receive it as output.

Week 14 (29 April - 3 May, 2024)


- Refined text output by removing irrelevant content and organizing it according to the graph
model, ensuring profanity-free summaries with appropriate formatting.
- Converted graphs from PNG to Base64 format, fixing linking errors and verifying the code
using POSTMAN.
- Tested the graph generation and text output, and developed two sub-models: one for random
graph selection and another for user-query-based graph retrieval.

83 | P a g e
6 May, 2024:
Completed the second phase of the model, focusing on obtaining graph outputs based on user
queries.
Successfully developed the prompt and submitted the model for evaluation.

7 May, 2024:
Resolved graph-related issues as per my manager's feedback.
Improved the prompt by making it more concise and providing clearer explanations.

8 May, 2024:
Integrated the graph model with the frontend interface.

9 May, 2024:
Began developing the next section of the module to handle user queries about the Excel
document and compute the corresponding output.
Investigated the built-in Python tools available for performing these computations.

10 May, 2024:
Began developing the module utilizing Python's built-in libraries such as NumPy and Pandas.
Reviewed the documentation for the Bing API.

Week 15 (6 May - 10 May, 2024)


- Completed the user query-based graph output model, refined prompts, and submitted it for
evaluation, resolving issues based on feedback and integrating it with the frontend interface.
- Started developing a module to handle user queries about Excel documents, using Python's
built-in libraries like NumPy and Pandas, and reviewed Bing API documentation for
additional functionality.

84 | P a g e
13 May, 2024:
Attempting to resolve the issue with Python libraries was unsuccessful, so the Assistant API was
employed to address the problem.
Successfully developed code to handle user queries.

14 May, 2024:
Conducted extensive testing of the model using more than 70 documents and over 210 questions.

15 May, 2024:
Integrated the backend model with the frontend website.
Successfully finished all tasks related to the tabular format data.

16 May, 2024:
Began utilizing the BING API to achieve functionality similar to Google Custom Search, but
with searches restricted to the Times of India website.
Addressed and resolved issues related to API keys and custom configuration ID.

17 May, 2024:
Developed code to utilize the BING API, successfully obtaining output in the form of URLs to
websites along with brief summaries.
Submitted the code for review.

Week 16 (13 May - 17 May, 2024)


- Resolved issues with Python libraries using the Assistant API, developed code to handle
user queries, and extensively tested the model with numerous documents and questions.
- Integrated the backend model with the frontend website, completing tasks related to tabular
data.
- Utilized the BING API for restricted searches on the Times of India website, developing
and submitting code to obtain URLs and brief summaries.

85 | P a g e
20 May, 2024:
Demonstrated the gamma.ai website.
Discussed upgrading from GPT-4 to GPT-4.0 and the necessary modifications to our model to
improve speed.

21 May, 2024:
Converted all PDF analysis models into sub-modules.
Modified the code to accommodate the upgrades for Data Analysis Model.

22 May, 2024:
Enhanced the Document Analyzer with upgrades.
Incorporated web scraping to extract text from any URL.
Implemented NLP techniques to analyze writing patterns of the scraped text.

23 May, 2024:
Utilized the GPT-4o model to analyze writing patterns in conjunction with NLP techniques.
Attempted to embed the analyzed patterns into Google Docs.

24 May, 2024:
Developed a model to scrape text from any URL.
Incorporated translation for text in languages other than English.
Embedded the translated text into Google Docs.

Week 17 (20 May - 24 May, 2024)


- Discussed upgrading to GPT-4.0 and made necessary modifications to improve model
speed, converting PDF analysis models into sub-modules and enhancing the Document
Analyzer.
- Implemented web scraping and NLP techniques to analyze writing patterns, integrating
these with the GPT-4.0 model and embedding the results into Google Docs.
- Developed a model for web scraping text from any URL, including translation for
non-English text, and successfully embedded the translated text into Google Docs.

86 | P a g e

You might also like