0% found this document useful (0 votes)
16 views5 pages

Chatbot Documentation Task

The document outlines various online chatbot platforms that support PDF uploads and Q&A, including ChatGPT, ChatPDF, Humata.ai, AskYourPDF, and SciSummary, each with unique features, pros, and cons. Additionally, it describes a custom chatbot system design that involves components like PDF parsing, text chunking, embedding, and user query processing to enable interactive document reading and question answering. The outcome aims to create a user-friendly chatbot for applications in education, legal, healthcare, and research sectors.

Uploaded by

1508madhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views5 pages

Chatbot Documentation Task

The document outlines various online chatbot platforms that support PDF uploads and Q&A, including ChatGPT, ChatPDF, Humata.ai, AskYourPDF, and SciSummary, each with unique features, pros, and cons. Additionally, it describes a custom chatbot system design that involves components like PDF parsing, text chunking, embedding, and user query processing to enable interactive document reading and question answering. The outcome aims to create a user-friendly chatbot for applications in education, legal, healthcare, and research sectors.

Uploaded by

1508madhav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Chatbot Documentation Task

Task 1: Identify Existing Online Chatbot Platforms That Support PDF


Upload and Q&A

1. ChatGPT (with File Upload Support)

Platform: chat.openai.com

Features:

 Users can upload PDF documents directly into the chat (available with ChatGPT Plus and
Pro accounts using GPT-4).

 GPT-4 scans and reads the file content.

 Users can ask questions in natural language, and ChatGPT responds based on the
document content.

 Handles long-form documents and multiple files.

Pros:

 Highly accurate answers.

 Good contextual understanding of documents.

 Ability to reference specific sections, tables, or figures.

Cons:

 Not free (requires paid subscription).

 Sometimes has limitations with scanned or image-based PDFs.

2. ChatPDF

Website: https://www.chatpdf.com

Features:

 Upload a PDF, and the system creates a chatbot interface to interact with the document.

 Extracts and summarizes content.


 Allows natural language Q&A.

 Works well for research papers, textbooks, and reports.

Pros:

 Free tier available.

 Simple and fast user interface.

 Good for academic documents.

Cons:

 File size limit on free plan.

 May struggle with complex formatting or visual data like charts.

 Limited memory/context window.

3. Humata.ai

Website: https://www.humata.ai

Features:

 Upload PDFs and chat with them instantly.

 Designed for deep understanding of documents—especially research papers, contracts,


and legal docs.

 Can generate summaries and explain complex sections.

Pros:

 Attractive, modern interface.

 Great for technical papers and scientific content.

 Offers citation-based answers.

Cons:

 Requires signup.

 Free plan has restrictions on usage and file size.


4. AskYourPDF

Website: https://askyourpdf.com

Features:

 Drag and drop PDFs to create a chat interface.

 Uses embeddings and LLMs to find and answer based on relevant sections.

 Chrome extension also available.

Pros:

 Easy to use and fast setup.

 Can integrate with ChatGPT API.

 Document history is saved for future access.

Cons:

 Some limitations in free version.

 May misinterpret layout-heavy PDFs.

5. SciSummary

Website: https://www.scisummary.com

Features:

 Tailored for summarizing scientific papers.

 Supports PDF upload, summarization, and question answering.

 Generates concise outputs for academic content.

Pros:

 Highly specialized for research papers.

 Useful for students, researchers, and academics.

Cons:

 Not general-purpose (best for scientific PDFs).

 Limited interactivity compared to other tools.


Task 2: Building a Custom Chatbot System
Objective:
Design and implement a chatbot capable of reading PDF documents, extracting key information,
and answering user questions interactively.

System Components:

1. PDF Parser:

o Use Python libraries like PyMuPDF, pdfplumber, or pdfminer.six to extract clean


text from PDFs.

o Handle multi-column layouts and images with OCR if necessary.

2. Text Chunking and Embedding:

o Split the extracted text into manageable chunks (e.g., using sentence or
paragraph breaks).

o Convert text into embeddings using models like OpenAI's text-embedding-ada-


002, Sentence-BERT, or HuggingFace Transformers.

3. Vector Store:

o Store embeddings in a vector database like FAISS or ChromaDB for similarity


search.

4. User Query Processing:

o When a user asks a question, convert the query into an embedding.

o Perform semantic search in the vector database to retrieve the most relevant
text chunks.

5. Answer Generation:

o Feed the retrieved chunks and question into an LLM (e.g., OpenAI GPT-4 or
HuggingFace model) to generate a contextual answer.

6. Chat Interface:

o Create a frontend using Streamlit, Gradio, or a web-based chatbot UI.

o Allow users to upload PDFs and interact via chat.

Tools and Libraries Used:


 Python

 PyMuPDF / pdfplumber

 LangChain or LlamaIndex

 OpenAI API or HuggingFace Transformers

 FAISS / Chroma

 Streamlit / Gradio

Outcome

 A functioning chatbot that accurately reads PDFs and answers questions in real time.

 Easy-to-use interface with upload, chat, and response features.

 Potential applications in education, legal, healthcare, and research industries.

You might also like