PdfChat

A modern RAG (Retrieval-Augmented Generation) application for interactive PDF document querying

Features • Architecture • Tech Stack • Installation • Usage • Development • License

Overview

PdfChat is a full-stack PDF chat application that allows users to upload PDF files, process them into vector embeddings, store them in a Qdrant vector database, and interact with the content through a conversational AI interface. The system leverages local machine learning models for embeddings and integrates with OpenRouter for chat completions.

Features

📄 PDF Management: Upload and process multiple PDFs with streamlined interface
🔍 Semantic Search: AI-powered search through your documents using natural language questions
💬 Conversational AI: Chat with your documents using OpenRouter's GPT-4o integration
🧠 Local Embeddings: Generate vector embeddings locally using Xenova Transformers
🚀 Real-time Updates: Track PDF processing with visual progress indicators
🔐 Authentication: Secure user management through Clerk authentication
🎨 Modern UI: Clean, responsive design with glassmorphism and dark mode

Architecture

┌─────────────┐          ┌─────────────┐          ┌─────────────┐
│   Frontend  │ ◄──────► │   Backend   │ ◄──────► │ Vector Store│
│  (Next.js)  │          │   (Node.js) │          │   (Qdrant)  │
└─────────────┘          └─────────────┘          └─────────────┘
       ▲                        │                        ▲
       │                        ▼                        │
       │                 ┌─────────────┐                 │
       │                 │  Job Queue  │                 │
       │                 │  (BullMQ)   │                 │
       │                 └─────────────┘                 │
       │                        │                        │
       │                        ▼                        │
       │                 ┌─────────────┐                 │
       └───────────────►│    LLM API   │◄────────────────┘
                        │ (OpenRouter) │
                        └─────────────┘

Data Flow

User uploads a PDF through the frontend
Backend queues the PDF for processing using BullMQ
Worker processes the PDF:
- Extracts text
- Splits into chunks
- Generates embeddings using Xenova Transformers
- Stores vectors in Qdrant
User asks questions through the chat interface
Backend retrieves relevant document chunks from Qdrant
Retrieved context is sent to OpenRouter along with the query
AI-generated response is displayed to the user

Tech Stack

Frontend

Framework: Next.js 15 with App Router and React 19
UI Components: Custom components with Tailwind CSS
Authentication: Clerk integration
State Management: React hooks

Backend

Runtime: Node.js with Express 5
Vector Database: Qdrant (via Docker)
Queue System: BullMQ with Valkey (Redis alternative)
Document Processing: LangChain document loaders and text splitters
Embeddings: Local Xenova Transformers (all-MiniLM-L6-v2)
LLM Provider: OpenRouter with GPT-4o

Installation

Prerequisites

Node.js (v18+)
Docker and Docker Compose
OpenRouter API key
Hugging Face API key (optional)

Step 1: Clone the Repository

git clone https://github.com/yourusername/pdfchat.git
cd pdfchat

Step 2: Start Infrastructure with Docker

docker-compose up -d

This starts:

Qdrant on port 6333 (UI dashboard available at http://localhost:6333/dashboard)
Valkey (Redis alternative) on port 6379

Step 3: Configure Environment Variables

For the server:

cd server
# Create .env file with necessary API keys and configuration

Required server environment variables:

OPENAI_API_KEY: Your OpenRouter API key
HUGGINGFACE_API_KEY: Your Hugging Face API key (optional)

For the client (if using authentication):

cd client
# Create .env.local with Clerk credentials

Step 4: Install Dependencies & Start Services

Backend:

cd server
npm install
npm run dev       # Start the main server
npm run dev:worker # In a separate terminal, start the worker

Frontend:

cd client
npm install
npm run dev

Usage

Navigate to http://localhost:3000 in your browser
Upload a PDF document
Wait for processing to complete (observe the progress indicator)
Start asking questions about your document in the chat interface
View AI-generated responses with source references

Development

Project Structure

pdfchat/
├── client/                  # Frontend (Next.js)
│   ├── app/                 # App router structure
│   ├── components/          # Reusable UI components
│   └── lib/                 # Utility functions
├── server/                  # Backend (Node.js)
│   ├── index.js             # Express server
│   ├── worker.js            # BullMQ worker for PDF processing
│   └── uploads/             # Temporary PDF storage
└── docker-compose.yml       # Docker config for infrastructure

Adding Features

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgements

LangChain for document processing
Qdrant for vector storage
OpenRouter for AI completions
Xenova Transformers for local embeddings
Next.js for the frontend framework
Clerk for authentication

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
client		client
server		server
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PdfChat

Overview

Features

Architecture

Data Flow

Tech Stack

Frontend

Backend

Installation

Prerequisites

Step 1: Clone the Repository

Step 2: Start Infrastructure with Docker

Step 3: Configure Environment Variables

Step 4: Install Dependencies & Start Services

Usage

Development

Project Structure

Adding Features

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

virajmandlik/RagVeda

Folders and files

Latest commit

History

Repository files navigation

PdfChat

Overview

Features

Architecture

Data Flow

Tech Stack

Frontend

Backend

Installation

Prerequisites

Step 1: Clone the Repository

Step 2: Start Infrastructure with Docker

Step 3: Configure Environment Variables

Step 4: Install Dependencies & Start Services

Usage

Development

Project Structure

Adding Features

License

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages