IntelliDocs is an AI-powered intelligent document analyzer that allows you to upload various document types (PDFs, images, DOCX, TXT) and ask natural language questions about their content. The application uses advanced OCR technology for scanned documents and leverages AI to provide accurate answers about document content, word counts, line locations, and summaries.
- Multi-Format Support: Upload PDFs, images (JPG, PNG, GIF, BMP, WEBP), DOCX, TXT, PPT, and PPTX files
- Serverless OCR: Extract text from scanned PDFs and images using Tesseract.js (WASM-based, no system dependencies)
- AI-Powered Analysis: Ask natural language questions about your documents using GPT-4o-mini
- Cloud Storage: Secure file storage and metadata management using Supabase
- Intelligent Text Extraction:
- Native PDF text extraction with OCR fallback
- DOCX text extraction using Mammoth
- Image-to-text conversion
- Advanced Queries:
- Count word occurrences
- Find specific line numbers
- Summarize content
- Modern UI: Beautiful, responsive interface built with React, TailwindCSS, and Shadcn UI
- Next.js 15 - React framework with App Router
- React 19 - UI library
- TypeScript - Type-safe JavaScript
- TailwindCSS 4 - Utility-first CSS framework
- Shadcn UI - Reusable components
- Supabase - PostgreSQL Database & File Storage
- Next.js API Routes - Serverless API endpoints
- OpenAI API - AI-powered document analysis (via GitHub Models)
- Tesseract.js - WebAssembly OCR (Serverless compatible)
- unpdf - PDF text extraction
- mammoth - DOCX to text conversion
Before running this project, ensure you have:
- Node.js (v20 or higher)
- Supabase Account (for database and storage)
- GitHub Account (for AI models access)
git clone <repository-url>
cd intelli-docsnpm install- Create a new project at database.new.
- Run the following SQL in the Supabase SQL Editor to set up the schema:
-- Create documents table
create table if not exists documents (
id uuid default gen_random_uuid() primary key,
original_name text not null,
storage_path text not null,
extracted_text text,
file_type text,
file_size bigint,
lines_count int,
word_count int,
created_at timestamp with time zone default timezone('utc'::text, now()) not null
);
-- Create storage bucket
insert into storage.buckets (id, name, public)
values ('documents', 'documents', true)
on conflict (id) do nothing;
-- Set security policies
create policy "Public Access" on storage.objects for select using ( bucket_id = 'documents' );
create policy "Public Upload" on storage.objects for insert with check ( bucket_id = 'documents' );Create a .env.local file in the root directory:
# GitHub Token for AI Models (Required)
GITHUB_TOKEN=your_github_personal_access_token
# Supabase Configuration (Required)
NEXT_PUBLIC_SUPABASE_URL=your_supabase_project_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_keynpm run devOpen http://localhost:3000 in your browser.
This project is open source and available under the MIT License.