Skip to content
This repository was archived by the owner on Nov 25, 2025. It is now read-only.

Conversation

@BurhanCantCode
Copy link

Summary

Adds a new POST /extract/pdf endpoint to handle PDF text extraction on the server-side. This enables the React SDK to offload PDF processing to the backend, reducing client bundle size by ~2MB (removal of pdfjs-dist dependency).

Changes

  • ✅ Add pdf-parse dependency (lightweight, 25KB vs 2MB pdfjs-dist)
  • ✅ Create ExtractPdfDto and ExtractPdfResponseDto following NestJS patterns
  • ✅ Add extractPdfText endpoint in ExtractorController with file validation
    • Validates file type (application/pdf)
    • Validates file size (max 50MB)
    • Returns extracted text and page count
  • ✅ Add unit tests for PDF file validation (size/type checks)
  • ✅ Follow existing audio.controller.ts pattern for consistency

API Endpoint

POST /extract/pdf Content-Type: multipart/form-data Request:
file: PDF file (max 50MB)
Response: { "text": "extracted text content...", "pages": 5 }

Testing

  • ✅ 3 unit tests added (file size validation, file type validation, valid file acceptance)
  • ✅ All tests passing
  • ✅ Follows NestJS best practices and project conventions

Related

This follows:
✅ Conventional Commits format (feat(api):)
✅ Clear summary of changes
✅ Technical details about the implementation
✅ Testing information
✅ Links to related PRs

BurhanCantCode and others added 3 commits October 26, 2025 19:02
Add POST /extract/pdf endpoint to handle PDF text extraction on the server.
This allows the React SDK to offload PDF processing, reducing client bundle
size by ~2MB (removal of pdfjs-dist dependency).

Changes:
- Add pdf-parse dependency (lightweight, 25KB)
- Create ExtractPdfDto and ExtractPdfResponseDto following NestJS patterns
- Add extractPdfText endpoint in ExtractorController with file validation
- Add unit tests for PDF file validation (size/type checks)
- Follow existing audio.controller.ts pattern for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Resolve package-lock.json conflicts by regenerating from package.json.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…tion

feat(api): add server-side PDF text extraction endpoint
@vercel
Copy link

vercel bot commented Oct 30, 2025

@BurhanCantCode is attempting to deploy a commit to the tambo ai Team on Vercel.

A member of the Team first needs to authorize it.

@michaelmagan
Copy link
Collaborator

Hey @BurhanCantCode! Thanks for the PR. I reviewed the server-side PDF extraction implementation. After discussing internally, we do need to update our Tambo Cloud API, but we don't need any new API routes. Instead, we need to extend our current image handling to support more file types, using S3 instead of storing in the database.

Many of the Frontier models just support passing file types. File types to support (from OpenAI, Anthropic, and Gemini):

Common across all three providers:

  • Documents: PDF, TXT, HTML, JSON, MD
  • Images: JPG/JPEG, PNG, GIF, WEBP
  • Spreadsheets: CSV

Additional formats:

  • DOCX, PPTX (OpenAI & Anthropic)
  • RTF, ODT (Anthropic & Gemini)
  • TSV, XLSX (Gemini)
  • Code files (.py, .js, .java, .cpp, etc.)

Implementation approach:

  • Frontend: Update the component to start uploading files to S3 immediately when attached
  • Upload handling: Ensure messages wait for uploads to complete before sending
  • Message content: Pass the message type and S3 location in the message content
  • API side: Update existing logic to fetch from S3 and pass to the LLM provider (similar to current image handling) - no new endpoints needed
  • Configuration: Allow engineers to specify which file types they want to support (since different models support different sets)
  • Documentation: Update docs to explain supported file types per model

No need for pdf-parse or any extraction logic - the LLM providers handle that themselves.

Here are the relevant docs for supported file types:
OpenAI File Search Supported Files
Anthropic Files API
Gemini Document Processing

Could you refactor this to follow the S3 upload pattern instead of adding a new extraction endpoint?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants