Skip to content

Conversation

@BurhanCantCode
Copy link

Summary

Extends file upload functionality to support PDFs and text files (.txt, .md) in addition to images. Files are processed client-side with appropriate extraction methods, and the UI provides professional preview cards with loading states.

Changes

Core Functionality

  • Added useMessageFiles hook to replace useMessageImages (fully backwards compatible)
  • Added pdfjs-dist dependency for client-side PDF text extraction
  • Implemented file type detection with content processing:
    • Images: Converted to base64 data URLs
    • PDFs: Text extracted via PDF.js
    • Text files: Content read directly with FileReader API
  • Added file limits: 10 files max, 50MB per file, 100MB total
  • Implemented content truncation (100KB total, 50KB per file) to prevent payload size errors
  • Enhanced error handling for connection/timeout errors with actionable messages

UI Improvements

  • Professional file preview system:
    • Images: Expandable preview badges (integrated with existing context badge feature)
    • PDFs/Text: File cards with type icons (📄 PDF, 📝 text)
  • Loading states with spinner animations during file processing
  • Image thumbnails with hover-to-remove functionality
  • File counter displaying "Attached files (X/10)"
  • File size display in human-readable format
  • Full drag-and-drop support for all file types
  • MessageAttachments component for displaying file metadata in chat messages

Developer Experience

  • Comprehensive JSDoc documentation for all new functions
  • Added use-message-files.test.ts with full test coverage
  • Updated message-builder.test.ts for file content handling
  • All file processing happens asynchronously with proper error handling

Backwards Compatibility

  • ✅ Old useMessageImages API still works via exports
  • StagedImage type aliased to StagedFile
  • ✅ Existing image-only workflows remain unchanged

Testing

  • ✅ All 254 tests passing
  • ✅ ESLint passing (0 errors, 0 warnings)
  • ✅ TypeScript builds successfully
  • ✅ Tested file upload flow:
    • Single and multiple files
    • Mixed file types (images + PDFs + text)
    • File size limits
    • Content truncation for large files
    • Drag-and-drop functionality
    • Error handling

Technical Details

File Processing:

  • Images processed via FileReader.readAsDataURL()
  • PDFs processed via pdfjs-dist with text extraction from all pages
  • Text files read via FileReader.readAsText()
  • All processing happens client-side for privacy

Content Management:

  • Text content automatically truncated with clear notices
  • File content sent to AI with proper formatting
  • File markers filtered from displayed user messages
  • Images sent as separate content parts

Security:

  • File type validation before processing
  • Size limits enforced at multiple levels
  • No server-side processing required

Files Changed

  • react-sdk/package.json - Added pdfjs-dist dependency
  • react-sdk/src/hooks/use-message-files.ts (NEW) - Core file handling hook
  • react-sdk/src/hooks/__tests__/use-message-files.test.ts (NEW) - Comprehensive tests
  • react-sdk/src/util/message-builder.ts - Content builder with truncation
  • react-sdk/src/providers/tambo-thread-input-provider.tsx - Updated provider
  • react-sdk/src/index.ts - Backwards-compatible exports
  • showcase/src/components/ui/message-input.tsx - Professional file preview UI
  • showcase/src/components/ui/message.tsx - MessageAttachments component
  • showcase/src/components/ui/thread-content.tsx - Renders attachments
  • showcase/src/lib/thread-hooks.ts - Content filtering utilities
  • Updated tests in react-sdk/src/util/__tests__/message-builder.test.ts

Screenshots

image image

Extends file upload functionality to support PDFs and text files (.txt, .md) in addition to images.

Key Changes:
- Added useMessageFiles hook replacing useMessageImages (backwards compatible)
- Added pdfjs-dist dependency for client-side PDF text extraction
- Implemented file type detection with content processing:
  - Images: converted to base64 data URLs
  - PDFs: text extracted via PDF.js
  - Text files: content read directly
- Added file limits: 10 files max, 50MB per file, 100MB total
- Implemented content truncation (100KB total, 50KB per file) to prevent payload size errors
- Enhanced error handling for connection/timeout errors

UI Improvements:
- Professional file preview cards with loading states
- File type icons (PDF, text, images)
- Image thumbnails with hover-to-remove functionality
- File counter and size display
- Drag-and-drop support for all file types
- MessageAttachments component for displaying file info in chat

Tests:
- Added comprehensive tests for useMessageFiles hook
- Updated message-builder tests for file content handling
- All 254 tests passing

Backwards Compatibility:
- Old useMessageImages API still works via exports
- StagedImage type aliased to StagedFile
@vercel
Copy link

vercel bot commented Oct 24, 2025

@BurhanCantCode is attempting to deploy a commit to the tambo ai Team on Vercel.

A member of the Team first needs to authorize it.

BurhanCantCode and others added 3 commits October 26, 2025 19:15
…action method to fetch from tambo-cloud

Key Changes:
- Removed pdfjs-dist dependency from package.json.
- Updated extractPdfText function to use a fetch API for PDF text extraction instead of pdfjs-dist.
- Adjusted tests to mock fetch for PDF API calls.

This change streamlines the PDF extraction process and reduces package size.
Remove unused canvas dependencies that were pulled in by pdfjs-dist.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Resolved conflicts:
- showcase/src/lib/thread-hooks.ts: Kept file content marker filtering logic
  needed for PDF/text file support feature
- package-lock.json: Regenerated after accepting main version

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
BurhanCantCode and others added 7 commits October 31, 2025 00:45
Fix lint warning for undeclared environment variable.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Remove fileToDataUrl, extractPdfText, readTextFile functions
- Replace dataUrl/textContent with storagePath in StagedFile
- Update ALLOWED_FILE_TYPES to include CSV and specific image formats
- Remove text file content embedding from message builder
- Prepare for storage-based file upload implementation
- Add uploadFile function that calls /extract/pdf endpoint
- Upload files immediately when added to staged files
- Store storagePath in StagedFile for message content
- Handle upload errors and filter out failed uploads
- Use TamboClient for API requests with proper auth
Check for pending uploads and failed uploads before allowing submission
- Images: Pass as image_url with storage:// URL
- Documents: Pass as text with storage:// URL pattern
- Backend will fetch from storage and convert to proper format for LLM
- Introduced new MIME type handling for various file formats including images, documents, and text files.
- Updated file upload validation to support a wider range of file types in the message input.
- Improved file icon representation based on MIME type in message components.
- Added functions to infer MIME types from file extensions and storage references.
- Enhanced error handling for unsupported file types during upload.
Update test assertions and mock data to use storagePath property and
storage:// URL format instead of dataUrl/textContent. Remove blank
lines within functions per coding standards.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant