Skip to content
Luke Encrapera edited this page Sep 5, 2024 · 3 revisions

Welcome to the OCR wiki!

Current Features of the OCR Application:

  • Single Image Processing: Users can upload and process a single image for text extraction using Tesseract OCR.
  • Image Preview: The application shows a preview of the uploaded image before processing it.
  • Download Extracted Text: After processing, users can download the extracted text as a .txt file.
  • Copy to Clipboard: Users can copy the extracted text directly to their clipboard for convenience.
  • File Size Validation: Ensures that uploaded files do not exceed the specified file size limit (e.g., 5MB).
  • Large Language Model (LLM) Integration: Implement an LLM for reviewing and improving the accuracy of the extracted text.
  • LLM Text Revision: Users can revise the extracted text with an LLM and download or copy the revised text.

Planned Features:

  • Batch Processing: The ability to upload and process multiple images at once.
  • Azure Cloud Deployment: Deploy the OCR tool to Azure for scalable and secure cloud-based processing.
  • Multi-Language Support (to be added later): The ability to process and extract text in multiple languages (e.g., English, Hindi, Tamil, etc.).
Clone this wiki locally