You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Gemma-3 OCR exemplifies the confluence of abstruse computer vision and arcane NLP, leveraging Gemma-3 Vision’s neural framework for precise OCR and semantically refined text curation. Powered by Streamlit and Ollama, this hermetic system converts visual data into perspicuous, markdown-rendered output, ensuring maximal accuracy and confidentiality.
This tool Help you Convert documents to markdown, extract raw text, and locate specific content with bounding boxes. It takes 20~ sec for markdown and 3~ sec for locate task. Check the info at the bottom of the page for more information.
This project demonstrates a classic OCR pipeline. This Flask app takes an image, applies an OpenCV preprocessing pipeline, and uses Tesseract OCR to digitize Vietnamese invoices (Bách Hóa Xanh)..
This project transforms messy invoice images into a structured, searchable knowledge base. The pipeline automatically extracts text with Tesseract, uses Google Gemini to parse fields (vendor, total, date), stores data in Milvus, and enables natural language queries via a LangChain-powered chatbot.
A minimalist SOTA LaTeX OCR model which contains only 20M parameters and runs in browser. Containing full training pipeline suitable for self-study. | 超轻量SOTA LaTeX公式识别模型,20M参数量,可在浏览器中运行。包含训练全流程代码,适合自学。
Django app that uses Tesseract OCR and a Gemini-based analyzer to extract nutrition info from images, match harmful ingredients, and persist scan results.
autoPDFtagger is a Python tool designed for efficient home-office organization, focusing on digitizing and organizing both digital and paper-based documents. By automating the tagging of PDF files, including image-rich documents and scans of varying quality, it aims to streamline the organization of digital archives.
AI-driven number plate recognition web app built with Flask, YOLOv8, and EasyOCR. Instantly detects and reads license plates from images or live webcam, flags stolen vehicles via database lookup, and is powered by a custom Roboflow-annotated dataset. 🚗✨