Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
-
Updated
Feb 13, 2026 - Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A community-supported supercharged document management system: scan, index and archive all your documents
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
yolo3+ocr
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network
pix2tex: Using a ViT to convert images of equations into LaTeX code.
All-in-One Development Tool based on PaddlePaddle
A synthetic data generator for text recognition
Convert documentation websites, GitHub repositories, and PDFs into Claude AI skills with automatic conflict detection
[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别
Official implementation of Character Region Awareness for Text Detection (CRAFT)
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)
Handwritten Text Recognition (HTR) system implemented with TensorFlow.
Effortless data labeling with AI support from Segment Anything and other awesome models.
[验证码识别-训练] This project is based on CNN/ResNet/DenseNet+GRU/LSTM+CTC/CrossEntropy to realize verification code identification. This project is only for training the model.
Add a description, image, and links to the ocr topic page so that developers can more easily learn about it.
To associate your repository with the ocr topic, visit your repo's landing page and select "manage topics."