Lists (6)
Sort Name ascending (A-Z)
Stars
Easiest and laziest way for building multi-agent LLMs applications.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
基于Paddle进行语义检索并部署上线,支持多语言 This code is based on Paddle to do a semantic search, and deploy it. Multilingual support
🛡️ A list of all profile badges and how to obtain each one 🛡️
PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.
Repository with code for object detection trajectory matching of unmarked honeybees in 2D.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Store the solutions and codes of AI competitions I participated in
lingskr / BurmeseCorpus-OCR
Forked from 1chimaruGin/BurmeseCorpusBurmese Language Corpus
Use Burmese text rendering to generate data suitable for training OCR
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
Tesseract Open Source OCR Engine (main repository)
A Multi-Modal Dataset of Chinese Governmental Docunments
Convert captured images to text using BaiduOCR, GoogleOCR, WindowsOCR, tesseractOCR, RapidOCR or Capture2Text, and translate the resulting text using Google, Chatgpt, Edgegpt, DeepL or many more. D…
Advanced real-time screen translator for games, hardcoded subtitles in videos, static text and etc.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
This repo contains script using Tesseract OCR to digitize pdf ebooks to text format.
Generate text images for training deep learning ocr model
Generate text line images for training deep learning OCR models
PaddleOCR provides 80 language models, but it still cannot cover all of them. The published models can achieve better results by fine-tuning them with your own recognition data.
This project is image to text recognizing implementation for Myanmar language by Tessearct 4.0 OCR Engine.
Extract text from tables of images. Use OpenCV to detect margin lines and PyTesseract to detect Burmese text.