Offline speech recognition API for Android, iOS, Raspberry Pi
Face recognition with deep neural networks
Open Source OCR Engine
Port of OpenAI's Whisper model in C/C++
Build your own AI friend
State-of-the-art 2D and 3D Face Analysis Project
Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
Awesome multilingual OCR toolkits based on PaddlePaddle
A Lightweight Face Recognition and Facial Attribute Analysis
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCRmyPDF adds an OCR text layer to scanned PDF files
Speech-to-text, text-to-speech, and speaker recognition
Interactive video and image annotation tool for computer vision
A GUI Agent app based on UI-TARS to control your computer using AI
A pure Javascript Multilingual OCR
Captcha solver extension for humans
High-performance neural network inference framework for mobile
Cross-platform, customizable ML solutions for live and streaming media
Open Source Computer Vision Library
ZBar is an open source software suite for reading bar codes
Speech recognition module for Python
A free, open source, and extensible speech-to-text application
Ready-to-use OCR with 80+ supported languages
Open-Source Python3 tool for recognizing layouts, tables, and math