🔍 Read lips in videos with an end-to-end deep learning model, enhancing accessibility and transcribing speech from mouth movements effectively.
-
Updated
Feb 10, 2026 - Python
🔍 Read lips in videos with an end-to-end deep learning model, enhancing accessibility and transcribing speech from mouth movements effectively.
🎤 Enable offline speech recognition in React Native using sherpa-onnx, supporting various model architectures for reliable performance.
React Native TurboModule for Sherpa-ONNX offline on-device Speech Processing (STT/TTS/Diarization/VAD) completely offline on the device. Support for Android & iOS
Speech-to-text server framework with next-gen Kaldi
Connectionist Temporal Classification (CTC) decoder with dictionary and language model.
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
ioBroker Adapter for myUplink.com for Nibe Heat Pumps
Offline Speech-to-Text for React Native using sherpa-onnx Supports Zipformer, Paraformer, NeMo CTC, Whisper & more.
Sistema OCR con Deep Learning (CRNN + CTC) entrenado con 100k imágenes sintéticas, incluye app web con Gradio
CTF writeups and proof files from TCS HackQuest Season 10 cybersecurity challenges.
Thulium is a production-ready Python library for offline handwritten text recognition (HTR) supporting 52+ languages across Latin, Cyrillic, Greek, Arabic, Hebrew, Devanagari, Chinese, Japanese, Korean, and Georgian scripts.
ASR-LE is an advanced ASR evaluation + observability toolkit that goes beyond WER: it shows where errors happen in time, estimates streaming p95 first-word latency, generates “worst moments” automatically, and produces reusable artifacts (report.json, timeline bins, moments, etc.)
Website to highlight open source projects from our very Cebu talents for every #hacktoberfest from here on...
Deep learning lip-reading model using Conv3D + BiLSTM + CTC architecture. Transcribes speech from mouth region video clips for accessibility applications.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Solving NTHU AIS captchas with deep learning models, implemented as APIs.
A simple and interactive CTC (Cost to Company) Calculator built using React, TypeScript, and Vite. This tool helps users break down their total salary package into individual components like Basic Salary, HRA, DA, LTA, Special Allowance, and more. It supports monthly and yearly modes, allowing easy toggling between the two views.
Add a description, image, and links to the ctc topic page so that developers can more easily learn about it.
To associate your repository with the ctc topic, visit your repo's landing page and select "manage topics."