🎤 Enable voice recognition for the Doubao input method using Python; ideal for learning and research with a focus on audio processing.
-
Updated
Apr 4, 2026 - Python
🎤 Enable voice recognition for the Doubao input method using Python; ideal for learning and research with a focus on audio processing.
🔍 Read lips in videos with an end-to-end deep learning model, enhancing accessibility and transcribing speech from mouth movements effectively.
Russian ASR system from scratch. Conformer encoder, CTC/Attention hybrid decoder, SpecAugment, beam search with LM fusion.
NVIDIA Conformer-CTC and Conformer-Transducer (RNN-T) running natively on Apple Silicon via MLX. Loads NeMo checkpoints directly.
Deep learning-based handwritten OCR system using CNN-RNN-CTC with evaluation via CER/WER metrics.
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
CTF writeups and proof files from TCS HackQuest Season 10 cybersecurity challenges.
Thulium is a production-ready Python library for offline handwritten text recognition (HTR) supporting 52+ languages across Latin, Cyrillic, Greek, Arabic, Hebrew, Devanagari, Chinese, Japanese, Korean, and Georgian scripts.
ASR-LE is an advanced ASR evaluation + observability toolkit that goes beyond WER: it shows where errors happen in time, estimates streaming p95 first-word latency, generates “worst moments” automatically, and produces reusable artifacts (report.json, timeline bins, moments, etc.)
Deep learning lip-reading model using Conv3D + BiLSTM + CTC architecture. Transcribes speech from mouth region video clips for accessibility applications.
The goal of this project is to accurately transcribe handwritten English word images into text using a deep learning model. This is achieved through the use of: The IAM Handwriting Dataset is a labeled image corpus CNN + BLSTM + CTC architecture An end-to-end training and inference pipeline
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
E2E Speech Recognition Toolkit with Hydra and Pytorch Lightning
CRNN/LPRNet/STNet + CTC + CCPD
TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
Add a description, image, and links to the ctc topic page so that developers can more easily learn about it.
To associate your repository with the ctc topic, visit your repo's landing page and select "manage topics."