Starred repositories
The swiss army knife of lossless video/audio editing
C inference for Qwen3-ASR 0.6b and 1.7b transcriptions models
A fast and soft pattern search for trillion-scale corpora.
Offline streaming speech-to-text in the browser
A Streaming-Native Serving Engine for TTS/STS Models
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
A real-time and light-weight software for generation of non-linguistic behaviors (turn-taking, backchannel, and head-nodding) in conversational AIs
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
VoiceBench: Benchmarking LLM-Based Voice Assistants
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.
Chrome extension that analyzes tweets on X timeline based on the X algorithm weights
Massive open Japanese speech corpus
A free, open source, and extensible speech-to-text application that works completely offline.
Browser automation CLI for AI agents
Curated list of design and UI resources from stock photos, web templates, CSS frameworks, UI libraries, tools and much more
Training code for FAcodec presented in NaturalSpeech3
Unsupervised Speech Decomposition Via Triple Information Bottleneck
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines
A lightweight text-to-speech model with zero-shot voice cloning
Conversion between Traditional and Simplified Chinese
A highly compressive and high-quality neural audio codec for speech models.
Hono <-> React Router Adapter