-
Tencent
- Chengdu, China
Stars
- All languages
- ActionScript
- Ada
- Assembly
- C
- C#
- C++
- CMake
- CSS
- ChucK
- Clojure
- Cuda
- Cython
- Dart
- Erlang
- GAP
- Go
- HTML
- Hack
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MATLAB
- MDX
- Makefile
- NewLisp
- Objective-C
- OpenEdge ABL
- PHP
- Perl
- PureBasic
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Starlark
- Swift
- TeX
- TypeScript
- XSLT
A lightweight library for normalizing speech transcripts before computing WER
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recogniti…
MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…
A 10000+ hours dataset for Chinese speech recognition
Large, modern dataset for speech recognition
Control Gmail, Google Calendar, Docs, Sheets, Slides, Chat, Forms, Tasks, Search & Drive with AI - Comprehensive Google Workspace / G Suite MCP Server & CLI Tool
26m function call model that runs on incredibly small devices
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
A Conversational Speech Generation Model
Inference and training library for high-quality TTS models.
A toolkit for processing speech data and creating speech datasets
AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
A Kubernetes media gateway for WebRTC. Contact: info@l7mp.io
Production-grade engineering skills for AI coding agents.
Agent Skills for Google products and technologies
A cross-platform desktop All-in-One assistant for Claude Code, Codex, OpenCode, OpenClaw, Gemini CLI & Hermes Agent. Only official website: ccswitch.io
The official Lark/Feishu CLI tool, maintained by the larksuite team — built for humans and AI Agents. Covers core business domains including Messenger, Docs, Base, Sheets, Calendar, Mail, Tasks, Me…
ESC-50: Dataset for Environmental Sound Classification
The Audio Set Ontology aims to provide a comprehensive set of categories to describe sound events.
ACL 2026 - Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
MeetEval - A meeting transcription evaluation toolkit
A blazing fast AI Gateway with integrated guardrails. Route to 1,600+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
All parts of Claude Code's system prompt, 27 builtin tool descriptions, sub agent prompts (Plan/Explore/Task), utility prompts (CLAUDE.md, compact, statusline, magic docs, WebFetch, Bash cmd, secur…
Faster Whisper transcription with CTranslate2