kingfener

king kingfener

a man open the new world

2 followers · 3 following

Beijing

Achievements

heretic Public
Forked from p-e-w/heretic

Fully automatic censorship removal for language models

Python GNU Affero General Public License v3.0 Updated Apr 25, 2026
sam-audio Public
Forked from facebookresearch/sam-audio

基于文本、视觉、时间范围线索的音频分割：The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example noteb…

Python Other Updated Dec 17, 2025
vlash Public
Forked from mit-han-lab/vlash

机器人基于视觉的动作执行：Real-Time VLAs via Future-state-aware Asynchronous Inference.

Python Apache License 2.0 Updated Dec 3, 2025
CarelessWhisper-Streaming Public
Forked from tomer9080/WhisperRT-Streaming

Causal streaming adaptation of OpenAI Whisper for real-time transcription on small audio chunks.

Python Other Updated Sep 18, 2025
Kimi-Audio Public
Forked from MoonshotAI/Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python Updated Apr 28, 2025
Orpheus-TTS Public
Forked from canopyai/Orpheus-TTS

TTS Towards Human-Sounding Speech

Python Apache License 2.0 Updated Mar 23, 2025
silentcipher Public
Forked from SesameAILabs/silentcipher

Deep Audio Watermarking : 音频水印

Python MIT License Updated Mar 17, 2025
Spark-TTS Public
Forked from SparkAudio/Spark-TTS

Spark-TTS Inference Code

Python Apache License 2.0 Updated Mar 5, 2025
zipEnhancer Public
Forked from boreas-l/zipEnhancer

该项目来源于阿里开源的语音降噪模型zipEnhancer

Python Updated Mar 4, 2025
async_cosyvoice Public
Forked from qi-hua/async_cosyvoice

使用vllm加速cosyvoice2的推理

Jupyter Notebook Apache License 2.0 Updated Mar 2, 2025
TTS-LLaSA_training Public
Forked from zhenye234/LLaSA_training

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python Other Updated Feb 14, 2025
unsloth-LLM-finetuning Public
Forked from unslothai/unsloth

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory

Python Apache License 2.0 Updated Feb 10, 2025
Qwen-Agent Public
Forked from QwenLM/Qwen-Agent

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python Other Updated Jan 24, 2025
MiniCPM-o Public
Forked from OpenBMB/MiniCPM-V

多模态语音大模型：MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python Apache License 2.0 Updated Jan 17, 2025
google-research Public
Forked from google-research/google-research

Google Research

Jupyter Notebook Apache License 2.0 Updated Jan 9, 2025
data-Thorsten-Voice Public
Forked from thorstenMueller/Thorsten-Voice

speech data: Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Python Creative Commons Zero v1.0 Universal Updated Jan 8, 2025
openai-cookbook Public
Forked from openai/openai-cookbook

Examples and guides for using the OpenAI API

MDX MIT License Updated Jan 8, 2025
vector-quantize-pytorch Public
Forked from lucidrains/vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Python MIT License Updated Jan 7, 2025
pycorrector Public
Forked from shibing624/pycorrector

pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，Qwen2.5等模型应用在纠错场景，开箱即用。

Python Apache License 2.0 Updated Dec 26, 2024
versa Public
Forked from wavlab-speech/versa

Versatile Evaluation of Speech and Audio

Python Apache License 2.0 Updated Dec 25, 2024
WavChat Public
Forked from jishengpeng/WavChat

A Survey of Spoken Dialogue Models (60 pages)

Updated Nov 28, 2024
snac Public
Forked from hubertsiuzdak/snac

audio codec: Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate

Python MIT License Updated Nov 19, 2024
streaming-ChatTTS Public
Forked from pengzhendong/streaming-ChatTTS

Jupyter Notebook Apache License 2.0 Updated Oct 30, 2024
GLM-4-Voice Public
Forked from zai-org/GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型, TTS 效果不错

Python Apache License 2.0 Updated Oct 30, 2024
spiritlm Public
Forked from facebookresearch/spiritlm

保留情感的音频LLM:Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".

Python Other Updated Oct 28, 2024
SNAC-Vocos Public
Forked from hertz-pj/SNAC-Vocos

A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.

Python Updated Oct 28, 2024
midi-fluidsynth Public
Forked from FluidSynth/fluidsynth

midi 播放： Software synthesizer based on the SoundFont 2 specifications

C GNU Lesser General Public License v2.1 Updated Oct 20, 2024
amt-apc Public
Forked from misya11p/amt-apc

音乐：自动钢琴翻唱： AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model

Python MIT License Updated Oct 19, 2024
qa-mdt Public
Forked from ivcylc/OpenMusic

文本到音乐生成： 241010-SOTA Text-to-music (TTM) Generation (OpenMusic)

Python MIT License Updated Oct 9, 2024
ml-depth-pro Public
Forked from apple/ml-depth-pro

苹果-深度图-估计-Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 1 Other Updated Oct 5, 2024

king kingfener

Achievements

Achievements

heretic Public

Uh oh!

sam-audio Public

Uh oh!

vlash Public

Uh oh!

CarelessWhisper-Streaming Public

Uh oh!

Kimi-Audio Public

Uh oh!

Orpheus-TTS Public

Uh oh!

silentcipher Public

Uh oh!

Spark-TTS Public

Uh oh!

zipEnhancer Public

Uh oh!

async_cosyvoice Public

Uh oh!

TTS-LLaSA_training Public

Uh oh!

unsloth-LLM-finetuning Public

Uh oh!

Qwen-Agent Public

Uh oh!

MiniCPM-o Public

Uh oh!

google-research Public

Uh oh!

data-Thorsten-Voice Public

Uh oh!

openai-cookbook Public

Uh oh!

vector-quantize-pytorch Public

Uh oh!

pycorrector Public

Uh oh!

versa Public

Uh oh!

WavChat Public

Uh oh!

snac Public

Uh oh!

streaming-ChatTTS Public

Uh oh!

GLM-4-Voice Public

Uh oh!

spiritlm Public

Uh oh!

SNAC-Vocos Public

Uh oh!

midi-fluidsynth Public

Uh oh!

amt-apc Public

Uh oh!

qa-mdt Public

Uh oh!

ml-depth-pro Public

Uh oh!