cc-cherie

Follow

cc-cherie

Follow

6 followers · 38 following

Starred repositories

moonshine-ai / moonshine

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces

C 8,445 458 Updated Jun 2, 2026

muratcankoylan / Agent-Skills-for-Context-Engineering

A comprehensive collection of Agent Skills for context engineering, multi-agent architectures, and production agent systems. Use when building, optimizing, or debugging agent systems that require e…

Python 16,546 1,343 Updated May 26, 2026

timercrack / trader

期货自动交易

C 8,255 1,897 Updated Feb 28, 2026

tuanio / whisper-ctc

Whisper Encoder (extracted from pretrained) with a Linear on top and solve using CTC criterion

Python 7 2 Updated Jul 3, 2023

xingchensong / TouchNet

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.

Python 230 30 Updated Apr 8, 2026

boson-ai / higgs-audio

Text-audio foundation model from Boson AI

Python 8,195 629 Updated Jun 5, 2026

BinWang28 / audio-ai-hub

The hub for audio AI research: papers, open models, benchmarks & datasets across audio LLMs, speech recognition, TTS, music & audio generation.

Python 932 48 Updated Jun 8, 2026

luhengshiwo / LLMForEverybody

每个人都能看懂的大模型知识分享，LLMs春/秋招大模型面试前必看，让你和面试官侃侃而谈

Jupyter Notebook 6,719 627 Updated May 31, 2026

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 22,473 2,303 Updated Jun 3, 2026

OpenVoiceOS / ovos-number-parser

Python 3 3 Updated Mar 31, 2026

traderpedroso / xphoneBR

XphoneBR is a Brazilian portuguese transformer base grapheme-to-phoneme and normalization tool modeling library that leverages recent deep learning technology and is optimized for usage in producti…

Python 12 Updated Aug 28, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,500 1,475 Updated Jun 13, 2026

mbzuai-oryx / LLMVoX

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Python 308 40 Updated May 16, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

17,882 1,128 Updated May 1, 2026

virattt / ai-hedge-fund

An AI Hedge Fund Team

Python 60,061 10,616 Updated Jun 9, 2026

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 9,322 785 Updated Mar 26, 2026

collabora / whisper-finetuning

Whisper finetuning

Python 17 4 Updated Apr 9, 2025

InternLM / xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 5,152 426 Updated Jun 12, 2026

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

Python 12,045 1,368 Updated Jun 13, 2026

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Jupyter Notebook 14,862 2,045 Updated Nov 19, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 58,674 6,416 Updated Apr 30, 2026

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 36,710 4,098 Updated Apr 19, 2025

Vaibhavs10 / insanely-fast-whisper

Jupyter Notebook 12,967 954 Updated Oct 25, 2025

lucidrains / denoising-diffusion-pytorch

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 10,604 1,283 Updated Feb 11, 2026

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 22,147 2,697 Updated Jan 23, 2026

Stability-AI / generative-models

Generative Models by Stability AI

Python 27,188 3,095 Updated Dec 16, 2025

lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,206 333 Updated Sep 10, 2025

himansh005 / emotionally_consistent_speech

Forked from b04901014/FT-w2v2-ser

Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

Jupyter Notebook 1 Updated Dec 12, 2022

speechbrain / benchmarks

This repository contains the SpeechBrain Benchmarks

Python 140 46 Updated Feb 3, 2026

youngbin-ro / audiotext-transformer

Multimodal Transformer for Korean Sentiment Analysis with Audio and Text Features

Python 28 7 Updated Sep 7, 2021

Starred topics

thai-language