Daisyqk

Follow

Daisyqk

Follow

9 followers · 4 following

Achievements

Achievements

Stars

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,436 11,108 Updated Nov 7, 2025

DIYgod / RSSHub

🧡 Everything is RSSible

TypeScript 39,708 8,702 Updated Nov 7, 2025

NeuroTechX / moabb

Mother of All BCI Benchmarks

Python 867 217 Updated Nov 7, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 14,126 3,252 Updated Nov 7, 2025

immersive-translate / immersive-translate

沉浸式双语网页翻译扩展 , 支持输入框翻译，鼠标悬停翻译， PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension

16,450 943 Updated Nov 7, 2025

braindecode / braindecode

Deep learning software to decode EEG, ECG or MEG signals

Python 1,067 233 Updated Nov 7, 2025

vipshop / cache-dit

A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗Diffusers.

Python 527 20 Updated Nov 7, 2025

index-tts / index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 14,904 1,684 Updated Nov 7, 2025

dw-dengwei / daily-arXiv-ai-enhanced

Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.

JavaScript 2,034 654 Updated Nov 7, 2025

shuaijiang / Whisper-Finetune

Forked from yeyupiaoling/Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 308 24 Updated Nov 7, 2025

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 12,903 1,347 Updated Nov 7, 2025

Soul-AILab / SoulX-Podcast

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 1,710 177 Updated Nov 6, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 24,003 1,958 Updated Nov 6, 2025

yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader

Python 134,136 10,773 Updated Nov 5, 2025

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,001 2,085 Updated Nov 5, 2025

espnet / espnet

End-to-End Speech Processing Toolkit

Python 9,568 2,343 Updated Nov 5, 2025

k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 8,743 968 Updated Nov 5, 2025

cai525 / Transformer4SED

This repository aims to collect Transformer-based sound event detection (SED) algorithms.

Jupyter Notebook 76 5 Updated Nov 4, 2025

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…

Python 1,014 87 Updated Nov 4, 2025

jianchang512 / pyvideotrans

Translate the video from one language to another and add dubbing.

Python 15,125 1,765 Updated Nov 4, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 25,764 2,591 Updated Nov 4, 2025

xcLee001 / SonicVale

一个开源的多角色、多情绪 AI 配音生成平台，支持小说、剧本、视频等内容的自动配音与导出。

Python 194 25 Updated Nov 4, 2025

DSXiangLi / DecryptPrompt

总结Prompt&LLM论文，开源数据&模型，AIGC应用

3,274 316 Updated Nov 3, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,077 827 Updated Nov 3, 2025

ASLP-lab / OSUM

OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.

Python 450 29 Updated Oct 29, 2025

lucidrains / vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Python 24,370 3,428 Updated Oct 28, 2025

vibevoice-community / VibeVoice

VibeVoice: Expressive, longform conversational speech synthesis. (Community fork)

Python 685 270 Updated Oct 27, 2025

k2-fsa / ZipVoice

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 692 91 Updated Oct 27, 2025

Eric-LRL / Brain-JEPA

Official codebase for "Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking" (NeurIPS 2024, Spotlight).

Python 140 34 Updated Oct 27, 2025

FireRedTeam / FireRedTTS2

Long-form streaming TTS system for multi-speaker dialogue generation

Python 1,198 106 Updated Oct 26, 2025