Skip to content
View fengshunli's full-sized avatar
🏠
study of learning
🏠
study of learning

Block or report fengshunli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Memory for AI Agents in 6 lines of code

Python 10,399 953 Updated Dec 20, 2025

SoTA open-source TTS

Python 16,438 2,247 Updated Dec 15, 2025

vits2 backbone with multilingual-bert

Python 8,643 1,255 Updated Dec 15, 2025

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 43,937 5,858 Updated Aug 16, 2024

GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Python 2,052 139 Updated Dec 18, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 53,304 5,834 Updated Dec 19, 2025

轻量、灵活、易上手的Python剪映草稿生成及导出工具,构建全自动化视频剪辑/混剪流水线。本项目的CapCut版本正于 https://github.com/GuanYixuan/pyCapCut 内开发

Python 2,421 486 Updated Nov 5, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,796 12,074 Updated Dec 20, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 18,095 2,010 Updated Dec 17, 2025
Python 472 43 Updated May 19, 2025

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Python 988 117 Updated Dec 15, 2025

Faster Whisper transcription with CTranslate2

Python 19,539 1,630 Updated Nov 19, 2025

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,248 2,049 Updated Oct 21, 2025

Robust Speech Recognition via Large-Scale Weak Supervision

Python 92,163 11,547 Updated Dec 15, 2025

A generative speech model for daily dialogue.

Python 38,356 4,163 Updated Dec 3, 2025

[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Python 672 69 Updated Nov 24, 2025

You can using EchoMimic in ComfyUI

Python 680 78 Updated Aug 26, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 16,846 2,021 Updated Dec 2, 2025

Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

TypeScript 163,752 52,309 Updated Dec 20, 2025

ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…

Python 12,856 1,778 Updated Dec 19, 2025

Nodes related to video workflows

Python 1,394 252 Updated Dec 17, 2025

IndexTTS Voice Cloning: Supports two-person dialogue

Python 458 43 Updated Nov 7, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 97,443 11,046 Updated Dec 20, 2025

Gen: Friendly & Safer GORM powered by Code Generation

Go 2,513 349 Updated Dec 15, 2025

Community maintained hardware plugin for vLLM on Ascend

Python 1,485 670 Updated Dec 20, 2025

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 60,921 7,523 Updated Oct 4, 2025

阿里巴巴分布式数据库同步系统(解决中美异地机房)

Java 8,131 2,483 Updated May 25, 2024

📥 An IMAP library for Go clients and servers

Go 2,274 336 Updated Dec 16, 2025

⚡️ Express inspired web framework written in Go

Go 38,775 1,925 Updated Dec 19, 2025

The Official Golang driver for MongoDB

Go 8,492 919 Updated Dec 18, 2025
Next