Skip to content
View lmxue's full-sized avatar

Block or report lmxue

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 145 23 Updated Oct 25, 2024

Multilingual Voice Understanding Model

Python 3,224 298 Updated Oct 18, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,516 204 Updated Aug 1, 2024

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 6,082 605 Updated Oct 25, 2024

Open source real-time translation app for Android that runs locally

C++ 6,727 506 Updated Sep 27, 2024

Foundational model for human-like, expressive TTS

Python 3,846 658 Updated Jul 30, 2024

A generative speech model for daily dialogue.

Python 31,904 3,477 Updated Oct 21, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,632 1,090 Updated May 23, 2024

Inference and training library for high-quality TTS models.

Python 4,514 457 Updated Oct 14, 2024

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 306 21 Updated Sep 3, 2024

Official repo for WavCraft, an AI agent for audio creation and editing

Python 652 96 Updated Sep 13, 2024

Awesome speech/audio LLMs, representation learning, and codec models

667 31 Updated Oct 29, 2024

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Python 16,690 2,657 Updated Jul 26, 2024

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,606 576 Updated Jul 2, 2024

A lightweight library for Frechet Audio Distance calculation.

Python 233 24 Updated Sep 4, 2024

trying to reproduce suno v3

24 1 Updated Mar 24, 2024

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 7,614 743 Updated Jun 24, 2024

Brand new TTS solution

Python 13,656 1,024 Updated Oct 25, 2024

我的 ComfyUI 工作流合集 | My ComfyUI workflows collection

5,001 469 Updated Oct 19, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,083 2,151 Updated Aug 9, 2024

AI powered speech denoising and enhancement

Python 1,381 138 Updated Jun 21, 2024

VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.

Shell 46 4 Updated May 14, 2024

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Python 56,933 7,039 Updated Oct 24, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,487 1,023 Updated Oct 29, 2024

High-level API for tar-based dataset

Python 10 Updated Feb 3, 2024

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Python 34,158 3,574 Updated Sep 23, 2024

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.

Jupyter Notebook 3,960 3,221 Updated May 10, 2024

Audio Codec Speech processing Universal PERformance Benchmark

Python 209 22 Updated Sep 28, 2024

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 618 43 Updated Oct 27, 2024

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,879 959 Updated Oct 24, 2024
Next