RMSnow

Xueyao Zhang RMSnow

Ph.D. student at The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)

121 followers · 56 following

The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen)
https://www.zhangxueyao.com/

Achievements

Highlights

Stars

QwenLM / Qwen2.5

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 9,165 571 Updated Oct 30, 2024

showlab / Show-o

Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 977 42 Updated Oct 27, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,981 2,199 Updated Aug 12, 2024

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 1,867 127 Updated Oct 30, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 6,035 669 Updated Oct 29, 2024

wangtianrui / ProgRE

Python 17 1 Updated Sep 14, 2024

anthropics / hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,601 125 Updated Sep 19, 2023

huggingface / trl

Train transformer language models with reinforcement learning.

Python 9,885 1,250 Updated Oct 28, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

Python 2,363 231 Updated Oct 30, 2024

Plachtaa / seed-vc

State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning

Python 488 53 Updated Oct 29, 2024

eliahuhorwitz / Academic-project-page-template

A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/

JavaScript 2,046 304 Updated Sep 10, 2024

numediart / EmoV-DB

The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems

Python 254 19 Updated Oct 10, 2023

Linear95 / CLUB

Code for ICML2020 paper - CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information

Jupyter Notebook 308 39 Updated May 10, 2024

kyutai-labs / moshi

Python 6,556 498 Updated Oct 14, 2024

ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 618 43 Updated Oct 27, 2024

google / visqol

Perceptual Quality Estimator for speech and audio

C++ 689 123 Updated Aug 2, 2024

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 4,514 457 Updated Oct 14, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,492 164 Updated Sep 24, 2024

justinlovelace / SESD

Python 37 2 Updated Oct 28, 2024

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5,931 635 Updated Oct 22, 2024

nicolaus625 / FM4Music

The official GitHub page for the survey paper "Foundation Models for Music: A Survey".

90 3 Updated Sep 4, 2024

facebookresearch / WavAugment

A library for speech data augmentation in time-domain

Python 641 57 Updated Aug 30, 2021

trinhtuanvubk / Diff-VC

Diffusion Model for Voice Conversion

Jupyter Notebook 36 7 Updated Mar 14, 2024

SilasAntonisen / PolySinger

PolySinger: Singing-Voice to Singing-Voice Translation From English to Japanese

3 Updated Jul 8, 2024

open-mmlab / FoleyCrafter

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝

Python 446 38 Updated Jul 26, 2024

3loi / NaturalVoices

Jupyter Notebook 45 3 Updated Oct 18, 2024

Yuan-ManX / ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

507 35 Updated Oct 13, 2024

zyushun / Adam-mini

Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793

Python 310 10 Updated Oct 19, 2024

HLTSingapore / Emotional-Speech-Data

This is the GitHub page for publicly available emotional speech data.

318 23 Updated Jan 6, 2022

AlanBaade / DisentangledNCLM

Public Code for Neural Codec Language Models for Disentangled and Textless Voice Conversion (Interspeech 2024)

3 Updated Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xueyao Zhang RMSnow

Achievements

Achievements

Highlights

Block or report RMSnow

Stars

QwenLM / Qwen2.5

showlab / Show-o

haotian-liu / LLaVA

THUDM / GLM-4-Voice

SWivid / F5-TTS

wangtianrui / ProgRE

anthropics / hh-rlhf

huggingface / trl

OpenRLHF / OpenRLHF

Plachtaa / seed-vc

eliahuhorwitz / Academic-project-page-template

numediart / EmoV-DB

Linear95 / CLUB

kyutai-labs / moshi

ddlBoJack / emotion2vec

google / visqol

huggingface / parler-tts

ictnlp / LLaMA-Omni

justinlovelace / SESD

FunAudioLLM / CosyVoice

nicolaus625 / FM4Music

facebookresearch / WavAugment

trinhtuanvubk / Diff-VC

SilasAntonisen / PolySinger

open-mmlab / FoleyCrafter

3loi / NaturalVoices

Yuan-ManX / ai-audio-datasets

zyushun / Adam-mini

HLTSingapore / Emotional-Speech-Data

AlanBaade / DisentangledNCLM