Skip to content
View yzhang2016's full-sized avatar

Block or report yzhang2016

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Offical Implementation of SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Python 446 19 Updated Dec 21, 2025

PersonaLive! : Expressive Portrait Image Animation for Live Streaming

Python 665 67 Updated Dec 19, 2025

Taming large-scale full-parameter few-step training with self-adversarial flows! 👏🏻

Python 298 17 Updated Dec 15, 2025

Kandinsky 5.0: A family of diffusion models for Video & Image generation

Python 603 41 Updated Dec 19, 2025

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Python 10,553 819 Updated Dec 4, 2024

Official implementation of "Towards One-Step Causal Video Generation via Adversarial Self-Distillation" (arXiv 2025). A novel framework for efficient and causal video generation using adversarial s…

19 Updated Nov 4, 2025

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 445 24 Updated Dec 15, 2025

SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.

Python 2,768 342 Updated Dec 11, 2025

rCM: SOTA Diffusion Distillation & Few-Step Video Generation based on sCM/MeanFlow

Python 385 14 Updated Dec 12, 2025

Official implementation for SSDD Single-Step Diffusion Decoder for Efficient Image Tokenization.

Jupyter Notebook 50 4 Updated Nov 12, 2025
Python 1,461 152 Updated Nov 15, 2025

Cut2Next: Generating Next Shot via In-Context Tuning

30 Updated Aug 21, 2025

A minimal implementation of DeepMind's Genie world model

Python 1,062 84 Updated Nov 22, 2025

LongLive: Real-time Interactive Long Video Generation

Python 921 63 Updated Dec 4, 2025

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 906 87 Updated Sep 20, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python 3,035 328 Updated Dec 20, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,682 1,354 Updated Dec 17, 2025

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Python 1,029 178 Updated Oct 19, 2025

Official repository for the UAE paper, unified-GRPO, and unified-Bench

Python 151 6 Updated Sep 12, 2025

SpatialVID: A Large-Scale Video Dataset with Spatial Annotations

Python 449 14 Updated Dec 15, 2025
Python 13 1 Updated Dec 1, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 16,907 2,036 Updated Dec 2, 2025

[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

Python 1,108 65 Updated Nov 25, 2025

A iterative feedback driven benchmark on LLM's instruction following ability

Python 46 4 Updated Sep 24, 2025

The codes for Vivid-VR: Distilling Concepts from Text-to-Video Diffusion Transformer for Photorealistic Video Restoration

Python 205 19 Updated Oct 30, 2025

Official Repository of "OmniTry: Virtual Try-On Anything without Masks"

Python 234 29 Updated Aug 29, 2025
Next