angzong

Youliang Zhang angzong

Tsinghua University / Big Data Technology and Engineering

14 followers · 7 following

Tsinghua University
18:29 (UTC -12:00)
https://www.whu.edu.cn/

Achievements

Stars

isl-org / DPT

Dense Prediction Transformers

Python 2,324 279 Updated Dec 18, 2024

facebookresearch / vjepa2

PyTorch code and models for VJEPA2 self-supervised learning from video.

Python 3,485 404 Updated Mar 23, 2026

facebookresearch / sam3

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 8,604 1,213 Updated Mar 30, 2026

facebookresearch / DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

Jupyter Notebook 7,229 1,323 Updated Jan 18, 2023

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,819 2,406 Updated Mar 20, 2026

DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 7,822 798 Updated Mar 24, 2026

OpenMOSS / MOVA

MOVA: Towards Scalable and Synchronized Video–Audio Generation

Python 869 60 Updated Mar 14, 2026

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…

Python 1,227 118 Updated Mar 23, 2026

PKU-YuanGroup / OpenS2V-Nexus

[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation

Jupyter Notebook 205 7 Updated Mar 8, 2026

JIA-Lab-research / UnityVideo

[CVPR 2026]UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

212 8 Updated Jan 29, 2026

smthemex / ComfyUI_InteractAvatar

InteractAvatar is a novel dual-stream DiT framework that enables talking avatars to perform Grounded Human-Object Interaction (GHOI)

Python 22 1 Updated Mar 26, 2026

OpenDriveLab / UniVLA

[RSS 2025] Learning to Act Anywhere with Task-centric Latent Actions

Python 1,030 58 Updated Nov 19, 2025

angzong / InteractAvatar

Python 332 27 Updated Feb 9, 2026

PangzeCheung / OmniTransfer

[CVPR 2026] OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

225 9 Updated Feb 21, 2026

MeiGen-AI / InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Python 5,198 859 Updated Dec 18, 2025

MiniMax-AI / VTP

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 462 12 Updated Mar 9, 2026

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,432 304 Updated Jan 5, 2026

Tongyi-MAI / Z-Image

Python 10,797 723 Updated Feb 9, 2026

thu-ml / Motus

Official code of Motus: A Unified Latent Action World Model

Python 917 39 Updated Jan 5, 2026

meituan-longcat / LongCat-Video

Python 2,220 335 Updated Mar 22, 2026

aceliuchanghong / FAQ_Of_LLM_Interview

大模型算法岗面试题(含答案):常见问题和概念解析 "大模型面试题"、"算法岗面试"、"面试常见问题"、"大模型算法面试"、"大模型应用基础"

Jupyter Notebook 1,803 126 Updated Mar 26, 2026

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,827 1,707 Updated Jan 30, 2026

ByteDance-Seed / Seed1.5-VL

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,566 65 Updated Jun 14, 2025

character-ai / Ovi

Python 1,675 190 Updated Nov 15, 2025

MotrixLab / SMPLest-X

[TPAMI 2025] Official Code for "SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation"

Python 269 29 Updated Feb 12, 2026

antgroup / echomimic_v3

[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Python 846 102 Updated Mar 18, 2026

Omni-Avatar / OmniAvatar

Python 1,809 168 Updated Aug 6, 2025

MeiGen-AI / MultiTalk

[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

Python 2,868 480 Updated Dec 18, 2025

Tencent-Hunyuan / HunyuanVideo-1.5

HunyuanVideo-1.5: A leading lightweight video generation model

Python 4,760 220 Updated Mar 29, 2026

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 4,004 328 Updated Aug 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Youliang Zhang angzong

Achievements

Achievements

Block or report angzong

Stars

isl-org / DPT

facebookresearch / vjepa2

facebookresearch / sam3

facebookresearch / DensePose

facebookresearch / sam2

DepthAnything / Depth-Anything-V2

OpenMOSS / MOVA

OpenMOSS / MOSS-TTSD

PKU-YuanGroup / OpenS2V-Nexus

JIA-Lab-research / UnityVideo

smthemex / ComfyUI_InteractAvatar

OpenDriveLab / UniVLA

angzong / InteractAvatar

PangzeCheung / OmniTransfer

MeiGen-AI / InfiniteTalk

MiniMax-AI / VTP

facebookresearch / sam-audio

Tongyi-MAI / Z-Image

thu-ml / Motus

meituan-longcat / LongCat-Video

aceliuchanghong / FAQ_Of_LLM_Interview

QwenLM / Qwen3-VL

ByteDance-Seed / Seed1.5-VL

character-ai / Ovi

MotrixLab / SMPLest-X

antgroup / echomimic_v3

Omni-Avatar / OmniAvatar

MeiGen-AI / MultiTalk

Tencent-Hunyuan / HunyuanVideo-1.5

modelscope / ClearerVoice-Studio