lxa9867

Xiang Li lxa9867

CMU -> Google Deepmind | Multimodal Understanding & Generation

89 followers · 16 following

Google Deepmind
Kirkland, WA
https://lxa9867.github.io/

Achievements

Stars

Tencent-Hunyuan / HY-WorldPlay

WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 605 33 Updated Dec 19, 2025

yuemingPAN / SFD

Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Python 297 3 Updated Dec 21, 2025

EzioBy / Ditto

[Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Python 538 43 Updated Oct 29, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,367 52 Updated Nov 28, 2025

zhenye234 / X-Codec-2.0

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 334 45 Updated Jul 21, 2025

AvaLovelace1 / BrickGPT

Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.

Python 1,548 94 Updated Nov 9, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 53 Updated Nov 15, 2025

zelaki / ReDi

[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Python 105 5 Updated Nov 3, 2025

qiuk2 / RobusTok

Image Tokenizer Needs Post-Training

Python 24 2 Updated Oct 4, 2025

Yikai-Wang / nvg

Code for our paper "Next Visual Granularity Generation".

Python 48 1 Updated Oct 7, 2025

FoundationVision / Waver

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

779 87 Updated Aug 27, 2025

Kai-46 / KnapFormer

Python 123 5 Updated Aug 10, 2025

stepfun-ai / NextStep-1

Python 579 16 Updated Nov 10, 2025

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,442 361 Updated Dec 19, 2025

Wan-Video / Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,934 1,506 Updated Dec 17, 2025

ali-vilab / TTS-VAR

Test-time Scaling for VAR models

Python 26 3 Updated Sep 19, 2025

Jiawei-Yang / DeTok

Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"

Jupyter Notebook 165 4 Updated Dec 17, 2025

guandeh17 / Self-Forcing

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 2,982 219 Updated Sep 12, 2025

lxtGH / DenseWorld-1M

Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"

117 2 Updated Oct 2, 2025

dvlab-research / VisionThink

[NeurIPS 2025] Efficient Reasoning Vision Language Models

Python 439 29 Updated Sep 18, 2025

dc-ai-projects / DC-AR

Python 78 Updated Oct 18, 2025

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,369 1,348 Updated Jul 9, 2025

camlab-ethz / GAOT

[NeurIPS 2025] Geometry Aware Operator Transformer As An Efficient And Accurate Neural Surrogate For PDEs On Arbitrary Domains

Python 67 12 Updated Oct 23, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,723 2,371 Updated Dec 20, 2025

XueZeyue / DanceGRPO

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,348 66 Updated Oct 16, 2025

ByteVisionLab / DetailFlow

🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"

Python 161 8 Updated Jul 10, 2025

Paper2Poster / Paper2Poster

[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers

Python 2,987 204 Updated Dec 21, 2025

buoyancy99 / diffusion-forcing

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"

Python 1,114 60 Updated Nov 9, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,489 481 Updated Oct 27, 2025

SandAI-org / MAGI-1

MAGI-1: Autoregressive Video Generation at Scale

Python 3,611 227 Updated Jun 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiang Li lxa9867

Achievements

Achievements

Block or report lxa9867

Stars

Tencent-Hunyuan / HY-WorldPlay

yuemingPAN / SFD

EzioBy / Ditto

baaivision / Emu3.5

zhenye234 / X-Codec-2.0

AvaLovelace1 / BrickGPT

bytetriper / RAE

zelaki / ReDi

qiuk2 / RobusTok

Yikai-Wang / nvg

FoundationVision / Waver

Kai-46 / KnapFormer

stepfun-ai / NextStep-1

QwenLM / Qwen-Image

Wan-Video / Wan2.2

ali-vilab / TTS-VAR

Jiawei-Yang / DeTok

guandeh17 / Self-Forcing

lxtGH / DenseWorld-1M

dvlab-research / VisionThink

dc-ai-projects / DC-AR

HW-whistleblower / True-Story-of-Pangu

camlab-ethz / GAOT

huggingface / trl

XueZeyue / DanceGRPO

ByteVisionLab / DetailFlow

Paper2Poster / Paper2Poster

buoyancy99 / diffusion-forcing

ByteDance-Seed / Bagel

SandAI-org / MAGI-1