-
Tsinghua University
- Beijing, China
- https://nicsefc.ee.tsinghua.edu.cn/people/TongchengFang
Stars
localize a memorized sequence in LLMs (NAACL 2024)
Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation.
Hierarchical Reasoning Model Official Release
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
[NeurIPS 2025] Latent Zoning Networks
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
🚀 Efficient implementations of state-of-the-art linear attention models
RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
Official repository of Agent Attention (ECCV2024)
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems