Skip to content
View stein-666's full-sized avatar

Block or report stein-666

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

localize a memorized sequence in LLMs (NAACL 2024)

Python 9 1 Updated Jul 17, 2024

Resurrect Mask AutoRegressive Modeling for Efficient and Scalable Image Generation.

Python 14 Updated Jul 21, 2025
Python 164 8 Updated Nov 26, 2025

Hierarchical Reasoning Model Official Release

Python 12,165 1,778 Updated Sep 9, 2025

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.

624 35 Updated Dec 21, 2025
Python 8 Updated Oct 23, 2025

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention

Python 179 9 Updated Dec 21, 2025

Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"

Python 57 3 Updated Jul 1, 2025
Python 29 Updated Jun 9, 2025

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 96,595 10,729 Updated Dec 9, 2025

[NeurIPS 2025] Latent Zoning Networks

Python 56 2 Updated Oct 29, 2025
Python 8 3 Updated Sep 11, 2025

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 672 43 Updated Dec 20, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,098 335 Updated Dec 20, 2025
Jupyter Notebook 383 74 Updated Dec 21, 2025

RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.

Python 1,774 169 Updated Dec 22, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,448 1,999 Updated Nov 1, 2025

The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers. (EMNLP 2025)"

Python 26 3 Updated Oct 1, 2025

[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Python 347 22 Updated Aug 11, 2025

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 474 24 Updated Nov 28, 2025

An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation

Python 1,353 66 Updated Oct 16, 2025

(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Python 1,117 63 Updated Aug 7, 2025

DDT: Decoupled Diffusion Transformer

Python 344 17 Updated Aug 22, 2025

Official repository of Agent Attention (ECCV2024)

Python 654 44 Updated Nov 17, 2024

[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation

Python 2,727 460 Updated Dec 18, 2025

Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)

Python 2,983 219 Updated Sep 12, 2025

Light Video Generation Inference Framework

Python 1,258 80 Updated Dec 19, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 4,835 322 Updated Dec 21, 2025

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

396 19 Updated Sep 22, 2025
Next