Stars
dInfer: An Efficient Inference Framework for Diffusion Language Models
Sing-box精装桶四合一协议VPS专用脚本:三大独家功能!自签/acme双证书切换、Argo固定临时双隧道(可共存)、Psiphon赛风VPN(30个国家)分流功能。Hostuno三合一代理脚本
TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models"
This is the official code repository for our paper submitted to ACL Findings 2024, titled "Mitigating Boundary Ambiguity and Inherent Bias for Text Classification in the Era of Large Language Models".
Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
This repository is the code implementation for our paper "Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning"
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).
paper list, tutorial, and nano code snippet for Diffusion Large Language Models.
Improving Pseudo Labels with Global-Local Denoising Framework for Cross-lingual Named Entity Recognition (IJCAI 2024)
[ICML2025] Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Official PyTorch implementation for "Large Language Diffusion Models"
Understanding R1-Zero-Like Training: A Critical Perspective
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
[CVPR2025] Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think