Stars
CDLM: Consistency Diffusion Language Models for Faster Sampling
[ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization