Lists (1)
Sort Name ascending (A-Z)
Stars
Code for "HiChunk: Evaluating and Enhancing Retrieval-Augmented Generation with Hierarchical Chunking"
Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Upcoming Survey.
A simple yet powerful agent framework that delivers with open-source models
A template repo for Python packages
Customized Inference Engine for Multiverse Models
[ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding
A Recipe for Building LLM Reasoners to Solve Complex Instructions
SGLang is a fast serving framework for large language models and vision language models.
Bag of Tricks for Inference-time Computation of LLM Reasoning
Paper list for Efficient Reasoning.
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
A high-throughput and memory-efficient inference and serving engine for LLMs
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
Democratizing Reinforcement Learning for LLMs
Code and data for the Chain-of-Draft (CoD) paper
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.
Force DeepSeek r1 models to think for as long as you wish