[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…

Python 1,147 63 Updated Sep 30, 2025

THUDM / LongBench

LongBench v2 and LongBench (ACL 25'&24')

Python 1,008 107 Updated Jan 15, 2025

kuleshov-group / bd3lms

[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Python 880 45 Updated Jul 10, 2025

undercasetype / Fraunces

Git Repository for Fraunces Font Family

Python 654 22 Updated Oct 21, 2025

spylang / spy

SPy language

Python 631 36 Updated Nov 4, 2025

spakin / SimpInkScr

Simple Inkscape Scripting

Python 401 35 Updated Oct 28, 2025

OpenLMLab / LEval

[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Python 391 13 Updated Jul 9, 2024

thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

Python 387 36 Updated Apr 20, 2024

rjl493456442 / leveldb-handbook

Analysis leveldb source code step by step

Python 366 73 Updated Nov 11, 2024

NVlabs / GatedDeltaNet

[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 357 21 Updated Sep 15, 2025

OpenBMB / InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Python 355 29 Updated Sep 25, 2024

Pold87 / academic-keyword-occurrence

Extracts the historic word occurrence of a search term in academic papers

Python 327 89 Updated Feb 11, 2024

InferenceMAX / InferenceMAX

Python 322 39 Updated Nov 6, 2025

interestingLSY / swiftLLM

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 286 34 Updated Jun 10, 2025

FasterDecoding / SnapKV

Python 286 23 Updated Jul 10, 2025

ByteDance-Seed / ShadowKV

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 269 18 Updated May 1, 2025

Infini-AI-Lab / MagicPIG

[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation

Python 238 16 Updated Dec 16, 2024

Previous Next

张丘洋 cs-qyzhang

Highlights

Starred repositories

Artificial Intelligence

NoSQL

Linux

LaTeX

Go

Data structures

Database

C++

C