-
Tsinghua University
- Beijing, China
- https://orcid.org/0000-0003-2975-1732
- @zijie_tian
Highlights
- Pro
-
x-attention Public
Forked from mit-han-lab/x-attention[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring
Python UpdatedDec 21, 2025 -
ShadowKV Public
Forked from ByteDance-Seed/ShadowKV[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
Python Apache License 2.0 UpdatedDec 20, 2025 -
RULER Public
Forked from NVIDIA/RULERThis repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Python Apache License 2.0 UpdatedDec 20, 2025 -
SpargeAttn Public
Forked from thu-ml/SpargeAttn[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
Cuda Apache License 2.0 UpdatedDec 18, 2025 -
MInference Public
Forked from microsoft/MInference[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…
Python MIT License UpdatedDec 16, 2025 -
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedDec 15, 2025 -
Block-Sparse-Attention Public
Forked from mit-han-lab/Block-Sparse-AttentionA sparse attention kernel supporting mix sparse patterns
C++ BSD 3-Clause "New" or "Revised" License UpdatedDec 14, 2025 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
C++ Apache License 2.0 UpdatedDec 11, 2025 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 9, 2025 -
NEO Public
Forked from NEO-MLSys25/NEONEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
Python Apache License 2.0 UpdatedDec 9, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedDec 5, 2025 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedDec 5, 2025 -
llama.cpp Public
Forked from ggml-org/llama.cppLLM inference in C/C++
C++ MIT License UpdatedDec 1, 2025 -
nano-llama.cpp Public
Forked from JINO-ROHIT/nano-llama.cppa repo to understand llama.cpp
C++ UpdatedDec 1, 2025 -
-
T-MAC Public
Forked from microsoft/T-MACLow-bit LLM inference on CPU with lookup table
-
-
LLMTest_NeedleInAHaystack Public
Forked from gkamradt/LLMTest_NeedleInAHaystackDoing simple retrieval from LLM models at various context lengths to measure accuracy
Jupyter Notebook Other UpdatedAug 8, 2025 -
ccf-deadlines Public
Forked from ccfddl/ccf-deadlines⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Vue MIT License UpdatedAug 7, 2025 -
KIVI Public
Forked from jy-yuan/KIVI[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Python MIT License UpdatedAug 5, 2025 -
neovim Public
Forked from neovim/neovimVim-fork focused on extensibility and usability
Vim Script Other UpdatedJul 26, 2025 -
bookmarks.nvim Public
Forked from heilgar/bookmarks.nvimA Neovim plugin for managing line bookmarks with Telescope integration and SQLite storage. Mark, organize, and quickly navigate between important locations in your codebase.
Lua MIT License UpdatedJul 19, 2025 -
-
-
codex Public
Forked from openai/codexLightweight coding agent that runs in your terminal
TypeScript Apache License 2.0 UpdatedMay 22, 2025 -
-
ggml Public
Forked from ggml-org/ggmlTensor library for machine learning
C++ MIT License UpdatedMay 13, 2025 -
kleidiai Public
Forked from ARM-software/kleidiaiThis repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai
C UpdatedMay 3, 2025 -
sleef Public
Forked from shibatch/sleefSIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
C Boost Software License 1.0 UpdatedApr 30, 2025 -
Catch2 Public
Forked from catchorg/Catch2A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)
C++ Boost Software License 1.0 UpdatedApr 29, 2025