Zijie-Tian

💭

I may be slow to respond.

Zijie Tian Zijie-Tian

💭

I may be slow to respond.

PhD Student

30 followers · 268 following

Tsinghua University
Beijing, China
https://orcid.org/0000-0003-2975-1732
@zijie_tian

Achievements

Highlights

x-attention Public
Forked from mit-han-lab/x-attention

[ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring

Python Updated Dec 21, 2025
ShadowKV Public
Forked from ByteDance-Seed/ShadowKV

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python Apache License 2.0 Updated Dec 20, 2025
RULER Public
Forked from NVIDIA/RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python Apache License 2.0 Updated Dec 20, 2025
SpargeAttn Public
Forked from thu-ml/SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda Apache License 2.0 Updated Dec 18, 2025
MInference Public
Forked from microsoft/MInference

[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…

Python MIT License Updated Dec 16, 2025
nano-vllm Public
Forked from GeeeekExplorer/nano-vllm

Nano vLLM

Python MIT License Updated Dec 15, 2025
Block-Sparse-Attention Public
Forked from mit-han-lab/Block-Sparse-Attention

A sparse attention kernel supporting mix sparse patterns

C++ BSD 3-Clause "New" or "Revised" License Updated Dec 14, 2025
flashinfer Public
Forked from flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

C++ Apache License 2.0 Updated Dec 11, 2025
flash-attention Public
Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python BSD 3-Clause "New" or "Revised" License Updated Dec 9, 2025
NEO Public
Forked from NEO-MLSys25/NEO

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading

Python Apache License 2.0 Updated Dec 9, 2025
sglang Public
Forked from sgl-project/sglang

SGLang is a fast serving framework for large language models and vision language models.

Python Apache License 2.0 Updated Dec 5, 2025
vllm Public
Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python Apache License 2.0 Updated Dec 5, 2025
llama.cpp Public
Forked from ggml-org/llama.cpp

LLM inference in C/C++

C++ MIT License Updated Dec 1, 2025
nano-llama.cpp Public
Forked from JINO-ROHIT/nano-llama.cpp

a repo to understand llama.cpp

C++ Updated Dec 1, 2025
zijie-tian.github.io Public

Astro MIT License Updated Oct 20, 2025
T-MAC Public
Forked from microsoft/T-MAC

Low-bit LLM inference on CPU with lookup table

C++ 1 MIT License Updated Sep 3, 2025
cpubench Public

C++ Updated Aug 17, 2025
LLMTest_NeedleInAHaystack Public
Forked from gkamradt/LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook Other Updated Aug 8, 2025
ccf-deadlines Public
Forked from ccfddl/ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue MIT License Updated Aug 7, 2025
KIVI Public
Forked from jy-yuan/KIVI

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python MIT License Updated Aug 5, 2025
neovim Public
Forked from neovim/neovim

Vim-fork focused on extensibility and usability

Vim Script Other Updated Jul 26, 2025
bookmarks.nvim Public
Forked from heilgar/bookmarks.nvim

A Neovim plugin for managing line bookmarks with Telescope integration and SQLite storage. Mark, organize, and quickly navigate between important locations in your codebase.

Lua MIT License Updated Jul 19, 2025
test-repo Public

A new repository for testing purposes

Updated Jul 11, 2025
theme-terminal Public
Forked from wan92hen/theme-terminal

HTML Updated Jun 30, 2025
codex Public
Forked from openai/codex

Lightweight coding agent that runs in your terminal

TypeScript Apache License 2.0 Updated May 22, 2025
headinfer Public
Forked from wdlctc/headinfer

Python Updated May 16, 2025
ggml Public
Forked from ggml-org/ggml

Tensor library for machine learning

C++ MIT License Updated May 13, 2025
kleidiai Public
Forked from ARM-software/kleidiai

This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai

C Updated May 3, 2025
sleef Public
Forked from shibatch/sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT

C Boost Software License 1.0 Updated Apr 30, 2025
Catch2 Public
Forked from catchorg/Catch2

A modern, C++-native, test framework for unit-tests, TDD and BDD - using C++14, C++17 and later (C++11 support is in v2.x branch, and C++03 on the Catch1.x branch)

C++ Boost Software License 1.0 Updated Apr 29, 2025

Zijie Tian Zijie-Tian

Achievements

Achievements

Highlights

x-attention Public

Uh oh!

ShadowKV Public

Uh oh!

RULER Public

Uh oh!

SpargeAttn Public

Uh oh!

MInference Public

Uh oh!

nano-vllm Public

Uh oh!

Block-Sparse-Attention Public

Uh oh!

flashinfer Public

Uh oh!

flash-attention Public

Uh oh!

NEO Public

Uh oh!

sglang Public

Uh oh!

vllm Public

Uh oh!

llama.cpp Public

Uh oh!

nano-llama.cpp Public

Uh oh!

zijie-tian.github.io Public

Uh oh!

T-MAC Public

Uh oh!

cpubench Public

Uh oh!

LLMTest_NeedleInAHaystack Public

Uh oh!

ccf-deadlines Public

Uh oh!

KIVI Public

Uh oh!

neovim Public

Uh oh!

bookmarks.nvim Public

Uh oh!

test-repo Public

Uh oh!

theme-terminal Public

Uh oh!

codex Public

Uh oh!

headinfer Public

Uh oh!

ggml Public

Uh oh!

kleidiai Public

Uh oh!

sleef Public

Uh oh!

Catch2 Public

Uh oh!