Skip to content
View WangXuan95's full-sized avatar

Block or report WangXuan95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

ModelEngine 项目群的社区管理规范。

42 Updated Mar 27, 2026

[VLDB 26, NeurIPS 25] Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.

Python 134 26 Updated Feb 22, 2026

High performance block-sorting data compression library

C 343 64 Updated Feb 3, 2026

High-speed lossless data compression of 16 to 512 bytes--get better average compression than QuickLZ for 512-byte blocks. td512 maintains good compression down to 16-byte blocks.

C 27 Updated Feb 14, 2022

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Python 112 18 Updated Oct 15, 2024

Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个

1,237 53 Updated Jul 31, 2024

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

419 26 Updated Mar 3, 2025
Python 310 30 Updated Jul 10, 2025

MQSim is a fast & accurate simulator for modern multi-queue (MQ) and SATA SSDs. MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and full end-to-end…

C++ 354 181 Updated Aug 25, 2025
C++ 75 14 Updated May 30, 2023

Open Source SSD Controller. NVMe and Lightstor variants

Bluespec 17 23 Updated May 21, 2014

现代图形引擎入门指南

C++ 469 53 Updated Dec 16, 2025

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Cuda 238 22 Updated Sep 24, 2023

Opensource DDR3 Controller

Verilog 423 65 Updated Jan 18, 2026

Using LLM to evaluate MMLU dataset.

Python 42 3 Updated Mar 8, 2024

A collection of benchmarks and datasets for evaluating LLM.

559 34 Updated Jul 13, 2024

qoi and qoi-like implementations optionally using simd

C 12 1 Updated Nov 28, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 55,681 9,492 Updated Nov 12, 2025

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

677 22 Updated Feb 24, 2026

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 16,524 2,343 Updated Sep 3, 2025

KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024

Python 89 3 Updated Feb 27, 2025

Fast LZMA2 Library

C 330 32 Updated Jan 11, 2026

翻译systemverilog assertion部分

8 1 Updated Jul 15, 2024

Insane(ly slow but wicked good) PNG image optimization

Python 3,429 148 Updated Jun 18, 2022

cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage.

C++ 696 53 Updated Dec 7, 2025

Trabalho de Graduação

C++ 20 4 Updated Nov 2, 2014

A random event driven text-based game engine.

TypeScript 260 38 Updated Aug 19, 2024

state-of-the-art lossless audio compression

C++ 63 6 Updated Mar 1, 2026
Next