-
Univ. of Sci. & Tech. of China (USTC)
- China
- https://orcid.org/0000-0003-3065-4606
- https://gitee.com/wangxuan95
- https://www.zhihu.com/people/wang-xuan-12-89/posts
Lists (9)
Sort Name ascending (A-Z)
Stars
Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.
High performance block-sorting data compression library
High-speed lossless data compression of 16 to 512 bytes--get better average compression than QuickLZ for 512-byte blocks. td512 maintains good compression down to 16-byte blocks.
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
MQSim is a fast & accurate simulator for modern multi-queue (MQ) and SATA SSDs. MQSim faithfully models new high-bandwidth protocol implementations, steady-state SSD conditions, and full end-to-end…
Open Source SSD Controller. NVMe and Lightstor variants
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Using LLM to evaluate MMLU dataset.
A collection of benchmarks and datasets for evaluating LLM.
qoi and qoi-like implementations optionally using simd
The simplest, fastest repository for training/finetuning medium-sized GPTs.
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Official implementation for Yuan & Liu & Zhong et al., KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024
Insane(ly slow but wicked good) PNG image optimization
cmix is a lossless data compression program aimed at optimizing compression ratio at the cost of high CPU/memory usage.
A random event driven text-based game engine.