Stars
这是一个用于显示当前网速、CPU及内存利用率的桌面悬浮窗软件,并支持任务栏显示,支持更换皮肤。
FlashMLA: Efficient Multi-head Latent Attention Kernels
High-speed Large Language Model Serving for Local Deployment
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
中国科学技术大学计算机学院课程资源(https://mbinary.xyz/ustc-cs/)
The RaftLib C++ library, streaming/dataflow concurrency via C++ iostream-like operators
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
ModelChecker: A bit-level model checking tool
Leveraging Critical Proof Obligations for Efficient IC3 Verification