Skip to content
View Odysseusq's full-sized avatar
  • Dept. of EEE, HKU
  • Hong Kong

Highlights

  • Pro

Block or report Odysseusq

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Nano Megatron

Python 1 Updated Feb 6, 2026

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

714 26 Updated Apr 15, 2026

Official Repo for paper "VLCache: Computing 2% Vision Tokens and Reusing 98% for Vision-Language Inference"

Python 14 1 Updated Mar 28, 2026

[NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing"

Python 90 13 Updated Apr 7, 2026

Nano vLLM

Python 14,022 2,212 Updated Apr 26, 2026

A safetensors extension to efficiently store sparse quantized tensors on disk

Python 292 93 Updated Jun 12, 2026

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,320 201 Updated Mar 27, 2024

The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)

HTML 341 9 Updated Jan 5, 2025

Expert Parallelism Load Balancer

Python 1,388 203 Updated Mar 24, 2025

HLLM: Enhancing Sequential Recommendations via Hierarchical Large Language Models for Item and User Modeling

Python 623 82 Updated Aug 26, 2025

Fully open reproduction of DeepSeek-R1

Python 26,311 2,439 Updated Apr 2, 2026

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 3,561 317 Updated Jul 17, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,156 8,827 Updated Jun 13, 2026

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

Python 14,715 1,301 Updated Apr 6, 2025

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,568 1,805 Updated Jun 27, 2024

用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.

Python 2,921 356 Updated May 21, 2024

Google Research

Jupyter Notebook 38,130 8,429 Updated Jun 12, 2026

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

Python 117 11 Updated Sep 7, 2021

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

Python 2,178 257 Updated Nov 27, 2024