Skip to content
View JensenFire's full-sized avatar
  • MM
  • Beijing

Block or report JensenFire

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
8 stars written in Python
Clear filter

My learning notes for ML SYS.

Python 5,792 376 Updated Mar 19, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,028 141 Updated Mar 28, 2026

Ring attention implementation with flash attention

Python 998 96 Updated Sep 10, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 934 299 Updated Mar 29, 2026

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 655 79 Updated Jan 15, 2026

From-scratch PyTorch implementation of Google's TurboQuant (ICLR 2026) for LLM KV cache compression. 5x compression at 3-bit with 99.5% attention fidelity.

Python 533 69 Updated Mar 25, 2026
Python 105 8 Updated Sep 9, 2024

Ahead of Time (AOT) Triton Math Library

Python 96 39 Updated Mar 27, 2026