Skip to content
View littsk's full-sized avatar

Block or report littsk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A kernel library written in tilelang

Python 1,523 126 Updated Apr 23, 2026

LLM KV cache compression made easy

Python 1,085 145 Updated May 12, 2026

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,855 697 Updated May 14, 2026

NCU-driven iterative optimization workflow for CUDA/CUTLASS/Triton/CuTe DSL kernels.

Python 20 1 Updated Apr 10, 2026

你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.

TypeScript 17,410 1,018 Updated May 13, 2026

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 301 23 Updated Apr 27, 2026

An experimental implementation of compiler-driven automatic sharding of models across a given device mesh.

Python 78 20 Updated May 16, 2026

An agent for CUDA compute-communication kernel co-design

Cuda 35 4 Updated May 7, 2026

NanoGPT (124M) in 90 seconds

Python 5,257 765 Updated May 14, 2026

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,176 495 Updated May 16, 2026

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 807 52 Updated May 16, 2026

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

C++ 23,889 5,988 Updated May 16, 2026

Triton-based implementation of Sparse Mixture of Experts.

Python 273 28 Updated Oct 3, 2025

LM engine is a library for pretraining/finetuning LLMs

Python 171 29 Updated May 15, 2026

Accelerating MoE with IO and Tile-aware Optimizations

Python 684 85 Updated May 14, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,340 208 Updated May 16, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,679 400 Updated May 16, 2026

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 2,049 136 Updated May 16, 2026

Measure and optimize the energy consumption of your AI applications!

Python 358 44 Updated May 16, 2026

dLLM: Simple Diffusion Language Modeling

Python 2,500 263 Updated Apr 15, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 6,223 570 Updated May 12, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,513 942 Updated May 15, 2026

slime is an LLM post-training framework for RL Scaling.

Python 5,705 794 Updated May 14, 2026

JAX backend for SGL

Python 270 99 Updated May 16, 2026

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,810 5,162 Updated May 8, 2026

Checkpoint-engine is a simple middleware to update model weights in LLM inference engines

Python 953 83 Updated Feb 28, 2026

We aim to redefine Data Parallel libraries portabiliy, performance, programability and maintainability, by using C++ standard features, instead of creating new compilers.

C++ 51 3 Updated May 14, 2026

Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,278 1,177 Updated May 16, 2026
Next