Skip to content
View imShZh's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@ShZh-Playground @ShZh-libraries @ShZh-websites

Block or report imShZh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The official Lark/Feishu CLI tool, maintained by the larksuite team — built for humans and AI Agents. Covers core business domains including Messenger, Docs, Base, Sheets, Calendar, Mail, Tasks, Me…

Go 14,405 991 Updated Jun 19, 2026

A framework for efficient model inference with omni-modality models

Python 5,211 1,144 Updated Jun 19, 2026

Agentic RL Training at Scale

Python 1,493 314 Updated Jun 20, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,592 268 Updated Jun 19, 2026

Perplexity open source garden for inference technology

Rust 581 56 Updated May 27, 2026

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)

Python 9,659 971 Updated Jun 17, 2026

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 3,247 289 Updated Jun 19, 2026

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM

Go 1,776 305 Updated May 12, 2026

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 42,542 4,863 Updated Jun 18, 2026

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

Python 5,321 523 Updated Jun 19, 2026

slime is an LLM post-training framework for RL Scaling.

Python 6,397 929 Updated Jun 19, 2026

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C 1,388 189 Updated Jun 15, 2026

My learning notes for ML SYS.

Python 6,547 445 Updated Jun 18, 2026

Optimized primitives for collective multi-GPU communication

C++ 4,820 1,299 Updated Jun 17, 2026

A PyTorch native platform for training generative AI models

Python 5,453 864 Updated Jun 19, 2026

NCCL Tests

Cuda 1,560 379 Updated Jun 9, 2026

What would you do with 1000 H100s...

Jupyter Notebook 1,179 73 Updated Jan 10, 2024

Efficient Triton Kernels for LLM Training

Python 6,444 543 Updated Jun 17, 2026

Puzzles for learning Triton, play it with minimal environment configuration!

Python 715 101 Updated Mar 17, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,611 861 Updated Jun 19, 2026

CUDA Python: Performance meets Productivity

Cython 3,293 300 Updated Jun 19, 2026

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 1,090 88 Updated Sep 4, 2024
C++ 98 9 Updated Mar 26, 2025

A minimal implementation of vllm.

Cuda 73 Updated Jul 27, 2024

FlashInfer: Kernel Library for LLM Serving

Python 5,824 1,062 Updated Jun 19, 2026

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 12,799 1,307 Updated Nov 4, 2025

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 125,670 14,074 Updated Jun 19, 2026

Tile primitives for speedy kernels

Cuda 3,460 299 Updated Jun 15, 2026

Material for gpu-mode lectures

Jupyter Notebook 6,193 624 Updated Jun 15, 2026
Next