Skip to content
View yanring's full-sized avatar
:octocat:
:octocat:

Block or report yanring

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Allow torch tensor memory to be released and resumed later

Python 162 26 Updated Nov 1, 2025

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,270 91 Updated Nov 5, 2025

The best workflows and configurations I've developed, having heavily used Claude Code since the day of it's release. Workflows are based off applied learnings from our AI-native startup.

3,042 460 Updated Sep 14, 2025

Training library for Megatron-based models

Python 166 45 Updated Nov 5, 2025

Sequence-level 1F1B schedule for LLMs.

Python 32 2 Updated Aug 26, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,376 242 Updated Nov 5, 2025

Distributed Compiler based on Triton for Parallel Systems

Python 1,213 104 Updated Oct 17, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,849 298 Updated Nov 5, 2025

Pipeline Parallelism Emulation and Visualization

Python 70 5 Updated Jun 12, 2025

Analyze computation-communication overlap in V3/R1.

1,112 143 Updated Mar 21, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,691 972 Updated Nov 5, 2025

A library to analyze PyTorch traces.

Python 423 69 Updated Oct 30, 2025

PyTorch centric eager mode debugger

TypeScript 48 1 Updated Dec 16, 2024

A PyTorch native platform for training generative AI models

Python 4,653 595 Updated Nov 5, 2025

LLM101n: Let's build a Storyteller

35,458 1,930 Updated Aug 1, 2024

A PyTorch Toolbox for Grouped GEMM in MoE Model Training

6 1 Updated May 28, 2024

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 164 48 Updated Oct 10, 2025

A Project dedicated to making GPU Partitioning on Windows easier!

PowerShell 5,172 521 Updated Oct 6, 2025

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,820 294 Updated Jan 16, 2024

Inference code for Mistral and Mixtral hacked up into original Llama implementation

Python 369 40 Updated Dec 9, 2023

chinalist for SwitchyOmega and SmartProxy

Python 146 13 Updated Oct 27, 2025
Python 155 49 Updated Feb 22, 2024

Scalable toolkit for efficient model alignment

Python 843 102 Updated Oct 6, 2025

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 109,717 11,412 Updated Nov 4, 2025
Next