Skip to content
View yushengsu-thu's full-sized avatar

Highlights

  • Pro

Organizations

@thunlp @OpenBMB @RLsys-Foundation

Block or report yushengsu-thu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1 Updated Apr 17, 2026
Shell 1 Updated Apr 16, 2026

Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language

Python 142 7 Updated Apr 2, 2026

how to optimize some algorithm in cuda.

Cuda 2,928 270 Updated Apr 16, 2026

Research on Coding Agents

11,668 19,740 Updated Apr 1, 2026

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,883 3,206 Updated Apr 9, 2026

A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-performance systems.

Python 114 6 Updated Apr 15, 2026

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,295 1,049 Updated Apr 12, 2026

Claude Opus 4.6 wrote a dependency-free C compiler in Rust, with backends targeting x86 (64- and 32-bit), ARM, and RISC-V, capable of compiling a booting Linux kernel.

Rust 2,634 220 Updated Feb 5, 2026

Tutorials for Triton, a language for writing gpu kernels

Jupyter Notebook 78 10 Updated Aug 23, 2023

HuggingFace conversion and training library for Megatron-based models

Python 1 Updated Apr 9, 2026

Training library for Megatron-based models with bidirectional Hugging Face conversion capability

Python 576 268 Updated Apr 17, 2026

PyTorch-native post-training at scale

Python 670 97 Updated Apr 17, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,013 582 Updated Mar 13, 2026

Getting Started with Triton: A Tutorial for Python Beginners

HTML 50 5 Updated Mar 26, 2026

Tinkering RL

Python 25 3 Updated Apr 14, 2026

A simple, performant and scalable Jax LLM!

Python 2,239 507 Updated Apr 17, 2026

The absolute trainer to light up AI agents.

Python 16,923 1,476 Updated Apr 3, 2026

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 7,011 541 Updated Apr 13, 2026

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,093 157 Updated Apr 17, 2026

🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation feedback, cross-platform NVIDIA/AMD, Kernelbook + KernelBench

Python 135 5 Updated Nov 10, 2025

JAX backend for SGL

Python 265 89 Updated Apr 17, 2026

Kimi Code CLI is your next CLI agent.

Python 7,867 867 Updated Apr 17, 2026
Python 136 17 Updated Mar 5, 2026

The best ChatGPT that $100 can buy.

Python 52,049 6,913 Updated Apr 14, 2026

open-source coding LLM for software engineering tasks

Python 1,192 152 Updated Sep 30, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 1 Updated Apr 10, 2026

APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM training.

Python 57 3 Updated Oct 11, 2025
Next