yushengsu-thu

Ethan (Yusheng) Su yushengsu-thu

By the community, for the community.

179 followers · 59 following

RadixArk | AMD | Tsinghua University
California, USA
10:47 (UTC -07:00)
https://yushengsu-thu.github.io/
@thu_yushengsu

Achievements

x3 x3

Achievements

x3 x3

Highlights

Organizations

Lists (4)

Sort

Stars

yushengsu-thu / miles-dev-script

Python 1 Updated Apr 17, 2026

yushengsu-thu / lora-dev-script

Shell 1 Updated Apr 16, 2026

TongmingLAIC / AKO4ALL

Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language

Python 142 7 Updated Apr 2, 2026

yushengsu-thu / claude-vim-IDE

Lua 1 Updated Apr 5, 2026

BBuf / SGLang-Auto-Driven-SKILLS

Python 68 6 Updated Apr 17, 2026

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,928 270 Updated Apr 16, 2026

sanbuphy / learn-coding-agent

Research on Coding Agents

11,668 19,740 Updated Apr 1, 2026

openai / parameter-golf

Train the smallest LM you can that fits in 16MB. Best model wins!

Python 4,883 3,206 Updated Apr 9, 2026

yzlnew / infra-skills

A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-performance systems.

Python 114 6 Updated Apr 15, 2026

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,295 1,049 Updated Apr 12, 2026

anthropics / claudes-c-compiler

Claude Opus 4.6 wrote a dependency-free C compiler in Rust, with backends targeting x86 (64- and 32-bit), ARM, and RISC-V, capable of compiling a booting Linux kernel.

Rust 2,634 220 Updated Feb 5, 2026

VikParuchuri / triton_tutorial

Tutorials for Triton, a language for writing gpu kernels

Jupyter Notebook 78 10 Updated Aug 23, 2023

yushengsu-thu / Megatron-Bridge

Forked from NVIDIA-NeMo/Megatron-Bridge

HuggingFace conversion and training library for Megatron-based models

Python 1 Updated Apr 9, 2026

NVIDIA-NeMo / Megatron-Bridge

Training library for Megatron-based models with bidirectional Hugging Face conversion capability

Python 576 268 Updated Apr 17, 2026

meta-pytorch / torchforge

PyTorch-native post-training at scale

Python 670 97 Updated Apr 17, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,013 582 Updated Mar 13, 2026

dsl-learn / triton-tutorial

Getting Started with Triton: A Tutorial for Python Beginners

HTML 50 5 Updated Mar 26, 2026

GMISWE / tinker-cloud

Tinkering RL

Python 25 3 Updated Apr 14, 2026

AI-Hypercomputer / maxtext

A simple, performant and scalable Jax LLM!

Python 2,239 507 Updated Apr 17, 2026

microsoft / agent-lightning

The absolute trainer to light up AI agents.

Python 16,923 1,476 Updated Apr 3, 2026

Orchestra-Research / AI-Research-SKILLs

Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…

TeX 7,011 541 Updated Apr 13, 2026

radixark / miles

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,093 157 Updated Apr 17, 2026

RLsys-Foundation / TritonForge

🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation feedback, cross-platform NVIDIA/AMD, Kernelbook + KernelBench

Python 135 5 Updated Nov 10, 2025