Skip to content
View duoan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report duoan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 3,779 527 Updated Jun 13, 2026

Model interpretability and understanding for PyTorch

Python 5,649 559 Updated Jun 11, 2026

The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

Python 1,756 147 Updated May 11, 2026

A flexible and high-performance training framework designed for large-scale foundation model training on AMD GPUs

Python 103 39 Updated Jun 13, 2026

High-performance GPU kernels for LLM inference in OpenAI Triton. Fused RMSNorm, SwiGLU, INT8 GEMM with benchmarks and roofline analysis.

Python 26 4 Updated Apr 5, 2026

Code release for book "Efficient Training in PyTorch"

Python 131 20 Updated Apr 10, 2025

AI 基础知识 - GPU 架构、CUDA 编程、大模型基础及AI Agent 相关知识。

HTML 1,470 235 Updated Jun 13, 2026

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 85,693 16,550 Updated Jun 1, 2026

AI Infrastructure Engineer Learning Track - Production ML infrastructure curriculum (2-4 years experience)

Python 1,240 198 Updated Jun 1, 2026

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 30,908 2,545 Updated Jun 13, 2026

Now, Stronger AI Pushes Frontiers, Stronger Our Shared Future.

TypeScript 3,038 319 Updated Jun 3, 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 12,010 1,103 Updated Jun 13, 2026

A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do

822 100 Updated Apr 27, 2026

AI Infrastructure Performance Engineer Learning Track - GPU optimization, inference optimization, and cost reduction

Python 35 9 Updated May 29, 2026

Universal LLM Deployment Engine with ML Compilation

Python 22,792 2,067 Updated May 11, 2026

An Extensible Deep Learning Library

Python 2,366 405 Updated May 16, 2026

Helpful kernel tutorials, examples and SKILLs for tile-based GPU programming

Python 751 74 Updated Jun 12, 2026
Python 4 Updated Mar 4, 2026

Kernels, of the mega variety :)

Python 751 59 Updated May 26, 2026

Tile primitives for speedy kernels

Cuda 3,427 295 Updated May 27, 2026

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,773 1,299 Updated Jun 13, 2026

CUDA Core Compute Libraries

C++ 2,378 410 Updated Jun 13, 2026

PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily write your own.

Python 1,461 114 Updated Jun 8, 2026

A systematic and pedagogical way to derive the correctness structure of 2D Register Allocated GEMM before coding.

HTML 8 Updated May 2, 2026

DeepSeek 4 Flash and PRO local inference engine for Metal, CUDA and ROCm

C 13,575 1,196 Updated Jun 11, 2026

Reverse engineering NVIDIA SASS instruction dictionary, kernel audits and pattern recognition across GPU architectures.

Cuda 294 14 Updated May 18, 2026

Fast Polar Decomposition for Muon

Python 154 13 Updated May 2, 2026

Skills for Real Engineers. Straight from my .claude directory.

Shell 127,449 11,141 Updated Jun 12, 2026
Next