Skip to content
View duoan's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report duoan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Our library for RL environments + evals

Python 4,218 562 Updated Jun 21, 2026

An asynchronous streaming data management module for efficient post-training.

Python 97 35 Updated Jun 18, 2026

RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI

Python 3,850 543 Updated Jun 18, 2026

Model interpretability and understanding for PyTorch

Python 5,654 559 Updated Jun 20, 2026

The WeightWatcher tool for predicting the accuracy of Deep Neural Networks

Python 1,756 148 Updated May 11, 2026

A flexible and high-performance training framework designed for large-scale foundation model training on AMD GPUs

Python 104 40 Updated Jun 20, 2026

High-performance GPU kernels for LLM inference in OpenAI Triton. Fused RMSNorm, SwiGLU, INT8 GEMM with benchmarks and roofline analysis.

Python 27 4 Updated Apr 5, 2026

Code release for book "Efficient Training in PyTorch"

Python 131 20 Updated Apr 10, 2025

AI 基础知识 - GPU 架构、CUDA 编程、大模型基础及AI Agent 相关知识。

HTML 1,541 242 Updated Jun 22, 2026

TradingAgents: Multi-Agents LLM Financial Trading Framework

Python 87,892 16,969 Updated Jun 22, 2026

AI Infrastructure Engineer Learning Track - Production ML infrastructure curriculum (2-4 years experience)

Python 1,354 224 Updated Jun 21, 2026

Academic Research Skills for Claude Code: research → write → review → revise → finalize

Python 33,496 2,752 Updated Jun 21, 2026

Now, Stronger AI Pushes Frontiers, Stronger Our Shared Future.

TypeScript 3,097 326 Updated Jun 15, 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 12,454 1,131 Updated Jun 21, 2026

A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do

843 102 Updated Apr 27, 2026

AI Infrastructure Performance Engineer Learning Track - GPU optimization, inference optimization, and cost reduction

Python 38 9 Updated Jun 21, 2026

Universal LLM Deployment Engine with ML Compilation

Python 22,835 2,069 Updated May 11, 2026

An Extensible Deep Learning Library

Python 2,367 406 Updated May 16, 2026

Helpful kernel tutorials, examples and SKILLs for tile-based GPU programming

Python 757 78 Updated Jun 17, 2026
Python 4 Updated Mar 4, 2026

Kernels, of the mega variety :)

Python 757 60 Updated May 26, 2026

Tile primitives for speedy kernels

Cuda 3,464 299 Updated Jun 15, 2026

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 9,574 1,367 Updated Jun 22, 2026

CUDA Core Compute Libraries

C++ 2,391 413 Updated Jun 22, 2026

PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily write your own.

Python 1,463 114 Updated Jun 15, 2026

A systematic and pedagogical way to derive the correctness structure of 2D Register Allocated GEMM before coding.

HTML 8 Updated May 2, 2026

DeepSeek 4 Flash and PRO local inference engine for Metal, CUDA and ROCm

C 14,923 1,304 Updated Jun 17, 2026

Reverse engineering NVIDIA SASS instruction dictionary, kernel audits and pattern recognition across GPU architectures.

Cuda 300 14 Updated May 18, 2026
Next