Skip to content
View GHGmc2's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Shanghai

Block or report GHGmc2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

GTP engine and self-play learning in Go

C++ 4,712 721 Updated Jun 22, 2026

Autoresearch for Go

Python 215 27 Updated May 15, 2026

Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM and DeepSpeed integration, CUDA/ROCm/Triton kernels, and transparent hardware-aware scaling.

Python 150 32 Updated Jun 22, 2026

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 21,593 2,564 Updated Jun 30, 2025
Python 84 4 Updated Jun 20, 2026

Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.

Python 450 64 Updated Jun 15, 2026

A programmable distributed training system for PyTorch

Python 17 2 Updated Jun 10, 2026

Research artifacts from Recursive's automated AI research system

Python 133 12 Updated Jun 11, 2026

A profiling and performance analysis tool for machine learning

C++ 543 88 Updated Jun 23, 2026

An LLM post-training framework with vLLM for RL Scaling

Python 294 32 Updated Jun 23, 2026

A unified framework for building, running, and training general agents at scale.

Python 359 54 Updated Jun 18, 2026

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 697 43 Updated Jun 23, 2026

Compositional Muon release

Python 22 4 Updated Jun 5, 2026

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 480 32 Updated May 20, 2026

torch_remat fine-grained activation checkpointing API

Python 13 Updated Jun 8, 2026

Provide performance insight capabilities for RL frameworks.

Python 36 26 Updated Jun 23, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,516 598 Updated May 23, 2026

Orchestrate multiple coding agents from desktop and mobile

TypeScript 9,049 862 Updated Jun 23, 2026

GPU-accelerated LLM Training Simulator

Makefile 22 9 Updated Jun 26, 2025

🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.

Jupyter Notebook 4,217 365 Updated May 25, 2026

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

Python 216 22 Updated Jun 23, 2026

A high-performance RL training-inference weight synchronization framework, designed to enable second-level parameter updates from training to inference in RL workflows

Python 160 18 Updated May 25, 2026

🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.

Python 2,026 187 Updated Jun 22, 2026

A Lightweight LLM Post-Training Library

Python 2,347 311 Updated Jun 23, 2026

An asynchronous streaming data management module for efficient post-training.

Python 98 35 Updated Jun 22, 2026

Pipeline parallelism for the minimalist

Python 39 1 Updated Aug 6, 2025

Artefacts from the first complete run of the Lossfunk AI Scientist pipeline for paper accepted at Agents4Science 2025.

19 1 Updated Nov 10, 2025
Next