Skip to content
View Whitea029's full-sized avatar
👋
Welcome to contact me!
👋
Welcome to contact me!

Block or report Whitea029

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.

Python 1,561 260 Updated Jun 16, 2026

A set of examples based on verl for end-to-end RL training recipes.

Python 291 134 Updated Jun 9, 2026

Original and practical skills for AI builders.

HTML 384 18 Updated Jun 15, 2026

美股指南

4,387 679 Updated Jun 15, 2026

Repair malformed JSON from LLMs, APIs, logs, and user input in Python.

Python 4,975 199 Updated Jun 9, 2026

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 474 32 Updated May 20, 2026

slime is an LLM post-training framework for RL Scaling.

Python 6,137 895 Updated Jun 15, 2026

This is a Chinese translation of the CUDA programming guide

1,987 291 Updated Nov 13, 2024

An open-source AI coding agent that lives in your terminal.

TypeScript 25,247 2,512 Updated Jun 16, 2026

SkillsBench evaluates how well skills work and how effective agents are at using them.

PDDL 1,356 317 Updated Jun 15, 2026

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

Python 10,467 1,109 Updated Jun 14, 2026

Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"

Python 222 13 Updated May 28, 2026

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Python 669 42 Updated May 30, 2026

Skills for Real Engineers. Straight from my .claude directory.

Shell 130,140 11,359 Updated Jun 12, 2026

Agent Skills for Google products and technologies

Python 13,725 1,034 Updated Jun 13, 2026

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

176,183 17,989 Updated Apr 20, 2026

AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI

TypeScript 62,971 7,649 Updated Jun 15, 2026

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 452 64 Updated Mar 20, 2026

CL-bench: A Benchmark for Context Learning

Python 559 29 Updated May 12, 2026

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

4,111 396 Updated Jul 25, 2025

50+ tutorials and implementations for Generative AI Agent techniques, from basic conversational bots to complex multi-agent systems.

Jupyter Notebook 22,616 3,798 Updated Jun 11, 2026

AI Infra学习笔记,完整高清大图;学习路线推荐

Python 207 6 Updated Feb 27, 2026

Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"

Python 65 3 Updated Feb 18, 2026

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 16,551 1,642 Updated Mar 4, 2026

Autonomous AI development loop for Claude Code with intelligent exit detection

Shell 9,339 714 Updated Jun 15, 2026

💻 vibe coding 2026 | Your first modern Coding course beginners to master step by step.

JavaScript 16,962 1,599 Updated Jun 10, 2026

【三年面试五年模拟】AIGC/LLM/AI Agent算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。

3,934 411 Updated Jun 15, 2026

Agentic Learning Powered by AWorld

Python 111 10 Updated Apr 16, 2026

Use Garry Tan's exact Claude Code setup: 23 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA

TypeScript 110,430 16,424 Updated Jun 14, 2026

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 159 11 Updated May 29, 2025
Next