Stars
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
An elegant PyTorch deep reinforcement learning library.
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
🎓 系统性大语言模型构建课程|🛠️ 覆盖预训练数据工程、Tokenizer、Transformer、MoE、GPU 编程 (CUDA/Triton)、分布式训练、Scaling Laws、推理优化及对齐 (SFT/RLHF/GRPO)|🚀 6 个渐进式作业 + 代码驱动,建立 LLM 全栈认知体系
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
AI agents running research on single-GPU nanochat training automatically
Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
Train transformer language models with reinforcement learning.
SkyRL: A Modular Full-stack RL Library for LLMs
RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings
A curated list of reinforcement learning (RL) for agents.
A markdown template for taking notes to summarize research papers.
Minimal reproduction of DeepSeek R1-Zero
Lightweight coding agent that runs in your terminal
Minimalistic 4D-parallelism distributed training framework for education purpose
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Minimalistic large language model 3D-parallelism training
🔍大模型应用开发实战一:RAG 技术全栈指南,在线阅读地址:https://datawhalechina.github.io/all-in-rag/
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning