-
-
-
Auto-claude-code-research-in-sleep Public
Forked from wanshuiyin/Auto-claude-code-research-in-sleepARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…
Python MIT License UpdatedMar 18, 2026 -
hermes-agent Public
Forked from NousResearch/hermes-agentThe agent that grows with you
Python MIT License UpdatedMar 11, 2026 -
MLEvolve Public
Forked from InternScience/MLEvolveMLEvolve is an open-source autonomous system for end-to-end machine learning algorithm design and optimization powered by progressive search and experience-driven memory.
Python UpdatedMar 9, 2026 -
-
OpenClaw-RL Public
Forked from Gen-Verse/OpenClaw-RLOpenClaw-RL: Personalize openclaw simply by talking to it
TypeScript MIT License UpdatedFeb 26, 2026 -
VeriSoftBench Public
Forked from utopia-group/VeriSoftBenchBenchmarking LLMs on Real-World Software Verification in Lean 4
Python MIT License UpdatedFeb 23, 2026 -
MARTI Public
Forked from TsinghuaC3I/MARTIA Framework for LLM-based Multi-Agent Reinforced Training and Inference
Python MIT License UpdatedFeb 19, 2026 -
ML-Master Public
Forked from sjtu-sai-agents/ML-MasterThe official implementation of "ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning"
Python UpdatedJan 16, 2026 -
qqr Public
Forked from Alibaba-NLP/qqrqqr is an RL training framework for open-ended agents.
Python Apache License 2.0 UpdatedJan 14, 2026 -
Spectral-Sphere-Optimizer Public
Forked from Unakar/Spectral-Sphere-OptimizerSpectral Sphere Optimizer
Python Apache License 2.0 UpdatedJan 14, 2026 -
Seed-Prover Public
Forked from ByteDance-Seed/Seed-ProverLean Apache License 2.0 UpdatedDec 19, 2025 -
torchforge Public
Forked from meta-pytorch/torchforgePyTorch-native post-training at scale
Python BSD 3-Clause "New" or "Revised" License UpdatedNov 5, 2025 -
-
-
sparsity_in_rl Public
Forked from SagnikMukherjee/sparsity_in_rlReinforcement Learning Finetunes Small Subnetworks in Large Language Models
Python UpdatedOct 20, 2025 -
verl_megatron_practice Public
Forked from ISEEKYAN/verl_megatron_practice(best/better) practices of megatron on veRL and tuning guide
Shell Apache License 2.0 UpdatedSep 26, 2025 -
RLinf Public
Forked from RLinf/RLinfRLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
Python Apache License 2.0 UpdatedSep 22, 2025 -
-
Verlog Public
Forked from WentseChen/VerlogVerlog: A Multi-turn RL framework for LLM agents
Python Apache License 2.0 UpdatedAug 16, 2025 -
-
IRL-VLA Public
Forked from IRL-VLA/IRL-VLAOfficial repo for IRL-VLA
Apache License 2.0 UpdatedAug 13, 2025 -
-
Agent_Foundation_Models Public
Forked from OPPO-PersonalAI/Agent_Foundation_ModelsChain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.
Python Apache License 2.0 UpdatedAug 11, 2025 -
MiroRL Public
Forked from MiroMindAI/MiroRLMiroRL is an MCP-first reinforcement learning framework for deep research agent.
Python Apache License 2.0 UpdatedAug 8, 2025 -
JAxtar Public
Forked from tinker495/JAxtarJAxtar is a project with a JAX-native implementation of parallelizeable A* & Q* solver for neural heuristic search research.
Python MIT License UpdatedAug 7, 2025 -
-
terminal-bench-rl Public
Forked from Danau5tin/terminal-bench-rlGRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
-
vsag Public
Forked from antgroup/vsagvsag is a vector indexing library used for similarity search.
C++ Apache License 2.0 UpdatedJul 29, 2025