Lists (1)
Sort Name ascending (A-Z)
Starred repositories
A platform for building proxies to bypass network restrictions.
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
Scheduling infrastructure for absolutely everyone.
Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.5, GPT-OSS, Llama, and more!
The absolute trainer to light up AI agents.
A fast and simple implementation of learning algorithms for robotics.
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
A toolkit for developing and comparing reinforcement learning algorithms.
Multi-Joint dynamics with Contact. A general purpose physics simulator.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
RLinf: Reinforcement Learning Infrastructure for Embodied and Agentic AI
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Algorithm powering the For You feed on X
Financial data platform for analysts, quants and AI agents.
MiroTrain is an efficient and algorithm-first framework research agent.
🏆 Top-1 on 5+ benchmarks | Web UI | Supports MiroThinker, Claude, Kimi, OpenAI
MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7 and MiroThinker-H1, achieve 74.0 and 88.2 on the BrowseComp, respectively.
Certificate Transparency Log aggregation, parsing, and streaming service written in Elixir
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
A Survey of Reinforcement Learning for Large Reasoning Models
A topic-centric list of HQ open datasets.
Blind&Invisible Watermark ,图片盲水印,提取水印无须原图!
Modeling, training, eval, and inference code for OLMo
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.