Lists (21)
Sort Name ascending (A-Z)
Stars
This repository contains a curated collection of 300+ case studies from over 80 companies, detailing practical applications and insights into machine learning (ML) system design. The contents are o…
i2LQR: Iterative LQR for Iterative Tasks in Dynamic Environments (CDC 2023) https://arxiv.org/abs/2302.14246
This repository contains multiple approaches for generating global racetrajectories.
[ICLR 2026] Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.
Reinforcement Learning Tutorial with Demo: DP (Policy and Value Iteration), Monte Carlo, TD Learning (SARSA, QLearning), Function Approximation, Policy Gradient, DQN, Imitation, Meta Learning, Pape…
Code for the paper "Language Models are Unsupervised Multitask Learners"
Wife approved HomeOps driven by Kubernetes and GitOps using Flux
My GitOps-managed home Kubernetes cluster... and more! ⛵
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
A benchmark for LLMs on complicated tasks in the terminal
Lightweight coding agent that runs in your terminal
A C library for creating Excel XLSX files.
💫 Toolkit to help you get started with Spec-Driven Development
antgroup / ant-ray
Forked from ray-project/rayRay is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads. AntRay is forked from ray, offering incremental new features on top …
slime is an LLM post-training framework for RL Scaling.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
Trae Agent is an LLM-based agent for general purpose software engineering tasks.
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
AGENTS.md — a simple, open format for guiding coding agents
Open source software for autonomous drones.
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."