Skip to content
View xdotli's full-sized avatar

Sponsoring

@dohooo
@AmyTao

Highlights

  • Pro

Organizations

@benchflow-ai

Block or report xdotli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Gemma Gem runs Google's Gemma 4 model entirely on-device via WebGPU — no API keys, no cloud, no data leaving your machine.

TypeScript 940 102 Updated May 29, 2026

Reproducible, flexible LLM evaluations

Python 380 95 Updated Mar 24, 2026

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 25,161 4,858 Updated Jun 21, 2026

Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)

Jupyter Notebook 1,096 193 Updated Dec 20, 2025

Makes your AI agent think like the laziest senior dev in the room. The best code is the code you never wrote.

JavaScript 45,664 2,246 Updated Jun 21, 2026

The Framework for Building Agents

TypeScript 2,026 146 Updated Jun 21, 2026

Learn it. Build it. Ship it for others.

Python 35,304 5,759 Updated Jun 14, 2026

Hierarchal Agent Loop Optimizer

TypeScript 891 67 Updated Jun 21, 2026

RLAnything (ICML 2026) & AutoTool (ICML 2026), DemyAgent: Open-Source RL for LLMs and Agentic Scenarios

Python 555 56 Updated Jun 12, 2026

Use your most capable model to audit your codebase and write plans for cheaper models to execute.

5,867 234 Updated Jun 15, 2026

Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, an…

Python 4,304 488 Updated Jun 21, 2026
Python 42 5 Updated Jun 18, 2026

Agents' Last Exam

Python 705 29 Updated Jun 21, 2026

Awesome List for Agentic RL

HTML 1,623 63 Updated Jun 20, 2026

An LLM post-training framework with vLLM for RL Scaling

Python 288 29 Updated Jun 21, 2026

Open-source local workbench for multi-agent software development.

TypeScript 1,237 108 Updated Jun 21, 2026

Official Compound Engineering plugin for Claude Code, Codex, Cursor, and more

TypeScript 21,856 1,607 Updated Jun 21, 2026

Drop-in replacement for `claude -p` that drives the interactive Claude Code TUI inside an in-process zmux PTY session.

Zig 386 34 Updated Jun 17, 2026

Drop-in replacement for claude -p that runs on your Claude Code subscription instead of metered API pricing.

TypeScript 37 7 Updated Jun 10, 2026

Skills for threat modeling, scanning, triage, patching, plus an autonomous scanning harness you can /customize

Python 6,128 468 Updated Jun 15, 2026

Low-level unprivileged sandboxing tool used by Flatpak and similar projects

C 7,677 353 Updated Jun 2, 2026

Open-source framework for superagents.

Python 88 4 Updated Jun 16, 2026

🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.

Python 3,038 201 Updated Jun 21, 2026

Scalable, cloud-native infrastructure for evaluating AI agents across any benchmark.

Python 10 Updated Jun 19, 2026
Python 80 21 Updated Nov 17, 2025

A protocol that recasts the primary research object from narrative document to machine-executable knowledge package — so AI agents can navigate, reproduce, and extend published research without re-…

JavaScript 379 40 Updated Jun 18, 2026

Stealth Chromium that passes every bot detection test. Drop-in Playwright replacement with source-level fingerprint patches. 30/30 tests passed.

Python 26,785 2,110 Updated Jun 21, 2026

Paperclip — search, read, and analyze 8M+ biomedical papers from the command line

Python 183 17 Updated May 22, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,567 1,487 Updated Jun 18, 2026

For adapters paper experiments (correlation study, traj analysis, etc.) and harbor mix selection.

TypeScript 2 1 Updated Jun 21, 2026
Next