-
Carnegie Mellon (PhD, AI Safety)
- Pittsburgh, USA
-
05:59
(UTC -04:00) - matanshtepel.com
Highlights
- Pro
Starred repositories
slime is an LLM post-training framework for RL Scaling.
This is a LASR Labs project supervised by Mary Phuong.
GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.
Post-training with Tinker
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
Incentivizing externlization via early exiting
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
Compositional Verification of Security Protocols
Analyze AI agent trajectories: extract actions, summarize, embed, and visualize.
A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code by TÂCHES.
ControlArena is a collection of settings, model organisms and protocols - for running control experiments.
iTerm2 is a terminal emulator for Mac OS X that does amazing things.
A CLI tool that helps AI researchers share datasets responsibly.
A terminal spreadsheet multitool for discovering and arranging data
Practice The CodeSignal Pre-screen for the Industry Coding Framework.
Menubar countdown timer for macOS
An agentic skills framework & software development methodology that works.
Optimize prompts, code, and more with AI-powered Reflective Text Evolution