Highlights
- Pro
Stars
KempnerPulse - real-time GPU monitoring dashboard for DCGM metrics.
Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"
Train the smallest LM you can that fits in 16MB. Best model wins!
Official Code Implementation of Translating Flow to Policy via Hindsight Online Imitation
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
Official Implementation of iMF https://arxiv.org/abs/2512.02012
🤗 smolagents: a barebones library for agents that think in code.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
[EMNLP 2025 Demo] Extracting internal representations from vision-language models. Beta version.
Frequently updated list of dLLM (Diffusion Large Language Models) papers, models, and other resources
A framework for few-shot evaluation of language models.
[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
metaTextGrad: Automatically optimizing language model optimizers. Published in NeurIPS 2025.
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Minimal reproduction of DeepSeek R1-Zero
MENTOR is a highly efficient visual RL algorithm that excels in both simulation and real-world complex robotic learning tasks.
Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"
Official PyTorch implementation for "Large Language Diffusion Models"
[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep
Testing baseline LLMs performance across various models
SWE-bench: Can Language Models Resolve Real-world Github Issues?
A LLM trained only on data from certain time periods to reduce modern bias
This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
[NeurIPS 2023] Official code release for the paper: "Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?"