Highlights
- Pro
Stars
Open source repository of plugins primarily intended for knowledge workers to use in Claude Cowork
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.
A benchmark for LLMs on complicated tasks in the terminal
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.
Lightweight coding agent that runs in your terminal
An open-source AI agent that lives in your terminal.
Python tool for converting files and office documents to Markdown.
Ranking LLMs on agentic tasks
Anthropic's educational courses
Model Context Protocol Servers
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
OpenAI-compatible API server for Apple on-device models
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A learning environment for man-made Interactive Fiction games.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
Open-source implementation of AlphaEvolve
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards