Highlights
- Pro
Stars
Define your problem and evaluation criteria — EurekAgent coordinates off-the-shelf CLI agents to propose diverse approaches, implement them, run experiments, and iterate. Human intervention is opti…
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards
Deep-dive notes and source code analysis of Claude Code and AI agent harnesses. Exploring memory mechanics and internal architectures.
💖🧸 Self hosted, you-owned Grok Companion, a container of souls of waifu, cyber livings to bring them into our worlds, wishing to achieve Neuro-sama's altitude. Capable of realtime voice chat, Minec…
Composable HLS library for rapid development of LLM accelerators. FlexLLM enables spatial-temporal hybrid architectures, with parameterized modulet templates customized for the prefill and decode s…
This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards".
Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"
Tongyi Deep Research, the Leading Open-source Deep Research Agent
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL
TradingAgents: Multi-Agents LLM Financial Trading Framework
slime is an LLM post-training framework for RL Scaling.
Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"
[NIPS 2025 DB Spotlight] AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Model Context Protocol Servers
Pioneering Automated GUI Interaction with Native Agents
A live stream development of RL tunning for LLM agents
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains papers, codes, datasets, evaluations, and analyses.
An invisible desktop application to help you pass your technical interviews.
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.