Stars
Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.
NEO Series: Native Vision-Language Models from First Principles
Open-source implementation of AlphaEvolve
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
Data and code for FreshLLMs (https://arxiv.org/abs/2310.03214)
The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
Code for paper: Reinforced Vision Perception with Tools
Python binding for curl-impersonate fork via cffi. A http client that can impersonate browser tls/ja3/http2 fingerprints.
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
Multimodal Referring Segmentation
The official github repo for "Diffusion Language Models are Super Data Learners".
ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
Codebase for paper ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
[NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"
Hierarchical Reasoning Model Official Release
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning