Stars
AgenticPay: A Multi-Agent LLM Negotiation System for Buyer–Seller Transactions
Code for "Variational Reasoning for Language Models"
[CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"
Official Code for paper "Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding""
[ACL '26] Lang2Act: Fine-Grained Visual Reasoning through Self-Emergent Linguistic Toolchains
A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation of large language models
A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimodal, and on-policy self-distillation.
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
Research of DeepSeek Engram Architecture based on Qwen-3 and Stable Diffusion series.
EngramX — the cached context spine for AI coding agents. 9 built-in providers + any MCP server as a 10-line plugin, pre-mortem mistake-guard, bi-temporal memory, Anthropic Auto-Memory bridge, SSE s…
[CVPR 2026] LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
⚒ Evolutionary self-improvement for Hermes Agent — optimize skills, prompts, and code using DSPy + GEPA
Semi-automated research assistant for academic research and software development. Supports Claude Code, OpenCode, and Codex CLI across ideation, coding, experiments, writing, and publication.
slime is an LLM post-training framework for RL Scaling.
[ICLR 2026]🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement l…
KEEP:Official code for "KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning". A memory management system optimized for embodied planning via static-dynamic memory cons…
A research agent system deeply rooted in your own Zotero library.
[NeurIPS 2025] Ask a Strong LLM Judge when Your Reward Model is Uncertain
A professional research suite for conducting rigorous academic research using specialized agents and multi-platform CLI commands. Compatible with Claude Code, Gemini CLI, OpenAI Codex, and OpenCode.
A ready-to-fork Claude Code template for academics using LaTeX/Beamer + R. Multi-agent review, quality gates, adversarial QA, and replication protocols.
tmux sidebar for coding agents — Amp, Claude Code, Codex, OpenCode. Per-thread markers, local HTTP API, live session state.
A Claude Code plugin that shows what's happening - context usage, active tools, running agents, and todo progress
Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding