Stars
A self-hosted ML coding practice platform. 68 problems from ReLU to flow matching — attention, training, RLHF, diffusion, and more. Instant feedback in the browser.
Official implementation of Seeing with You: Perception-Reasoning Co-evolution for Multimodal Reasoning.
Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.
(CVPR 26) Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
A project implementing various agentic RL based on the Slime post-training framework
Synchronize Codex session provider metadata across rollout files and SQLite state.
ALMA (Automated meta-Learning of Memory designs for Agentic systems) is a framework that meta-learns memory designs to replace human-engineered designs for agentic system.
Official Repository of "Learning to Reason under Off-Policy Guidance"
A Claude Code hook plugin for IP-based access control · 防 Claude 封号 · Claude IP 检测 · IP 地理位置拦截 · Claude 账号保护
MLLM hallucination, LVLM, LLM, Hallucination Mitigation, Training-free hallucination mitigation
Self-hosted AI assistant with tool use, multi-agent orchestration, coding copilot and a lightweight Flask + vanilla JS stack.
Official repo for ”Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought“
AutoSkill: Experience-Driven Lifelong Learning via Skill Self-Evolution
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
Official Implementation of Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
Code release for paper "Test-Time Training Done Right"
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
Official repository for paper Auto-scaling Continuous Memory for GUI Agent
Official code for PEARL: Personalized Streaming Video Understanding Model
Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"
[CVPR 2025] RAP: Retrieval-Augmented Personalization
"Parallel Test-Time Scaling for Latent Reasoning Models"
Test-Time Mixture of World Models for Embodied Agents in Dynamic Environments [ICLR 2026]
Official code for the paper “Look Where It Matters: Training-Free Ultra-HR Remote Sensing VQA via Adaptive Zoom Search”.