Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
Stars
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
Echos is a headless, API-driven DAW engine. It’s the backend for building AI tools that automate the entire music production lifecycle.
Cambrian-S: Towards Spatial Supersensing in Video
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…
Official implementation of "Continuous Autoregressive Language Models"
RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.
The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
This is the official repo for the paper "LongCat-Flash-Omni Technical Report"
Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments
AgenTracer: A Lightweight Failure Attributor for Agentic Systems
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
On the Effect of Instruction Tuning Loss on Generalization
Official implementation of "Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs".
Efficient Mixture of Experts for LLM Paper List
Source code of "Dr.LLM: Dynamic Layer Routing in LLMs"
Official Implement of "Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models"
"LightAgent: Lightweight and Cost-Effective Mobile Agents"
Demystifying Reinforcement Learning in Agentic Reasoning
Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"
AgentFlow: In-the-Flow Agentic System Optimization
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training
Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.
Official Repo for Self-Forcing++ High Quality Long Video Generation