Stars
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone
GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.
Incentivizing "Thinking with Long Videos" via Native Tool Calling
Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026 Oral).
Native Multimodal Models are World Learners
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
Tongyi Deep Research, the Leading Open-source Deep Research Agent
A lightweight Python library for simulating Chinese handwriting
So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD
The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.
Multilingual Document Layout Parsing in a Single Vision-Language Model
Wan: Open and Advanced Large-Scale Video Generative Models
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.