Skip to content
View mxin262's full-sized avatar

Block or report mxin262

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation Model

822 27 Updated Dec 22, 2025

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 18,764 2,970 Updated Dec 22, 2025

GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.

Python 1,684 139 Updated Dec 19, 2025

Incentivizing "Thinking with Long Videos" via Native Tool Calling

Python 154 10 Updated Dec 20, 2025
Python 7,650 452 Updated Dec 14, 2025

Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Python 129 5 Updated Dec 17, 2025

SAM 3D Objects

Python 5,040 468 Updated Dec 16, 2025

Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026 Oral).

31 Updated Nov 14, 2025

Native Multimodal Models are World Learners

Python 1,367 52 Updated Nov 28, 2025

Contexts Optical Compression

Python 21,540 1,926 Updated Oct 25, 2025

QeRL enables RL for 32B LLMs on a single H100 GPU.

Python 470 46 Updated Nov 27, 2025

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,154 193 Updated Oct 9, 2025

codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)

Python 708 69 Updated Dec 19, 2025

MCP for xiaohongshu.com

Go 7,600 1,192 Updated Dec 21, 2025

Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…

JavaScript 8,171 834 Updated Sep 8, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,303 117 Updated Dec 11, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,696 1,355 Updated Dec 17, 2025

A lightweight Python library for simulating Chinese handwriting

Python 2,210 256 Updated Apr 6, 2024

So your teacher asked you to upload written assignments? Hate writing assigments? This tool will help you convert your text to handwriting xD

HTML 4,948 1,173 Updated Jul 11, 2021

The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.

Dockerfile 276 7 Updated Sep 26, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 19,456 1,999 Updated Nov 1, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,460 364 Updated Dec 19, 2025

LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.

Perl 1,188 128 Updated Dec 13, 2025

Multilingual Document Layout Parsing in a Single Vision-Language Model

Python 5,915 579 Updated Oct 31, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 12,990 1,510 Updated Dec 17, 2025

An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.

TypeScript 19,103 2,702 Updated Dec 18, 2025
Next