Skip to content
View gujiaqivadin's full-sized avatar

Block or report gujiaqivadin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research · 浏览器里运行的安卓模拟器 · Browser-hosted Android Simulator · Verifiable Evaluation · Scalable Online RL Training

TypeScript 629 102 Updated Jun 15, 2026

Memory Sparse Attention - A scalable, end-to-end trainable latent-memory framework for 100M-token contexts.

Python 3,481 225 Updated May 6, 2026

The largest open-source medical AI skills library for OpenClaw🦞.

Python 2,714 378 Updated Jun 18, 2026

论文X光机 — Claude Code Skill,解构学术论文,提炼餐巾纸公式

581 42 Updated Feb 26, 2026

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Jupyter Notebook 657 58 Updated Mar 17, 2026

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 29,150 6,598 Updated Jun 18, 2026

[NeurIPS 2025]⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.

Python 440 29 Updated Oct 5, 2025

⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.

Jupyter Notebook 131 36 Updated Oct 27, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,223 18,177 Updated Jun 18, 2026

MCP for xiaohongshu.com

Go 14,241 2,135 Updated Jun 17, 2026

An Arena-style Automated Evaluation Benchmark for Detailed Captioning

Python 59 4 Updated Jun 1, 2025

The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.

Python 108 9 Updated May 30, 2025

A Comprehensive Survey on Continual Learning in Generative Models.

159 10 Updated Jun 1, 2026

The open-source CapCut alternative

TypeScript 56,644 6,142 Updated May 27, 2026

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,530 1,315 Updated Jul 9, 2025

Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".

Python 125 2 Updated Feb 7, 2026

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, …

Python 14,557 1,484 Updated Jun 18, 2026

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,485 47 Updated Mar 9, 2026

[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning

Python 348 15 Updated Feb 9, 2026

Geologic models from Llama 4 language model + GemPy!

Python 87 23 Updated Jan 22, 2026

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 394 16 Updated Jun 13, 2025

OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

Python 152 7 Updated May 25, 2026

An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.

Python 1,917 218 Updated May 26, 2026

[NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling

Python 4,594 383 Updated Sep 26, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 1,483 101 Updated Jun 15, 2026

[CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Python 218 9 Updated Sep 26, 2025

Collections of Papers and Projects for Multimodal Reasoning.

109 11 Updated Apr 25, 2025

A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.

75 8 Updated Mar 18, 2025

Explore the Multimodal “Aha Moment” on 2B Model

Python 623 23 Updated Mar 18, 2025

Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’

Jupyter Notebook 2,250 107 Updated Oct 29, 2025
Next