Skip to content
View StanLei52's full-sized avatar
:octocat:
:octocat:
  • National University of Singapore
  • Singapore

Block or report StanLei52

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AgentFlow: In-the-Flow Agentic System Optimization

Python 1,147 136 Updated Nov 4, 2025

My learning notes/codes for ML SYS.

Python 4,057 247 Updated Oct 6, 2025
Python 718 60 Updated Jun 26, 2025

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 134 8 Updated May 29, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 4,666 440 Updated Nov 4, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,975 297 Updated Nov 3, 2025

Train transformer language models with reinforcement learning.

Python 16,155 2,272 Updated Nov 5, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 16,889 1,285 Updated Nov 3, 2025

Building a comprehensive and handy list of papers for GUI agents

Python 544 30 Updated Oct 27, 2025

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,251 58 Updated Oct 18, 2025

A Survey of Reinforcement Learning for Large Reasoning Models

1,983 111 Updated Nov 5, 2025

Collect every awesome work about r1!

Python 422 15 Updated May 2, 2025

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

875 25 Updated Aug 26, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,111 2,424 Updated Nov 5, 2025

Open-source unified multimodal model

Python 5,249 455 Updated Oct 27, 2025

AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.

Python 1,093 103 Updated Jun 14, 2025

Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"

Python 68 3 Updated Oct 29, 2025
Python 8,123 571 Updated Nov 5, 2025
Python 545 52 Updated Sep 23, 2025

Fully open reproduction of DeepSeek-R1

Python 25,609 2,400 Updated Sep 8, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,960 1,255 Updated Oct 27, 2025

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 12,215 1,117 Updated Sep 26, 2025

A framework for few-shot evaluation of language models.

Python 10,522 2,826 Updated Oct 29, 2025

A collection of benchmarks and datasets for evaluating LLM.

522 30 Updated Jul 13, 2024

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,878 944 Updated Nov 5, 2025

Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation

Python 109 Updated Apr 16, 2025

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Python 1,535 108 Updated May 29, 2025

Awesome LLMs on Device: A Comprehensive Survey

1,243 109 Updated Jan 12, 2025

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 1,413 201 Updated Oct 31, 2025

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Python 1,816 190 Updated May 21, 2025
Next