-
National University of Singapore
- Singapore
Stars
AgentFlow: In-the-Flow Agentic System Optimization
My learning notes/codes for ML SYS.
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
Democratizing Reinforcement Learning for LLMs
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Train transformer language models with reinforcement learning.
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Building a comprehensive and handy list of papers for GUI agents
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!
A Survey of Reinforcement Learning for Large Reasoning Models
Collect every awesome work about r1!
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
verl: Volcano Engine Reinforcement Learning for LLMs
AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.
Implementation for "The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer"
Fully open reproduction of DeepSeek-R1
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
A framework for few-shot evaluation of language models.
A collection of benchmarks and datasets for evaluating LLM.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
Awesome LLMs on Device: A Comprehensive Survey
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Out-of-the-box (OOTB) GUI Agent for Windows and macOS