njucckevin

Kanzhi Cheng njucckevin

Achievements

Stars

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,123 31,046 Updated Nov 5, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 15,982 1,259 Updated Oct 27, 2025

xlang-ai / OSWorld

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,292 321 Updated Oct 30, 2025

microsoft / Magma

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,840 146 Updated Oct 4, 2025

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

Python 1,264 73 Updated Jun 8, 2025

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 1,209 190 Updated Oct 3, 2025

yaotingwangofficial / Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

875 25 Updated Aug 26, 2025

OpenGVLab / ScaleCUA

ScaleCUA is the open-sourced computer use agents that can operate on corss-platform environments (Windows, macOS, Ubuntu, Android).

Python 774 43 Updated Oct 3, 2025

OSU-NLP-Group / GUI-Agents-Paper-List

Building a comprehensive and handy list of papers for GUI agents

Python 544 30 Updated Oct 27, 2025

google-research / android_world

AndroidWorld is an environment and benchmark for autonomous agents

Python 493 105 Updated Oct 27, 2025

llm-as-a-judge / Awesome-LLM-as-a-judge

454 21 Updated Jul 25, 2025

ranpox / awesome-computer-use

This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.

453 16 Updated Jun 4, 2025

njucckevin / SeeClick

The model, data and code for the visual GUI Agent SeeClick

HTML 433 24 Updated Jul 13, 2025

MobileLLM / Personal_LLM_Agents_Survey

Paper list for Personal LLM Agents

417 21 Updated May 8, 2024

OS-Copilot / OS-Atlas

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

Python 397 20 Updated Apr 20, 2025

xlang-ai / aguvis

[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Python 366 26 Updated Mar 7, 2025

microsoft / GUI-Actor

[NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Python 350 40 Updated Oct 29, 2025

OSU-NLP-Group / UGround

[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents

Python 284 12 Updated Jul 18, 2025

likaixin2000 / ScreenSpot-Pro-GUI-Grounding

GUI Grounding for Professional High-Resolution Computer Use

Python 277 30 Updated Oct 27, 2025

QiushiSun / Awesome-Code-Intelligence

Neural Code Intelligence Survey 2024; Reading lists and resources

275 15 Updated Jul 24, 2025

cooelf / Auto-GUI

Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)

Python 253 19 Updated Jul 16, 2024

aiming-lab / MDocAgent

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Python 240 28 Updated Aug 8, 2025

ltzheng / agent-studio

[ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents

Python 219 24 Updated Jun 16, 2025

TideDra / VL-RLHF

A RLHF Infrastructure for Vision-Language Models

Python 185 8 Updated Nov 15, 2024

OS-Copilot / OS-Genesis

[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Jupyter Notebook 167 12 Updated Oct 8, 2025

Neph0s / CoSER

Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"

Python 138 8 Updated Jun 28, 2025

RUCBM / GUICourse

GUICourse: From General Vision Langauge Models to Versatile GUI Agents

Python 133 7 Updated Jul 17, 2024

OS-Copilot / ScienceBoard

Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"

Python 117 10 Updated Aug 28, 2025

xufangzhi / ENVISIONS

[ACL 2025] A Neural-Symbolic Self-Training Framework

C 116 4 Updated Jun 1, 2025

xufangzhi / phi-Decoding

[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling

Python 106 8 Updated May 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kanzhi Cheng njucckevin

Achievements

Achievements

Block or report njucckevin

Stars

huggingface / transformers

QwenLM / Qwen3-VL

xlang-ai / OSWorld

microsoft / Magma

zzli2022 / Awesome-System2-Reasoning-LLM

web-arena-x / webarena

yaotingwangofficial / Awesome-MCoT

OpenGVLab / ScaleCUA

OSU-NLP-Group / GUI-Agents-Paper-List

google-research / android_world

llm-as-a-judge / Awesome-LLM-as-a-judge

ranpox / awesome-computer-use

njucckevin / SeeClick

MobileLLM / Personal_LLM_Agents_Survey

OS-Copilot / OS-Atlas

xlang-ai / aguvis

microsoft / GUI-Actor

OSU-NLP-Group / UGround

likaixin2000 / ScreenSpot-Pro-GUI-Grounding

QiushiSun / Awesome-Code-Intelligence

cooelf / Auto-GUI

aiming-lab / MDocAgent

ltzheng / agent-studio

TideDra / VL-RLHF

OS-Copilot / OS-Genesis

Neph0s / CoSER

RUCBM / GUICourse

OS-Copilot / ScienceBoard

xufangzhi / ENVISIONS

xufangzhi / phi-Decoding