njucckevin

Kanzhi Cheng njucckevin

38 followers · 0 following

Achievements

Stars

38 stars written in Python

Clear filter

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,145 31,054 Updated Nov 6, 2025

xlang-ai / OSWorld

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Python 2,292 322 Updated Oct 30, 2025

microsoft / Magma

[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents

Python 1,840 146 Updated Oct 4, 2025

zzli2022 / Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

Python 1,264 73 Updated Jun 8, 2025

web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Python 1,209 190 Updated Oct 3, 2025

OpenGVLab / ScaleCUA

ScaleCUA is the open-sourced computer use agents that can operate on corss-platform environments (Windows, macOS, Ubuntu, Android).

Python 778 43 Updated Oct 3, 2025

OSU-NLP-Group / GUI-Agents-Paper-List

Building a comprehensive and handy list of papers for GUI agents

Python 544 30 Updated Oct 27, 2025

google-research / android_world

AndroidWorld is an environment and benchmark for autonomous agents

Python 494 105 Updated Oct 27, 2025

OS-Copilot / OS-Atlas

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

Python 397 20 Updated Apr 20, 2025

xlang-ai / aguvis

[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Python 366 26 Updated Mar 7, 2025

microsoft / GUI-Actor

[NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Python 350 40 Updated Oct 29, 2025

OSU-NLP-Group / UGround

[ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents

Python 284 12 Updated Jul 18, 2025

likaixin2000 / ScreenSpot-Pro-GUI-Grounding

GUI Grounding for Professional High-Resolution Computer Use

Python 277 31 Updated Oct 27, 2025

cooelf / Auto-GUI

Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)

Python 253 19 Updated Jul 16, 2024

aiming-lab / MDocAgent

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Python 240 28 Updated Aug 8, 2025

ltzheng / agent-studio

[ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents

Python 219 24 Updated Jun 16, 2025

TideDra / VL-RLHF

A RLHF Infrastructure for Vision-Language Models

Python 185 8 Updated Nov 15, 2024

Neph0s / CoSER

Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"

Python 139 8 Updated Jun 28, 2025

RUCBM / GUICourse

GUICourse: From General Vision Langauge Models to Versatile GUI Agents

Python 133 7 Updated Jul 17, 2024

OS-Copilot / ScienceBoard

Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"

Python 117 10 Updated Aug 28, 2025

xufangzhi / phi-Decoding

[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling

Python 106 8 Updated May 18, 2025

njucckevin / MM-Self-Improve

A Self-Training Framework for Vision-Language Reasoning

Python 85 1 Updated Jan 23, 2025

LightChen233 / M3CoT

Python 84 3 Updated Jun 7, 2024

CONE-MT / LLaMAX

Python 71 5 Updated Dec 6, 2024

xufangzhi / Genius

[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework

Python 71 8 Updated Jun 1, 2025

xufangzhi / Symbol-LLM

[ACL 2024] The project of Symbol-LLM

Python 59 4 Updated Jul 10, 2024

njucckevin / CapArena

An Arena-style Automated Evaluation Benchmark for Detailed Captioning

Python 56 3 Updated Jun 1, 2025

HKUNLP / RSA

Forked from chang-github-00/RSA

Retrieved Sequence Augmentation for Protein Representation Learning

Python 53 3 Updated Nov 1, 2023

chengyou-jia / AgentStore

[ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Python 41 4 Updated Dec 19, 2024

AlignGPT-VL / AlignGPT

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

Python 34 5 Updated Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kanzhi Cheng njucckevin

Achievements

Achievements

Block or report njucckevin

Stars

huggingface / transformers

xlang-ai / OSWorld

microsoft / Magma

zzli2022 / Awesome-System2-Reasoning-LLM

web-arena-x / webarena

OpenGVLab / ScaleCUA

OSU-NLP-Group / GUI-Agents-Paper-List

google-research / android_world

OS-Copilot / OS-Atlas

xlang-ai / aguvis

microsoft / GUI-Actor

OSU-NLP-Group / UGround

likaixin2000 / ScreenSpot-Pro-GUI-Grounding

cooelf / Auto-GUI

aiming-lab / MDocAgent

ltzheng / agent-studio

TideDra / VL-RLHF

Neph0s / CoSER

RUCBM / GUICourse

OS-Copilot / ScienceBoard

xufangzhi / phi-Decoding

njucckevin / MM-Self-Improve

LightChen233 / M3CoT

CONE-MT / LLaMAX

xufangzhi / Genius

xufangzhi / Symbol-LLM

njucckevin / CapArena

HKUNLP / RSA

chengyou-jia / AgentStore

AlignGPT-VL / AlignGPT