Highlights
- Pro
Stars
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
A high-throughput and memory-efficient inference and serving engine for LLMs
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Automate browser based workflows with AI
verl: Volcano Engine Reinforcement Learning for LLMs
Agent S: an open agentic framework that uses computers like a human
"AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework"
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Mobile-Agent: The Powerful GUI Agent Family
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
A repo lists papers related to LLM based agent
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling
An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
A curated list of awesome LLM agents frameworks.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
Building a comprehensive and handy list of papers for GUI agents
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
OS-ATLAS: A Foundation Action Model For Generalist GUI Agents
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment