The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
-
Updated
Dec 15, 2025 - TypeScript
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
Driving all platforms UI automation with vision-based model
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
Agent S: an open agentic framework that uses computers like a human
Agent Framework For Fintech and Banks
Fara-7B: An Efficient Agentic Model for Computer Use
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
An open-sourced end-to-end VLM-based GUI Agent
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
A framework to enable autonomous android and computer use using any LLM (local or remote)
Desktop app powered by Claude’s computer use capability to control your computer
CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware features.
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
Add a description, image, and links to the computer-use topic page so that developers can more easily learn about it.
To associate your repository with the computer-use topic, visit your repo's landing page and select "manage topics."