The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
-
Updated
Mar 27, 2026 - TypeScript
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
AI-powered, vision-driven UI automation for every platform.
Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to ChatGPT Atlas, Perplexity Comet, Dia and Microsoft CoPilot Edge Browser
State of the Art 82% OSWorld Verified Computer Using Agent, fully open-source, safe, auditable, and production-ready.
AI-powered computer control for automated testing. Factifai uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
A fully-featured, GUI-powered local LLM Agent sandbox with complete MCP protocol support. Features both CLI and full desktop environment, enabling AI agents to operate browsers, terminal, and other desktop applications just like humans. Based on E2B oss code.
AI-powered login automation. Uses Claude to classify login pages and Playwright to interact with them.
The World's First Out-of-the-Box Computer Use Agent Powered by Gemini-CLI @openmule
AI-powered computer control for automated testing in your CI/CD pipelines. Factifai agent uses vision models (Claude, GPT-4o) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
Give AI eyes and hands on your desktop. Open-source MCP server for desktop automation — screenshots, UI control, browser automation, OCR. Works with Claude, Cursor, and any MCP client. macOS + Windows.
💻 Control AI agents to automate tasks on computers, enabling true autonomy with browser, terminal, and desktop interaction. Perfect for developers.
Mark web pages for use with vision-language models
This is OpenAI's computer use hooked up to a chrome extension.
✨ Use natural language to control your browser, powered by LLM and playwright
ChatGPT Agent but in Cloudflare Containers
Anthropic's Computer use implementation in Nodejs
This is the crud backend for our QA test application
Auto-Browse: AI Enabled Browser Automation
Add a description, image, and links to the computer-use topic page so that developers can more easily learn about it.
To associate your repository with the computer-use topic, visit your repo's landing page and select "manage topics."