Skip to content

lixinso/vmClaw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

vmClaw πŸ¦€

Don't share your pool with your Claws.
Give each AI employee its own identity and digital workspace.
You run the fleet like a boss.

vmClaw logo

vmClaw captures your VM screen, sends it to an AI vision model, and executes the actions it decides on β€” clicks, typing, keyboard shortcuts, scrolling β€” in a continuous loop until the task is done.

Imagine running a company staffed by AI employees. Each one has a unique identity and works inside its own VM, giving every agent a clean, isolated workspace that never touches your host system. From your host machine, you act as the bossβ€”assigning tasks, supervising their work, and interacting with each employee separatelyβ€”while keeping your personal identity and environment fully isolated from the identities of your AI workforce.

Why vmClaw?

  • Multi-model β€” GPT-5.4, Claude Opus 4.6, GPT-4o, DeepSeek, Grok, and 15+ more models.
  • Local β€” Runs on your Windows machine. Screenshots never leave your network (sent directly to the AI API).
  • Universal β€” Supports Hyper-V, VMware, VirtualBox, and QEMU VMs.
  • AI Memory β€” Stores past task executions in a local vector database and recalls similar successes as few-shot examples, so it improves with every run. All memory stays on your machine β€” nothing is shared or uploaded.
  • Simple β€” One command to start. No complex setup.

Fleet Mode β€” Control All Your VMs From One Screen

vmClaw Fleet

What if one AI agent isn't enough? Fleet mode lets you command VMs across every machine on your network from a single GUI.

  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚              Your Machine (Hub)                       β”‚
  β”‚   vmClaw GUI  ──────────────────────────────────────  β”‚
  β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
  β”‚   β”‚ VM: Alice     β”‚  β”‚ VM: Bob    β”‚  β”‚ VM: Carol  β”‚  β”‚
  β”‚   β”‚ (local)       β”‚  β”‚ (local)    β”‚  β”‚ (local)    β”‚  β”‚
  β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚  WebSocket
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  Lab Server      β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  10.0.0.9        │─────────│  More machines   β”‚
        β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚         β”‚  ...             β”‚
        β”‚  β”‚ VM: Dev-01  β”‚  β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚  β”‚ VM: Dev-02  β”‚  β”‚
        β”‚  β”‚ VM: Dev-03  β”‚  β”‚
        β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • One click, any VM β€” Browse VMs across all machines in a sidebar tree. Click one, assign a task, watch it execute in real-time.
  • Live streaming β€” Screenshots, logs, and actions from remote nodes stream back over WebSocket instantly. It feels like the VM is running locally.
  • Zero config discovery β€” Just add a peer's IP to config.toml. No VPN, no cloud, no port forwarding gymnastics. It works on your LAN out of the box.
  • Scale your AI workforce β€” Run 3 VMs on your desktop, 5 on a lab server, 10 in a rack. Assign tasks to any of them from one place. Each AI employee works independently inside its own VM.
  • Proxy chains β€” Node A discovers Node B's peers automatically, so A -> B -> C routing works without configuring every node.
# config.toml β€” that's all you need
[fleet]
enabled = true
node_name = "my-pc"
listen_port = 8077

[[fleet.peers]]
name = "lab-server"
url = "http://192.168.1.50:8077"

Fleet turns vmClaw from a single-machine tool into a distributed AI operations center. Think Ansible, but instead of running shell commands, your agents see the screen and use it like a human would.

Quick Start

# Install
.\.venv\Scripts\pip.exe install -e .

# Run as Administrator (required to inject input into VM windows)
.\.venv\Scripts\python.exe -m vmclaw run

# Or launch the GUI
.\.venv\Scripts\python.exe -m vmclaw gui

That's it. vmClaw will walk you through selecting a provider, model, and VM window interactively.

How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Capture VM  │────>β”‚  AI Vision   │────>β”‚  Execute     β”‚
β”‚  Screenshot  β”‚     β”‚  Model       β”‚     β”‚  Action      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       ^                                        β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    repeat until done
  1. Capture β€” Takes a screenshot of the selected VM window
  2. Think β€” Sends the screenshot + task description to an AI vision model
  3. Act β€” Executes the AI's decision (click, type, key press, scroll)
  4. Repeat β€” Loops until the AI reports the task is done (or hits the action limit)

Supported Models

Provider Models Auth
GitHub Copilot (free) Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.4, GPT-5-mini, GPT-4o, GPT-4.1, o3, o4-mini, DeepSeek-R1, Grok-3, and more gh auth login (browser)
OpenAI (API key) GPT-4o, GPT-4.1, o3, o4-mini, and any OpenAI model OPENAI_API_KEY env var

Commands

python -m vmclaw run         # Start the AI agent loop (CLI)
python -m vmclaw gui         # Launch the graphical interface
python -m vmclaw list        # List detected VM windows
python -m vmclaw list-all    # List all windows (for debugging)
python -m vmclaw capture     # Capture a VM screenshot

Requirements

  • Windows 10/11 with Python 3.10+
  • A running VM (Hyper-V, VMware, VirtualBox, or QEMU)
  • GitHub CLI (gh) for GitHub Copilot auth, or an OpenAI API key

Configuration

vmClaw works out of the box with interactive prompts. For automation, create a config.toml:

[api]
provider = "github"    # or "openai"
model = "claude-opus-4.6"

[agent]
max_actions = 50       # Safety limit
action_delay = 1.0     # Seconds between actions
screenshot_width = 1024

How to Contribute

  • Run VS Code with Administrator

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages