Skip to content

NakaokaRei/SwiftAutoGUI

Repository files navigation

SwiftAutoGUI

SPM is supported Github issues Github forks Github stars Github top language Github license

A Swift library for macOS automation — mouse, keyboard, screenshots, image recognition, and AI-powered agents.

This repository is inspired by pyautogui.

Demo

AI Agent that autonomously observes the screen and executes actions to achieve a goal.

sagui agent "Open Safari and search for Swift"

Demo: sagui agent

Requirements

  • macOS 26.0+
  • Swift 6.0+

Installation

Swift Package Manager

SwiftAutoGUI is available through Swift Package Manager.

in Package.swift add the following:

dependencies: [
    // Dependencies declare other packages that this package depends on.
    .package(url: "https://github.com/NakaokaRei/SwiftAutoGUI", branch: "master")
],
targets: [
    .target(
        name: "MyProject",
        dependencies: [..., "SwiftAutoGUI"]
    )
    ...
]

Homebrew (sagui CLI)

The sagui command-line tool is available via the Homebrew tap:

brew install NakaokaRei/tap/sagui

After install, grant Accessibility permission to your terminal in System Settings → Privacy & Security → Accessibility.

Example Usage

If you would like to know more details, please refer to the DocC Style Document.

AI Agent

SwiftAutoGUI includes an Agent that can autonomously observe the screen, reason about what it sees, and execute actions in a loop until a goal is achieved. This follows the ReAct (Observe → Think → Act) pattern using a vision-capable LLM.

import SwiftAutoGUI

let backend = OpenAIVisionBackend(apiKey: "sk-...", model: "gpt-4o")
let agent = Agent(backend: backend, maxIterations: 15)

let result = try await agent.run(goal: "Open Safari and search for Swift")
print("Completed: \(result.completed), Steps: \(result.iterationsUsed)")

Basic Usage

import SwiftAutoGUI

// Execute single actions
await Action.leftClick.execute()
await Action.write("Hello, World!").execute()
await Action.keyShortcut([.command, .a]).execute()  // Select all

// Build and execute action sequences
let actions: [Action] = [
    .move(to: CGPoint(x: 100, y: 100)),
    .wait(0.5),
    .leftClick,
    .write("Hello, SwiftAutoGUI!"),
    .keyShortcut([.returnKey])
]
await actions.execute()

Claude Code Plugin

SwiftAutoGUI ships as a Claude Code plugin so Claude can control macOS GUI applications via the sagui CLI — taking screenshots, clicking buttons, typing text, scrolling, and more.

Install from the marketplace

Inside Claude Code:

/plugin marketplace add NakaokaRei/SwiftAutoGUI
/plugin install swift-auto-gui@swift-auto-gui

This installs the macos-control skill, which is invoked as /swift-auto-gui:macos-control. The skill walks Claude through installing the sagui binary the first time it's needed (Swift 6.2+ toolchain required).

Permissions

Grant the application running Claude Code (Terminal.app, iTerm, etc.) both:

  • Accessibility — System Settings → Privacy & Security → Accessibility
  • Screen Recording — System Settings → Privacy & Security → Screen Recording

For full skill details, see plugins/swift-auto-gui/skills/macos-control/SKILL.md.

Star History

Star History Chart

Contributors

License

MIT license. See the LICENSE file for details.

About

A Swift library for macOS automation — mouse, keyboard, screenshots, image recognition, and AI-powered agents.

Topics

Resources

License

Stars

Watchers

Forks

Contributors