Skip to content

dortanes/atlas

Atlas

Atlas

AI agent that lives on your desktop.
It sees your screen, understands what you need, and gets things done — hands-free.

Download  Get Started  License

Windows macOS & Linux Privacy

Atlas Demo


⚠️ Atlas is in active development (v0.2.3).

  • 🤖 LLM support: Gemini (including native Computer Use API) and OpenAI. More providers on the way.
  • 🖥 Screen control: Gemini 3.x models use native Computer Use API for precise actions. Older models use vision-based coordinate prediction.
  • 💻 Platform: Windows only for now. macOS & Linux support is planned.
  • 🐛 Found a bug? We'd love to hear about it — open an issue.

What is Atlas?

Atlas is an AI-powered desktop agent that works alongside you as a transparent overlay. Press Ctrl+Space, tell it what to do — and it figures out the rest: navigating apps, clicking buttons, typing text, searching the web, finding files, running commands.

Think of it as a copilot for your entire OS.

  • 🖥 Sees your screen — captures what's on your display and understands the context
  • 🧠 Thinks before it acts — plans multi-step tasks and shows progress in real time
  • 🖱 Controls your computer — mouse, keyboard, and terminal — all automated
  • 🎯 Shows what it's doing — you can see the agent's cursor moving on screen
  • 🔍 Searches the web — finds answers and brings them back, no tab-switching needed
  • 📂 Finds your files — searches local files and folders by name, right from chat
  • 🗣 Speaks to you — real-time voice responses with streaming TTS
  • 🎙 Listens to you — local speech-to-text with wake word activation, no cloud required
  • 🔊 Sound feedback — distinct sounds for every state: activation, processing, task complete, warnings
  • 🛡 Asks before doing anything risky — built-in safety system with permission prompts

✨ Key Features

🔮 The Orb

A glowing AI indicator that shows you exactly what Atlas is doing — idle, thinking, acting, or waiting for your input. Always visible, never in the way.

🏝 Islands

Context-aware floating panels that appear when relevant:

  • Action Island — shows the current task and progress
  • Response Island — streams Atlas's thoughts and replies word by word
  • Permission Island — asks for confirmation before risky operations
  • Microtask Island — your task queue with real-time step progress (queue new tasks while the agent is busy)
  • Search Island — web search results and local file search results
  • Listening Island — live transcript display during voice input
  • Warning Island — dismissable warnings for errors and quota issues

🎯 Agent Cursor

When Atlas controls your desktop, you can see its cursor moving on screen — clicking, typing, and scrolling — so you always know what's happening.

🖥 Computer Use

With compatible Gemini 3.x models, Atlas uses the native Computer Use API for precise screen control — clicking, typing, scrolling, navigating, and searching — all without opening extra apps. Multi-monitor setups are supported.

🧩 Smart Task Planning

Before executing complex commands, Atlas breaks them into high-level steps (2–5) and displays them in the Task Queue. You see planned steps before execution begins and watch progress as each step completes.

🎭 Personas

Create multiple AI agents with unique personalities, knowledge, and voices. Each persona has its own memory and prompt settings — switch between them from the tray menu.

🧠 Memory

Atlas remembers your preferences and context across sessions. It learns facts about you from conversations and uses them to give better responses over time. Browse conversation history and view, edit, or delete learned facts in Settings.

🎙 Voice Input

Local offline speech-to-text via Vosk — just say the wake word (the active persona's name) and Atlas starts listening. No cloud API required.

✍️ Editable Prompts

Full control over the AI's behavior — modify system, action, and safety prompts directly from the Settings UI. Reset to defaults anytime.

⚙️ Customizable Layout

Choose where Atlas appears on screen (left, right, or center) and configure your preferred activation hotkey — all from Settings.

🔧 Debug Logging

Enable per-request session logs to trace the full pipeline: intent classification → LLM calls → actions → response streaming — with precise timing for every stage.


🚀 Getting Started

Download & Install

  1. Go to Releases and download the latest installer for Windows

  2. Run the installer — Atlas will appear in your system tray

  3. Get a Gemini API key: go to Google AI Studio → sign in → Create API Key → copy it

  4. Click the Atlas tray iconSettingsIntelligence tab → paste your API key

  5. Set the recommended models in the Intelligence tab:

    Setting Free tier Paid tier
    Text model gemini-3.1-flash-lite-preview gemini-3.1-flash-lite-preview
    Vision model gemini-3.1-flash-lite-preview gemini-3-flash-preview

    Vision model handles screen control & Computer Use. Paid tier model is more accurate but requires a billing-enabled API key.

  6. (Optional) For voice output:

    • Alice (free, no API key): Voice tab → select Alice → done!
    • ElevenLabs (premium voices): get an ElevenLabs API key → Voice tab → paste key + voice ID
  7. Press Ctrl+Space and start giving Atlas tasks 🎉

Build from Source

For contributors and developers who want to run Atlas from source.

git clone https://github.com/dortanes/atlas.git
cd atlas
yarn install
yarn dev

Requires: Node.js ≥ 20 · Yarn ≥ 1.22


🗺 Roadmap

Status Feature
Transparent glassmorphism overlay with Orb + Island UI
LLM integration (Gemini + OpenAI) with multi-provider architecture
Screen vision + desktop automation (robotjs)
Native Gemini Computer Use API
Smart task planning with step-by-step progress
Agent cursor animations (click, type, scroll overlays)
Streaming TTS (ElevenLabs + Alice)
Persona system with isolated memory & custom voices
Web search + local file search
Settings UI with prompt editor + debug logging
Intent classification (direct / action / chat)
Context caching (Gemini prompt caching for token optimization)
Voice input (wake word + local STT via Vosk)
🔜 Action whitelist/blacklist & audit log
🔜 Onboarding flow
🔜 Auto-update

⭐ Support the Project

If you find Atlas useful, please consider giving the repository a star ⭐ — it helps others discover the project and motivates further development!

Star on GitHub

🤝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

📜 License

Apache License 2.0 — use it, modify it, build on it.


Vibecoded with ❤️ by dortanes

About

An AI-powered computer-use agent built with Electron. Automate desktop tasks by letting AI see and interact with your OS.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages