Atlas

AI agent that lives on your desktop.
It sees your screen, understands what you need, and gets things done — hands-free.

⚠️ Atlas is in active development (v0.2.3).

🤖 LLM support: Gemini (including native Computer Use API) and OpenAI. More providers on the way.

🖥 Screen control: Gemini 3.x models use native Computer Use API for precise actions. Older models use vision-based coordinate prediction.

💻 Platform: Windows only for now. macOS & Linux support is planned.

🐛 Found a bug? We'd love to hear about it — open an issue.

What is Atlas?

Atlas is an AI-powered desktop agent that works alongside you as a transparent overlay. Press Ctrl+Space, tell it what to do — and it figures out the rest: navigating apps, clicking buttons, typing text, searching the web, finding files, running commands.

Think of it as a copilot for your entire OS.

🖥 Sees your screen — captures what's on your display and understands the context
🧠 Thinks before it acts — plans multi-step tasks and shows progress in real time
🖱 Controls your computer — mouse, keyboard, and terminal — all automated
🎯 Shows what it's doing — you can see the agent's cursor moving on screen
🔍 Searches the web — finds answers and brings them back, no tab-switching needed
📂 Finds your files — searches local files and folders by name, right from chat
🗣 Speaks to you — real-time voice responses with streaming TTS
🎙 Listens to you — local speech-to-text with wake word activation, no cloud required
🔊 Sound feedback — distinct sounds for every state: activation, processing, task complete, warnings
🛡 Asks before doing anything risky — built-in safety system with permission prompts

✨ Key Features

🔮 The Orb

A glowing AI indicator that shows you exactly what Atlas is doing — idle, thinking, acting, or waiting for your input. Always visible, never in the way.

🏝 Islands

Context-aware floating panels that appear when relevant:

Action Island — shows the current task and progress
Response Island — streams Atlas's thoughts and replies word by word
Permission Island — asks for confirmation before risky operations
Microtask Island — your task queue with real-time step progress (queue new tasks while the agent is busy)
Search Island — web search results and local file search results
Listening Island — live transcript display during voice input
Warning Island — dismissable warnings for errors and quota issues

🎯 Agent Cursor

When Atlas controls your desktop, you can see its cursor moving on screen — clicking, typing, and scrolling — so you always know what's happening.

🖥 Computer Use

With compatible Gemini 3.x models, Atlas uses the native Computer Use API for precise screen control — clicking, typing, scrolling, navigating, and searching — all without opening extra apps. Multi-monitor setups are supported.

🧩 Smart Task Planning

Before executing complex commands, Atlas breaks them into high-level steps (2–5) and displays them in the Task Queue. You see planned steps before execution begins and watch progress as each step completes.

🎭 Personas

Create multiple AI agents with unique personalities, knowledge, and voices. Each persona has its own memory and prompt settings — switch between them from the tray menu.

🧠 Memory

Atlas remembers your preferences and context across sessions. It learns facts about you from conversations and uses them to give better responses over time. Browse conversation history and view, edit, or delete learned facts in Settings.

🎙 Voice Input

Local offline speech-to-text via Vosk — just say the wake word (the active persona's name) and Atlas starts listening. No cloud API required.

✍️ Editable Prompts

Full control over the AI's behavior — modify system, action, and safety prompts directly from the Settings UI. Reset to defaults anytime.

⚙️ Customizable Layout

Choose where Atlas appears on screen (left, right, or center) and configure your preferred activation hotkey — all from Settings.

🔧 Debug Logging

Enable per-request session logs to trace the full pipeline: intent classification → LLM calls → actions → response streaming — with precise timing for every stage.

🚀 Getting Started

Download & Install

Go to Releases and download the latest installer for Windows
Run the installer — Atlas will appear in your system tray
Get a Gemini API key: go to Google AI Studio → sign in → Create API Key → copy it
Click the Atlas tray icon → Settings → Intelligence tab → paste your API key
Set the recommended models in the Intelligence tab:

Setting Free tier Paid tier

Text model gemini-3.1-flash-lite-preview gemini-3.1-flash-lite-preview

Vision model gemini-3.1-flash-lite-preview gemini-3-flash-preview

Vision model handles screen control & Computer Use. Paid tier model is more accurate but requires a billing-enabled API key.
(Optional) For voice output:
- Alice (free, no API key): Voice tab → select Alice → done!
- ElevenLabs (premium voices): get an ElevenLabs API key → Voice tab → paste key + voice ID
Press Ctrl+Space and start giving Atlas tasks 🎉

Build from Source

For contributors and developers who want to run Atlas from source.

git clone https://github.com/dortanes/atlas.git
cd atlas
yarn install
yarn dev

Requires: Node.js ≥ 20 · Yarn ≥ 1.22

🗺 Roadmap

Status	Feature
✅	Transparent glassmorphism overlay with Orb + Island UI
✅	LLM integration (Gemini + OpenAI) with multi-provider architecture
✅	Screen vision + desktop automation (robotjs)
✅	Native Gemini Computer Use API
✅	Smart task planning with step-by-step progress
✅	Agent cursor animations (click, type, scroll overlays)
✅	Streaming TTS (ElevenLabs + Alice)
✅	Persona system with isolated memory & custom voices
✅	Web search + local file search
✅	Settings UI with prompt editor + debug logging
✅	Intent classification (direct / action / chat)
✅	Context caching (Gemini prompt caching for token optimization)
✅	Voice input (wake word + local STT via Vosk)
🔜	Action whitelist/blacklist & audit log
🔜	Onboarding flow
🔜	Auto-update

⭐ Support the Project

If you find Atlas useful, please consider giving the repository a star ⭐ — it helps others discover the project and motivates further development!

🤝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

📜 License

Apache License 2.0 — use it, modify it, build on it.

Vibecoded with ❤️ by dortanes

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github		.github
build		build
docs		docs
electron		electron
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
electron-builder.json5		electron-builder.json5
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.d.ts		vite.config.d.ts
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atlas

What is Atlas?

✨ Key Features

🔮 The Orb

🏝 Islands

🎯 Agent Cursor

🖥 Computer Use

🧩 Smart Task Planning

🎭 Personas

🧠 Memory

🎙 Voice Input

✍️ Editable Prompts

⚙️ Customizable Layout

🔧 Debug Logging

🚀 Getting Started

Download & Install

Build from Source

🗺 Roadmap

⭐ Support the Project

🤝 Contributing

📜 License

About

Uh oh!

Releases 5

Contributors 1

Languages

Setting	Free tier	Paid tier
Text model	`gemini-3.1-flash-lite-preview`	`gemini-3.1-flash-lite-preview`
Vision model	`gemini-3.1-flash-lite-preview`	`gemini-3-flash-preview`

Folders and files

Latest commit

History

Repository files navigation

Atlas

What is Atlas?

✨ Key Features

🔮 The Orb

🏝 Islands

🎯 Agent Cursor

🖥 Computer Use

🧩 Smart Task Planning

🎭 Personas

🧠 Memory

🎙 Voice Input

✍️ Editable Prompts

⚙️ Customizable Layout

🔧 Debug Logging

🚀 Getting Started

Download & Install

Build from Source

🗺 Roadmap

⭐ Support the Project

🤝 Contributing

📜 License

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Contributors 1

Languages