Skip to content

dykyi-roman/office-ai

Repository files navigation

OfficeAI

Status Version Tauri Svelte PixiJS Tests

🌐 Website · GitHub

OfficeAI

Desktop app that turns your AI agents into employees of a virtual isometric office

Every running AI agent (Claude Code, Gemini CLI, Codex CLI, ChatGPT, ...) appears as an animated character in a 2D isometric office. Install, open — see all your agents in real time. Zero changes to your CLI workflow required.


Contents


Quick Start

Prerequisites

Install & Run

make install
make dev

The app opens in a native 1280x800 window. The Rust backend automatically starts scanning processes and logs.


Basic Usage

  1. Launch OfficeAI: Start the app using make dev or open the installed binary.
  2. Run your AI Agent: Open a separate terminal and start your preferred agent (e.g., claude, gemini-cli, or codex).
  3. Watch the Office: OfficeAI will automatically detect the new process. An employee character will appear, walk to their assigned desk, and begin reflecting the agent's real-time state (thinking, typing, or using tools).
  4. Interact: Hover over agents to see their latest response or click the status bar to see a full list of active employees.

Concept

Each running AI agent is mapped to an animated office employee in a 2D isometric scene.

Key principles:

  • 1 agent = 1 person — each AI agent process maps to a virtual employee
  • Agent naming{model name}-{PID}, e.g. claude-423235, gemini-23512, codex-54321, chatgpt-78901
  • Open space — every agent has a personal desk
  • Zero-intrusion — the app never modifies or wraps CLI agents
  • Auto-discovery — the system detects running agents automatically

App Tour

A quick walkthrough of what you see when you use OfficeAI.

Full office view

Your AI agents, visualized as office employees. When you open OfficeAI, you see a full isometric office floor. Each running AI agent occupies its own desk. The status bar at the bottom shows the total agent count, and the settings button is in the top-left corner.

Working agent with bouncing balls

Bouncing balls mean the agent is busy. Colored balls appear above an agent's head when it is actively thinking, responding, or using tools. The ball color reflects the model tier — gold for expert models, blue for senior, and so on.

Idle agent roaming the office

No balls — the agent is free. When an agent finishes its task, the balls disappear and the character leaves its desk to roam the office — visiting the water cooler, sofa, or kitchen. The process is still running, just waiting for your next prompt.

Speech bubble with agent response

Speech bubbles show what agents are saying. Hover over a working agent to see a preview of its response right inside the office, without switching to your terminal or browser.

Cancelled request visualization

Cancellations are detected in real-time. If you interrupt an agent's task in your terminal (Ctrl+C), OfficeAI detects this state change instantly. The agent stops its current activity and returns to its idle routine.

Agents panel listing all agents

Click the status bar to see all agents. The agents panel lists every detected agent along with its current status — Thinking, Using tool, Responding, Idle, and more. Use it to quickly check who is busy and who is available.

Sub-agents panel showing background tasks

Sub-agents handle delegated work. Click the SUB-AGENTS tab to see background tasks that a main agent has spawned. This gives you visibility into parallel work happening behind the scenes.

Dock icon with active agent count badge

The app icon shows the active agent count. A red badge on your dock or taskbar icon tells you how many agents are currently working. One glance is enough to know if something is running — no need to open the app.

Settings — General tab

General settings let you control core behavior. Open Settings from the top-left gear icon. The General tab includes scan interval, animation speed, max agents, and other global preferences.

Settings — Discovery tab

Discovery settings configure how agents are found. The Discovery tab controls process scanning parameters and log file monitoring for each supported agent type.

Settings — Display tab

Display settings customize the visual experience. The Display tab adjusts office layout, zoom level, speech bubble behavior, and other rendering preferences.


How It Works

OfficeAI operates on a zero-intrusion principle — it only observes AI agents, never interferes with their work.

OS Processes (sysinfo)    Agent log files       Chrome Extension
        │                       │               (MV3 + Native Messaging)
        ▼                       ▼                       │
  Process Scanner (2s)  Log Watcher (500ms)    Extension HTTP Server
        │                       │               (localhost:7842)
        └──────────┬────────────┴───────────────────────┘
                   ▼
           Agent Registry ──► Tauri IPC Events
                   │
        ┌──────────┴──────────┐
        ▼                     ▼
   Svelte UI             PixiJS Renderer
   (overlay)             (isometric scene)
  1. Process Scanner discovers CLI agents via OS process list
  2. Log Watcher reads agent log files for status changes
  3. Chrome Extension observes browser AI chats (ChatGPT, Gemini, Claude.ai) via DOM MutationObserver
  4. State Classifier (FSM with debounce) determines agent state
  5. Agent Registry maintains state, emits Tauri IPC events, and updates the app icon badge
  6. Frontend renders agents as animated characters in an isometric office

The app icon badge displays the number of active agents directly on the dock (macOS) or taskbar (Linux), so you always know how many agents are working without switching to the app window.

Bug Report: If you encounter a bug, open Settings and click Bug Report to save a diagnostic JSON file. Attach it to a GitHub Issue — no data is sent automatically.


Model Tiers

When the backend receives a model name from agent logs (e.g. "claude-opus-4-6"), it classifies it into one of four tiers:

Tier Keywords in model name Examples
Expert opus, ultra, gpt-4o (no -mini), o1-*, o3-* (no -mini) Claude Opus 4, GPT-4o, Gemini Ultra, o3
Senior sonnet, pro, gpt-4 (not gpt-4o) Claude Sonnet 4, GPT-4-turbo, Gemini Pro
Middle Everything else (fallback) Any unknown model
Junior haiku, nano, flash, gpt-3.5, -mini Claude Haiku 4, GPT-4o-mini, Gemini Flash

Check order matters: Junior is checked first (so -mini catches o1-mini, o3-mini before Expert). Then Expert, Senior. Everything else — Middle.

Work Indicator

When an agent is working (thinking, responding, tool_use), three animated bouncing balls appear above its sprite. The ball color is determined by model tier. Balls disappear when the agent finishes and transitions to idle.

Tier Color
Expert 🟡 Gold #FFD700
Senior 🔵 Blue #4A90E2
Middle 🟢 Green #5CB85C
Junior ⚪ Gray #AAAAAA

Balls are positioned horizontally above the sprite head and animated with a staggered sine wave (bounce). Animation speed is controlled by the animationSpeed setting.


Agent Lifecycle

The visual state of an agent directly reflects its process status:

                    ┌─────────────────────────────────┐
                    │                                 │
                    ▼                                 │
┌──────┐    ┌──────────────┐    ┌──────────┐    ┌─────┴──────┐
│ Idle │───▶│ Walking      │───▶│ Thinking │───▶│ Responding │
│      │    │ to desk      │    │          │    │            │
└──┬───┘    └──────────────┘    └────┬─────┘    └─────┬──────┘
   │                                 │                │
   │                                 ▼                ▼
   │                           ┌──────────┐    ┌────────────┐
   │                           │ Tool Use │    │ Collabora- │
   │                           └────┬─────┘    │ tion       │
   │                                │          └─────┬──────┘
   │                                ▼                │
   │                          ┌───────────┐          │
   │◀─────────────────────────│ Task      │◀─────────┘
   │                          │ Complete  │
   │                          └───────────┘
   │
   │         ┌─────────┐    ┌──────────┐
   └────────▶│  Error  │    │ Offline  │
             └─────────┘    └──────────┘

State Table

State Trigger Animation Visual Indicator
Idle Process running, no active request Agent roams the office: cooler, kitchen, sofa, etc. Relaxed pose, subtle idle animation
Walking to desk New prompt/task received Agent walks from current location to desk (A* pathfinding) Walking animation
Thinking Waiting for LLM response (streaming not started) Sitting at desk, typing animation Colored bouncing balls above head (color by model tier)
Responding Token streaming Active typing animation Speech bubble with response text preview
Tool use Agent executes shell command, reads files, etc. Reaches for folder / types in terminal Colored bouncing balls above head
Collaboration Multi-agent context or sub-agent spawned Agent sits at desk Status in data model; visual delegation (planned)
Task complete Response finished, transitioning to idle Agent stands up, walks back to previous location Bouncing balls disappear
Error Request failed / crash Agent grabs head, frustration gesture Red exclamation mark
Offline Process terminated Аgent go to the door Gray semi-transparent avatar

Idle Zones

When an agent has no active task, it randomly roams between rest areas in the office:

Location Animation
Water Cooler (water_cooler) Agent pours and drinks water
Kitchen (kitchen) Interacts with coffee machine
Sofa (sofa) Reads / scrolls phone
Meeting Room (meeting_room) Whiteboard discussion (for multi-agent setups)
Standing Desk (standing_desk) Stretching / casual browsing
Bathroom (bathroom) Agent stepped away
HR Zone (hr_zone) Chatting at the HR stand
Lounge (lounge) Relaxing in the lounge area

Agent Discovery

The system uses different detection strategies depending on the agent type:

Agent Type Status Detection Method State Extraction
Claude Code (CLI) Implemented Process scanning via sysinfo crate. Monitoring ~/.claude/projects/ directory Log file parsing: user_prompt, assistant_start, tool_use, assistant_end
Gemini CLI Implemented Process scanning (gemini, node.*gemini). Monitoring ~/.gemini/tmp/ directory JSON-array session parsing: user, gemini, info messages
Codex CLI Implemented Process scanning (codex). Monitoring ~/.codex/sessions/ directory JSONL parsing: message, function_call_output, exec_result events
Cursor (IDE) Implemented Process scanning (Cursor) with TTY bypass for GUI apps. Monitoring ~/.cursor/ai-tracking/ File activity monitoring: mtime changes on AI tracking database
Windsurf (IDE) Implemented Process scanning (Windsurf) with TTY bypass for GUI apps. Monitoring ~/.codeium/windsurf/cascade/ + ~/.codeium/implicit/ + ~/.codeium/cascade/ File activity monitoring: mtime changes on Codeium protobuf files
ChatGPT (Browser) Implemented Chrome MV3 extension with DOM MutationObserver on chatgpt.com CSS selector detection: stop button, streaming response, code interpreter
Gemini (Browser) Implemented Chrome MV3 extension with DOM MutationObserver on gemini.google.com Web Component attributes: model-response[loading], mat-progress-spinner
Claude (Browser) Implemented Chrome MV3 extension with DOM MutationObserver on claude.ai CSS selector detection: [data-is-streaming], artifact panel, thinking indicator

Browser Extension

OfficeAI includes a Chrome MV3 extension that tracks AI agent activity directly in browser tabs. It monitors ChatGPT, Gemini, and Claude.ai sessions in real time — each open chat appears as a separate employee in the office, just like CLI agents.

Browser extension tracking ChatGPT, Gemini, and Claude agents

How it works:

  • Content scripts use MutationObserver to detect DOM changes on AI chat pages (streaming responses, thinking indicators, tool use)
  • The background Service Worker bridges content scripts to a Native Messaging Host (Node.js)
  • The host forwards agent state via HTTP to the Tauri desktop app (localhost:7842)
  • Each browser tab gets a unique agent ID: browser-{platform}-{hash} (e.g. browser-chatgpt-a1b2c3d4)

For full details — architecture, CSS selectors, detection algorithms, native messaging protocol — see EXTENSION.md.


IDE Support

OfficeAI natively supports Cursor and Windsurf — two popular AI-powered code editors. Their built-in AI assistants are detected automatically and appear as office employees, just like CLI agents.

IDE Detection Monitored Paths State Extraction
Cursor Process scanning (Cursor) with TTY bypass for GUI apps ~/.cursor/ai-tracking/ File activity monitoring: mtime changes on AI tracking database
Windsurf Process scanning (Windsurf) with TTY bypass for GUI apps ~/.codeium/windsurf/cascade/, ~/.codeium/implicit/, ~/.codeium/cascade/ File activity monitoring: mtime changes on Codeium protobuf files

How it works:

  • The process scanner detects running Cursor/Windsurf processes via OS process list. GUI apps bypass the TTY filter since they don't have a terminal attached.
  • The log watcher monitors IDE-specific directories for file activity changes (mtime polling).
  • Since IDE agents cannot signal task completion explicitly, a 15-second inactivity timeout is used — if no file changes are detected within 15s, the agent transitions to idle.
  • Each Cursor session gets a unique agent ID: log-cursor--{session-hash}. Windsurf uses a fixed ID: log-windsurf--activity.

Setup: No configuration needed — just launch Cursor or Windsurf and start using their AI features. OfficeAI will detect them automatically.


Troubleshooting

  • Agent not appearing? Verify the log root in Settings > Discovery. For example, Claude Code logs are usually in ~/.claude/projects/.
  • Process not detected? Some agents run via node or python. Ensure your agent_process_patterns in settings include the correct regex for your environment.
  • Diagnostic Log: If you run into issues, go to Settings > General and click Bug Report. This generates a diagnostic.json file for debugging.

Non-Goals

  • The app never modifies CLI agent behavior, injects middleware, or requires config changes.
  • No network requests to external servers — all processing is local.
  • This is not a CLI replacement — the visualizer is a companion/monitor tool only.
  • Prompt data is never stored or transmitted — only metadata is used (state, model name, token counts).

Roadmap

  • ChatGPT CLI Support — integration with official and community-built CLIs.
  • Browser Model Tracking — ChatGPT, Gemini, Claude.ai web sessions tracked via Chrome MV3 extension.
  • Office Customization — changeable floor plans, custom furniture, and skins.
  • Collaboration Mode — visual links/indicators when multiple agents are delegating tasks to each other.
  • New Idle Zones — gym area, library, and outdoor garden for more character variety.

Commands

Command Description
make install Install all dependencies (npm + cargo)
make dev Full Tauri + Vite dev server
make build Production build (AppImage/DMG/MSI)
make build-debug Debug build (faster, no optimizations)
make build-frontend Build frontend only (to dist/)
make test-js TypeScript tests (vitest)
make test-rust Rust tests
make test-all All tests (TS + Rust)
make test-watch TypeScript tests in watch mode
make bench Run performance benchmarks
make check svelte-check + clippy + fmt
make lint Svelte type checker only
make fmt Format Rust code
make clippy Rust linter (warnings as errors)
make assets Regenerate sprites + tiles + effects
make icons Generate Tauri app icons
make clean Remove dist/ + cargo clean
make clean-all Remove artifacts + node_modules

Tech Stack

Layer Technology
Desktop runtime Tauri v2 (Rust)
Frontend framework Svelte 5 (runes)
2D rendering PixiJS v8 (isometric)
Process discovery sysinfo (Rust)
Async runtime Tokio
IPC Tauri events + commands
Browser extension Chrome MV3 + Native Messaging
Config storage TOML (~/.config/office-ai/config.toml)
TS testing Vitest (~411 tests)
Rust testing cargo test (~371 tests)

Cross-Platform Support

Feature macOS Linux Windows
Process scanning Yes Yes Yes
Agent log parsing Yes Yes Yes
Isometric office rendering Yes Yes Yes
App icon badge (active agent count) Yes (Dock) Yes (Unity/KDE) No
Chrome Extension Yes Yes Planned
Production build DMG AppImage MSI

The app icon badge shows the number of currently active agents (not idle, not offline) as a numeric indicator on the dock/taskbar icon. When no agents are active, the badge is removed. The badge updates automatically on every agent state change — registration, status transition, and removal.


Documentation

Document Description
ARCHITECTURE.md Detailed system architecture, data structures, algorithms
FRONTEND.md TypeScript frontend — Svelte 5, PixiJS v8, stores, UI
BACKEND.md Rust backend — process scanner, log parser, IPC
CONFIGURATION.md All settings explained (defaults, behavior)
EXTENSION.md Chrome extension — setup, architecture, detection
TESTING.md Test structure, commands, coverage, CI/CD
CHANGELOG.md Project history, version changes, and release notes
CONTRIBUTING.md How to contribute, code style, commit conventions
LICENSE MIT License

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors