Skip to content

uuchoaa/jards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Jards

Invisible personal copilot. Always listening. Never shows up.

Jards is a local macOS agent that continuously listens to audio, identifies intent, transcribes meetings, and executes web automations — no wake word, no mandatory cloud, no visible traces.


What it does

  • Continuous listening — mic always open, lightweight background processing
  • Smart routing — classifies every audio chunk: command, meeting, or noise
  • Meeting transcription — identifies speakers via DOM scraping (Meet/Teams) or OCR (Zoom)
  • Keyword watching — acts silently when it hears trigger words during calls
  • Web automations — controls the browser with human-like behavior via Ferrum
  • Screen recording — captures, processes and exports feature demos
  • Hidden terminal — Touch ID-protected control panel, invisible during screen share
  • Discreet menu bar — neutral icon, no Dock, no notifications

Architecture

Audio pipeline

Mic (always open)
    ↓
Cobra — VAD (is there a human voice?)
    ↓
Cheetah — real-time STT
    ↓
lightweight local filter (worth processing?)
    ↓
Rhino — Speech-to-Intent
    ↓
    ├── COMMAND      → ActionManager executes
    ├── MEETING      → buffer accumulates + OCR identifies speaker
    └── NOISE        → silently discarded

Swift ↔ Rails — WebSocket (Action Cable)

Swift is a thin native client. All business logic lives in Rails. Communication happens over a persistent WebSocket — faster than HTTP even on localhost.

Swift (thin client)                    Rails (orchestrator)
─────────────────────                  ────────────────────
AVAudioEngine                          ActionCable (WebSocket)
Cobra — VAD              ←——————————→  JardsChannel
Cheetah — STT                          ActionRegistry
Vision — OCR                           MeetingSession
ScreenCaptureKit                       BrowserPool + Ferrum
Menu bar + native UI                   picoLLM

Swift owns: audio capture, VAD, STT, OCR, native UI, WebSocket client Rails owns: all logic, state, browser automations, LLM decisions, persistence

App states

IDLE → ALWAYS_ON → MEETING_MODE → SUMMARIZING → ALWAYS_ON
  • IDLE — app paused, mic closed
  • ALWAYS_ON — listening and routing commands
  • MEETING_MODE — accumulating transcript, commands disabled
  • SUMMARIZING — processing buffer with picoLLM

Layers

Layer Responsibility Stack
Capture Continuous audio, screenshots AVAudioEngine, ScreenCaptureKit
Processing VAD, STT, OCR, intent Picovoice, Vision Framework
Transport Bidirectional communication WebSocket / Action Cable
Orchestration State, actions, registry Rails
Automation Web automations Ferrum + Sidekiq
LLM Ambiguous decisions, summaries picoLLM (local)

Stack

macOS App (Swift)

  • AVAudioEngine — audio capture
  • Picovoice (Cobra, Cheetah, Rhino, Orca, Falcon) — voice pipeline
  • Vision Framework — on-device OCR to identify speakers
  • ScreenCaptureKit — screen recording
  • LocalAuthentication — Touch ID for hidden terminal
  • NSStatusItem — menu bar
  • URLSessionWebSocketTask — WebSocket client

Rails (local)

  • Action Cable — WebSocket server
  • Ferrum — browser automation
  • Sidekiq — job queue
  • Browser::ActionRegistry — automation registry
  • picoLLM — local LLM for decisions

Invisibility

Risk Solution
Dock LSUIElement = true in Info.plist
Mission Control collectionBehavior = .stationary
Screen share NSWindow.sharingType = .none
Hidden terminal Touch ID via LocalAuthentication
Secret hotkey ⌃⌥⌘Space opens the panel

The macOS orange microphone indicator is system-enforced for privacy — it cannot be removed. Nobody knows what the mic is doing though.


Setup

Requirements

  • macOS 13+
  • Xcode Command Line Tools
  • Ruby 3.2+
  • Chrome installed

Installation

# 1. System dependencies
xcode-select --install
brew install xcode-build-server redis
brew services start redis

# 2. Rails
cd rails
bundle install
cp .env.example .env
# fill in PICOVOICE_KEY → console.picovoice.ai (free tier)
rails db:create db:migrate
rails s

# 3. Swift app
# open Jards.xcodeproj in Xcode or Zed
# cmd+R to run

Zed (recommended editor)

xcode-build-server config -scheme Jards -project Jards.xcodeproj
// .zed/settings.json
{
  "languages": {
    "Swift": {
      "enable_language_server": true,
      "language_servers": ["sourcekit-lsp"],
      "format_on_save": "on"
    }
  }
}

Action Registry

Voice commands

"record screen"        → ScreenRecordingAction
"start meeting"        → MeetingTranscriptionAction
"summarize"            → SummarizeAction
"stop"                 → StopCurrentAction
"take screenshot"      → ScreenshotAction

Web automations (Rails)

linkedin.connect       → Browser::Actions::Linkedin::Connect
linkedin.send_message  → Browser::Actions::Linkedin::SendMessage
gmail.send             → Browser::Actions::Gmail::SendEmail
generic.screenshot     → Browser::Actions::Generic::Screenshot

Estimated resource usage

State RAM CPU
IDLE ~20MB <1%
ALWAYS_ON ~80MB ~3%
MEETING_MODE ~100MB ~5%
Screen recording ~120MB ~8%

Roadmap

  • Swift app base + menu bar
  • Cobra → Cheetah pipeline running
  • WebSocket connection Swift ↔ Rails (Action Cable)
  • ActionRegistry with first voice commands
  • MeetingMode + OCR (Google Meet)
  • MeetingMode + OCR (Zoom)
  • HumanTyper + Ferrum (LinkedIn wrapper)
  • Hidden terminal with Touch ID
  • ScreenCaptureKit + post-processing pipeline
  • picoLLM for ambiguous decisions
  • Screen recording with automatic captions
  • BrowserPool with anti-bot behavior

Jards doesn't exist. It's just a system utility.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors