Skip to content

worthmining/bantzv2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

464 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Bantz (It's Still in Early Development)

Bantz is a local-first AI assistant that runs on your Linux machine and acts as a personal butler — it has a voice, remembers things across sessions, runs scheduled jobs overnight, controls your desktop, reads your email, and talks to you like a person who's known you long enough to be useful. The primary interface is the Operations Center desktop app (bantz --ui), with a bantz CLI alongside it. Everything is local by default. Nothing phones home unless you configure it to.


Demo

Screenshots and screen recordings live in bantz-demo/.

Chat Vitals
Logs Directives
Anomaly Watch Settings

GIF walkthroughs:


Architecture

The Brain (core/brain.py) sits at the center. Every request — typed, spoken, or sent via Telegram — goes through the same pipeline:

Input  (Terminal / Voice / Telegram)
  │
  ▼
Translation Layer       core/translation_layer.py
  │  MarianMT Turkish↔English bridge; all internal processing in English
  │
  ▼
Memory Injector         core/memory_injector.py
  │  Injects: recent messages, desktop context, persona state, location
  │
  ▼
OmniMemoryManager       memory/omni_memory.py
  │  Parallel asyncio recall — Graph (35%) + Vector (40%) + Deep (25%)
  │  400-token budget, entity-based re-ranking, zero sequential waiting
  │
  ▼
Routing Engine          core/routing_engine.py + core/intent.py
  │  quick_route(): hardware controls (TTS stop, wake word, ducking)
  │  cot_route():   Chain-of-Thought LLM routing — tool selection + risk
  │
  ▼
Executor / Planner      agent/executor.py + agent/planner.py
  │  Plan-and-Solve multi-step execution with $REF variable binding
  │  Step failure → circuit breaker, optional replan
  │
  ▼
Finalizer               core/finalizer.py
  │  Butler persona enforcement, hallucination checks, error honesty
  │
  ▼
Memory persistence      MemPalace (ChromaDB + SQLite KG)
                        core/memory.py (session log, SQLite WAL)

Supporting systems run alongside the main loop:

  • APScheduler (agent/job_scheduler.py) — persistent SQLAlchemy job store, nightly maintenance/reflection/overnight email poll
  • Ghost Loop (agent/ghost_loop.py) — wake word → VAD capture → STT → brain dispatch
  • Affinity Engine (agent/affinity_engine.py) — cumulative score [-100, 100] drives persona tier
  • Event Bus (core/event_bus.py) — decoupled pub/sub between brain, TUI, voice, notifications
  • GPS Server (core/gps_server.py) — local HTTP server receiving phone location updates

Features

What's working

Conversation and memory

  • Persistent memory via MemPalace: ChromaDB vector store + SQLite knowledge graph
  • Hybrid recall: graph entities + semantic search run in parallel, merged by relevance
  • Session distillation: conversations mined into memory palace after each session
  • Onboarding: first-run identity setup, stored in memory wing
  • 400-token memory budget enforced per request (35/40/25 split across layers)

Voice pipeline

  • Wake word detection via Porcupine (runs on dedicated daemon thread, always-on)
  • VAD-based audio capture via WebRTC VAD (auto-stops when you stop talking)
  • STT via faster-whisper (local, GPU-accelerated if available)
  • TTS via Piper + aplay (local, no cloud)
  • Audio ducking: system volume lowers during Bantz speech
  • Ambient sound classification: silence / speech / noisy from mic energy (no FFT)
  • Conversation window: 60s follow-up without re-triggering wake word

Scheduling and automation

  • APScheduler with SQLAlchemy persistent job store (survives restarts)
  • Nightly maintenance workflow (3am): database compaction, memory distillation, digest prep
  • Nightly reflection (11pm): daily summary written to reflection journal
  • Overnight email/calendar poll (every 2h, 00-07): urgent keyword detection
  • Morning briefing prep (6am): pre-generates briefing for fast delivery at wake-up time
  • Reminder system with repeat support (30s check interval)

Desktop and computer control

  • Desktop screenshot + optional VLM analysis (self-hosted endpoint)
  • Visual element detection and click via coordinate mapping
  • pyautogui-based GUI automation (mouse, keyboard, window focus)
  • Accessibility tree reading
  • App detector: tracks active application context (optional, polling-based)
  • Browser control via subprocess + xdotool

External integrations

  • Gmail: read, search, compose, reply (Google OAuth2 PKCE flow)
  • Google Calendar: read events, create, check conflicts
  • Google Classroom: assignments, deadlines, announcements
  • Telegram bot: full two-way remote access, screenshot-on-request, whitelist by user ID
  • GPS location from phone via MQTT relay or direct HTTP push

Personality and adaptation

  • 1920s English butler persona enforced at the Finalizer layer
  • Affinity Engine: score persists in SQLite, drives 5-tier formality ladder
    • -100 → clipped and resentful
    • 0 → neutral and professional
    • +100 → deeply bonded, proactive, affectionate
  • Highwater protection: score can't drop from a tier you've reached
  • Bonding Meter: sigmoid-gated interaction scoring, configurable rate/midpoint

Security and permissions

  • Risk level propagated through BantzContext: safe / moderate / destructive
  • Two-pass confirm flow: destructive operations require explicit y before execution
  • DESTRUCTIVE_COMMANDS frozenset in tools/shell.py — rm -rf, mkfs, dd, etc.
  • Shell timeout configurable, stderr captured separately

Infrastructure

  • SQLite WAL mode throughout, thread-safe connection pool
  • Auto-migration: JSON data files → SQLite on first run (profile, places, schedule, session)
  • DataLayer singleton: unified init for all stores, called once at startup
  • pydantic-settings config: ~70 env vars via .env, all aliased

Interfaces

  • Operations Center — Tauri + React desktop app, launched with bantz --ui. Six pages, all live over a local WebSocket (:8765):
    • Broadcast Channel — chat with Bantz; streamed responses, including live progress from long jobs
    • Vitals — CPU / RAM / VRAM / DISK, refreshed every 2s
    • Kernel Log — live log stream from the daemon
    • Directives — scheduled jobs and reminders (APScheduler), with a New Directive box that parses natural language ("every morning at 7am …")
    • Anomaly Watch — real-time system anomalies (see below)
    • Settings — provider/model, voice, language, behaviour dials, appearance — written back to the daemon live
  • bantz CLI as a secondary interface: bantz --once "query" for a single query, bantz --daemon for headless/systemd operation, bantz --doctor for health checks

Web intelligence (via the bundled bantz-web pipeline at vendor/bantz-web)

  • web_search — quick web lookup (under ~60s), tiered SearXNG → DuckDuckGo fallback
  • web_research — deep, multi-source research producing a structured report; runs async with live progress streamed to the Broadcast Channel and a cancel control
  • web_news — news pipeline: fetch + extract + summarize current headlines for a topic
  • Wired directly into the tool layer (no HTTP, no subprocess); routed by Chain-of-Thought ("research X", "search X", "news about X", plus Turkish: araştır / ara / haberler)

Anomaly Watch

  • Real-time monitoring of CPU, RAM, disk, and swap pressure, plus recent ERROR/CRITICAL logs grouped by source
  • Thresholds tuned for a personal machine (CPU > 80%, RAM > 85%, disk > 85%, swap > 60% warn / > 85% critical), plus a combined memory-pressure signal
  • Per-anomaly Investigate (asks Bantz to analyze it in the Broadcast Channel) and Snooze 1h (client-side, persists across reloads, auto-expires)

Multi-step workflows

  • Chain-of-Thought routing selects tools and builds multi-step plans
  • Plan-and-Solve executor: $REF_STEP_N variable binding between steps
  • YAML-based workflow engine for deterministic step sequences
  • Inline workflow detection: "send email, add to calendar, remind me tomorrow" → 3 tool calls
  • Delegate-to-subagent tool: spawns sub-agents for parallel or specialized tasks

i18n

  • MarianMT offline translation (Turkish↔English) — no API key, runs locally
  • Configurable primary language; English used internally, translated for display

What's missing or incomplete

Wake word: requires a Porcupine access key from Picovoice. Without it the voice pipeline silently disables itself. There's no fallback wake word engine.

VLM / vision analysis: screenshot capture works, but VLM analysis requires a self-hosted endpoint (BANTZ_VLM_ENDPOINT). No built-in vision model — you bring your own.

Mood history display: bantz --mood-history prints a stub message. Mood data is recorded in SQLite but there's no display command since the Textual TUI was removed.

Observer: background log analysis via a small Ollama model (default qwen2.5:0.5b) — implemented but disabled by default (BANTZ_OBSERVER_ENABLED=false). Adds latency on low-end hardware.

RL engine: the old Q-learning engine was replaced by the Affinity Engine. The BANTZ_RL_ENABLED flag exists and gates the intervention/proactive systems, but the underlying RL training loop is no longer active.

Proactive interventions: implemented in agent/interventions.py, gated behind BANTZ_RL_ENABLED. Off by default. Not well-tested in the current build.

Autonomy confirmation: the Settings "Autonomy" dial is parsed into a requires_confirm flag on the routing decision (low = always confirm, absolute = never), but the executor does not yet enforce it — destructive shell commands still gate on their own DESTRUCTIVE_COMMANDS confirm, independent of this dial.

web_research is Ollama-bound: deep research runs many local model calls, so on a memory-constrained machine the model can stall mid-run; it falls back gracefully but a full report needs a healthy Ollama.


Installation

Requirements: Python 3.11+, git, Ollama running locally

One-liner (recommended)

curl -fsSL https://raw.githubusercontent.com/miclaldogan/bantzv2/main/install.sh | bash

Checks Python and git, clones the repo to ~/.local/share/bantz/src, installs the package, fixes your PATH if needed, runs an interactive wizard to write your .env, and finishes with bantz --doctor.

Manual

git clone https://github.com/miclaldogan/bantzv2.git
cd bantzv2
pip install -e ".[dev]"
cp .env.example .env   # then edit with your values
bantz --doctor

Voice pipeline (all optional — install only what you need):

pip install pvporcupine pyaudio webrtcvad  # wake word + capture
pip install faster-whisper                  # STT
# Piper TTS: install binary from https://github.com/rhasspy/piper/releases
#            put 'piper' in PATH, download a voice model .onnx file

MemPalace memory:

pip install mempalace

Google integrations:

# Create an OAuth 2.0 client in Google Cloud Console (Desktop app type)
# Download credentials.json to ~/.local/share/bantz/
bantz --setup google gmail
bantz --setup google classroom

Telegram:

bantz --setup telegram

Configuration

Create a .env file in your working directory (or ~/.local/share/bantz/.env). Minimum working config:

BANTZ_OLLAMA_MODEL=llama3.1:8b
BANTZ_OLLAMA_BASE_URL=http://localhost:11434

# Optional: faster routing via a smaller model
BANTZ_OLLAMA_ROUTING_MODEL=qwen2.5:3b

# Conversation provider: ollama (local, default) | claude | gemini | openai
# Switchable live from Settings → Conversation Provider (persists to .env).
BANTZ_LLM_PROVIDER=ollama

# Claude / Anthropic
BANTZ_ANTHROPIC_API_KEY=your_key_here
BANTZ_ANTHROPIC_MODEL=claude-sonnet-4-6

# Gemini
BANTZ_GEMINI_API_KEY=your_key_here
BANTZ_GEMINI_MODEL=gemini-2.0-flash

# OpenAI
BANTZ_OPENAI_API_KEY=your_key_here
BANTZ_OPENAI_MODEL=gpt-4o-mini

# Voice (wake word)
BANTZ_PORCUPINE_ACCESS_KEY=your_picovoice_key

# Primary language (default Turkish)
BANTZ_LANGUAGE=tr

# Memory
BANTZ_MEMPALACE_ENABLED=true

Full reference in src/bantz/config.py — every field has a comment.


Setup wizards

bantz --setup profile          # name, timezone, city — stored in SQLite
bantz --setup places           # named GPS locations (home, office, etc.)
bantz --setup schedule         # weekly timetable
bantz --setup google gmail     # Google OAuth for Gmail
bantz --setup google classroom # Google OAuth for Classroom
bantz --setup telegram         # Telegram bot token
bantz --setup systemd          # install + enable systemd user service
bantz --setup systemd --check  # show service status, PID, memory, uptime

Running

# Operations Center — the Tauri desktop UI (starts the daemon if not running)
bantz --ui

# Headless daemon — APScheduler drives all background jobs (UI connects to this)
bantz --daemon

# Single query from the CLI, no UI
bantz --once "what's on my calendar today?"

# System health check
bantz --doctor

# Show running config (secrets masked)
bantz --config

Scheduled job management:

bantz --jobs                          # list all APScheduler jobs
bantz --run-job nightly_maintenance   # trigger any job immediately
bantz --maintenance                   # run maintenance workflow now
bantz --reflect                       # run reflection now
bantz --reflections                   # view last 10 reflections
bantz --overnight-poll                # run one overnight poll cycle

Systemd service (recommended for daemon mode):

bantz --setup systemd
# writes ~/.config/systemd/user/bantz.service
# enables linger, enables and starts the service

systemctl --user status bantz
journalctl --user -u bantz -f

Project layout

src/bantz/
├── __main__.py          entry point, CLI argument routing
├── config.py            pydantic-settings, ~70 env vars
├── cli/
│   └── setup.py         all setup wizards and --doctor diagnostics
├── core/
│   ├── brain.py         central orchestrator
│   ├── routing_engine.py quick_route + plan-and-solve dispatch
│   ├── intent.py        CoT LLM routing (cot_route)
│   ├── finalizer.py     butler persona + hallucination check
│   ├── memory_injector.py context assembly before LLM call
│   ├── prompt_builder.py system prompt composition
│   └── workflow.py      inline multi-tool workflow detection
├── memory/
│   ├── bridge.py        MemPalace adapter (replaces 8 old modules)
│   └── omni_memory.py   parallel hybrid recall orchestrator
├── agent/
│   ├── executor.py      plan-and-solve step runner
│   ├── planner.py       LLM plan generator
│   ├── job_scheduler.py APScheduler wrapper
│   ├── affinity_engine.py bonding score + persona tier
│   ├── ghost_loop.py    wake→capture→STT→dispatch cycle
│   ├── wake_word.py     Porcupine always-on listener
│   ├── voice_capture.py WebRTC VAD recording
│   ├── stt.py           faster-whisper transcription
│   ├── tts.py           Piper + aplay synthesis
│   ├── audio_ducker.py  system volume control during speech
│   ├── ambient.py       environment sound classifier
│   ├── observer.py      background log analysis
│   ├── notifier.py      desktop notifications
│   ├── interventions.py proactive suggestion queue
│   └── workflows/       nightly maintenance, reflection, overnight poll
├── data/
│   ├── layer.py         DataLayer singleton, unified store init
│   ├── sqlite_store.py  profile, places, schedule, session, KV stores
│   └── connection_pool.py WAL-mode thread-safe SQLite pool
├── interface/
│   ├── ws_server.py     WebSocket server (:8765) — backend for the desktop UI
│   └── live_ui.py       legacy CLI status view
├── integrations/
│   └── telegram_bot.py  Telegram remote access bot
├── tools/               31 registered tools (shell, gmail, calendar, ...)
├── llm/
│   ├── router.py        provider selector (ollama | claude | gemini | openai)
│   ├── ollama.py        local Ollama client (default)
│   ├── anthropic_client.py  Claude / Anthropic client
│   ├── gemini.py        Gemini client
│   └── openai_client.py OpenAI client
├── personality/
│   ├── persona.py       system prompt persona layer
│   ├── bonding.py       interaction scoring
│   └── greeting.py      morning briefing generation
├── auth/                Google OAuth2 PKCE flow
└── i18n/
    └── bridge.py        MarianMT translation bridge

Tests

pytest                   # full suite
pytest tests/core/       # core modules only
pytest --cov=bantz       # coverage report (target: 65%)

48 pre-existing failures in prompt content and routing regex tests — these test specific LLM output strings that drift with model changes. Everything structural (core, data, agent, cli) passes.


Dependencies

Core (always installed):

  • ollama — local LLM server (separate binary install, not pip)
  • httpx — async HTTP for Ollama and Gemini
  • pydantic-settings — config from env
  • rich — terminal UI
  • aioconsole — async terminal input
  • apscheduler + sqlalchemy — persistent job scheduling
  • psutil — system stats (CPU/RAM/VRAM/DISK)
  • python-telegram-bot — Telegram integration
  • mempalace — ChromaDB + KG memory stack

Optional (install as needed):

  • pvporcupine, pyaudio — wake word detection
  • webrtcvad — voice activity detection
  • faster-whisper — local STT
  • piper (binary) — local TTS
  • transformers, torch, sentencepiece — MarianMT translation (pip install -e ".[translation]")
  • pymupdf, python-docx — document reading (pip install -e ".[docs]")
  • pyautogui, pynput — desktop automation (pip install -e ".[automation]")

About

Bantz won't be a helper; he will be your Host. Your computer will become his studio, and your tasks will give the "entertainment." He won't just execute code; he will "pull strings" behind the scenes with a smile that you can hear through the text.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 93.5%
  • TypeScript 3.5%
  • HTML 2.0%
  • Shell 0.5%
  • Jupyter Notebook 0.2%
  • CSS 0.1%
  • Other 0.2%