🕵️ Khabri AI

Your personal khabri (खबरी) — the informant who sees everything, hears everything, and never forgets.

A daemon that lives on your laptop, captures everything you see and hear, and becomes your always-on personal AI assistant powered by a local or cloud LLM.

╔═══════════════════════════════════════════════════════════╗
║  Khabri AI — Captures everything. Forgets nothing.        ║
║  Sees your screen. Hears your meetings.                   ║
║  Tracks your tasks. Knows your context.                   ║
╚═══════════════════════════════════════════════════════════╝

What It Does

Capability	Description
Screen Capture + OCR	Screenshots every 15s, extracts text via EasyOCR, skips unchanged screens
Audio Transcription	Captures system audio (WASAPI loopback) and transcribes with local Whisper — hears your meetings without recording
Clipboard Monitoring	Watches for new text copied to clipboard
Window Tracking	Tracks active app/window, auto-detects meetings (Teams, Zoom, Webex)
Semantic Search	ChromaDB-backed vector search across all captured content
Meeting Intelligence	Auto-detects meetings, summarizes when they end, extracts action items
Task Management	Auto-extracts tasks from conversations, tracks status, deadline reminders
Daily Digest	End-of-day markdown report: accomplishments, meetings, tasks, app usage
Focus Analysis	Tracks app switches, detects distraction patterns, productivity coaching
Timeline Recall	Reconstructs what happened at a specific time or around a topic
Knowledge Synthesis	Connects information across meetings, screen activity, and clipboard
Proactive Alerts	Overdue tasks, stale items, context-switching warnings

Architecture

src/
├── main.py                ← CLI daemon (interactive chat + background capture)
├── tray.py                ← System tray mode (taskbar icon, right-click menu)
├── capture/
│   ├── screen.py          ← Screenshot + OCR + diff detection + multi-monitor
│   ├── audio.py           ← WASAPI loopback + Whisper transcription
│   ├── clipboard.py       ← Clipboard monitoring (Win32 API)
│   └── window.py          ← Active window tracking + meeting detection
├── models/
│   └── local_llm.py       ← GPT-5.2 primary + Ollama fallback + retry
├── storage/
│   └── database.py        ← SQLite (events, conversations, tasks, digests, insights)
├── indexing/
│   ├── embeddings.py      ← ChromaDB semantic search
│   └── indexer.py         ← Background indexer + auto-pruning
├── assistant/
│   ├── chat.py            ← Context-aware chat engine (7 capabilities)
│   ├── commands.py        ← 15 slash commands
│   ├── daily_digest.py    ← End-of-day markdown report generator
│   └── proactive.py       ← Autonomous intelligence engine
└── utils/
    └── config.py          ← YAML config loader

Quick Start

Prerequisites

Python 3.11+
Windows (uses WASAPI for audio capture, Win32 API for clipboard/window)
Ollama (optional, for offline fallback): ollama.com

Install

cd khabri-ai
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

Configure

Edit config/settings.yaml:

llm:
  # Primary: any OpenAI-compatible endpoint
  provider: openai_compat
  base_url: "http://localhost:3741/v1"  # your endpoint
  api_key: "your-api-key"              # or set LLM_API_KEY env var
  model: "gpt-4o"                      # model name

  # Fallback: local Ollama (auto-switches when primary is down)
  fallback:
    provider: ollama
    base_url: "http://localhost:11434"
    model: "llama3"

Run

# Interactive CLI mode — chat + captures in background
python -m src.main

# System tray mode — lives in your taskbar silently
python -m src.tray

Commands

Command	Description
`/help`	Show all commands
`/status`	Capture stats, event counts, open tasks
`/summary [min]`	Summarize recent activity (default: 60 min)
`/actions`	Extract action items from meeting audio
`/tasks`	Show all open tasks
`/done <id>`	Mark a task as completed
`/search <query>`	Text search across captured content
`/recall <topic>`	Reconstruct timeline for a topic or time
`/synth <topic>`	Synthesize knowledge from all sources
`/focus [hours]`	Analyze focus & productivity (default: 4h)
`/digest`	Generate today's daily digest now
`/insights`	Show proactive intelligence insights
`/history`	Show conversation history
`/clear`	Clear conversation history
`/quit`	Exit the assistant

How It Captures Meetings

Khabri does not join or record meetings. Instead:

Window tracker detects when Teams/Zoom/Webex is the active window
Audio capture grabs your system audio output via WASAPI loopback (the same audio your speakers/headphones play)
Whisper transcribes the audio locally — no data is sent anywhere
Proactive engine detects the meeting ended and auto-generates a summary + action items
Audio is processed in memory and never saved to disk

No one in the meeting knows. No bots join. No recordings are created.

Data Flow

Screen ─────┐
Audio ──────┤
Clipboard ──┼──▶ SQLite DB ──▶ Background Indexer ──▶ ChromaDB
Window ─────┘         │                                   │
                      │                                   │
                      ▼                                   ▼
                 Daily Digest              Semantic Search for Chat
                 Proactive Engine          Timeline Recall
                 Task Extraction           Knowledge Synthesis

Storage

All data is stored locally:

Store	Path	Purpose
SQLite	`data/db/assistant.db`	Events, conversations, tasks, digests, insights
ChromaDB	`data/db/chroma/`	Vector embeddings for semantic search
Digests	`data/digests/`	Daily markdown reports

Auto-pruning keeps the database from growing unbounded (default: 500K events max).

Configuration Reference

All settings are in config/settings.yaml:

Section	Key Settings
`llm`	Provider, model, temperature, timeout, retries, fallback
`screen`	Interval (15s), OCR toggle, multi-monitor (-1=all), diff threshold
`audio`	Chunk duration (30s), Whisper model size, silence threshold
`clipboard`	Poll interval (2s), max content length
`window`	Poll interval (3s), meeting app detection
`storage`	DB path, ChromaDB path, max events, prune interval
`daily`	Digest hour (18:00), output directory, retention (90 days)
`proactive`	Check interval (15 min), meeting/focus/pattern/reminder detection
`assistant`	System prompt, max context items, history length

Offline Mode

To run entirely offline with no network:

llm:
  provider: ollama
  base_url: "http://localhost:11434"
  model: "llama3"

ollama pull llama3
ollama pull nomic-embed-text
python -m src.main

Dependencies

Package	Purpose
`openai`	LLM client (OpenAI-compatible / Ollama)
`pyyaml`	Configuration
`mss`	Fast cross-platform screenshots
`Pillow`	Image processing
`easyocr`	Offline OCR text extraction
`numpy`	Array operations
`pyaudiowpatch`	Windows audio loopback capture
`openai-whisper`	Local speech-to-text
`torch`	Whisper backend
`chromadb`	Vector database for semantic search
`pystray`	System tray icon (tray mode)

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
config		config
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🕵️ Khabri AI

What It Does

Architecture

Quick Start

Prerequisites

Install

Configure

Run

Commands

How It Captures Meetings

Data Flow

Storage

Configuration Reference

Offline Mode

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🕵️ Khabri AI

What It Does

Architecture

Quick Start

Prerequisites

Install

Configure

Run

Commands

How It Captures Meetings

Data Flow

Storage

Configuration Reference

Offline Mode

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages