Real-time interview co-pilot that listens to call audio, transcribes questions, classifies intent, retrieves relevant context, and shows concise live hints in a floating overlay.
Repository: AhsanRiaz786/clutch-ai
Course: CS-419 Deep Learning (NUST SEECS, Spring 2026)
Clutch.ai runs this online pipeline:
- Capture audio from your selected input device (mic or loopback).
- Detect speech segments (RMS VAD; optional GRU VAD model if trained).
- Transcribe with Faster-Whisper (
base.en, CPU int8). - Classify transcript as
technical_question,personal_behavioral, ornoise. - Retrieve supporting context from ChromaDB (+ optional reranker).
- Generate short interview-ready guidance via Groq (default 70B) with Ollama fallback.
- Stream output to a floating PyQt5 overlay.
pipeline.py— app entry point and orchestrationaudio/capture.py— capture + VAD + transcriptionaudio/devices.py— input device resolution (--list-devicessupport)classifier/predict.py— BiLSTM/MLP classifier inferencerag/retriever.py— Chroma retrieval + rerank hookllm/hint_gen.py— prompting and Groq/Ollama generationui/overlay.py— stealth/demo overlay behavior
git clone https://github.com/AhsanRiaz786/clutch-ai.git
cd clutch-ai
pip install -r requirements.txtcp .env.example .envUpdate .env with your values:
GROQ_API_KEY— required for GroqLLM_MODEL— defaults tollama-3.3-70b-versatileCLUTCH_INPUT_DEVICE— optional input selector (e.g.blackhole)OVERLAY_DEMO_MODE0= stealth (hidden from screen capture on macOS)1= demo mode (capture-visible)
MIN_CLASSIFIER_CONFIDENCE— default65
Leave CLUTCH_INPUT_DEVICE unset (or set to default/mic).
List devices:
python audio/capture.py --list-devicesThen set CLUTCH_INPUT_DEVICE to a matching name fragment or index.
- Install BlackHole 2ch.
- In Audio MIDI Setup, create a Multi-Output Device with:
- BlackHole 2ch
- your speakers/headphones
- In System Settings → Sound → Output, select that Multi-Output device.
- Set
.env:CLUTCH_INPUT_DEVICE=blackhole
Use Stereo Mix (or equivalent loopback input), then set CLUTCH_INPUT_DEVICE accordingly.
Add personal/context documents:
data/notes/— notes/docsdata/code/— code filesdata/resume/— resume context
Build vector DB and (optionally) retrain models:
python ingest/ingest.py
python classifier/train.py
python classifier/lstm_classifier.py
python classifier/finetune_embeddings.pypython pipeline.pyExpected startup signals:
[UI] Overlay mode: DEMO (capture-visible)orSTEALTH (capture-hidden)[PIPELINE] Prerequisites OK[AUDIO] VAD capture ready — listening for speech ...
- Class demo: set
OVERLAY_DEMO_MODE=1, and share entire screen if your meeting app excludes floating overlays in window-only share. - Interview stealth: set
OVERLAY_DEMO_MODE=0(macOS appliesNSWindowSharingNone).
- If Groq key is missing/invalid, app falls back to local Ollama.
- If Ollama is not running, app returns a safe generic fallback response.
Start Ollama fallback:
ollama pull llama3.2:3b
ollama servepython eval/eval_retrieval.py
python eval/eval_latency.py