A real-time fact-checking system that monitors conversations in video meetings and YouTube, extracts factual claims, and verifies them using AI and web search. Features an animated mascot that alerts you when questionable claims are detected.
Technical Challenge: Balancing sub-second latency requirements with the reasoning depth needed for accurate fact verification in live conversations.
- Real-time Audio Monitoring: Captures system audio from Zoom, Google Meet, Teams, and YouTube
- Automatic Claim Extraction: Uses AI to identify factual claims in conversations
- Web-based Verification: Searches trusted sources to verify claims
- Confidence Scoring: Provides confidence levels for each verdict
- Chrome Extension: Animated mascot with speech bubbles for visual fact-check alerts
- Optimised Pipeline: 1.5-2.5s end-to-end latency whilst maintaining reasoning quality
System Audio → WebSocket Server (FastAPI) → STT (Groq) → Claim Extraction (PydanticAI)
↓ ↓
Chrome Extension ← WebSocket ← Web Search (Exa) → Verification (Groq)
↓
Mascot UI (alerts on false/unclear claims)
- Audio Capture: System audio monitoring using sounddevice for real-time capture
- WebSocket Server: FastAPI server with clean architecture and dependency injection
- Speech-to-Text: Groq STT with Whisper model
- Claim Extraction: PydanticAI agent to identify factual claims
- Web Search: Exa API for fast neural/keyword search
- Verification: PydanticAI agent to analyse evidence and generate verdicts
- Chrome Extension: Animated mascot UI with speech bubbles for visual alerts
The WebSocket server follows clean architecture principles:
- Separation of Concerns: Each module has a single responsibility
- Dependency Injection: Services are injected, not hardcoded
- Layered Architecture: Clear separation between API, Core, and Infrastructure layers
- Message Factory Pattern: Consistent message creation across the application
uhmm-achtually/
├── backend/ # Python backend server
│ ├── main.py # Main entry point for WebSocket server
│ ├── src/
│ │ ├── api/ # API layer (WebSocket + HTTP)
│ │ ├── core/ # Business logic (fact-checking, NLP, transcription)
│ │ ├── domain/ # Domain models and interfaces
│ │ ├── infrastructure/ # External clients (Groq, Exa)
│ │ └── processors/ # Audio processing and claim extraction
│ ├── pyproject.toml # Python dependencies (uv)
│ └── .env # Environment variables (API keys)
│
├── chrome-extension/ # Chrome extension for visual alerts
│ ├── manifest.json # Extension configuration
│ ├── background/
│ │ └── background.js # WebSocket client & service worker
│ ├── content/
│ │ ├── content.js # Mascot UI logic
│ │ └── content.css # Mascot styling & animations
│ ├── popup/
│ │ ├── popup.html # Extension popup UI
│ │ ├── popup.js # Popup logic & settings
│ │ └── popup.css # Popup styling
│ ├── assets/
│ │ ├── thinking-pose.png # Mascot default state
│ │ └── talking-pose.png # Mascot alert state
│ └── icons/ # Extension icons
│
└── README.md
- Python 3.12+
- Google Chrome browser
- API Keys:
- Groq API key for STT and LLM (required)
- Exa API key for web search (required)
- Clone the repository:
git clone https://github.com/yourusername/uhmm-achtually.git
cd uhmm-achtually- Install UV package manager (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh- Navigate to backend and install dependencies:
cd backend
uv sync --all-groups-
Configure environment variables:
Create a
.envfile in thebackenddirectory with your API keys:# Required GROQ_API_KEY=your_groq_api_key_here EXA_API_KEY=your_exa_api_key_here # Optional - Customise trusted search domains ALLOWED_DOMAINS=stackoverflow.com,github.com,python.org,reactjs.org
-
Start the backend server:
uv run python main.py
The server will start on
ws://localhost:8765and begin capturing system audio.
-
Open Chrome Extensions page:
- Navigate to
chrome://extensions/in your Chrome browser - Enable Developer mode (toggle in top-right corner)
- Navigate to
-
Load the extension:
- Click "Load unpacked"
- Select the
chrome-extensionfolder from this repository
-
Verify installation:
- You should see "Uhmm Actually - AI Fact Checker" in your extensions list
- The extension icon should appear in your Chrome toolbar
-
Check connection status:
- Click the extension icon to open the popup
- Verify it shows "Connected" (green indicator)
- If disconnected, make sure the backend server is running
Once both the backend server and Chrome extension are running:
-
Navigate to a supported platform:
- Zoom (zoom.us)
- Google Meet (meet.google.com)
- Microsoft Teams (teams.microsoft.com)
- YouTube (youtube.com)
-
The mascot will appear:
- An animated character appears in the bottom-right corner
- Green dot indicates active WebSocket connection
- You can drag and drop the mascot to reposition it
-
When false or unclear claims are detected:
- The mascot switches to "talking" pose
- A speech bubble appears with the verdict
- Shows confidence percentage, rationale, and evidence link
- Bubble auto-hides after 10 seconds
-
Verdict types:
- Contradicted (red bubble): Claim is false or contradicted by evidence
- Unclear (yellow bubble): Insufficient evidence to verify
- Supported: No alert shown (claim is true)
- Not Found: No claims detected in the conversation
- Drag & Drop: Click and drag to reposition the mascot anywhere on the page
- Persistent Position: Your chosen position is saved in browser storage
- Floating Animation: Gentle bobbing motion when idle
- Connection Status: Green dot when connected, yellow when disconnected
- Responsive Design: Adapts to different screen sizes
The screenshot shows the extension fact-checking a YouTube video claiming "HTML is for back-end development". The mascot appears in the bottom-right with a red speech bubble indicating the claim is contradicted with 90% confidence, along with a rationale and evidence link.
When the backend is running, you'll see real-time processing in the terminal:
INFO: Started server process
INFO: Uvicorn running on http://localhost:8765
INFO: WebSocket connection established
INFO: Transcribing: "HTML is for back-end development"
INFO: Extracting claims from transcript
INFO: Found 1 claim: HTML is for back-end development
INFO: Searching for evidence...
INFO: Verdict: contradicted (90% confidence)
INFO: Sent verdict to Chrome extension
When someone makes a false claim in a video:
- Audio captured: System picks up the conversation
- Claim detected: AI identifies a factual claim
- Evidence gathered: Searches trusted sources
- Verdict delivered:
- Mascot animates to "talking" pose
- Speech bubble appears with verdict
- Red bubble: "This claim is contradicted by evidence"
- Shows 90% confidence with link to source
Edit backend/dev_config.yaml to customise system behaviour:
# Voice Activity Detection (optimised for low latency)
vad:
disable: false # Set true to bypass VAD
start_secs: 0.1 # Faster speech detection
stop_secs: 0.8 # Natural pause tolerance
min_volume: 0.5 # Sensitivity threshold (0.0-1.0)
# Speech-to-Text Provider
stt:
provider: "groq" # Options: "groq" or "avalon"
groq:
model: "whisper-large-v3"
language: "en"
# LLM Models (Groq)
llm:
claim_extraction_model: "llama-3.3-70b-versatile"
verification_model: "llama-3.3-70b-versatile"
temperature: 0.1 # Lower = more deterministic (0.0-1.0)
# Logging
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR
log_transcriptions: trueThe system supports two STT providers:
- Groq STT (default) - Fast, reliable Whisper-based transcription
- Avalon STT - AquaVoice, 97.3% accuracy on technical terms
To switch providers, edit backend/dev_config.yaml:
stt:
provider: "groq" # Switch from "groq" to "avalon"The fact-checker searches only trusted domains specified in ALLOWED_DOMAINS. This constraint serves two purposes:
- Speed: Reduces search space for faster results (<500ms)
- Reliability: Ensures evidence comes from authoritative sources
Default domains include:
- docs.python.org (Python documentation)
- kubernetes.io (Kubernetes docs)
- owasp.org (Security best practices)
- nist.gov (Standards and guidelines)
- postgresql.org (PostgreSQL documentation)
Add your own trusted sources by updating backend/.env.
Server won't start:
- Ensure you're in the
backenddirectory - Check Python version:
python --version(must be 3.12+) - Verify dependencies:
uv sync --all-groups - Check port 8765 isn't already in use:
lsof -i :8765
API errors:
- Verify API keys in
.envfile - Check API key validity at Groq Console and Exa Dashboard
- Ensure no extra spaces or quotes around API keys
No audio being captured:
- Check system audio permissions for the terminal/Python
- Verify sounddevice is properly installed:
uv run python -c "import sounddevice; print(sounddevice.query_devices())" - On macOS: Grant microphone access in System Preferences → Security & Privacy
Extension not connecting:
- Ensure backend server is running on
ws://localhost:8765 - Check browser console for WebSocket errors (F12 → Console)
- Click extension icon and try "Reconnect" button
Mascot not appearing:
- Verify you're on a supported platform (Zoom, Meet, Teams, YouTube)
- Check extension is enabled at
chrome://extensions/ - Reload the page after starting the backend
No verdicts showing:
- Check backend console for claim detection logs
- Verify ALLOWED_DOMAINS includes relevant sources
- Speech must contain clear factual claims to trigger detection
cd backend
uv run pytest tests/Update the ALLOWED_DOMAINS environment variable in backend/.env:
ALLOWED_DOMAINS=docs.python.org,stackoverflow.com,developer.mozilla.orgReal-time fact-checking presents a fundamental challenge: how quickly can we verify claims without sacrificing accuracy? Conversations move fast, and users expect near-instant feedback, yet thorough fact-checking requires:
- Understanding context and nuance in claims
- Searching multiple sources for evidence
- Reasoning about contradictions and reliability
- Generating confident verdicts
We optimise the pipeline at each stage to minimise latency whilst maintaining reasoning quality:
| Component | Latency | Optimisation Strategy |
|---|---|---|
| Speech-to-Text | 200-400ms | Groq's optimised Whisper Large v3 with streaming |
| Claim Extraction | 200-300ms | PydanticAI with Llama 3.3 70B (structured outputs, low temperature) |
| Web Search | <500ms | Exa keyword mode on pre-filtered trusted domains |
| Verification | 500-800ms | Focused prompts with evidence pre-filtering |
| Total Pipeline | 1.5-2.5s | End-to-end fact-check delivered to user |
Speed optimisations:
- Keyword search over neural search: Exa's keyword mode is 3-5x faster whilst maintaining relevance on domain-filtered results
- Sentence-level chunking: Process claims as complete sentences arrive, don't wait for full paragraphs
- Pre-filtered domains: ALLOWED_DOMAINS constraint reduces search space and improves speed
- Structured outputs: PydanticAI enforces strict schemas, eliminating parsing overhead
- Low temperature (0.1): Reduces model reasoning time whilst maintaining accuracy for factual tasks
Reasoning quality:
- 70B parameter model: Llama 3.3 70B provides strong reasoning whilst being faster than 405B models
- Two-stage AI pipeline: Separate claim extraction and verification agents for specialised reasoning
- Evidence-based verification: LLM evaluates actual source content, not just search snippets
- Confidence scoring: Expresses uncertainty when evidence is insufficient
Trade-offs we accept:
- May miss subtle claims that require multi-sentence context (sentence-level processing)
- Limited to pre-approved domains (won't find evidence on new or niche sites)
- Favours precision over recall (alerts only on contradicted/unclear claims, not all claims)
- Audio Capture: System audio monitoring using sounddevice library
- VAD: Silero VAD detects speech segments
- STT: Groq Whisper converts speech to text
- Sentence Buffering: Accumulates text until sentence boundary detected
- Claim Extraction: PydanticAI analyses sentence for factual claims
- Web Search: Exa searches trusted domains for evidence
- Verification: PydanticAI analyses evidence and generates verdict
- WebSocket Delivery: Verdict sent to Chrome extension in real-time
Clean separation of concerns:
- API layer handles WebSocket/HTTP communication
- Core layer contains business logic (fact-checking, NLP)
- Infrastructure layer manages external API clients
- Domain layer defines interfaces and models
Dependency injection:
- Services are injected via constructors, making testing and swapping implementations easy
- Example: Switch from Groq STT to Avalon STT by changing config, no code changes needed
Message factory pattern:
- Consistent message formatting for WebSocket communication
- Simplifies debugging and ensures protocol compliance
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- PydanticAI for structured AI outputs
- Exa for fast neural search
- Groq for LLM and STT services
- FastAPI for the WebSocket server framework
- Pipecat for audio processing components
- Avalon for alternative STT services