This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
AI-DX is an autonomous amateur radio operator that conducts QSOs (radio conversations) using the GPT-4o Realtime API for end-to-end audio processing (VAD, STT, LLM, TTS) and wfweb for radio audio I/O and PTT control. It runs on macOS Apple Silicon (M4 tested).
# Install dependencies
uv sync
# Production (requires wfweb + radio hardware)
uv run python radio_operator.py
# Demo mode (mic + speakers, no radio hardware)
uv run python radio_operator.py --demo
uv run python radio_operator.py -d
# Suppress Rich UI, log to console instead
uv run python radio_operator.py --no-ui
# Play RX/TX audio locally (production only)
uv run python radio_operator.py --monitor-audioThere are no tests, linter, or CI configured. The test_tools/ directory contains manual testing utilities, not automated test suites.
Entry point: radio_operator.py — contains RadioOperator, TxBuffer, AudioMonitor, MicCapture, QSO state machine, weather fetching, phonetic spelling, and the main worker loop.
Production pipeline:
wfweb WebSocket (RX audio, 48kHz)
→ RealtimeSession.push_audio() [resample 48→24kHz]
→ GPT-4o Realtime API [server VAD + STT + LLM + TTS]
→ TxBuffer [resample 24→48kHz, real-time pacing]
→ wfweb WebSocket (TX audio + PTT)
Demo pipeline (--demo):
MicCapture (default mic, 16kHz)
→ RealtimeSession.push_audio() [resample 16→24kHz]
→ GPT-4o Realtime API
→ TxBuffer
→ AudioMonitor (local speaker playback, 24kHz)
Key classes in radio_operator.py:
TxBuffer— queues 24kHz PCM16 from GPT-4o Realtime, streams to wfweb with real-time pacing; calls_on_tx_start/_on_tx_endcallbacks; handles demo mode (no wfweb) transparentlyAudioMonitor— optional local playback via sounddevice; used for--monitor-audio(48kHz) and demo mode (24kHz)MicCapture— demo mode only; captures default mic at 16kHz, gates onis_transmitting
Key modules:
ai/realtime_client.py—RealtimeSession: asyncio WebSocket client for GPT-4o Realtime API; runs event loop in background daemon thread; exposes syncpush_audio(),send_text(),reconnect(),close()audio/wfweb_client.py—WfwebClient: WebSocket client for wfweb browser protocol; handles RX audio frames, TX audio frames, PTT, and status/meter callbackscore/config.py—AppConfigcomposed ofAudioConfig,RadioConfig,WfwebConfig,RealtimeConfig; loaded from env /.envcore/operator_profiles.py+core/operator_profiles_base.py— system prompt templates per operator style (CALLING_CQ, CONTESTING, MONITORING, SWL)core/band_utils.py— frequency → band name mappingcore/adif_logger.py— ADIF QSO log writerui/radio_ui.py— Rich terminal UI at 10 FPS; panels: header, PTT/TX-RX, S-meter/TX-meters, comms log, QSO bar
LLM integration: GPT-4o Realtime API via WebSocket (wss://api.openai.com/v1/realtime). Server-side VAD (server_vad mode) — no client-side VAD. Contact tracking uses GPT-4o function calling (update_contact tool). CQ calls injected as text via send_text() with ephemeral=True. reconnect() called between QSOs to clear conversation history.
QSO state machine (in RadioOperator): CALLING_CQ → IN_QSO → QSO_ENDED → CALLING_CQ (or MONITORING/SWL variants). Contacts logged to ADIF via update_contact(closing=true) tool call.
All configuration is via environment variables or .env file. Key settings:
OPENAI_API_KEY=sk-... # Required
CALLSIGN=W1AW # Required
WFWEB_URL=wss://192.168.x.x:8080 # Required for production
YOUR_NAME=Hiram
LOCATION="Newington, CT"
ANTENNA="Dipole"
POWER=100W
TRANSCEIVER="IC-7300"
OPERATOR_STYLE=CALLING_CQ # CALLING_CQ | CONTESTING | MONITORING | SWL
REALTIME_MODEL=gpt-realtime-1.5
REALTIME_VOICE=ash
VAD_THRESHOLD=0.5
VAD_SILENCE_DURATION=0.6
CQ_INTERVAL_SEC=30
LOG_LEVEL=INFOSee README.md for the complete reference.
- macOS tested (Apple Silicon M4); no platform-specific dependencies in current architecture
- Requires Python >=3.10, <3.14
- No local STT, TTS, or VAD — all handled server-side by GPT-4o Realtime
- No Hamlib, no sounddevice in production — all audio I/O via wfweb WebSocket
radio_operator.pyis large; most application logic lives there.envcontains API keys — never commit it- Demo mode logs to
logs/demo_YYYYMMDD_HHMMSS.{log,adi}— never toucheslogs/contacts.adi