Skip to content

Latest commit

 

History

History
89 lines (67 loc) · 3.97 KB

File metadata and controls

89 lines (67 loc) · 3.97 KB

Changelog

All notable changes to this project are documented here.


[0.4.0] — 2026-02-26

Added

  • --monitor-audio flag: plays RX and TX audio locally through the computer's speakers via sounddevice, so you can follow the conversation without a radio headset
  • HF radio-themed terminal UI: compact header + S-meter (RX) / power+SWR (TX) + mode/frequency panel, updated at 10 FPS from live wfweb meter data
  • Callsign validation in update_contact tool handler — rejects strings without a digit (e.g. a name accidentally passed as a callsign)

Changed

  • AudioConfig stripped to the two fields that are actually used: VAD_THRESHOLD and VAD_SILENCE_DURATION (server-VAD tuning for GPT-4o Realtime)
  • RadioConfig Hamlib PTT fields removed (PTT_ENABLED, PTT_RIG_MODEL, etc.)
  • AgentConfig removed entirely (LangChain legacy, no longer used)
  • vad_threshold removed from RadioUI constructor (was accepted but never used)

Fixed

  • VAD_SILENCE_DURATION env var name corrected in README (was incorrectly documented as SILENCE_DURATION_SEC)
  • ADIF log frequency race condition: _on_status could fire before adif_logger was created; fixed by caching _current_freq_mhz and reading it at logger init time
  • S-meter, power, and SWR bars now use physical units from wfweb (dB rel. S9, watts, SWR ratio) instead of wrongly assuming 0–255 raw values

[0.3.0] — 2026-02-25

Added

  • GPT-4o Realtime function calling for contact tracking: model calls update_contact(callsign, name, qth, notes, closing) incrementally during each QSO instead of emitting structured text
  • CONTACT_TRACKING prompt section in operator_profiles_base.py instructs the model when and how to call the tool
  • Tool result submission via function_call_output conversation items
  • on_speech_started / on_speech_stopped callbacks in RealtimeSession, wired to UI speech indicator

Removed

  • [QSO_LOG] text block parsing: _parse_and_update_qso_log, _strip_spoken_metadata, _end_qso_from_transcript, _QSO_CLOSING_PHRASES, QSO_LOGGING_FORMAT

Fixed

  • CQ timer now resets on QSO close, preventing an immediate re-CQ before the delay has elapsed
  • Message truncation ([:120]) removed from all transcript log sites

[0.2.0] — 2026-02-24

Changed — architecture rewrite (Phase B)

Complete replacement of the local audio pipeline with a GPT-4o Realtime + wfweb architecture:

Before (Phase A) After (Phase B)
sounddevice (PortAudio) wfweb WebSocket binary frames
Silero VAD (local) GPT-4o Realtime server_vad
Whisper MLX / faster-whisper GPT-4o Realtime STT
LangChain agent + LLM GPT-4o Realtime LLM
Piper TTS GPT-4o Realtime TTS
Hamlib PTT wfweb {"cmd":"setPTT"}

Added

  • audio/wfweb_client.py — WebSocket client speaking the wfweb browser protocol (RX audio, TX audio, PTT, status/meter messages)
  • TxBuffer — streams 24 kHz GPT-4o Realtime audio to wfweb in real-time-paced 20 ms chunks after resampling to 48 kHz
  • RealtimeSession — asyncio GPT-4o Realtime WebSocket session running in a background thread
  • WfwebConfigWFWEB_URL, WFWEB_CONNECT_TIMEOUT
  • SWL (Short Wave Listener) receive-only operator mode
  • MONITORING operator mode (listens for direct calls, IDs every 5 minutes)

Removed

  • sounddevice, PortAudio, Hamlib, Silero VAD, Whisper, Piper TTS, LangChain from the runtime path

[0.1.0] — 2026-01-xx

Initial working version (as demonstrated in the companion video).

Stack

  • Audio I/O: sounddevice (PortAudio)
  • VAD: Silero VAD (local, PyTorch)
  • STT: Whisper MLX (Apple Silicon) or faster-whisper
  • LLM: LangChain multi-provider agent (Anthropic, OpenAI, Groq, Ollama, …)
  • TTS: Piper TTS
  • PTT: Hamlib via rigctld

Features

  • QSO state machine: CALLING_CQ → IN_QSO → QSO_ENDED
  • ADIF contact logging
  • Rich terminal UI
  • CALLING_CQ and CONTESTING operator profiles
  • Weather lookup injected into system prompt
  • Station skip / hijack detection