An example Electron application that streams microphone and system audio to OpenAI's Realtime API for real-time transcription. The app provides a simple interface to capture both microphone input and system audio output, transcribe them in real-time, and optionally record the combined audio as WAV files.
This provides a starting point for building a desktop application that streams microphone and system audio to OpenAI's Realtime API for real-time transcription. See electron-audio-loopback for more information on how to capture system audio.
- Real-time Transcription: Stream microphone and system audio to OpenAI's Realtime API
- Dual Audio Capture: Simultaneously capture microphone input and system audio output
- Multiple Model Support: Choose from different OpenAI transcription models:
whisper-1gpt-4o-transcribegpt-4o-mini-transcribe
- Audio Recording: Record combined microphone and system audio as WAV files
- Modern UI: Clean, dark-themed interface with real-time status indicators
- Cross-platform: Works on macOS, Windows, and Linux
- Node.js (v16 or higher)
- OpenAI API key with access to Realtime API
- Microphone and speakers/audio output device
-
Clone or download this repository
-
Install dependencies:
npm install
-
Create a
.envfile in the project root and add your OpenAI API key:OPENAI_KEY=your_openai_api_key_here
-
Start the application:
npm start
-
The app window will open with controls for:
- Start Streaming: Begin capturing and transcribing audio
- Stop Streaming: Stop audio capture and transcription
- Start Recording: Begin recording combined audio as WAV file
- Microphone Select: Choose input device
- Model Select: Choose transcription model
-
Status indicators show connection state for:
- Microphone input
- System audio output
- Recording status
-
Real-time transcription results appear in separate panels for microphone and system audio
- Main Process (
main.js): Electron main process with audio loopback initialization - Renderer Process (
renderer.js): Frontend logic for audio capture and API communication - Preload Script (
preload.js): Secure bridge between main and renderer processes
- Session Class: Manages WebRTC connections to OpenAI Realtime API
- WavRecorder Class: Handles audio recording and WAV file generation
- Audio Loopback: Uses
electron-audio-loopbackfor system audio capture
electron: Desktop application frameworkelectron-audio-loopback: System audio capturedotenv: Environment variable management
This application requires an OpenAI API key with access to the Realtime API. The Realtime API is currently in beta and may require special access.
- No audio detected: Ensure microphone permissions are granted to the application
- System audio not captured: On macOS, grant microphone permissions in System Preferences
- API errors: Verify your OpenAI API key is valid and has Realtime API access
This project is provided as-is for educational and development purposes. Go crazy.
Alec Armbruster @alectrocute