Mic & Speaker Streamer

An example Electron application that streams microphone and system audio to OpenAI's Realtime API for real-time transcription. The app provides a simple interface to capture both microphone input and system audio output, transcribe them in real-time, and optionally record the combined audio as WAV files.

This provides a starting point for building a desktop application that streams microphone and system audio to OpenAI's Realtime API for real-time transcription. See electron-audio-loopback for more information on how to capture system audio.

Features

Real-time Transcription: Stream microphone and system audio to OpenAI's Realtime API
Dual Audio Capture: Simultaneously capture microphone input and system audio output
Multiple Model Support: Choose from different OpenAI transcription models:
- whisper-1
- gpt-4o-transcribe
- gpt-4o-mini-transcribe
Audio Recording: Record combined microphone and system audio as WAV files
Modern UI: Clean, dark-themed interface with real-time status indicators
Cross-platform: Works on macOS, Windows, and Linux

Prerequisites

Node.js (v16 or higher)
OpenAI API key with access to Realtime API
Microphone and speakers/audio output device

Installation

Clone or download this repository
Install dependencies:
```
npm install
```
Create a .env file in the project root and add your OpenAI API key:
```
OPENAI_KEY=your_openai_api_key_here
```

Usage

Start the application:
```
npm start
```
The app window will open with controls for:
- Start Streaming: Begin capturing and transcribing audio
- Stop Streaming: Stop audio capture and transcription
- Start Recording: Begin recording combined audio as WAV file
- Microphone Select: Choose input device
- Model Select: Choose transcription model
Status indicators show connection state for:
- Microphone input
- System audio output
- Recording status
Real-time transcription results appear in separate panels for microphone and system audio

Technical Details

Architecture

Main Process (main.js): Electron main process with audio loopback initialization
Renderer Process (renderer.js): Frontend logic for audio capture and API communication
Preload Script (preload.js): Secure bridge between main and renderer processes

Key Components

Session Class: Manages WebRTC connections to OpenAI Realtime API
WavRecorder Class: Handles audio recording and WAV file generation
Audio Loopback: Uses electron-audio-loopback for system audio capture

Dependencies

electron: Desktop application framework
electron-audio-loopback: System audio capture
dotenv: Environment variable management

API Requirements

This application requires an OpenAI API key with access to the Realtime API. The Realtime API is currently in beta and may require special access.

Troubleshooting

No audio detected: Ensure microphone permissions are granted to the application
System audio not captured: On macOS, grant microphone permissions in System Preferences
API errors: Verify your OpenAI API key is valid and has Realtime API access

License

This project is provided as-is for educational and development purposes. Go crazy.

Author

Alec Armbruster @alectrocute

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
index.html		index.html
main.js		main.js
package-lock.json		package-lock.json
package.json		package.json
preload.js		preload.js
renderer.js		renderer.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mic & Speaker Streamer

Features

Prerequisites

Installation

Usage

Technical Details

Architecture

Key Components

Dependencies

API Requirements

Troubleshooting

License

Author

About

Uh oh!

Releases

Packages

Languages

alectrocute/mic-speaker-streamer

Folders and files

Latest commit

History

Repository files navigation

Mic & Speaker Streamer

Features

Prerequisites

Installation

Usage

Technical Details

Architecture

Key Components

Dependencies

API Requirements

Troubleshooting

License

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages