Skip to content

zenlm/zen-live

Repository files navigation

zen-live

Zen Live

Real-time speech translation for broadcast news monitoring. Powered by Hanzo AI.

Overview

Zen Live is a low-latency simultaneous translation service designed for news control rooms. It takes audio/video input in one language and outputs translated audio + captions in real-time via WebRTC.

Default configuration: Spanish → English (configurable for 18+ language pairs)

Features

  • ~200-500ms end-to-end latency - Suitable for live broadcast monitoring
  • WebRTC streaming - Simple browser-based consumption for control rooms
  • Multiple output formats - WebRTC, HTTP audio streams (PCM/WAV), SSE transcripts
  • Broadcast integration - Convert to SRT/RTMP/NDI via ffmpeg
  • Control room UI - Professional interface with transcript logging
  • Monitor mode - Clean fullscreen view for broadcast displays

Quick Start

Option 1: Docker Hub (Recommended)

# Pull the image from Docker Hub
docker pull zenlm/zen-live:latest

# Create a .env file with your configuration
cat > .env << EOF
# Hanzo API Configuration
HANZO_API_KEY=your_hanzo_api_key_here
HANZO_NODE_URL=https://your-hanzo-node.com

# Authentication (optional)
ZEN_LIVE_USER=admin
ZEN_LIVE_PASS=YourSecurePassword
EOF

# Run with mounted .env file
docker run -d --name zen-live \
  -p 8000:8000 \
  -v $(pwd)/.env:/app/.env:ro \
  --restart unless-stopped \
  zenlm/zen-live:latest

Option 2: Docker Compose

# Clone the repository
git clone https://github.com/zenlm/zen-live.git
cd zen-live

# Copy and configure .env
cp .env.example .env
nano .env  # Edit with your API keys

# Run with Docker Compose
docker-compose up -d

Option 3: Manual Installation

# Clone the repository
git clone https://github.com/zenlm/zen-live.git
cd zen-live

# Install dependencies
pip install -r requirements.txt

# Configure backend (choose one):
export HANZO_NODE_URL=http://localhost:3690   # Recommended: Hanzo Node
# or
export HANZO_API_KEY=your_hanzo_api_key         # Direct Hanzo API

# Run
python app.py

Open http://localhost:8000 in your browser.

Endpoints

Endpoint Description
/ Control room web portal
/monitor Simplified broadcast display view
/monitor?autostart=1 Auto-start for OBS/video walls
/api/status Service health check
/api/sessions Active session list
/broadcast/info Integration guide for engineers
/docs OpenAPI (Swagger) documentation
/redoc ReDoc API documentation
/openapi.json OpenAPI JSON spec for code generation
/whip WHIP endpoint for broadcaster ingestion
/whep WHEP endpoint for WebRTC consumption
/outputs?webrtc_id=ID SSE transcript stream
/audio/stream/ID Raw PCM16 audio (24kHz)
/audio/wav/ID WAV-wrapped audio stream

API Documentation

Full OpenAPI/Swagger documentation is available at /docs when the server is running.

  • Interactive Docs: http://your-server:8000/docs
  • ReDoc: http://your-server:8000/redoc
  • OpenAPI JSON: http://your-server:8000/openapi.json

Use the OpenAPI spec to generate client SDKs in any language.

Backend Options

1. Hanzo Node (Recommended)

Connect to a Hanzo Node instance for managed translation infrastructure. Includes support for local Zen Omni models.

export HANZO_NODE_URL=http://your-hanzo-node:3690

2. Direct Hanzo API

Use Hanzo Zen Live API directly from the cloud (requires API key).

export HANZO_API_KEY=your_hanzo_key

Control Room Usage

For News Teams

  1. Open http://server:8000 in Chrome/Edge
  2. Select source language (default: Spanish)
  3. Select target language (default: English)
  4. Click Start to begin translation
  5. Listen to translated audio through speakers/headphones
  6. View live captions on screen

For Broadcast Engineers

OBS Integration:

Add Browser Source → http://server:8000/monitor?autostart=1

Direct Audio Monitoring:

ffplay -f s16le -ar 24000 -ac 1 http://server:8000/audio/stream/SESSION_ID

Convert to SRT:

ffmpeg -f s16le -ar 24000 -ac 1 -i http://server:8000/audio/stream/SESSION_ID \
       -c:a aac -f mpegts 'srt://broadcast-server:9000'

Convert to RTMP:

ffmpeg -f s16le -ar 24000 -ac 1 -i http://server:8000/audio/stream/SESSION_ID \
       -c:a aac -f flv rtmp://server/live/translation

Supported Languages

Source Languages

Spanish, English, Chinese, Portuguese, French, German, Russian, Italian, Korean, Japanese, Cantonese, Indonesian, Vietnamese, Thai, Arabic, Hindi, Greek, Turkish

Target Languages

English, Chinese, Russian, French, German, Portuguese, Spanish, Italian, Korean, Japanese, Cantonese, Indonesian, Vietnamese, Thai, Arabic

Audio Format

  • Encoding: PCM16 signed little-endian
  • Sample Rate: 24,000 Hz
  • Channels: Mono (1)
  • Bits per Sample: 16

Environment Variables

Variable Description Default
HANZO_NODE_URL Hanzo Node backend URL -
HANZO_API_KEY Hanzo API key (fallback) -
ZEN_OMNI_PATH Local model path -
ZEN_LIVE_USER HTTP Basic Auth username (optional) -
ZEN_LIVE_PASS HTTP Basic Auth password (optional) -
WHIP_ENABLED Enable WHIP ingestion endpoint true
PORT Server port 8000
MODE UI for Gradio, PHONE for FastPhone -

Authentication

When both ZEN_LIVE_USER and ZEN_LIVE_PASS are set, HTTP Basic Authentication is required for the control room UI and monitor pages. API endpoints remain accessible for integration.

export ZEN_LIVE_USER=operator@news.com
export ZEN_LIVE_PASS=your_secure_password

Architecture

┌─────────────────┐      WebRTC       ┌──────────────────┐
│  Control Room   │◄─────────────────►│    Zen Live      │
│   Browser UI    │                   │  (FastRTC/ASGI)  │
└─────────────────┘                   └────────┬─────────┘
                                               │
                                    ┌──────────┴──────────┐
                                    │                     │
                              ┌─────▼─────┐        ┌──────▼─────┐
                              │   Hanzo   │        │    Hanzo   │
                              │   Node    │        │    API     │
                              └───────────┘        └────────────┘
                                    │                     │
                                    └──────────┬──────────┘
                                               │
                                    ┌──────────▼────────────┐
                                    │  Zen Omni + Zen Live  │
                                    │     (Backend)         │
                                    └───────────────────────┘

Related Projects

License

Apache 2.0

Links

About

Real-time speech translation for broadcast news monitoring. Powered by Hanzo AI.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors