Open-source voice AI orchestration for India
Build production voice agents in 22+ Indian languages. Plug in any STT, LLM, or TTS provider. Ship compliant, cost-tracked conversations β no vendor lock-in.
Website Β· Live Demo Β· Architecture Β· Tech Specs
Convox is a self-hosted voice AI orchestration platform built on Pipecat. It sits between your application and voice AI providers, adding:
- Provider orchestration β swap STT, TTS, LLM, and telephony providers without changing pipeline code
- Indian language support β first-class support for Hindi, Tamil, Telugu, Bengali, Marathi, Kannada, and all 22 scheduled languages
- Cost tracking β per-session, per-provider cost attribution for every conversation
- Compliance engine β DPDP Act compliant with consent tracking, audit logging, and data retention policies
- Dashboard β monitor active calls, review transcripts, analyze costs, and configure agents
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONVOX PLATFORM β
β β
β βββββββββββββ βββββββββββββ ββββββββββββββββββββββββ β
β β Dashboard β β Core API β β Compliance Engine β β
β β (React) β β (FastAPI) β β (DPDP / HIPAA) β β
β βββββββββββββ βββββββββββββ ββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β ORCHESTRATION LAYER (Pipecat) ββ
β β Agent Runtime Β· Pipeline Execution Β· Turn Taking ββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β PROVIDER PLUGIN LAYER ββ
β β STT Β· LLM Β· TTS Β· Telephony (all swappable) ββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Category | Providers |
|---|---|
| STT | Sarvam AI, Deepgram, NVIDIA Riva, Azure Speech, OpenAI Whisper |
| LLM | OpenAI, Anthropic Claude, Sarvam Saarika, Groq |
| TTS | Sarvam AI, Gnani Vachana, ElevenLabs, Azure Neural |
| Telephony | Exotel, Twilio |
Every provider is a pluggable module implementing a standard interface β add your own with a single Python class.
| Layer | Technology |
|---|---|
| Voice Pipeline | Pipecat |
| API Backend | Python 3.12+ / FastAPI / asyncpg (no ORM) |
| Database | PostgreSQL 17 / Redis 7 |
| Migrations | dbmate (plain SQL) |
| Frontend | Vite / React 19 / TypeScript / Tailwind v4 |
| Infrastructure | Docker Compose / single-container monolith |
convox/
βββ api/
β βββ convox/ # Python package
β β βββ app.py # FastAPI app factory
β β βββ config.py # Pydantic settings (env vars)
β β βββ database/ # asyncpg + Redis connections
β β βββ handler/ # HTTP route handlers
β β βββ middleware/ # CORS, logging
β β βββ model/ # Pydantic schemas
β β βββ providers/ # STT/LLM/TTS/telephony plugins
β β βββ repository/ # Data access (raw SQL)
β β βββ service/ # Business logic
β β βββ compliance/ # DPDP compliance module
β β βββ ws/ # WebSocket handlers
β βββ migrations/ # dbmate SQL migrations (8 tables)
β βββ tests/ # pytest suite
β βββ pyproject.toml # Python deps (uv)
β
βββ web/
β βββ src/
β β βββ routes/ # Page components
β β βββ components/ # Reusable UI components
β β βββ lib/ # API client, utilities
β β βββ types/ # TypeScript types
β βββ vite.config.ts
β
βββ docs/ # Architecture & tech specs
βββ docker-compose.yml # Full stack: postgres + redis + app
βββ Dockerfile # Multi-stage build
βββ Makefile # Dev commands
- Docker & Docker Compose
- Python 3.12+ and uv
- Node.js 20+ (for frontend dev)
git clone https://github.com/rohansx/convox.git
cd convox
cp .env.example .env
# Edit .env with your provider API keysmake db-up # Start PostgreSQL
make db-migrate # Run all migrationscd api
uv sync --all-extras
uv run uvicorn convox.app:app --reload --port 8000cd web
bun install
bun run devdocker compose up --buildThe dashboard will be available at http://localhost:5173 and the API at http://localhost:8000.
/health β Health check
/v1/agents β CRUD for agent definitions
/v1/sessions β Call session lifecycle
/v1/sessions/{id}/transcript β Stored transcripts
/v1/analytics/overview β Cost & latency metrics
/v1/providers β Provider configuration
/v1/compliance/dpdp/consents β DPDP consent records
/ws/call β Real-time audio WebSocket
See docs/tech-specs.md for the complete API reference.
Inbound Call (Exotel/Twilio)
β
βΌ
Session Created β Compliance check (DPDP consent)
β
βΌ
Audio Stream β STT (Sarvam/Deepgram) β Transcript
β
βΌ
Transcript β LLM (Claude/GPT-4o) β Response
β
βΌ
Response β TTS (ElevenLabs/Gnani) β Audio
β
βΌ
Audio β Back to caller
β
βΌ
Log: transcript, latency, cost per provider
Every step is observable, swappable, and cost-tracked.
Convox is built India-first, not India-as-afterthought:
| Language | Code | STT | TTS |
|---|---|---|---|
| Hindi | hi | Sarvam, Deepgram | Sarvam, Gnani |
| Tamil | ta | Sarvam | Sarvam, Gnani |
| Telugu | te | Sarvam | Sarvam, Gnani |
| Bengali | bn | Sarvam | Sarvam, Gnani |
| Marathi | mr | Sarvam | Sarvam, Gnani |
| Kannada | kn | Sarvam | Sarvam, Gnani |
| Gujarati | gu | Sarvam | Sarvam |
| Malayalam | ml | Sarvam | Sarvam |
| Punjabi | pa | Sarvam | Sarvam |
| Odia | or | Sarvam | Sarvam |
| Assamese | as | Sarvam | Sarvam |
| English | en | All providers | All providers |
The compliance engine is modular β enable only what you need:
- DPDP Act (India) β voice consent capture, consent storage, configurable retention TTLs, right-to-erasure, 72-hour breach notification, audit export
- HIPAA β planned
- GDPR β planned
# Run tests
cd api && uv run pytest
# Lint
cd api && uv run ruff check .
# Type check frontend
cd web && npx tsc --noEmit
# Build frontend for production
cd web && bun run buildConvox is Apache 2.0 licensed. We welcome contributions:
- Fork the repo
- Create a feature branch (
git checkout -b feature/amazing-thing) - Commit your changes
- Push to your branch
- Open a Pull Request
Built for India. Open to the world.