VoxGlide

Embeddable voice AI SDK for web pages
Speak to fill forms, click buttons, navigate, and ask questions. Works with any LLM provider.

Features

Voice & text input — Browser Speech API with automatic text fallback
Form filling — Detects fields, fills values, triggers React/Vue/Angular change detection
Smart page scanning — Auto-discovers forms, headings, navigation, interactive elements
Multi-LLM support — Gemini, OpenAI, Anthropic, Ollama (any OpenAI-compatible API)
Themeable UI — Presets, sizes, full color control
Conversation workflows — Guided multi-step flows with validation
Accessibility — ARIA live regions, keyboard shortcuts, screen reader tools
Zero dependencies — Self-contained SDK with Shadow DOM isolation

Architecture

Browser (SDK)                    Server (proxy)
─────────────                    ──────────────
SpeechRecognition → text ──WS──→ Receives text
Execute DOM actions ←──WS──────← LLM tool calls
SpeechSynthesis (TTS)            LLM API (holds key)
Page context scanning            Session/history mgmt
Shadow DOM UI                    Context caching

Quick Start

1. Start the server

The server is a thin proxy that holds your API key — it never reaches the browser.

cd server && npm install
GEMINI_API_KEY=your-key npm run dev

Other LLM providers

# OpenAI / GPT
OPENAI_API_KEY=your-key LLM_PROVIDER=openai npm run dev

# Anthropic / Claude
ANTHROPIC_API_KEY=your-key LLM_PROVIDER=anthropic npm run dev

# Ollama (local, no key needed)
LLM_PROVIDER=ollama npm run dev

2. Add the SDK

Script tag (IIFE):

<script src="https://your-server.com/sdk/voice-sdk.iife.js"></script>
<script>
  const sdk = new VoxGlide.VoiceSDK({
    serverUrl: 'wss://your-server.com',
  });
</script>

ES module import

import { VoiceSDK } from 'voxglide';

const sdk = new VoiceSDK({
  serverUrl: 'wss://your-server.com',
  autoContext: true,
  tts: true,
});

That's it. The SDK auto-discovers forms and interactive elements on the page.

Configuration

const sdk = new VoiceSDK({
  serverUrl: 'wss://your-server.com',  // Required
  autoContext: true,                     // Auto-scan DOM for context
  context: 'This is a checkout page',   // Developer-supplied context
  language: 'en-US',                     // Speech recognition language
  tts: true,                             // Enable browser text-to-speech
  ui: { theme: 'ocean', size: 'md' },   // UI theming
  debug: false,                          // Verbose logging
  autoReconnect: true,                   // Reconnect after navigation
});

See docs/configuration.md for full configuration reference.

Custom Tools

Pages can expose tools via window.nbt_functions — the SDK auto-discovers them:

<script>
  window.nbt_functions = {
    lookupOrder: {
      description: 'Look up an order by ID',
      parameters: {
        orderId: { type: 'string', description: 'The order ID', required: true },
      },
      handler: async (args) => {
        return await fetch(`/api/orders/${args.orderId}`).then(r => r.json());
      },
    },
  };
</script>

You can also register tools via SDK config or at runtime. See the custom tools docs.

Documentation

Topic	Link
Configuration & theming	docs/configuration.md
Custom tools	docs/custom-tools.md
Conversation workflows	docs/workflows.md
Events reference	docs/events.md
Server setup	docs/server.md
Architecture overview	docs/architecture.md

Examples

The examples/ directory contains demo pages:

basic.html — Minimal integration
form-filling.html — Form auto-fill demo
custom-actions.html — Custom tool registration

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions, code style, and PR guidelines.

git clone https://github.com/billiax/voxglide.git
cd voxglide && npm install
npm run check    # typecheck + lint + test

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
docs		docs
examples		examples
extension		extension
packages/react		packages/react
server		server
src		src
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.npmignore		.npmignore
.releaserc.json		.releaserc.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
rollup.config.mjs		rollup.config.mjs
start-dev.sh		start-dev.sh
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoxGlide

Features

Architecture

Quick Start

1. Start the server

2. Add the SDK

Configuration

Custom Tools

Documentation

Examples

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoxGlide

Features

Architecture

Quick Start

1. Start the server

2. Add the SDK

Configuration

Custom Tools

Documentation

Examples

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages