Skip to content
/ ai Public

A typescript library for connecting videos in your Mux account to multi-modal LLMs.

License

Notifications You must be signed in to change notification settings

muxinc/ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@mux/ai

npm version License

Easy to use, purpose-driven, cost effective, configurable workflow functions in a TypeScript SDK for building AI-powered video and audio workflows on the server, powered by Mux, with support for popular AI/LLM providers (OpenAI, Anthropic, Google).

Turn your Mux video and audio assets into structured, actionable data — summaries, chapters, moderation scores, translations, embeddings, and more — with a single function call. @mux/ai handles fetching media data from Mux, formatting it for AI providers, and returning typed results so you can focus on building your product instead of wrangling prompts and media pipelines.

Quick Start

Install

npm install @mux/ai

Configure

Add your credentials to a .env file (we support dotenv):

MUX_TOKEN_ID=your_mux_token_id
MUX_TOKEN_SECRET=your_mux_token_secret
OPENAI_API_KEY=your_openai_api_key          # or ANTHROPIC_API_KEY, GOOGLE_GENERATIVE_AI_API_KEY

You only need credentials for the AI provider you're using. See the Credentials guide for full setup details including signed playback, S3 storage, and all supported providers.

For multi-tenant apps or cases where you need to provide API keys at runtime rather than through environment variables, every workflow accepts a credentials option. You can also register a global credentials provider with setWorkflowCredentialsProvider() for dynamic key resolution (e.g. per-tenant secrets). When using Workflow DevKit, credentials can be encrypted before crossing workflow boundaries so plaintext secrets never appear in serialized payloads.

Run Your First Workflow

import { getSummaryAndTags } from "@mux/ai/workflows";

const result = await getSummaryAndTags("your-asset-id", {
  provider: "openai",
  tone: "professional",
  includeTranscript: true
});

console.log(result.title);        // "Getting Started with TypeScript"
console.log(result.description);  // "A comprehensive guide to..."
console.log(result.tags);         // ["typescript", "tutorial", "programming"]

⚠️ Note: Many workflows rely on transcripts for best results. Consider enabling auto-generated captions on your Mux assets to unlock the full potential of transcript-based workflows like summarization, chapters, and embeddings. This applies to both video and audio-only assets.

Why @mux/ai?

  • Pre-built workflows for media AI tasks. Common multi-step operations (transcript access, frame analysis, LLM calls, and structured parsing) are available as high-level functions.
  • Support for video and audio assets. The same workflows work with video and audio-only assets, including summarization, moderation, chaptering, and more.
  • Provider-flexible API. Choose OpenAI, Anthropic, or Google through workflow options while keeping the same workflow interface.
  • Published evaluation coverage. Workflows include evals for quality, latency, and cost, with results published publicly on pushes to main.
  • Sensible default models. Defaults (gpt-5.1, claude-sonnet-4-5, gemini-3-flash-preview) are selected to balance output quality and runtime cost.
  • Typed end-to-end. Workflow inputs, options, and outputs are fully typed in TypeScript.
  • Operational defaults included. Retry handling, error handling, signed playback support, and Workflow DevKit compatibility are built in.
  • Prompt customization support. Use promptOverrides to adjust sections of workflow prompts for your domain or product requirements.
  • Composable abstractions. Start with full workflows and drop down to lower-level primitives when you need more control.

Workflows

Workflows are high-level functions that handle complete media AI tasks end-to-end — fetching data from Mux, calling AI providers, and returning structured results. Most workflows support both video and audio-only assets.

Workflow What it does Providers Audio-only
getSummaryAndTags Generate titles, descriptions, and tags OpenAI, Anthropic, Google Yes
getModerationScores Detect inappropriate content OpenAI, Hive Yes
hasBurnedInCaptions Detect hardcoded subtitles in video frames OpenAI, Anthropic, Google
askQuestions Answer yes/no questions about asset content OpenAI, Anthropic, Google
generateChapters Create chapter markers from transcripts OpenAI, Anthropic, Google Yes
generateEmbeddings Generate vector embeddings for semantic search OpenAI, Google Yes
translateCaptions Translate captions into other languages OpenAI, Anthropic, Google Yes
translateAudio Create AI-dubbed audio tracks ElevenLabs Yes

See the Workflows guide for detailed documentation, options, and examples for each workflow. See the API Reference for complete parameter and return type details.

Quick Examples

Content moderation:

import { getModerationScores } from "@mux/ai/workflows";

const result = await getModerationScores("your-asset-id", {
  provider: "openai",
  thresholds: { sexual: 0.7, violence: 0.8 }
});

if (result.exceedsThreshold) {
  console.log("Content flagged for review");
}

Chapter generation:

import { generateChapters } from "@mux/ai/workflows";

const result = await generateChapters("your-asset-id", "en", {
  provider: "anthropic"
});

// [{ startTime: 0, title: "Introduction" }, { startTime: 45, title: "Main Content" }, ...]

Semantic search embeddings:

import { generateEmbeddings } from "@mux/ai/workflows";

const result = await generateEmbeddings("your-asset-id", {
  provider: "openai",
  chunkingStrategy: { type: "token", maxTokens: 500, overlap: 100 }
});

for (const chunk of result.chunks) {
  await vectorDB.insert({ embedding: chunk.embedding, startTime: chunk.metadata.startTime });
}

Prompt Customization

Every workflow prompt is built from a structured template of named sections. The promptOverrides option lets you swap out individual sections with your own instructions while keeping the battle-tested defaults for everything else — no need to rewrite entire prompts.

const result = await getSummaryAndTags(assetId, {
  provider: "openai",
  promptOverrides: {
    title: "Create a search-optimized title (50-60 chars) with the primary keyword front-loaded.",
    keywords: "Focus on high search volume terms and long-tail keyword phrases.",
    // task, description, qualityGuidelines → keep defaults
  },
});

This works with getSummaryAndTags, generateChapters, and hasBurnedInCaptions. The Prompt Customization guide has ready-to-use presets for SEO, social media, e-commerce, and technical analysis, along with tips for writing effective overrides.

Evaluations

Choosing between OpenAI, Anthropic, and Google for a given workflow isn't guesswork. Every workflow in @mux/ai ships with eval coverage that benchmarks providers and models against three dimensions:

  • Efficacy — Does it produce accurate, high-quality results?
  • Efficiency — How fast is it and how many tokens does it consume?
  • Expense — What does each request cost?

Evals run automatically on every push to main and results are published to a public dashboard so you can compare providers side-by-side before choosing one for your use case.

You can also run evals locally against your own assets:

npm run test:eval

See the Evaluations guide for details on the 3 E's framework, adding your own evals, and cross-provider testing.

Primitives

Primitives are low-level building blocks that give you direct access to Mux media data — transcripts, storyboards, thumbnails, and text chunking utilities. Use them when you need full control over your AI prompts or want to build custom workflows.

import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";

const transcript = await fetchTranscriptForAsset(asset, playbackId, { languageCode: "en" });
const storyboard = getStoryboardUrl(playbackId, 640);

All pre-built workflows are composed from these primitives internally, so you can always drop down a level when you need to customize behavior.

See the Primitives guide for the full list of available functions and examples of building custom workflows.

Package Structure

// Import specific workflows
import { getSummaryAndTags, generateChapters } from "@mux/ai/workflows";

// Import specific primitives
import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";

// Or import everything via namespace
import { workflows, primitives } from "@mux/ai";

Prerequisites

  • Node.js (>= 21.0.0)
  • A Mux account (free to sign up)
  • An API key for at least one supported AI provider

Documentation

Guide Description
Workflows Detailed guide for each pre-built workflow with examples and options
API Reference Complete API docs — all function signatures, parameters, and return types
Primitives Low-level building blocks for custom workflows
Prompt Customization Overriding prompt sections with promptOverrides for custom use cases
Credentials Setting up Mux, AI provider, and cloud storage credentials
Workflow DevKit Integration with Workflow DevKit for observability and orchestration
Workflow Encryption Encrypting credentials across Workflow DevKit boundaries
Storage Adapters Using custom storage SDKs (AWS, Cloudflare R2, MinIO)
Audio-Only Workflows Working with audio-only assets (no video track)
Evaluations AI eval testing with the 3 E's framework — public dashboard
Examples Running the example scripts from the repository

Additional Resources

Contributing

We welcome contributions! Please see the Contributing Guide for details on setting up your development environment, running tests, and submitting pull requests.

For questions or discussions, feel free to open an issue.

License

Apache 2.0

About

A typescript library for connecting videos in your Mux account to multi-modal LLMs.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 7

Languages