transcriber

A TypeScript CLI that transcribes video/audio files to text using OpenAI Whisper. Runs locally — no API keys, no cloud services. Models download automatically on first use.

Features

Transcribe any video or audio file to timestamped text
Record from microphone and transcribe on the fly
Coherent sentence boundaries — doesn't break mid-sentence
Multiple Whisper model sizes (tiny → large)
Auto-downloads models on first use (cached locally)
Auto-installs system dependencies (ffmpeg, sox) via Homebrew when missing

Requirements

Node.js >= 18
macOS (Homebrew-based dependency management; Linux users install ffmpeg/sox manually)
ffmpeg — for extracting audio from video/audio files (auto-installed if missing)
sox — for microphone recording only (auto-installed if missing)

Setup

git clone <repo-url>
cd transcriber
npm install

Usage

Transcribe a file

npm run transcribe -- video.mp4
npm run transcribe -- podcast.mp3
npm run transcribe -- voice-memo.m4a

This creates a .txt file next to the input with timestamped sentences:

[00:00:00 → 00:00:07] Hello guys, my name is Piotr and I'm gonna teach you Laravel.
[00:00:07 → 00:00:17] First, let's start by talking about why it is so popular.

Record from microphone

npm run transcribe -- --mic

Records until you press Ctrl+C, then transcribes. Saves to recording-<timestamp>.txt.

Options

Flag	Description	Default
`-o, --output <path>`	Output file path	`<input>.txt` or `recording-<timestamp>.txt`
`-m, --model <size>`	Whisper model size	`base`
`-l, --language <code>`	Language code (e.g. `en`, `pl`, `de`)	`en`
`--mic`	Record from microphone instead of a file	off
`--no-file`	Print to stdout only, don't write a file	off

Models

Size	Accuracy	Speed	Download
`tiny`	Low	Fastest	~75 MB
`base`	Good	Fast	~150 MB
`small`	Better	Moderate	~500 MB
`medium`	Great	Slow	~1.5 GB
`large-v3-turbo`	Best	Slowest	~3 GB

Models are downloaded from Hugging Face on first use and cached in ~/.cache/huggingface/.

# Use a larger model for better accuracy
npm run transcribe -- lecture.mp4 -m small

# Use tiny for quick drafts
npm run transcribe -- note.m4a -m tiny

Examples

# Transcribe a video, save next to it
npm run transcribe -- ~/recordings/lesson-01.mp4

# Transcribe with better accuracy
npm run transcribe -- interview.mp3 -m small -l en

# Transcribe Polish audio
npm run transcribe -- rozmowa.m4a -l pl

# Just print to terminal, no file
npm run transcribe -- memo.m4a --no-file

# Record a voice memo and transcribe
npm run transcribe -- --mic -o my-thought.txt

# Record with a specific model
npm run transcribe -- --mic -m small

Output format

Each line is a complete sentence with start and end timestamps:

[HH:MM:SS → HH:MM:SS] Sentence text here.

Whisper chunks are merged into coherent sentences — lines break on sentence-ending punctuation (. ! ?), not mid-sentence.

How it works

Audio extraction — ffmpeg converts the input file to raw PCM audio (16kHz, mono, float32)
Mic recording — sox captures from the default microphone in the same format
Transcription — the Whisper model (via @huggingface/transformers + ONNX runtime) processes the audio with timestamps
Sentence merging — raw chunks are merged into complete sentences using punctuation boundaries
Output — formatted lines are printed and saved to a text file

Tests

npm test

Tests cover: timestamp formatting, sentence merging logic, audio loading, and ffmpeg extraction (integration test).

Tech stack

TypeScript + tsx (runtime)
@huggingface/transformers — runs Whisper ONNX models in Node.js
commander — CLI argument parsing
ffmpeg — audio extraction from video/audio files
sox — microphone recording
Node built-in test runner — tests

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

transcriber

Features

Requirements

Setup

Usage

Transcribe a file

Record from microphone

Options

Models

Examples

Output format

How it works

Tests

Tech stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

transcriber

Features

Requirements

Setup

Usage

Transcribe a file

Record from microphone

Options

Models

Examples

Output format

How it works

Tests

Tech stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages