tiktoksubs

CLI tool for adding TikTok-style burned-in captions to vertical MP4 videos.

It transcribes speech with whisperkit-cli, groups timed words into short caption chunks, highlights the currently spoken word, and renders the result directly into the final MP4 so the video is ready to upload to TikTok or similar platforms.

Why this approach

Burned-in captions always require re-encoding the video stream. This project keeps audio untouched and uses a high-quality H.264 output profile by default.

The caption renderer does not depend on ffmpeg subtitle filters such as ass or drawtext. Those filters are missing in some local builds. Instead, tiktoksubs generates transparent overlay frames in Go and composites them with ffmpeg, which makes the pipeline more portable.

Requirements

whisperkit-cli available in PATH
ffmpeg and ffprobe available in PATH
Go 1.26+ to build from source

Build

go build -o tiktoksubs .

Quick start

./tiktoksubs -input 20260311_082403.mp4

Default output:

20260311_082403_captioned.mp4

Example

./tiktoksubs \
  -input input.mp4 \
  -output output.mp4 \
  -language en \
  -quality high \
  -font "Verdana Bold"

Options

-input         input MP4 file
-output        output MP4 file
-language      spoken language for WhisperKit, for example en or de
-model         WhisperKit model name
-font          font name or path to a .ttf/.otf file
-quality       high, smaller, or lossless
-keep-temp     keep temporary transcription and overlay files
-uppercase     render captions in uppercase
-max-words     maximum words per caption block
-max-duration  maximum caption duration in seconds

Quality profiles

high: good default for upload-ready output with low visible quality loss
smaller: smaller file size with stronger compression
lossless: lossless H.264 output, much larger files

How it works

Probe the source video with ffprobe.
Transcribe the video audio with whisperkit-cli using word timestamps.
Group words into short caption blocks optimized for short-form video.
Render caption frames with outline, shadow, centered layout, and active-word highlight.
Overlay the transparent caption video on top of the source video with ffmpeg.

Development

Run tests:

go test ./...

Build a local binary:

go build -o tiktoksubs .

Notes

Audio is copied without re-encoding.
Video is always re-encoded because the captions are burned in.
The repository includes a sample vertical MP4 for local testing.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
20260311_082403.mp4		20260311_082403.mp4
AGENTS.md		AGENTS.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go
main_test.go		main_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tiktoksubs

Why this approach

Requirements

Build

Quick start

Example

Options

Quality profiles

How it works

Development

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tiktoksubs

Why this approach

Requirements

Build

Quick start

Example

Options

Quality profiles

How it works

Development

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages