Skip to content

tobiasbischoff/tiktoksubs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tiktoksubs

CLI tool for adding TikTok-style burned-in captions to vertical MP4 videos.

It transcribes speech with whisperkit-cli, groups timed words into short caption chunks, highlights the currently spoken word, and renders the result directly into the final MP4 so the video is ready to upload to TikTok or similar platforms.

Why this approach

Burned-in captions always require re-encoding the video stream. This project keeps audio untouched and uses a high-quality H.264 output profile by default.

The caption renderer does not depend on ffmpeg subtitle filters such as ass or drawtext. Those filters are missing in some local builds. Instead, tiktoksubs generates transparent overlay frames in Go and composites them with ffmpeg, which makes the pipeline more portable.

Requirements

  • whisperkit-cli available in PATH
  • ffmpeg and ffprobe available in PATH
  • Go 1.26+ to build from source

Build

go build -o tiktoksubs .

Quick start

./tiktoksubs -input 20260311_082403.mp4

Default output:

20260311_082403_captioned.mp4

Example

./tiktoksubs \
  -input input.mp4 \
  -output output.mp4 \
  -language en \
  -quality high \
  -font "Verdana Bold"

Options

-input         input MP4 file
-output        output MP4 file
-language      spoken language for WhisperKit, for example en or de
-model         WhisperKit model name
-font          font name or path to a .ttf/.otf file
-quality       high, smaller, or lossless
-keep-temp     keep temporary transcription and overlay files
-uppercase     render captions in uppercase
-max-words     maximum words per caption block
-max-duration  maximum caption duration in seconds

Quality profiles

  • high: good default for upload-ready output with low visible quality loss
  • smaller: smaller file size with stronger compression
  • lossless: lossless H.264 output, much larger files

How it works

  1. Probe the source video with ffprobe.
  2. Transcribe the video audio with whisperkit-cli using word timestamps.
  3. Group words into short caption blocks optimized for short-form video.
  4. Render caption frames with outline, shadow, centered layout, and active-word highlight.
  5. Overlay the transparent caption video on top of the source video with ffmpeg.

Development

Run tests:

go test ./...

Build a local binary:

go build -o tiktoksubs .

Notes

  • Audio is copied without re-encoding.
  • Video is always re-encoded because the captions are burned in.
  • The repository includes a sample vertical MP4 for local testing.

About

CLI for adding TikTok-style burned-in captions to vertical MP4 videos using WhisperKit and ffmpeg

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages