GoCreator turns local slides plus narration text into narrated videos from the command line.
- Reads slide assets from
data/slides - Infers narration from matching
.txtand audio sidecar files indata/slides - Supports PNG, JPG, JPEG, PDF, MP4, MOV, AVI, MKV, and WEBM inputs
- Expands PDFs into one page per slide before rendering
- Translates narration into multiple output languages
- Generates text-to-speech audio or uses prerecorded narration
- Renders per-slide video segments and combines them into final outputs
- Applies optional post-processing such as voice overrides, subtitles, audio mixing, intro/outro clips, exports, metadata, chapters, and thumbnails
- Caches translations, audio, video segments, and PDF preprocessing artifacts
GoCreator is now local-only and CLI-first:
gocreator initcreates a starter project layoutgocreator createruns the full generation pipeline- Slides are discovered only from the top level of
data/slides - Slide ordering uses natural filename order (
slide2comes beforeslide10)
The current media contract is:
- One narration entry is required for every final slide after PDF expansion
- Image slides and PDF pages use narration duration
- Video slides use clip duration by default
- Video slides can instead align to narration duration with
timing.media_alignment: slide - Video slides mix embedded clip audio with narration when the clip has audio
- Go 1.24+ if building from source
ffmpegandffprobeinPATHOPENAI_API_KEYset in the environment when TTS or translation is needed- For PDF input:
pdfinfo,pdfseparate, andpdftocairoinPATH
PDF support currently relies on those PDF utilities during preprocessing so multi-page PDFs can be split and rendered into slide assets.
go install github.com/Napolitain/gocreator/cmd/gocreator@latestgo build -o gocreator.exe ./cmd/gocreatorInitialize a project:
gocreator initThis creates:
gocreator.yamldata/slides/data/out/data/cache/
Add your assets:
- Put images, PDFs, and/or video clips in
data/slides/ - Put matching narration files in the same folder using the same basename
Examples:
data/slides/01-cover.png
data/slides/01-cover.txt
data/slides/02-demo.mp4
data/slides/02-demo.wav
data/slides/03-summary.png
data/slides/03-summary.fr.txt
Create videos:
gocreator create --lang en --langs-out en,fr,esGoCreator now infers narration from files placed next to each slide:
basename.txt: source-language text for TTS or translationbasename.<lang>.txt: language-specific text overridebasename.mp3/basename.wav/ other supported audio formats: source-language prerecorded audiobasename.<lang>.mp3/basename.<lang>.wav: language-specific prerecorded audio
Inference rules:
- If matching audio exists for the requested language, GoCreator uses it directly
- Otherwise, if matching text exists for the requested language, GoCreator uses TTS on that text
- Otherwise, if source text exists, GoCreator translates it and uses TTS
- If both text and audio exist for the same slide/language, audio wins
- Sidecars can be interleaved in any order; media ordering is driven only by slide filenames
For PDF pages, use the expanded page basename:
02-handout-page-0001.txt02-handout-page-0002.fr.txt02-handout-p003.wav
timing.media_alignment still controls video-slide timing:
video: keep the clip durationslide: trim or loop the clip to narration duration
- PDFs are discovered alongside images and videos
- A multi-page PDF is expanded into one final slide per page
- A single-page PDF still goes through the same preprocessing path
- Invalid or encrypted PDFs fail the run
- Expanded PDF artifacts are cached under
data/cache/pdf/
If data/slides contains:
01-cover.png
02-handout.pdf # 3 pages
03-demo.mp4
then your narration sidecars could look like:
01-cover.txt02-handout-page-0001.txt02-handout-page-0002.txt02-handout-page-0003.txt03-demo.wav
Creates a starter config and project layout in the current directory.
Runs the generation pipeline.
Common flags:
--lang,-l: input language--langs-out,-o: comma-separated output languages--config,-c: config file path--no-progress: disable the progress UI
- Load slides from
data/slides - Expand and cache PDF pages when needed
- Match per-slide text and audio sidecars from
data/slides - Translate only the slide texts that do not already have a target-language sidecar
- Generate TTS only for slides that do not already have a matching audio sidecar
- Render one video segment per final slide
- Concatenate segments into a master language render
- Optionally post-process with subtitles, music, intro/outro, exports, metadata, chapters, and thumbnails
The main create flow actively uses:
input.langoutput.languagesoutput.directoryoutput.formatoutput.qualityoutput.formatsvoicecacheencodingeffectsaudiosubtitlesintrooutrometadatachapterstransitiontiming.media_alignmentmulti_view
Supported effects in the core pipeline are:
ken-burnsfor still-image slidestext-overlayblur-backgroundcolor-gradevignettefilm-grainstabilizefor video slides
Effects are optional. When effects is absent, the normal rendering path stays on the lightweight fast path.
Optional post-processing features now supported by create include:
- per-language TTS voice overrides
- background music, ducking, and timed sound effects
- generated
.srt/.vttsubtitles plus optional burn-in - intro/outro clips or generated template cards
- multi-format export (
mp4,webm,gif) with encoding presets - metadata, chapter markers, and thumbnail generation
When those post-processing sections are absent, create keeps the direct fast path and writes the primary video without extra FFmpeg passes.
The config schema is still larger than the runtime surface: pip remains declarative only.
examples/minimal-sidecar-tts/- single image plus.txtsidecar (requires API key)examples/video-prerecorded/- single video plus prerecorded.wavsidecar (no API key)examples/video-align-to-audio/- video plus longer prerecorded audio usingtiming.media_alignment: slideexamples/language-overrides/- mixed per-language.txtand prerecorded audio sidecarsexamples/getting-started/- five-slide starter project with matching.txtsidecars (requires API key)examples/demo/- two-slide multi-language CLI example with inferred sidecars (requires API key)examples/demo-multiview/- multi-view example with per-slide.txtsidecars (requires API key)examples/*.yaml- config schema examples
Common commands:
go mod tidy
go fmt ./...
go vet ./...
go test ./...
go build -o gocreator.exe ./cmd/gocreator
go build -o perftest.exe ./cmd/perftest
go build -o cache-perf-test.exe ./cmd/cache-perf-testThe main runtime path is:
cmd/gocreator -> internal/cli/create.go -> internal/services/creator.go
Core services handle:
- text loading
- translation
- audio generation
- slide discovery and PDF preprocessing
- video assembly
- transitions and multi-view layout composition
GPL-3.0