mbforge

mbforge is a Go CLI that downloads the latest MusicBrainz JSON dump and builds a flattened libSQL metadata database for Oxygen/Chromakopia.

It is intentionally narrow:

One-shot initial build only
No incremental replication sync
No raw JSON blobs
No annotation import

Incremental updates belong in Chromakopia, which resumes from _meta.replication_sequence.

Commands

mbforge build

Resolves the latest dump directory from LATEST
Downloads the selected *.tar.xz archives into --dump-dir
Streams xz -> tar -> JSONL without extracting the full dumps to disk
Imports artists, release groups, releases, embedded recordings/tracks, and standalone recordings
Defers secondary index creation until after the bulk load
Optionally builds a full-text search index with --search-index
Writes _meta with dump and replication metadata
Runs PRAGMA optimize and VACUUM

mbforge info

Prints database path, size, core row counts, and _meta values

mbforge search

Searches artists, release groups, releases, recordings, and tracks from a single free-form query
Accepts artist names, track titles, album titles, MBIDs, ISRCs, and release barcodes
Uses the full-text search index when present
Falls back to the older slower SQL path when the search index is absent

mbforge search-index

Builds or rebuilds the full-text search index on an existing database
Useful when you already finished a long build run without --search-index

mbforge version

Prints build-time version metadata

Requirements

Go 1.24+
CGO-enabled build environment
Network access to the MusicBrainz dump mirror

Build

go build ./cmd/mbforge

Example:

mbforge build \
  --output /mnt/nvme/metadata.db \
  --dump-dir /mnt/nvme/mbdump \
  --workers 16 \
  --batch-size 5000 \
  --search-index \
  --verbose

Import a subset of entities:

mbforge build \
  --output ./musicbrainz.db \
  --entities artist,release-group

Inspect a finished database:

mbforge info --db ./musicbrainz.db

Search across the main entities:

mbforge search --db ./musicbrainz.db "nirvana"

Build the fast search index on an existing database:

mbforge search-index --db ./musicbrainz.db

Pipeline

Import order:

artist
release-group
release
recording

Important detail: release.tar.xz contains the overwhelming majority of recordings at media[].tracks[].recording. recording.tar.xz only covers the standalone subset. mbforge imports both, and uses INSERT OR IGNORE for recordings so the two sources can coexist safely.

The importer uses multiple JSON parse workers and a single batched SQLite writer to avoid write contention.

Schema Notes

Areas and labels are flattened inline instead of being normalized into separate tables.
Secondary indexes are created after bulk import for speed.
SQLite does not allow expressions inside a table primary key, so nullable key parts from the draft schema are normalized to empty strings for artist_aliases.locale and release_labels key columns.

Azure Build VM

deploy/cloud-init.yaml is a starting point for the one-shot build VM path:

Install Go
Clone the repo
Build mbforge
Build metadata.db on local NVMe
Copy the finished database to an attached disk
Stop the VM

You still need to customize the final handoff step to your long-running sqld VM.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cmd/mbforge		cmd/mbforge
deploy		deploy
internal		internal
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
go.sum		go.sum
mbforge.exe		mbforge.exe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mbforge

Commands

Requirements

Build

Pipeline

Schema Notes

Azure Build VM

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mbforge

Commands

Requirements

Build

Pipeline

Schema Notes

Azure Build VM

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages