GitHub - steffenfritz/FileTrove: FileTrove indexes files and creates metadata from them.

VERSION: v1.0.0-BETA-9

FileTrove walks a directory tree, identifies every file, computes metadata, and writes all results into a SQLite database with TSV export support.

What it collects

Category	Details
File type	MIME type, PRONOM identifier, format version, identification proof/note, extension — via siegfried
File & directory timestamps	Creation, modification, and access times
Hashes	MD5, SHA1, SHA256, SHA512, BLAKE2B-512
Entropy	Shannon entropy (files up to 1 GB)
Extended attributes	xattr from ext3/ext4, btrfs, APFS, and others
EXIF metadata	Extracted from image files
YARA-X	Match results from your own rule files
NSRL	Flags known software files via the National Software Reference Library
Dublin Core	Optional session-level descriptive metadata

Each file and directory gets a UUIDv4 as a unique identifier. All results land in a SQLite database and can be exported to TSV.

Installation

Get a distribution bundle — download from the releases page, or build one from source (see BUILDING.md):
```
task dist:bundle    # builds binaries + bundles siegfried.sig
```
The bundle at build/<os>_<arch>/ contains everything you need.
Run the installer from the bundle directory:
```
cd build/darwin_arm64   # or linux_amd64, etc.
./ftrove --install .
```
This creates the scan database (db/filetrove.db) and logs/ directory. The siegfried signature file is included in the bundle. The NSRL bloom filter (~150–240 MB depending on variant) is downloaded automatically during install. Use --nsrl-variant to select which subset to download (default: all).
You're ready.

Building from source without task dist? You can build the NSRL bloom filter locally. See BUILDING.md for details on task nsrl:build-all and disk space requirements.

YARA-X

YARA-X scanning requires a C library that is not bundled with FileTrove. It is built automatically during task build if not already present. See BUILDING.md for setup instructions.

Example rule files: testdata/yara/
When a rule matches, the rule name, session UUID, and file UUID are recorded in the yara table. The rule file itself is not stored.

NSRL

The NSRL bloom filter is not bundled in the repository. It is downloaded automatically during ftrove --install from the GitHub Releases page. Three variants are available:

Variant	Subsets	Size
`modern`	Modern OS software	~150 MB
`mobile`	Modern + Android + iOS	~200 MB
`all`	Modern + Android + iOS + Legacy	~240 MB

./ftrove --install . --nsrl-variant all     # default
./ftrove --install . --nsrl-variant modern  # smallest

NSRL checks are skipped gracefully if no bloom filter is present — scanning still works.

When NIST publishes a new RDS version, rebuild by updating NSRL_VERSION in Taskfile.nsrl.yml and running one of the build targets. See BUILDING.md for details.

You can also build a custom Bloom filter from any newline-delimited list of SHA1 hashes:

admftrove --creatensrl hashes.txt --nsrlversion "my-hashset-v1"

Optional flags: --nsrl-out (output filename, default nsrl.bloom), --nsrl-estimate (expected hash count; auto-counted from file if omitted) and --nsrl-fpr (false positive rate, default 0.01). Copy the resulting bloom file into db/. ftrove loads db/nsrl-<variant>.bloom based on --nsrl-variant (default all), with a fallback to db/nsrl.bloom.

Running a scan

./ftrove -i $DIRECTORY

FileTrove walks $DIRECTORY recursively. Run ./ftrove -h for all available flags.

Viewing results

List all sessions and export one to TSV:

./ftrove -l
./ftrove -t 926be141-ab75-4106-8236-34edfcf102f2

You can also query the SQLite database directly:

CLI: sqlite3 db/filetrove.db
GUI: sqlitebrowser
Visualisation: Sqliteviz

Exporting to PREMIS v3 XML

FileTrove can export the metadata of a session as PREMIS v3 XML. The export includes one premis:object per file (with fixity values, format information, file size, and storage location) as well as a premis:event for the scan event, written to stdout.

./ftrove -P 926be141-ab75-4106-8236-34edfcf102f2

Redirect stdout to save the output to a file:

./ftrove -P 926be141-ab75-4106-8236-34edfcf102f2 > session.premis.xml

Use ./ftrove -l to list available session UUIDs.

webftrove — Web Interface

webftrove is a companion tool that opens a read-only web interface for an existing FileTrove database. It runs a local HTTP server on port 9000 and opens your default browser automatically.

Features

Browse all sessions with file and directory counts
Filter files by name/path (with optional NOT negation), extension, MIME type (multi-select), NSRL status, and YARA hits
Sort by filename, size, modification time, entropy, extension, or MIME type
Live filtering via HTMX — results update without page reload
File detail view: all hashes, EXIF metadata, YARA matches, extended attributes, NTFS ADS
Directory listing with full-text search
Click 📂 next to any path to open the containing directory in the local file browser
Light and dark theme, toggle in the navigation bar

Installation

webftrove is included in the release packages (.deb for Linux, .tar.gz for macOS). No separate build step is needed — just use the binary from the release bundle.

To build from source instead:

git clone https://github.com/steffenfritz/FileTrove.git
cd FileTrove
go build ./cmd/webftrove/

This produces a single self-contained webftrove binary (templates are embedded). Copy it to any location you like, e.g.:

cp webftrove /usr/local/bin/

No additional files are required — webftrove carries everything it needs inside the binary.

Usage

Point webftrove at any filetrove.db file using the --db flag:

webftrove --db /path/to/db/filetrove.db

The browser opens automatically at http://localhost:9000. The database is opened in read-only mode; no data is ever written or modified.

Typical workflow after a scan:

# 1. Run a scan with ftrove
./ftrove -i /media/evidence -p "Case 2025-042" -a "J. Smith"

# 2. Open the results in the browser
webftrove --db db/filetrove.db

Requirements

The filetrove.db must exist and be a valid FileTrove database (created by ftrove --install or a previous scan).
Port 9000 must be available on localhost.
An internet connection is required on first load to fetch Tailwind CSS and HTMX from CDN. Subsequent loads are cached by the browser.

Background

FileTrove is the successor of filedriller, based on the iPres 2021 paper Marrying siegfried and the National Software Reference Library.

Name		Name	Last commit message	Last commit date
Latest commit History 476 Commits
.github		.github
cmd		cmd
packaging		packaging
testdata		testdata
.gitignore		.gitignore
BUILDING.md		BUILDING.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Taskfile.dist.yml		Taskfile.dist.yml
Taskfile.nsrl.yml		Taskfile.nsrl.yml
Taskfile.package.yml		Taskfile.package.yml
Taskfile.yara.yml		Taskfile.yara.yml
Taskfile.yml		Taskfile.yml
VERSION		VERSION
database_schema.dbml		database_schema.dbml
db.go		db.go
db_test.go		db_test.go
debug.go		debug.go
debug_test.go		debug_test.go
dublincore.go		dublincore.go
dublincore_test.go		dublincore_test.go
entropy.go		entropy.go
entropy_test.go		entropy_test.go
exif.go		exif.go
exif_test.go		exif_test.go
filewalk.go		filewalk.go
filewalk_test.go		filewalk_test.go
go.mod		go.mod
go.sum		go.sum
hash.go		hash.go
hash_test.go		hash_test.go
install.go		install.go
jsonl.go		jsonl.go
jsonl_test.go		jsonl_test.go
nfpm.yaml		nfpm.yaml
nsrl.go		nsrl.go
nsrl_test.go		nsrl_test.go
premis.go		premis.go
premis_test.go		premis_test.go
setup_unix_test.go		setup_unix_test.go
siegfried.go		siegfried.go
siegfried_test.go		siegfried_test.go
times.go		times.go
times_test.go		times_test.go
userinfo.go		userinfo.go
uuid.go		uuid.go
uuid_test.go		uuid_test.go
version.go		version.go
webdb.go		webdb.go
xattr.go		xattr.go
xattr_test.go		xattr_test.go
yara.go		yara.go
yara_test.go		yara_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What it collects

Installation

YARA-X

NSRL

Running a scan

Viewing results

Exporting to PREMIS v3 XML

webftrove — Web Interface

Features

Installation

Usage

Requirements

Background

About

Uh oh!

Releases 20

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

What it collects

Installation

YARA-X

NSRL

Running a scan

Viewing results

Exporting to PREMIS v3 XML

webftrove — Web Interface

Features

Installation

Usage

Requirements

Background

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 20

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages