7 releases
Uses new Rust 2024
| 0.1.7 | Mar 25, 2026 |
|---|---|
| 0.1.5 | Mar 24, 2026 |
#657 in Parser implementations
58KB
1.5K
SLoC
rsdedup
A fast, Rust-based file deduplication tool.
Features
- Multiple actions: report, delete, hardlink, symlink
- Comparison strategies: size+hash (default), hash-only, byte-for-byte
- Hash algorithms: SHA-256 (default), xxHash, BLAKE3
- Persistent hash cache (
~/.rsdedup/cache.db) for fast repeated scans - Parallel hashing with configurable thread count
- Include/exclude glob filters, min/max size filters
- JSON and text output formats
- Dry-run mode for destructive operations
- Shell completions for bash, zsh, and fish
Installation
From source
cargo install rsdedup
From releases
Download a prebuilt binary from GitHub Releases.
Usage
# Report duplicates in current directory
rsdedup report
# Report duplicates in a specific directory
rsdedup report /home/user/photos
# Delete duplicates, keeping the oldest file
rsdedup delete --keep oldest /home/user/photos
# Hardlink duplicates (dry-run first)
rsdedup hardlink --dry-run /data
# Symlink duplicates
rsdedup symlink /data
# Use BLAKE3 with byte-for-byte comparison and JSON output
rsdedup report --hash blake3 --compare byte-for-byte --output json
# Only find duplicate images, excluding thumbnails
rsdedup report --include '*.jpg' --include '*.png' --exclude '*thumb*'
# Filter by file size
rsdedup report --min-size 1024 --max-size 100000000
# Pre-populate the hash cache
rsdedup scan /home/user/photos
# Show cache statistics
rsdedup cache stats
# Clear the cache
rsdedup cache clear
# Generate shell completions
rsdedup completions bash > ~/.local/share/bash-completion/completions/rsdedup
# Show version and build info
rsdedup version
Subcommands
| Command | Description |
|---|---|
report |
Find and list duplicate files (read-only) |
delete |
Remove duplicate files, keeping one per group |
hardlink |
Replace duplicates with hardlinks |
symlink |
Replace duplicates with symlinks |
scan |
Populate the hash cache without taking action |
cache |
Manage the hash cache (clear, stats) |
completions |
Generate shell completions |
version |
Show version and build information |
How It Works
rsdedup uses a multi-stage pipeline to efficiently find duplicates:
- Scan — Walk the directory tree and collect file metadata
- Group — Group files by size (unique sizes are skipped)
- Filter — Apply size and glob filters
- Compare — Compare candidates using the chosen strategy
- Act — Perform the requested action on duplicate groups
The default size-hash comparison strategy is optimized for speed:
files are first grouped by size, then a partial 4KB hash eliminates
most non-duplicates, and only remaining candidates get a full hash.
Exit Codes
| Code | Meaning |
|---|---|
0 |
No duplicates found |
1 |
Duplicates found |
2 |
Error |
License
MIT
Dependencies
~8–13MB
~254K SLoC