Project is currently under development.
A fast Python tool that computes cryptographic hashes for files and folders to check system integrity. It generates two different hashes for each folder - one for content and one for structure.
Syntegrity scans directories recursively and computes:
- Individual file hashes (SHA-256)
- Folder content hashes (hash1) - combines all file and subfolder hashes
- Folder structure hashes (hash2) - based on folder metadata and organization
This lets you detect both content changes and structural modifications in your file system.
- Fast processing - Uses parallel processing and memory mapping for large files
- Smart caching - Saves computed hashes to avoid re-processing unchanged files
- Dual hash system - Separate hashes for content vs structure integrity
- Recursive scanning - Processes entire directory trees automatically
- Error handling - Gracefully handles permission errors and missing files
# Run the analyzer
python3 analyzer.py
The tool outputs file hashes and folder hashes in this format:
Processing files:
/home/test-syntegrity/file1: 32c66107f0f4f2053128e519681fc8e88806d0d2b17607ce9f2362aff66ad6c7
/home/test-syntegrity/file2: 85df9a7c92f2e8c562629361ed51d54efb76e0f12ffd2a588f25f93a29d2a43e
Processing folders:
test-syntegrity:[content_hash];[structure_hash]
folder1:[content_hash];[structure_hash]
For folders, the format is: foldername:[hash1];[hash2]
- hash1 = content integrity (all files and subfolders)
- hash2 = structure integrity (folder metadata and organization)
Computes a hash of all file hashes within the folder, recursively. This detects when any file content changes.
Computes a hash of the folder's immediate structure - file names, sizes, subfolder names, and modification times. This detects structural changes like renames, moves, or permission changes.
- Uses multiprocessing to utilize all CPU cores
- Memory maps large files (>1MB) for faster reading
- Caches results to avoid re-computing unchanged files
- Single-pass directory discovery reduces I/O operations
Edit the directories list in main()
:
directories_to_process = [
"/home/test-syntegrity",
"/etc/config",
"/var/log"
]
# Create baseline
python3 analyzer.py > baseline.txt
# Later check for changes
python3 analyzer.py > current.txt
diff baseline.txt current.txt
- Detect unauthorized file modifications
- Monitor system configuration changes
- Verify backup integrity
- Ensure build artifacts haven't changed
If you see the same hash for different files/folders, clear the cache:
rm .hash_cache.pkl
The script handles permission errors gracefully and logs them to stderr. Check the error output for details.
Adjust the worker count based on your system:
max_workers = min(cpu_count(), 8) # Default max 8 workers
- Time complexity: O(n) where n = number of files
- Space complexity: O(p) where p = number of parallel workers
- Hash algorithm: SHA-256
- Cache location:
.hash_cache.pkl
The script adapts its approach based on file size:
- Small files (<1MB): Chunked reading
- Large files (1-10MB): Memory mapping
- Very large files (>10MB): Chunked memory mapping
- Python 3.7+
- Linux/Unix system (for optimal performance)
- Standard library modules only (no external dependencies)
MIT License - see LICENSE file for details.