Art File Management System

File management for the lost.

Core Concept

Content-addressable storage with a SQLite database to track everything. Files are deduplicated automatically and can live in multiple locations (local + cloud).

Storage Structure

Local:

storage/
├── a3/
│   ├── b5/
│   │   ├── a3b5c7d9e1f2a4b6...           # actual file
│   │   └── a3b5c7d9e1f2a4b6....meta.json # optional sidecar backup

Hash-based sharding (2 levels: 65,536 possible directories)
~15 files per directory on average with 1M files
Same structure mirrored in cloud (S3/Backblaze B2)

Cloud: Same hash-based structure for consistency

Database Schema

CREATE TABLE files (
    hash TEXT PRIMARY KEY,           -- SHA-256 of file content
    size INTEGER,
    mime_type TEXT,
    original_filename TEXT,
    file_extension TEXT,
    created_at TIMESTAMP,
    modified_at TIMESTAMP,
    imported_at TIMESTAMP,
    
    -- Storage locations
    local_path TEXT,                 -- storage/a3/b5/a3b5c7d9...
    s3_url TEXT,                     -- s3://bucket/a3/b5/a3b5c7d9...
    
    -- Origin tracking
    original_paths TEXT,              -- Projects/MyWebsite/images/logo.png
    original_source TEXT,            -- "Dropbox", "iCloud", "OldMacDrive"
    
    -- Flexible data
    tags TEXT,                       -- ['Projects', 'MyWebsite', 'vacation']
    metadata TEXT                    -- EXIF, dimensions, AI analysis, etc.
);

Key Features

Automatic deduplication - same file stored only once (by hash)
Multiple locations - track local, cloud, and legacy (Dropbox/iCloud) locations
Preserves context - original paths saved, auto-extracted as tags
Flexible metadata - JSON columns for file-specific data
Simple deployment - just SQLite + Python, no servers

Implementation (Python)

Core operations:

Hash file → SHA-256
Check if hash exists in DB (dedupe)
Store in hash-sharded directory
Extract tags from original path
Record in database
Optionally sync to cloud

Why This Works for You

One person system - SQLite is perfect, no complexity
Gradual migration - can index legacy locations first, move files later
Extensible - JSON lets you add features (AI analysis, etc.) without schema changes
Resilient - easy DB backups
Scales to your needs - handles 1TB/1M files easily

Simple, clean, and you're in full control.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
templates		templates
.gitignore		.gitignore
README.md		README.md
cli.py		cli.py
filer.py		filer.py
ingest.py		ingest.py
web.py		web.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Art File Management System

Core Concept

Storage Structure

Database Schema

Key Features

Implementation (Python)

Why This Works for You

About

Uh oh!

Releases

Packages

Languages

owenosborn/Filer

Folders and files

Latest commit

History

Repository files navigation

Art File Management System

Core Concept

Storage Structure

Database Schema

Key Features

Implementation (Python)

Why This Works for You

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages