Kreuzakt - a simple replacement for Paperless

Kreuzakt is a project that takes the best parts of Paperless, drastically improves the OCR using VLLMs, and throws out 99% of the complexity. Take every boring document in your life and make them all instantly easy to find, and (optionally) let AIs search them to answer questions for you.

What's Different:

Kreuzakt uses a single Docker container with an SQLite database, there aren't a ton of moving parts
Rather than use Tesseract, Kreuzakt uses LLMs to do OCR (by default via OpenRouter but Ollama/Local LLMs work as well) via Kreuzberg. This drastically improves OCR accuracy, and by extension, search accuracy.
Kreuzakt provides a remote MCP server - connect Claude Desktop, Cursor, or any other MCP client to Kreuzakt and ask questions about your documents
Kreuzakt uses an LLM to also derive a title / description / original date for every document, out of the box. Zero manual curation / toil work.
Metadata can always be regenerated from the source documents, the only thing you need to migrate is the originals

What's the Same:

Kreuzakt always preserves your original documents, it never edits them directly
Ingestion based on file watches works the same, drop documents into the 'ingest' folder and it will automatically be processed

Self-hosting with Docker Compose

services:
  kreuzakt:
    image: ghcr.io/anaisbetts/kreuzakt:latest
    ports:
      - "3000:3000"
    environment:
      OPENROUTER_KEY: ${OPENROUTER_KEY}
      TZ: Europe/Berlin  # Set your local timezone
    volumes:
      - ./docs:data
    restart: unless-stopped

Drop this in a docker-compose.yml, set OPENROUTER_KEY in your environment or a .env file, and run docker compose up -d. The web UI is at http://localhost:3000.

The ./docs folder will be initialized with directories including ./data/ingest, ./data/originals, and ./data/thumbnails.

Ok now what do I do?

docker-compose up -d
Drop all of your documents into the ingest folder - they will eventually all move to the originals folder. You can see the progress at /settings - if you have a lot of documents it might take a bit.
If you've got an existing Paperless install, you can run the import
You can also simply drag-drop a bunch of files onto the main page

How much is this gonna cost me?

I'm too lazy to do the math on exactly how much per-page it costs, but for perspective, importing 440 documents from Paperless (a few of which were up to 80pgs long), cost me ~$5.

Volume mounts

Everything lives under /data by default — the SQLite database, originals, thumbnails, and the ingest folder. If you want to split things up, override with individual env vars and mount each path separately:

Variable	Default	Description
`INGEST_DIR`	`/data/ingest`	Watched folder for new documents
`IMPORT_DIR`	`/data/import`	Staging folder for orchestrated imports (e.g. Paperless); not watched
`ORIGINALS_DIR`	`/data/originals`	Stored original files
`THUMBNAILS_DIR`	`/data/thumbnails`	Generated thumbnails
`DB_PATH`	`/data/docs-ai.db`	SQLite database

Optional environment variables

Variable	Default	Description
`OPENROUTER_KEY`	—	API key for OpenRouter (recommended)
`OPENAI_API_KEY`	—	Alternative: direct OpenAI key
`OPENAI_BASE_URL`	`https://openrouter.ai/api/v1`	Base URL for any OpenAI-compatible API (e.g. Ollama at `http://host.docker.internal:11434/v1`)
`OCR_VLM_MODEL`	`openai/gpt-5.4-mini`	Model used for OCR
`METADATA_LLM_MODEL`	`openai/gpt-5.4`	Model used for title/description extraction
`PORT`	`3000`	Port inside the container
`TZ`	`UTC`	Timezone for date display (e.g. `Europe/Berlin`, `America/New_York`). Use any tz database name.
`INGEST_WATCH_POLL`	`false`	Poll `INGEST_DIR` instead of using inotify. Enable when the ingest folder is on NFS, SMB, or a FUSE mount — inotify does not see changes made on the remote side.
`INGEST_WATCH_POLL_INTERVAL_MS`	`2000`	Poll interval in ms when `INGEST_WATCH_POLL` is enabled.

MCP setup

Kreuzakt exposes a remote MCP endpoint at /mcp (Streamable HTTP). Replace the hostname in the snippets below with wherever you serve the app — for example https://docs.your-tailnet.ts.net/mcp when using Tailscale Serve. Most clients will not talk to plain http, so terminating TLS (Serve, a reverse proxy, etc.) is the usual approach.

Claude Desktop — npx mcp-remote@latest …

mcp-remote bridges the HTTP MCP endpoint for clients that expect a local process.

{
  "mcpServers": {
    "docs": {
      "command": "npx",
      "args": ["mcp-remote@latest", "https://docs.your-tailnet.ts.net/mcp"]
    }
  }
}

Cursor — type: "http" in MCP config

Add to .cursor/mcp.json or your project’s MCP settings.

{
  "mcpServers": {
    "docs": {
      "type": "http",
      "url": "https://docs.your-tailnet.ts.net/mcp"
    }
  }
}

Example prompts

"Find invoices from Deutsche Telekom."
"What was my health insurance number again?"
"How much did I pay in taxes last year"

Local development

Prerequisites: Bun (the project runs Next.js and scripts through Bun; see package.json) and a Rust toolchain for the Kreuzberg extraction CLI.

Install dependencies: bun install
Build the local extraction CLI: cargo build -p kreuzakt-kreuzberg
Copy .env.local.example to .env.local and set at least one way to reach an OpenAI-compatible API. The usual choice is OPENROUTER_KEY. For a local LLM, set OPENAI_DEV_URL, OPENAI_DEV_KEY, and optionally OCR_VLM_DEV_MODEL / METADATA_LLM_DEV_MODEL. See .env.local.example for all variables the app and tooling recognize.
Start the dev server: bun dev. The app listens on port 3000 by default (PORT). Runtime data defaults to ./data (SQLite, ingest, originals, thumbnails) unless you override DATA_DIR or individual path variables.

Other useful commands:

bun test — unit tests
cargo test — Rust extraction CLI tests
bun run test:integration — integration tests (loads .env.local via --env-file; requires Paperless-related vars when those tests run)
bun storybook — UI development on port 6006

Text export

POST /api/documents/export-text exports every document whose SQLite content column is non-empty as a ZIP of .txt files. Each file is named {id}-{sanitized-title}.txt and begins with YAML frontmatter (original_filename, document_url, original_url) followed by the extracted document body. The response is an application/zip download named kreuzakt-text-export-YYYYMMDD-HHmmss.zip. Returns 400 if there is no exportable content.

curl -X POST http://localhost:3000/api/documents/export-text -o export.zip

So.... why's it called "Kreuzakt"?

It uses the library Kreuzberg, and it is a tool to help you with your "Akte" (files/documents). Just like "Berghain" is a portmanteau of "Kreuzberg" and "Friedrichshain", the two districts in Berlin that it sits between. (today you learn!)

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.cursor		.cursor
.github/workflows		.github/workflows
.storybook		.storybook
eval		eval
public		public
rust/kreuzakt-kreuzberg		rust/kreuzakt-kreuzberg
spec		spec
src		src
.dockerignore		.dockerignore
.env.local.example		.env.local.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
COPYING		COPYING
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
next.config.ts		next.config.ts
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kreuzakt - a simple replacement for Paperless

What's Different:

What's the Same:

Self-hosting with Docker Compose

Ok now what do I do?

How much is this gonna cost me?

Volume mounts

Optional environment variables

MCP setup

Example prompts

Local development

Text export

So.... why's it called "Kreuzakt"?

About

Uh oh!

Releases 15

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kreuzakt - a simple replacement for Paperless

What's Different:

What's the Same:

Self-hosting with Docker Compose

Ok now what do I do?

How much is this gonna cost me?

Volume mounts

Optional environment variables

MCP setup

Example prompts

Local development

Text export

So.... why's it called "Kreuzakt"?

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages