simpleocr is a macOS OCR CLI written in Swift for image-to-text workflows in local AI pipelines, meaning no data has to be send to the model providers.
It uses Apple's Vision framework, has no third-party dependencies, and is designed to produce output that is easy to pipe into downstream LLM or automation steps.
- OCR for local image files on macOS
- Spatially aware plain-text output for LLM consumption
- Structured JSON output with normalized bounding boxes
- Table-focused JSON output derived from generic layout heuristics
- Searchable PDF generation:
pdf-text: text-only PDFpdf-image: original image plus invisible text layer
- Optional PII redaction for recognized text
- No network dependency and no cloud OCR service
- macOS 13 or newer
brew install tobilg/simpleocr/simpleocrsimpleocr <image-path> [options]
simpleocr - [options] # read image from stdinimage-path: path to the input image file (use-to read from stdin)
--lang <codes>: comma-separated language codes, defaultde-DE,en-US--mode <level>:accurateorfast, defaultaccurate--format <type>:plain,text,json,table-json,pdf-text, orpdf-image, defaulttext--output <path>: output file path for PDF formats; defaults to the input basename with.pdf--min-confidence <val>: minimum confidence threshold between0.0and1.0, default0.3--pii: redact personally identifiable information from recognized text--error-format <type>: error output format:textorjson, defaulttext--describe-formats: describe available output formats and exit--version: print version and exit--help,-h: print help and exit
jpg,jpegpngtiff,tifheic,heifbmpgif
Basic OCR (plain text, best for LLMs):
simpleocr examples/example-bill.png --format plainOCR with spatial coordinates:
simpleocr examples/example-bill.pngJSON output:
simpleocr examples/example-bill.png --format jsonTable-focused JSON output:
simpleocr examples/example-bill.png --format table-jsonFast mode with German-first language hints:
simpleocr examples/example-bill.png --lang de-DE,en-US --mode fastGenerate a searchable image PDF:
simpleocr examples/example-bill.png --format pdf-image --output bill-searchable.pdfRedact PII before returning text:
simpleocr examples/example-bill.png --piiRead image from stdin:
cat screenshot.png | simpleocr - --format plainJSON errors for programmatic consumption:
simpleocr missing.png --error-format json
# stderr: {"error":"Error: File not found or unreadable: missing.png","code":1}Describe available output formats:
simpleocr --describe-formatsPlain text output, one line per recognized text element, sorted top-to-bottom then left-to-right. Best for feeding into LLMs or other text processing tools.
Example:
Muster GmbH
Industriestrasse 42, 80331 Munchen
Spatially-aware text with normalized coordinates (y,x) prepended to each line. Useful when position matters.
Example:
[y=0.08,x=0.06] Muster GmbH
[y=0.11,x=0.06] Industriestrasse 42, 80331 Munchen
Returns document metadata, recognized observations, and inferred structured regions:
{
"image_size": {
"height": 3508,
"width": 2480
},
"language_hints": [
"de-DE",
"en-US"
],
"observations": [
{
"bounding_box": {
"height": 0.03,
"width": 0.22,
"x": 0.06,
"y": 0.08
},
"confidence": 0.98,
"text": "Muster GmbH"
}
],
"pii_redacted": false,
"recognition_level": "accurate",
"source": "invoice.png"
}Returns only inferred table-like regions with row and cell structure derived from geometry:
{
"image_size": {
"height": 1161,
"width": 796
},
"language_hints": [
"de-DE",
"en-US"
],
"pii_redacted": false,
"recognition_level": "accurate",
"source": "example-bill.png",
"tables": [
{
"column_anchors": [0.1, 0.14, 0.49, 0.59, 0.74, 0.82],
"row_count": 2
}
]
}Creates a PDF page containing rendered OCR text only.
Creates a PDF containing the original image with an invisible text layer for search and copy/paste.
This repo includes a Claude Code skill that lets coding agents run OCR directly:
/ocr examples/example-bill.png
/ocr screenshot.png --format json
The skill is defined in .claude/skills/ocr/SKILL.md and is available automatically when Claude Code is used in this project.
To use the skill in other projects, install it to your personal skills directory:
mkdir -p ~/.claude/skills/ocr
curl -fsSL https://raw.githubusercontent.com/tobilg/simpleocr/main/.claude/skills/ocr/SKILL.md -o ~/.claude/skills/ocr/SKILL.mdUse the wrapper script so SwiftPM and Clang caches stay inside the repository:
./scripts/build-local.shRelease build:
./scripts/build-local.sh -c releaseIf you see an error like:
this SDK is not supported by the compiler
your selected Swift toolchain and the active Apple SDK do not match. Fix it by:
- installing a matching Xcode version
- selecting the matching developer directory with
xcode-select - rerunning
./scripts/build-local.sh
The wrapper script exports local cache paths:
SWIFTPM_MODULECACHE_OVERRIDE=.build/module-cacheCLANG_MODULE_CACHE_PATH=.build/clang-module-cache
That avoids writing to global cache locations during local or sandboxed builds.
If plain swift build already works on your machine, you can keep using it.
Package.swift
README.md
.claude/skills/ocr/SKILL.md
Sources/simpleocr/main.swift
Sources/simpleocr/CLI.swift
Sources/simpleocr/Models.swift
Sources/simpleocr/ObservationLayout.swift
Sources/simpleocr/OCREngine.swift
Sources/simpleocr/OutputFormatter.swift
Sources/simpleocr/PDFGenerator.swift
Sources/simpleocr/PIIRedactor.swift
Tests/simpleocrTests/
requirements/ocr-cli-prd.md
examples/example-bill.png