simpleocr

simpleocr is a macOS OCR CLI written in Swift for image-to-text workflows in local AI pipelines, meaning no data has to be send to the model providers.

It uses Apple's Vision framework, has no third-party dependencies, and is designed to produce output that is easy to pipe into downstream LLM or automation steps.

Features

OCR for local image files on macOS
Spatially aware plain-text output for LLM consumption
Structured JSON output with normalized bounding boxes
Table-focused JSON output derived from generic layout heuristics
Searchable PDF generation:
- pdf-text: text-only PDF
- pdf-image: original image plus invisible text layer
Optional PII redaction for recognized text
No network dependency and no cloud OCR service

Requirements

macOS 13 or newer

Install

brew install tobilg/simpleocr/simpleocr

Usage

simpleocr <image-path> [options]
simpleocr - [options]              # read image from stdin

Arguments

image-path: path to the input image file (use - to read from stdin)

Options

--lang <codes>: comma-separated language codes, default de-DE,en-US
--mode <level>: accurate or fast, default accurate
--format <type>: plain, text, json, table-json, pdf-text, or pdf-image, default text
--output <path>: output file path for PDF formats; defaults to the input basename with .pdf
--min-confidence <val>: minimum confidence threshold between 0.0 and 1.0, default 0.3
--pii: redact personally identifiable information from recognized text
--error-format <type>: error output format: text or json, default text
--describe-formats: describe available output formats and exit
--version: print version and exit
--help, -h: print help and exit

Supported Input Formats

jpg, jpeg
png
tiff, tif
heic, heif
bmp
gif

Examples

Basic OCR (plain text, best for LLMs):

simpleocr examples/example-bill.png --format plain

OCR with spatial coordinates:

simpleocr examples/example-bill.png

JSON output:

simpleocr examples/example-bill.png --format json

Table-focused JSON output:

simpleocr examples/example-bill.png --format table-json

Fast mode with German-first language hints:

simpleocr examples/example-bill.png --lang de-DE,en-US --mode fast

Generate a searchable image PDF:

simpleocr examples/example-bill.png --format pdf-image --output bill-searchable.pdf

Redact PII before returning text:

simpleocr examples/example-bill.png --pii

Read image from stdin:

cat screenshot.png | simpleocr - --format plain

JSON errors for programmatic consumption:

simpleocr missing.png --error-format json
# stderr: {"error":"Error: File not found or unreadable: missing.png","code":1}

Describe available output formats:

simpleocr --describe-formats

Output Formats

`plain`

Plain text output, one line per recognized text element, sorted top-to-bottom then left-to-right. Best for feeding into LLMs or other text processing tools.

Example:

Muster GmbH
Industriestrasse 42, 80331 Munchen

`text`

Spatially-aware text with normalized coordinates (y,x) prepended to each line. Useful when position matters.

Example:

[y=0.08,x=0.06] Muster GmbH
[y=0.11,x=0.06] Industriestrasse 42, 80331 Munchen

`json`

Returns document metadata, recognized observations, and inferred structured regions:

{
  "image_size": {
    "height": 3508,
    "width": 2480
  },
  "language_hints": [
    "de-DE",
    "en-US"
  ],
  "observations": [
    {
      "bounding_box": {
        "height": 0.03,
        "width": 0.22,
        "x": 0.06,
        "y": 0.08
      },
      "confidence": 0.98,
      "text": "Muster GmbH"
    }
  ],
  "pii_redacted": false,
  "recognition_level": "accurate",
  "source": "invoice.png"
}

`table-json`

Returns only inferred table-like regions with row and cell structure derived from geometry:

{
  "image_size": {
    "height": 1161,
    "width": 796
  },
  "language_hints": [
    "de-DE",
    "en-US"
  ],
  "pii_redacted": false,
  "recognition_level": "accurate",
  "source": "example-bill.png",
  "tables": [
    {
      "column_anchors": [0.1, 0.14, 0.49, 0.59, 0.74, 0.82],
      "row_count": 2
    }
  ]
}

`pdf-text`

Creates a PDF page containing rendered OCR text only.

`pdf-image`

Creates a PDF containing the original image with an invisible text layer for search and copy/paste.

Claude Code Skill

This repo includes a Claude Code skill that lets coding agents run OCR directly:

/ocr examples/example-bill.png
/ocr screenshot.png --format json

The skill is defined in .claude/skills/ocr/SKILL.md and is available automatically when Claude Code is used in this project.

To use the skill in other projects, install it to your personal skills directory:

mkdir -p ~/.claude/skills/ocr
curl -fsSL https://raw.githubusercontent.com/tobilg/simpleocr/main/.claude/skills/ocr/SKILL.md -o ~/.claude/skills/ocr/SKILL.md

Development

Build from source

Use the wrapper script so SwiftPM and Clang caches stay inside the repository:

./scripts/build-local.sh

Release build:

./scripts/build-local.sh -c release

Troubleshooting

Swift / SDK version mismatch

If you see an error like:

this SDK is not supported by the compiler

your selected Swift toolchain and the active Apple SDK do not match. Fix it by:

installing a matching Xcode version
selecting the matching developer directory with xcode-select
rerunning ./scripts/build-local.sh

Sandbox cache warnings

The wrapper script exports local cache paths:

SWIFTPM_MODULECACHE_OVERRIDE=.build/module-cache
CLANG_MODULE_CACHE_PATH=.build/clang-module-cache

That avoids writing to global cache locations during local or sandboxed builds. If plain swift build already works on your machine, you can keep using it.

Project Layout

Package.swift
README.md
.claude/skills/ocr/SKILL.md
Sources/simpleocr/main.swift
Sources/simpleocr/CLI.swift
Sources/simpleocr/Models.swift
Sources/simpleocr/ObservationLayout.swift
Sources/simpleocr/OCREngine.swift
Sources/simpleocr/OutputFormatter.swift
Sources/simpleocr/PDFGenerator.swift
Sources/simpleocr/PIIRedactor.swift
Tests/simpleocrTests/
requirements/ocr-cli-prd.md
examples/example-bill.png

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.claude/skills/ocr		.claude/skills/ocr
.github/workflows		.github/workflows
Sources/simpleocr		Sources/simpleocr
Tests/simpleocrTests		Tests/simpleocrTests
scripts		scripts
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simpleocr

Features

Requirements

Install

Usage

Arguments

Options

Supported Input Formats

Examples

Output Formats

`plain`

`text`

`json`

`table-json`

`pdf-text`

`pdf-image`

Claude Code Skill

Development

Build from source

Troubleshooting

Swift / SDK version mismatch

Sandbox cache warnings

Project Layout

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

simpleocr

Features

Requirements

Install

Usage

Arguments

Options

Supported Input Formats

Examples

Output Formats

plain

text

json

table-json

pdf-text

pdf-image

Claude Code Skill

Development

Build from source

Troubleshooting

Swift / SDK version mismatch

Sandbox cache warnings

Project Layout

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`plain`

`text`

`json`

`table-json`

`pdf-text`

`pdf-image`

Packages