PDF/Image to Markdown converter using Claude's vision. Named after Nori, the dwarf scribe from LOTR.
Renders PDF pages to images using macOS native APIs, then uses the Claude Agent SDK to visually read each page and produce faithful markdown — including ASCII art for diagrams.
Use it as a standalone CLI, or as a Claude Code skill that runs entirely inside your editor.
Most PDF-to-markdown tools extract embedded text, which misses diagrams, figures, and visual layout. nori sees each page as an image and converts what it sees, preserving:
- Text with proper heading structure
- Tables as markdown tables
- Diagrams and flowcharts as ASCII art
- Code snippets with syntax highlighting
- Math equations in LaTeX notation
nori also ships as a Claude Code skill, so you
can convert documents without leaving your editor. The skill does not use the Python
pipeline or the Agent SDK — Claude Code is already a vision model, so it reads the pages
itself. The only bundled helper is a tiny macOS PDF rasterizer
(.claude/skills/nori/scripts/render_pdf.js).
The skill lives in this repo at .claude/skills/nori/, so it's active whenever you run
Claude Code inside the project. To use it from any project, copy it to your personal
skills folder:
cp -r .claude/skills/nori ~/.claude/skills/Invoke it directly:
/nori document.pdf
/nori paper.pdf -o notes.md -w 6 -p 1-5 --scale 3.0
/nori ./exported_pages/
…or just ask in plain language ("convert this PDF to markdown") and Claude loads the skill automatically.
Flags: -o/--output, -w/--workers (parallel transcription subagents, default 4),
-p/--pages (e.g. 1-5), --scale (default 2.0).
CLI (uv run nori) |
Skill (/nori) |
|
|---|---|---|
| Conversion engine | Claude Agent SDK (separate runs) | Claude Code's own vision |
| PDF rendering | macOS PDFKit | macOS PDFKit (same script) |
-w parallelism |
hard limit (asyncio semaphore) | best-effort (subagents) |
| Resume after crash | yes (state in ~/.nori/) |
no — single session |
- macOS (uses native PDFKit for PDF rendering)
- uv (Python package manager)
- Claude subscription (uses Claude Agent SDK — no API key needed)
cd nori && uv syncuv run nori document.pdfOutput: document.md in the same directory as the PDF.
# A folder of page images
uv run nori ./exported_pages/
# Individual image files
uv run nori page1.png page2.png page3.pnguv run nori --help
positional arguments:
input PDF file, image files, or directories containing images
options:
-o, --output OUTPUT Output markdown file (default: <input_name>.md)
-w, --workers N Number of parallel workers (default: 1)
--scale SCALE Scale factor for PDF rendering (default: 2.0)
--clean Clear saved state and start fresh
# Custom output path
uv run nori paper.pdf -o notes.md
# Faster with more workers
uv run nori paper.pdf -w 8
# Higher quality rendering
uv run nori paper.pdf --scale 3.0
# Start fresh (discard previous progress)
uv run nori paper.pdf --cleannori saves progress after each page. If interrupted (Ctrl+C, timeout, error), re-run the same command and it picks up where it left off:
uv run nori big-document.pdf -w 8
# ... Converted 42/96 pages
# ^C (interrupted)
uv run nori big-document.pdf -w 8
# Resuming — 42/96 pages already done
# Converted 96/96 pages — doneState is stored in ~/.nori/ and auto-cleaned after successful completion.
- PDF to images — Uses macOS native PDFKit (via JXA/CoreGraphics) to render each page as a PNG. No external dependencies like poppler or ImageMagick.
- Images to markdown — Sends each page image to Claude via the Agent SDK's
Readtool (multimodal). Claude visually reads the page and outputs markdown. - Parallel processing — Multiple pages are converted concurrently for speed.
- Assembly — Page markdowns are stitched together in order into the final document.
MIT