Skip to content

Tags: misty-rc/zipdf

Tags

v0.2.1

Toggle v0.2.1's commit message
Refactor processor and unify code comments to Japanese

- Extract prepareImages helper to deduplicate quality/reencode logic
- Translate all pdfextract.go doc comments from English to Japanese
- Fix double blank line before uniqueName
- Translate README to English

v0.2.0

Toggle v0.2.0's commit message
Add --recompress mode with lightweight direct PDF parser

Replace pdfcpu-based image extraction with a custom XRef parser
that reads image streams one at a time without loading the entire
PDF into memory. Fixes hangs on large PDFs (600MB+).

- pdfextract.go: parse XRef table, extract FlateDecode and DCTDecode
  image streams directly; FlateDecode+PNG predictor data is wrapped
  in a synthetic PNG container (no decompression needed — PDF IDAT
  and PNG IDAT are the same zlib format)
- processor.go: switch to extractPDFImagesDirect, remove pdfcpu dep
- main.go: skip *_compressed.pdf files to avoid reprocessing outputs

Results: 622MB PDF (181 pages) → 116MB at quality 70, no hang.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>