Tags: misty-rc/zipdf
Tags
Refactor processor and unify code comments to Japanese - Extract prepareImages helper to deduplicate quality/reencode logic - Translate all pdfextract.go doc comments from English to Japanese - Fix double blank line before uniqueName - Translate README to English
Add --recompress mode with lightweight direct PDF parser Replace pdfcpu-based image extraction with a custom XRef parser that reads image streams one at a time without loading the entire PDF into memory. Fixes hangs on large PDFs (600MB+). - pdfextract.go: parse XRef table, extract FlateDecode and DCTDecode image streams directly; FlateDecode+PNG predictor data is wrapped in a synthetic PNG container (no decompression needed — PDF IDAT and PNG IDAT are the same zlib format) - processor.go: switch to extractPDFImagesDirect, remove pdfcpu dep - main.go: skip *_compressed.pdf files to avoid reprocessing outputs Results: 622MB PDF (181 pages) → 116MB at quality 70, no hang. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>