node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
-
Updated
Nov 11, 2025 - HTML
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!
Octical Character Recognition app that extracts Text from images built with FastAPI, Tailwindcss and Pytesseract
Add a description, image, and links to the extract-text topic page so that developers can more easily learn about it.
To associate your repository with the extract-text topic, visit your repo's landing page and select "manage topics."