• Log in
  • Register

linkhut
Bookmarks
tagged with:
  • ocr
  • dev
Sort by:
  • recency
  • popularity
Order:
  • descending
  • ascending

30 Aug 25

CatchTheTornado/text-extract-api: Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

https://github.com/CatchTheTornado/text-extract-api
by shubxam 3 months ago
Tags:
  • ocr
  • pdf
  • dev
  • library
  • python
  • repo

Tags
Sort by:
  • label
  • usage
Order:
  • ascending
  • descending
  • dev
  • library
  • ocr
  • pdf
  • python
  • repo
Explore
  • Recent
  • Popular
RSS feed

linkhut is open source software. You can contribute and report issues on SourceHut at ~mlb/linkhut (v0.1.0)