24 Oct 25

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing. The pdf-to-markdown GitHub repository hosts a tool designed to convert PDF files into Markdown format for easier text extraction and reformatting, with the process running locally on the user’s machine.

by tmfnk 2 months ago