24 Oct 25
Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing. The pdf-to-markdown GitHub repository hosts a tool designed to convert PDF files into Markdown format for easier text extraction and reformatting, with the process running locally on the user’s machine.
by tmfnk
2 months ago
31 Jan 25
MarkItDown is a utility for converting various files to Markdown (e.g., for indexing, text analysis, etc). It supports: