Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
-
Updated
Nov 7, 2025 - HTML
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
List of software that allows searching the web with the assistance of AI: https://hf.co/spaces/felladrin/awesome-ai-web-search
HTML to markdown converter
Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.
Use LLMs for building real-world apps
Bedrock Knowledge Base and Agents for Retrieval Augmented Generation (RAG)
Medical RAG QA App using Meditron 7B LLM, Qdrant Vector Database, and PubMedBERT Embedding Model.
A sample Chatbot in C# using Microsoft Agent Framework
Logseq Spring Thing Immersive & Agentic Knowledge Development Engine
"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang
'Talk to your slide deck' (Multimodal RAG) using foundation models (FMs) hosted on Amazon Bedrock and Amazon SageMaker
A proof-of-concept for a RAG to query the scikit-learn documentation
Anthropic's Contextual Retrieval implementation with visual chunk comparison. Preview context enrichment before/after embedding.
Add a description, image, and links to the rag topic page so that developers can more easily learn about it.
To associate your repository with the rag topic, visit your repo's landing page and select "manage topics."