Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
-
Updated
Nov 10, 2025 - HTML
Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract.
List of software that allows searching the web with the assistance of AI: https://hf.co/spaces/felladrin/awesome-ai-web-search
Use LLMs for building real-world apps
HTML to markdown converter
A sample Chatbot in C# using Microsoft Agent Framework
Medical RAG QA App using Meditron 7B LLM, Qdrant Vector Database, and PubMedBERT Embedding Model.
Source code for the Gilded Age Gourmet, a cooking chat app based on the Boston Cooking-School Cook Book.
Bedrock Knowledge Base and Agents for Retrieval Augmented Generation (RAG)
Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.
'Talk to your slide deck' (Multimodal RAG) using foundation models (FMs) hosted on Amazon Bedrock and Amazon SageMaker
This is a RAG implementation using Open Source stack. BioMistral 7B has been used to build this app along with PubMedBert as an embedding model, Qdrant as a self hosted Vector DB, and Langchain & Llama CPP as an orchestration frameworks.
Add a description, image, and links to the rag topic page so that developers can more easily learn about it.
To associate your repository with the rag topic, visit your repo's landing page and select "manage topics."