Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF
-
Updated
Oct 15, 2019 - Python
Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF
The extended version of simhash supports fingerprint extraction of documents and images.
COVID-19 Open Research Dataset (CORD-19) Analysis
Search through all your personal data efficiently like web search.
Information retrieval of text document using TF-IDF weighting & Cosine Similarity Algorithm.
This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.
dead simple document index and search, nothing fancy
Retrieval-Augmented Generation, or RAG, is an innovative approach that enhances the capabilities of pre-trained large language models (LLMs) by integrating them with external data sources. This technique leverages the generative power of LLMs (Large Language Model), and combines it with the precision of specialized data search mechanisms.
Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)
Semantic document search system with pgvector and PGAI
An in-memory NoSQL database implemented in Python.
An interactive GPT-style web application that lets you query folders of PDFs using open-source LLMs from Meta, Microsoft, Google, Mistral, and more.
Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.
AI-powered hybrid search engine combining keyword, vector, and LLM-based contextual search using RAG with support for AI21, OpenAI or any other LLM.
Local Retrieval-Augmented Generation (RAG) pipeline using LangChain and ChromaDB to query PDF files with LLMs.
Chat with your PDFs using AI! This Streamlit app uses RAG, LangChain, FAISS, and OpenAI to let you ask questions and get answers with page and file references.
Semestrální práce z předmětu Information Retrieval
SmartRAG is a terminal-based RAG system using LangGraph. It processes queries by retrieving relevant content from markdown or PDFs, then responds using OpenAI GPT. Supports webpage-to-PDF conversion, vector DB search, and modular flow control.
PostgreSQL-native semantic search engine with multi-modal capabilities. Add AI-powered search to your existing database without separate vector databases, vendor fees, or complex setup. Features text + image search using CLIP embeddings, native SQL joins, and 10-minute Docker deployment.
Add a description, image, and links to the document-search topic page so that developers can more easily learn about it.
To associate your repository with the document-search topic, visit your repo's landing page and select "manage topics."