Processing and hashing Slack communication to enable language modelling
-
Updated
Jun 7, 2023 - Python
Processing and hashing Slack communication to enable language modelling
A fast and easy-to-use Python toolkit for image processing with CLI tools for resizing, cropping, OCR, and optimization, including batch processing support.
A local GPU-accelerated Retrieval-Augmented Generation (RAG) pipeline for PDF question-answering with multi-LLM support and modular NLP components. Process documents locally with privacy-focused information retrieval.
PDF Liberation MCP Server - Break large PDFs into digestible chunks for Claude
Web Application to extract text from image
This web application utilizes OCR technology to recognize text in uploaded images and provides spelling correction and word performance improvement. Users can easily upload images containing text and receive accurate and enhanced text results.
FastSnip - Free OCR screen capture tool for Windows. Extract text from anywhere on your screen with Ctrl+Shift+T. Perfect TextSniper alternative with multi-language support.
This assistant tool (WIP) will help you search, browse and summarize the answers to your questions from your uploaded PDF using advanced text analytics, semantic search and Large Language Model (LLM)
The objective is to analyze text content from a list of URLs. This involves extracting article titles and text, then performing natural language processing to generate metrics like sentiment, readability, and word usage. Finally, the results are stored for further analysis or visualization.
Extract price amount and currency symbol from a raw text string
OCR tool to extract and structure text from images and scanned PDFs (outputs .docx / .txt) — FR/EN
Python tool for converting PDF files to text. Simplify your document processing tasks.
A privacy-focused, client-side web application that extracts clean, readable content from any webpage and converts it to PDF format. Built with pure HTML, CSS, and JavaScript—no backend required, no tracking, complete privacy.
A Cloud-Native Infrastructure for License Plate Recognition and Text Extraction with Python Integration
Implementing Text Summarization techniques on 'CNN DialyMail' dataset, using both 'Extractive' and 'Abstractive' strategies.
A Python-based application for live video text extraction using the Gemini 1.5 Flash API, hand gesture detection, and UI display.
Convert scrolling article videos into long images and extract text with OCR.
A complete Python pipeline that automates the creation of structured datasets from natural language search queries. This tool searches the web for content matching your query, scrapes and cleans the content, and outputs a structured dataset in multiple formats.
Retrieve data from two different websites, loading them into the PostgreSQL database using Python, and combine them to get and present new information
Add a description, image, and links to the text-extraction topic page so that developers can more easily learn about it.
To associate your repository with the text-extraction topic, visit your repo's landing page and select "manage topics."