Skip to content
#

text-extraction

Here are 384 public repositories matching this topic...

Gemma-3 OCR exemplifies the confluence of abstruse computer vision and arcane NLP, leveraging Gemma-3 Vision’s neural framework for precise OCR and semantically refined text curation. Powered by Streamlit and Ollama, this hermetic system converts visual data into perspicuous, markdown-rendered output, ensuring maximal accuracy and confidentiality.

  • Updated Nov 9, 2025
  • Python

⚡ Pen2PDF Suite – an all-in-one 🚀 productivity platform ✨ with 🤖 AI-powered text extraction (PDF/Images → Markdown 📝), 📅 smart timetable management (CSV/Excel import 📊), ✅ todo lists with subtasks📈, 🧠 AI-generated notes library 📚 and 💬 Isabella AI assistant (OpenAI/Microsoft/llama/Mistral/LongCat/Gemini models 🔄)for context-aware help 🧩.

  • Updated Nov 8, 2025
  • JavaScript

Apache Tika extract text and metadata from any document format with this pre-built containerised solution Kubernetes-ready deployment with intuitive UI, API, and text-to-speech capabilities - perfect for content indexing, analysis, and document processing workflows

  • Updated Nov 3, 2025
  • JavaScript

DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality

  • Updated Nov 1, 2025
  • C++

Improve this page

Add a description, image, and links to the text-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the text-extraction topic, visit your repo's landing page and select "manage topics."

Learn more