extract-text

Here are 58 public repositories matching this topic...

KINGPIN707 / PDF-Highlight-Extractor

A Python tool for extracting highlighted text from PDF files while preserving formatting attributes (headers, bold, italic) and removing unwanted line breaks and page breaks. Perfect for integrating with content management systems.

python markdown pdf-converter mobi kindle ebook-reader extract-text kindle-clippings koreader remarkable-tablet highlight-color extract-highlights faiss-backend ai21labs

Updated Nov 13, 2025
Python

shelfio / tika-text-extract

Star

Extract text from a document by Apache Tika

tika npm-package node-module extract-text apache-tika

Updated Nov 13, 2025
TypeScript

lifegpc / msg-tool

Star

Tools for export and import scripts

extractor circus kirikiri extract-text bgi galgame ethornell

Updated Nov 13, 2025
Rust

yaxyobekuz / image-ocr

Star

A simple OCR REST API for extracting text from images with multi-language support, confidence scoring.

api image ocr public tesseract free extract-text image-ocr free-ocr

Updated Nov 10, 2025
JavaScript

saidsef / tika-document-to-text

Star

Apache Tika extract text and metadata from any document format with this pre-built containerised solution Kubernetes-ready deployment with intuitive UI, API, and text-to-speech capabilities - perfect for content indexing, analysis, and document processing workflows

nodejs python kubernetes text-to-speech docker-container text-extraction extract-text kubernetes-deployment helm-chart document-to-text document-to-text-ui

Updated Nov 3, 2025
JavaScript

datalogics / pdf-rest-api-samples

Star

pdfRest API Toolkit is a REST API service for processing PDF documents, made by developers, for developers. Rapidly integrate PDF workflows with your existing projects and applications, simply and seamlessly. Get started for free in seconds.

Updated Oct 23, 2025
C#

ropensci / rtika

Star

R Interface to Apache Tika

java r parse tika tesseract rstats pdf-files r-package extract-text extract-metadata peer-reviewed

Updated Oct 12, 2025
R

Circle-Company / Circle-Text-Library

Star

Circle Text Library. This code is used to process and analize text from CircleApp.

parser sentiment-analysis keyword-extraction extract-text text-validation

Updated Oct 11, 2025
TypeScript

Zoltanar / Happy-Reader

Star

VNDB explorer and VNR-like text hooker.

wpf visual-novels extract-text vndb game-launcher vnr ithvnr translation-apis text-hooking

Updated Sep 27, 2025
C#

RPG-Maker-Translation-Tools / rvpacker-txt-rs-lib

Star

Library that allows to extract text from RPG Maker files.

rust library extract rust-library rpg-maker-mv rpg-maker extract-text rust-crate rpg-maker-xp rpg-maker-vxace rpg-maker-vx rpg-maker-mz

Updated Sep 27, 2025
Rust

rlayers / pawpaw

Star

Text Processing & Segmentation Framework

python nlp parser natural-language-processing tree information-extraction knowledge-graph lexer xml-parser query-engine text-processing query-language extract-text text-segmentation xmlparser hierarchical-text-segmentation

Updated Sep 18, 2025
Python

devmehq / extract-text

Star

node.js module for extracting text from html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf and more!

pdf ocr extractor tesseract-ocr extract-text tessaract

Updated Nov 11, 2025
HTML

SyncfusionExamples / Extract-text-from-PDF-Flutter

Star

This repository contains examples to extract text from PDF documents in Flutter apps using Syncfusion PDF Flutter library.

pdf extract-text flutter-pdf extract-text-from-pdf

Updated Aug 13, 2025
Dart

BitMiracle / Docotic.Pdf.Samples

Star

C# and VB.NET samples for Docotic.Pdf library

Updated Aug 5, 2025
Visual Basic .NET

loganlinn / copy-text-of-selected-area-shortcut

Star

Apple Shortcut to copy text of selected area (screenshot) to clipboard

macos screenshot ios shortcuts extract-text shortcuts-app

Updated Jun 10, 2025

mehmetensarcetin / OCRCapturText

Star

Capture images from your computer and easily extract text from images.

python ocr extract-text ocr-python extract-text-from-image

Updated May 21, 2025
Python

sgerwk / pdftoroff

Star

view pdf on X11 and the Linux framebuffer; resize pdf; convert pdf to text, html, TeX, groff

html pdf tex accessibility framebuffer pdf-viewer pdf-files extract-text small-screen groff linux-framebuffer two-columns pdf-scale small-page pdf-resize

Updated May 5, 2025
C

ropensci / antiword

Star

R wrapper for antiword utility

r rstats r-package extract-text antiword

Updated Apr 4, 2025
C

ahmedkhemiri95 / PDFs-TextExtract

Star

Multiple and Large PDF Documents Text Extraction.

python pdf parser data-science pdf-document text-analytics pdfs pypdf2 extract-text pdfminer pdf-processing pdfs-textextract

Updated Feb 10, 2025
Python

islom-pardaboyev / image_to_text_converter

Star

A React-based web app that extracts text from images using Tesseract.js. Upload an image, and the app will process it automatically. Supports manual text extraction as well. 🚀

react image-to-text extract-text react-icons tailwindcss sooner

Updated Feb 10, 2025
TypeScript

Improve this page

Add a description, image, and links to the extract-text topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the extract-text topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extract-text

Here are 58 public repositories matching this topic...

KINGPIN707 / PDF-Highlight-Extractor

shelfio / tika-text-extract

lifegpc / msg-tool

yaxyobekuz / image-ocr

saidsef / tika-document-to-text

datalogics / pdf-rest-api-samples

ropensci / rtika

Circle-Company / Circle-Text-Library

Zoltanar / Happy-Reader

RPG-Maker-Translation-Tools / rvpacker-txt-rs-lib

rlayers / pawpaw

devmehq / extract-text

SyncfusionExamples / Extract-text-from-PDF-Flutter

BitMiracle / Docotic.Pdf.Samples

loganlinn / copy-text-of-selected-area-shortcut

mehmetensarcetin / OCRCapturText

sgerwk / pdftoroff

ropensci / antiword

ahmedkhemiri95 / PDFs-TextExtract

islom-pardaboyev / image_to_text_converter

Improve this page

Add this topic to your repo