tika

A Java application that uses Lucene and Tika to search document and display the document part in which the document is found.Along with precision and recall value

java search-engine tika lucenesearch

Updated Aug 20, 2017
Java

voltek62 / Rwahoo

Star

Create the ultimate scraper with Apache Tika for R

cran r tika

Updated Mar 23, 2018
R

dmamakas2000 / tiktok-java-app

Star

This project implements a multimedia content sharing system in Java 8, allowing users to upload and stream videos to their subscribers. Inspired by platforms like TikTok, it manages user channels, subscriptions, and real-time video streaming, developing the event delivery system for efficient content promotion.

java distributed-systems multimedia tika broker java-8 client-server video-streaming

Updated Dec 2, 2024
Java

bastman / azure-functions-kotlin

Star

POC: azure-functions (kotlin, gradle, tika)

kotlin gradle tika azure-functions

Updated Feb 18, 2019
Kotlin

FrodeRanders / disksearch

Star

Indexes a directory hierarchy and provides a crude search interface onto that index

tika pdfbox poi lucene

Updated Dec 16, 2024
Java

kanety / tikarb

Star

A simple Apache Tika binding for ruby using rjb

ruby tika

Updated May 24, 2020
Ruby

chrisbratlien / aws-bucketeer

Star

Apache Solr/Tika index/search plus SHA256 content-based addressing for files stored into AWS S3 buckets

aws buckets solr tika s3 apache sha256 content-based addressing

Updated Oct 26, 2021
PHP

tirthmehta / Apache-Solr-based-Web-Search-Engine

Star

Deployment of a search engine utilizing Apache Solr, Apache Tika and spelling correction programs.

python java php solr tika

Updated Jul 28, 2017

sesam-community / content-extractor

Star

Extract textual information using the Apache Tika library from JSON streams

docker tika transform sesam

Updated Apr 25, 2017
Java

AidaRosaCalvo / info-retrieval-system

Star

Este proyecto consiste en la construcción de un sistema de recuperación de información que puede manipular documentos de diferentes formatos provenientes de un repositorio de información. La aplicación utiliza herramientas como Lucene y Tika para indexar y extraer información de los documentos.

clustering tika javafx java-8 lucene kmeans-clustering linkage c-means-implementation

Updated Jun 23, 2024
Java

vahabov007 / PDFTextSearch

Star

PDFTextSearch is a Spring Boot backend service that extracts text from uploaded PDF documents using Apache Tika and indexes the extracted content into Elasticsearch for full-text search capabilities. Users can upload PDFs, search through their content, and retrieve matching documents.

pdf spring-boot tika elastic-search

Updated Nov 8, 2025
Java

orijtech / tikago

Star

Apache Tika adapter in Go

tika pdf-to-text apache-tika transcribe docs-to-text

Updated Jan 4, 2017
Go

Improve this page

Add a description, image, and links to the tika topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tika topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tika

Here are 152 public repositories matching this topic...

dataiku / dss-plugin-nlp-extraction

krish-kunal / task

procesaur / TExASe

sbelassa / SMIR

gcpetri / SiteMap-Python

mrspaceman / elibraryserver

semarslan / ws

albertus82 / extfix

tusharkm / search_engine_using_lucene

voltek62 / Rwahoo

dmamakas2000 / tiktok-java-app

bastman / azure-functions-kotlin

FrodeRanders / disksearch

kanety / tikarb

chrisbratlien / aws-bucketeer

tirthmehta / Apache-Solr-based-Web-Search-Engine

sesam-community / content-extractor

AidaRosaCalvo / info-retrieval-system

vahabov007 / PDFTextSearch

orijtech / tikago

Improve this page

Add this topic to your repo