Skip to content
View jtbates's full-sized avatar

Organizations

@dssg

Block or report jtbates

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

39 stars written in Java
Clear filter

Apache Iceberg

Java 8,742 3,164 Updated Apr 16, 2026

AI + Data, online. https://vespa.ai

Java 6,876 707 Updated Apr 16, 2026

Statistical Machine Intelligence & Learning Engine

Java 6,365 1,148 Updated Apr 16, 2026

A machine learning software for extracting information from scholarly documents

Java 4,793 543 Updated Apr 15, 2026

Collect, aggregate, and visualize a data ecosystem's metadata

Java 2,166 391 Updated Apr 12, 2026

A native library providing a Tinder-like cards effect. A card can be constructed using an image and displayed with animation effects, dismiss-to-like and dismiss-to-unlike, and use different sortin…

Java 1,469 361 Updated Nov 16, 2020

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Java 1,182 160 Updated Mar 27, 2026

Anserini is a Lucene toolkit for reproducible information retrieval research

Java 1,108 586 Updated Apr 16, 2026

A Java HTTP client for consuming Twitter's realtime Streaming API

Java 955 361 Updated Apr 6, 2022

align and compare tables

Java 903 76 Updated Mar 17, 2026

A programmable, embeddable web browser driver compatible with the Selenium WebDriver spec -- headless, WebKit-based, pure Java

Java 814 142 Updated Jul 29, 2024

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.

Java 689 167 Updated Apr 14, 2026

Duke is a fast and flexible deduplication engine written in Java

Java 626 190 Updated Oct 11, 2023

ReverseProxy-Android

Java 511 159 Updated Apr 16, 2018

Latent Dirichlet Allocation (LDA) model for Microblogs (Twitter, weibo etc.)

Java 322 108 Updated May 4, 2018

Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.com/booknlp/booknlp)

Java 316 46 Updated Feb 4, 2022

A Java library of SOCKS5 protocol including client and server

Java 305 109 Updated Jul 16, 2023

Mirror of Apache Samoa (Incubating)

Java 251 102 Updated Apr 16, 2023

An open source, high scalability toolkit in Java for Entity Resolution.

Java 223 43 Updated Jul 12, 2025

Flexible classic and NeurAl Retrieval Toolkit

Java 223 36 Updated Jun 28, 2025

Twitter Tools

Java 222 96 Updated Feb 18, 2018

Warcbase is an open-source platform for managing analyzing web archives

Java 161 47 Updated Dec 8, 2017

Android app for saving webpages for offline reading.

Java 143 46 Updated Jul 15, 2021

Artificial Intelligence for Digital Response

Java 104 40 Updated Nov 21, 2018

A toolbox for statistical relational learning and reasoning.

Java 103 26 Updated Jul 6, 2022

neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität Berlin.

Java 70 10 Updated Feb 13, 2019

A spring-boot-starter application, with user authentication, registration, JPA using mysql.

Java 49 22 Updated Oct 3, 2024

BoostSRL: "Boosting for Statistical Relational Learning." A gradient-boosting based approach for learning different types of SRL models.

Java 31 24 Updated Sep 11, 2023

Simple kafka producer that ingest data from Twitter Streaming API to a Kafka broker

Java 28 32 Updated Sep 19, 2016

Egonet is a program for the collection and analysis of egocentric network data. It helps you create the questionnaire, collect data, and provide general global network measures and data matrixes th…

Java 26 10 Updated Apr 3, 2026
Next