Skip to content
View jtbates's full-sized avatar

Organizations

@dssg

Block or report jtbates

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

35 stars written in Java
Clear filter

Apache Iceberg

Java 8,748 3,171 Updated Apr 17, 2026

AI + Data, online. https://vespa.ai

Java 6,878 707 Updated Apr 17, 2026

Statistical Machine Intelligence & Learning Engine

Java 6,366 1,148 Updated Apr 17, 2026

A machine learning software for extracting information from scholarly documents

Java 4,796 544 Updated Apr 17, 2026

Collect, aggregate, and visualize a data ecosystem's metadata

Java 2,168 392 Updated Apr 12, 2026

A native library providing a Tinder-like cards effect. A card can be constructed using an image and displayed with animation effects, dismiss-to-like and dismiss-to-unlike, and use different sortin…

Java 1,468 361 Updated Nov 16, 2020

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Java 1,183 160 Updated Apr 17, 2026

Anserini is a Lucene toolkit for reproducible information retrieval research

Java 1,108 586 Updated Apr 18, 2026

A Java HTTP client for consuming Twitter's realtime Streaming API

Java 955 361 Updated Apr 6, 2022

align and compare tables

Java 903 76 Updated Mar 17, 2026

INCEpTION provides a semantic annotation platform offering intelligent annotation assistance and knowledge management.

Java 689 167 Updated Apr 17, 2026

Duke is a fast and flexible deduplication engine written in Java

Java 626 190 Updated Oct 11, 2023

ReverseProxy-Android

Java 511 159 Updated Apr 16, 2018

Latent Dirichlet Allocation (LDA) model for Microblogs (Twitter, weibo etc.)

Java 322 108 Updated May 4, 2018

A Java library of SOCKS5 protocol including client and server

Java 305 109 Updated Jul 16, 2023

Mirror of Apache Samoa (Incubating)

Java 251 102 Updated Apr 16, 2023

An open source, high scalability toolkit in Java for Entity Resolution.

Java 223 43 Updated Jul 12, 2025

Flexible classic and NeurAl Retrieval Toolkit

Java 223 37 Updated Jun 28, 2025

Twitter Tools

Java 222 96 Updated Feb 18, 2018

Warcbase is an open-source platform for managing analyzing web archives

Java 161 47 Updated Dec 8, 2017

Android app for saving webpages for offline reading.

Java 143 46 Updated Jul 15, 2021

Artificial Intelligence for Digital Response

Java 104 40 Updated Nov 21, 2018

A toolbox for statistical relational learning and reasoning.

Java 103 26 Updated Jul 6, 2022

neonion is a user-centered collaborative semantic annotation webapp developed at the Human-Centered Computing group at Freie Universität Berlin.

Java 70 10 Updated Feb 13, 2019

BoostSRL: "Boosting for Statistical Relational Learning." A gradient-boosting based approach for learning different types of SRL models.

Java 31 24 Updated Sep 11, 2023

Simple kafka producer that ingest data from Twitter Streaming API to a Kafka broker

Java 28 32 Updated Sep 19, 2016

Egonet is a program for the collection and analysis of egocentric network data. It helps you create the questionnaire, collect data, and provide general global network measures and data matrixes th…

Java 26 10 Updated Apr 3, 2026

https://android.scraperclub.com This is an experimental technique to share mobile phones as scraper workers coordinated over a central server. Since these are all normal phones running normal andro…

Java 20 6 Updated Mar 3, 2020

Joint Behavior-Topic Model for Microblogs

Java 15 9 Updated Apr 25, 2015

🔎 📄 SpExtor: Sparse Entity Extractor

Java 11 5 Updated Feb 10, 2020
Next