Skip to content
View pgroth's full-sized avatar

Organizations

@openphacts @Data2Semantics

Block or report pgroth

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Scalable association rule mining from tabular datasets.

Python 27 3 Updated Feb 9, 2026

A modular framework for benchmarking multimodal AI agents in a reproducible, full-OS environment. Using and adaption of the Smolagents's CodeAgent, Docker containers to run the VM in, VM's created …

Jupyter Notebook 1 Updated Feb 9, 2026

Synthetic Patient Population Simulator

Java 2,970 824 Updated Feb 12, 2026

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

Python 4,911 449 Updated Feb 15, 2026

Repository for my AI master's thesis.

HTML 2 Updated Jul 4, 2023

Database system for AI-powered apps

Python 2,681 261 Updated May 17, 2024

potato: portable text annotation tool

Python 364 67 Updated Feb 12, 2026

A simple & elegant experiment tracking framework that integrates persistence logic & best practices directly into Python

Jupyter Notebook 538 16 Updated Jan 14, 2025

Powerful RDF Knowledge Graph Generation with RML Mappings

Python 260 50 Updated Feb 15, 2026

Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...

Python 320 26 Updated Dec 9, 2023

An easy way to extract information from documents

Python 1,786 131 Updated May 3, 2023
Jupyter Notebook 3 Updated Jul 15, 2022

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 2,020 488 Updated Feb 13, 2026

Database Reasoning Over Text project for ACL paper

Python 1 Updated Mar 24, 2022

DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not w…

Java 1,474 129 Updated Jul 7, 2023

Data visualization workshop (Ams data science center, 2022Feb)

Jupyter Notebook 9 1 Updated Nov 25, 2025

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

C++ 26,722 4,100 Updated Jun 19, 2025

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 15,113 12,801 Updated Feb 16, 2026

Type System for Data Analysis in Python

Python 216 20 Updated Feb 1, 2025

(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)

549 36 Updated Dec 7, 2025

Leveraging table semantics for data or knowledge discovery

Jupyter Notebook 2 Updated Jul 14, 2022

MELT - Matching EvaLuation Toolkit

Java 54 12 Updated Oct 1, 2025

Paper, data and code from Investigating Potential Security Vulnerability Manifestation through Various Analyses & Inferences Regarding Internet RFCs

HTML 19 2 Updated Jan 28, 2021

Python implementation of character-level, textual inter-annotator agreement with Krippendorff's alpha.

Python 3 Updated Feb 12, 2024

Graph Engine for Exploration and Search

Python 42 4 Updated Jan 26, 2024

REL: Radboud Entity Linker

Python 317 68 Updated Apr 9, 2024

Entity Linker solution

Python 1,205 235 Updated Sep 21, 2023

Labelling platform for text using weak supervision.

JavaScript 260 18 Updated Jun 24, 2022

openclean - Data Cleaning and data profiling library for Python

Python 83 5 Updated Nov 1, 2021
Java 7 3 Updated Nov 2, 2017
Next