- Amsterdam
- http://pgroth.com
Stars
Scalable association rule mining from tabular datasets.
A modular framework for benchmarking multimodal AI agents in a reproducible, full-OS environment. Using and adaption of the Smolagents's CodeAgent, Docker containers to run the VM in, VM's created …
Synthetic Patient Population Simulator
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
A simple & elegant experiment tracking framework that integrates persistence logic & best practices directly into Python
Powerful RDF Knowledge Graph Generation with RML Mappings
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
An easy way to extract information from documents
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
usc-isi-i2 / NeuralDB
Forked from facebookresearch/NeuralDBDatabase Reasoning Over Text project for ACL paper
DDlog is a programming language for incremental computation. It is well suited for writing programs that continuously update their output in response to input changes. A DDlog programmer does not w…
Data visualization workshop (Ams data science center, 2022Feb)
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
A beautiful, simple, clean, and responsive Jekyll theme for academics
(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)
Leveraging table semantics for data or knowledge discovery
Paper, data and code from Investigating Potential Security Vulnerability Manifestation through Various Analyses & Inferences Regarding Internet RFCs
Python implementation of character-level, textual inter-annotator agreement with Krippendorff's alpha.
Labelling platform for text using weak supervision.
openclean - Data Cleaning and data profiling library for Python