-
probabl.ai
- Paris, France
- https://ogrisel.com
- @ogrisel@sigmoid.social
- @ogrisel.bsky.social
- @ogrisel
Stars
Distributed database specialized in exporting key/value data from Hadoop
Bnd/Bndtools. Tooling to build OSGi bundles including Eclipse, Maven, and Gradle plugins.
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of da…
Counterclockwise is an Eclipse plugin helping developers write Clojure code
Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
Java implementation of a probabilistic set data structure
Machine learning and natural language processing with Apache Pig
Utility to re-structure research papers published in US Letter or A4 format PDF files to typically remove the 2 columns layout.
The Colossal Pipe framework for map/reduce processing.
A reporistory of User-defined functions for Apache Pig
Personal development repository to prepare contributions and patches for Apache Mahout
Tools for using Clueweb09 and HBase together
A word tokenizer component for UIMA that take advantage of unicode general classes. The tokenizer only handles French for the moment, but can be extended quite easily.
Sandbox for the Berlin Buzzwords semantic hackathon