Skip to content
View cguegi's full-sized avatar
  • Scigility Inc.
  • Zurich, Switzerland

Block or report cguegi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Egeria's Guidance on Governance as well as large media files such as presentations and movies

107 30 Updated Oct 20, 2022

This repository contains the notebooks and presentations we use for our Databricks Tech Talks

HTML 734 447 Updated Jan 6, 2025

Code repository for O'Reilly Hadoop Application Architectures book

Java 160 99 Updated May 26, 2015

hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE

Java 296 105 Updated Jan 2, 2023

Kerberos and Hadoop: The Madness beyond the Gate

283 128 Updated Jul 28, 2023

KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apache Spark Streaming, Apache Cassandra, Apache Kafka and Akka f…

Scala 1,179 392 Updated Jan 5, 2017

Snippets and small examples demonstrating kafka features and configs

Java 649 383 Updated Jul 1, 2022

Apache Storm

Java 6,684 4,043 Updated Jun 19, 2026

Scalable machine learning library for Apache Hive/Spark/Pig

501 145 Updated Dec 2, 2016

Streaming MapReduce with Scalding and Storm

Scala 2,125 256 Updated Jan 19, 2022

📚 Freely available programming books

Python 390,479 66,437 Updated Jun 18, 2026

Secondary Index for HBase

Java 589 284 Updated May 18, 2017
Java 559 223 Updated Feb 12, 2022

JeroMQ is a pure Java implementation of the ZeroMQ messaging library, offering high-performance asynchronous messaging for distributed or concurrent applications.

Java 2,446 486 Updated Nov 30, 2025

Elephant Twin is a framework for creating indexes in Hadoop

Java 99 16 Updated Oct 12, 2020

Common Crawl support library to access 2008-2012 crawl archives (ARC files)

C++ 508 92 Updated Nov 29, 2017

Apache Kafka - A distributed event streaming platform

Java 32,882 15,283 Updated Jun 18, 2026