Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Everything you need to know to get the job.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
A proof-of-concept tool for generating payloads that exploit unsafe Java object deserialization.
Alluxio, data orchestration for analytics and machine learning in the cloud
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
INACTIVE: A simple docker client for the JVM
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
An implementation of a real-world map-reduce workflow in each major framework.
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
Data Stream Development with Apache Spark, Kafka and Spring Boot by Packt Publishing
A collection of tools for accessing Neo4j graph databases from Apache NiFi.
Demonstrates how to link a processor bundle with a custom controller service.
Allows you to see where(datanodes) that contain a file in HDFS