Skip to content
#

hadoop

Here are 566 public repositories matching this topic...

📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transformation, storage, monitoring, and AI/ML serving with CI/CD automation using Terraform & GitHub Actions.

  • Updated Dec 10, 2025
  • Python

A practical coursework-style project from my Master's studies in Big Data Analytics (at University of East London), showcasing hands-on use of big data tools and techniques on a real-world cyber-security dataset.

  • Updated Nov 18, 2025
  • Python

450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc...

  • Updated Nov 6, 2025
  • Python

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

  • Updated Nov 6, 2025
  • Python

🔍Model Context Protocol (MCP) server for Apache Ambari API integration. This project provides tools for managing Hadoop clusters, including service operations, configuration management, status monitoring, and request tracking.

  • Updated Nov 24, 2025
  • Python

This project implements a real-time credit card fraud detection system using big data technologies. It simulates a production-grade fraud detection pipeline where credit card transactions are streamed through Apache Kafka, classified in real-time using a trained Mahout Random Forest model, and stored in separate databases based on fraud predictions

  • Updated Oct 25, 2025
  • Python

Improve this page

Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."

Learn more