Big data training material
-
Updated
Jun 29, 2023 - Python
Big data training material
A distributed file system program that works like Hadoop with minor changes. A completely working program that incorporates asynchronous distribution of files and map and reduce components. It has its own command line interfaces with all the required commands.
Trying best case apache spark working environment for robust data pipelines
Repository containing python code for MapReduce jobs to answer questions about Udacity forum data.
This repository contains a python script that can be used to solve multiple use-cases starting from os use cases like creating partitions, LVM, Hadoop tasks, AWS tasks, Docker tasks, etc.
Advanced Topics in Database Systems @ NTUA | 2022- 2023
A Simple SQL engine using Hadoop MapReduce
Crime Analytics is a project aimed at analyzing crime data in Los Angeles from 2020 onwards. The project utilizes statistical methods and data analysis techniques to identify trends, high-risk groups, and the effectiveness of law enforcement agencies.
A mini-Hadoop clone capable of performing all DFS functionalities through a CLI
Running Map Reduce in Hadoop using Docker
Real Time Streaming: Twitter Data Pipeline Using Big data Tools
Step By Step guide for Hadoop installation on Ubuntu 16.04.3 with MapReduce example using Streaming
Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.
To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."