hadoop

Library for remote JVM ExecutorService with only dependency being password-less SSH -- Run clustered Hadoop/Spark jobs from IDE -- IDE-pimped Spark shell with full auto-completion!

cloud grid hadoop jvm ide spark-shell

Updated Feb 11, 2021
Scala

This Big Data project consists of obtaining data on vehicle theft in the city of São Paulo and consolidating it in a counting and heat map, in order to show areas with a higher index of this type of crime. All applicable in AWS Resources.

scala spark hadoop analytics aws-s3 aws-emr aws-sqs hdfs aws-elasticsearch aws-athena spark-sql aws-kinesis-firehose spark-shell

Updated Apr 21, 2023
Scala

kamireddig / GetDailyRevenue

Star

Scope of this project is to calculate Daily Revenue from retail products

scala programming sql spark hive hadoop functional-programming uml databases data-warehouse hdfs sparksql retail sqoop retail-data sqoop-documentation

Updated May 28, 2020
Scala

mayankrastogi / faculty-page-rank

Star

A Spark application to process the DBLP dataset to find out the Page Rank of faculty at the UIC CS department based on their co-authorships on publications.

scala spark hadoop sbt xml aws-emr typesafe-config scalatest

Updated Apr 23, 2019
Scala

Kaushal1011 / CS441SimRankForGraphs

Star

This is the implementation of an algorithm that finds traceability links in two graphs such that the other graph is a perturbed version of the original graph.

distributed-systems scala hadoop graph mapreduce jaccard

Updated Oct 25, 2023
Scala

multivacplatform / multivac-pubmed

Star

Update PubMed articles daily on HDFS by using Spark Cluster

apache-spark yarn hadoop pubmed pubmed-parser hdfs dataframe spark-sql

Updated Nov 18, 2022
Scala

rupeshtr78 / spark-streaming

Star

Spark Streaming Big Data Hadoop

scala kafka big-data spark cassandra mongodb hive hadoop bigdata spark-streaming hdfs

Updated Apr 21, 2020
Scala

lovescott / spark-streaming-general

Star

Lab with Scala and Spark Streaming

scala big-data spark hadoop

Updated May 18, 2017
Scala

minhhahl / hadoop-balancer

Star

Hadoop balancer which helps balance disks on a single node

hadoop balancer

Updated Mar 13, 2021
Scala

s3ni0r / spark-job-skeleton.g8

Star

A skeleton to generate a Spark job project in Scala with local distributed environment for development, example at (https://github.com/s3ni0r/spark-app-example)

scala spark hadoop sbt docker-compose giter8-templates

Updated Sep 11, 2019
Scala

mehroosali / ABCStoresPipeline

Star

Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.

mysql airflow scala spark hadoop hadoop-cluster powerbi hadoop-hdfs etl-pipeline airflow-dags

Updated Dec 29, 2021
Scala

kchenphy / better-paths

Star

Simple and intuitive Hadoop Paths

scala hadoop hdfs

Updated Jul 28, 2018
Scala

mehassanhmood / hadoop-spark-pipeline

Star

An ETL pipeline that extracts data from HDFS , transforms using spark and writes back to HDFS.

scala hive hadoop hdfs

Updated Dec 3, 2023
Scala

kapilthakre / Bicycle-Sharing-Demand-Forecasting-Using-Spark-Scala

Star

In this project, we are going to build a Bicycle sharing demand prediction service using Apache Spark and Scala. I have created a two spark application one for model generation and another for model demand prediction.

machine-learning scala spark hadoop spark-streaming spark-sql spark-mllib databricks-notebooks hadoop-hdfs

Updated Jan 8, 2021
Scala

shuuji3 / spark-ceph-connector

Sponsor

Star

🌟Spark Ceph Connector: Implementation of Hadoop Filesystem API for Ceph

spark apache-spark hadoop ceph apache-hadoop

Updated Aug 25, 2020
Scala

huaweiTo / BigDataLearning

Star

some codes are created when learning BigData

spark hadoop spark-streaming

Updated Jun 29, 2022
Scala

Improve this page

Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hadoop

Here are 140 public repositories matching this topic...

shask9 / chicago_city_crime_analysis

kwartile / spark-benchmark

multivacplatform / multivac-elasticsearch

Starofall / QryGraph

hindog / grid-executor

markoshlima / crimes-map

kamireddig / GetDailyRevenue

mayankrastogi / faculty-page-rank

Kaushal1011 / CS441SimRankForGraphs

multivacplatform / multivac-pubmed

rupeshtr78 / spark-streaming

lovescott / spark-streaming-general

minhhahl / hadoop-balancer

s3ni0r / spark-job-skeleton.g8

mehroosali / ABCStoresPipeline

kchenphy / better-paths

mehassanhmood / hadoop-spark-pipeline

kapilthakre / Bicycle-Sharing-Demand-Forecasting-Using-Spark-Scala

shuuji3 / spark-ceph-connector

huaweiTo / BigDataLearning

Improve this page

Add this topic to your repo