You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Library for remote JVM ExecutorService with only dependency being password-less SSH -- Run clustered Hadoop/Spark jobs from IDE -- IDE-pimped Spark shell with full auto-completion!
This Big Data project consists of obtaining data on vehicle theft in the city of São Paulo and consolidating it in a counting and heat map, in order to show areas with a higher index of this type of crime. All applicable in AWS Resources.
Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.
A Spark application to process the DBLP dataset to find out the Page Rank of faculty at the UIC CS department based on their co-authorships on publications.
This is the implementation of an algorithm that finds traceability links in two graphs such that the other graph is a perturbed version of the original graph.