#

apache-spark

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 15 public repositories matching this topic...

AbdullahKhurshid / ecommerce-marketing-analytics

Using Apache Spark for marketing analytics

r apache-spark supervised-learning cloud-computing unsupervised-learning big-data-analytics marketing-analytics

Updated Jan 30, 2025
R

d4rthm4ul / R-Cleaning-Exploration-Imputation-Visualization

This repository you are browsing contains intermediate level piece of codes which are useful for cleaning, exploratory analysis, handling of missing data points, outlier detection and different visualization techniques using graphics, ggplot2, tidycharts, ggExtra packages. Also in particular part of the script you can get basic information about…

Updated Jun 23, 2021
R

thegothamstak / RProject

Projects created using R

data-science r spark apache-spark dplyr bigdata apache data-analysis sparklyr

Updated Mar 19, 2018
R

kiendang / sparkr-naivebayes-example

r scala apache-spark mllib sparkr

Updated Jul 3, 2017
R

rog33zy / apache-sedona-r-tutorial.github.io

A tutorial showing how to use Apache Spark, Apache Sedona, and Delta Lake for big data analysis in R.

r apache-spark sparklyr sedona delta-lake

Updated Apr 14, 2025
R

slothkong / r_on_gcloud

R workloads running at scale on Google Cloud

r apache-spark sparklyr sparkr unittesting gcloud-sdk gcloud-cli

Updated Apr 25, 2020
R

r-spark / variantspark

A sparklyr extension to analyze genome datasets

apache-spark genomics sparklyr

Updated Jun 14, 2019
R

zero323 / dlt

Mirror of https://gitlab.com/zero323/dlt

r spark apache-spark rstats delta sparkr delta-lake delta-io

Updated Nov 25, 2022
R

etiennebr / sparksf

Enable spatial functions in Spark through the `sparklyr` package

r apache-spark spatial rstats

Updated Feb 4, 2019
R

r-spark / sparklyr.flint

Sparklyr extension making Flint time series library functionalities (https://github.com/twosigma/flint) easily accessible through R

Updated Jan 11, 2022
R

rstudio / sparktf

R interface to Spark TensorFlow Connector

r apache-spark tensorflow keras rstats sparklyr sparklyr-extension

Updated Sep 13, 2021
R

kevinykuo / sparklygraphs

Old repo for R interface for GraphFrames

r apache-spark rstats sparklyr r-package graphframes

Updated Mar 21, 2018
R

sparkxgb

rstudio / sparkxgb

R interface for XGBoost on Spark

machine-learning r spark apache-spark xgboost rstats

Updated Sep 11, 2025
R

harryprince / geospark

bring sf to spark in production

r apache-spark gis spatial-analysis spark-sql spatial-queries sparklyr-extension large-scale-spatial-analysis

Updated Dec 13, 2021
R

sparklyr

sparklyr / sparklyr

R interface for Apache Spark

machine-learning r spark apache-spark dplyr ide distributed rstats sparklyr livy remote-clusters

Updated Nov 19, 2025
R

Created by Matei Zaharia

Released May 26, 2014

Followers: 435 followers
Repository: apache/spark
Website: github.com/topics/spark
Wikipedia: Wikipedia

Related topics