A subproject of Predictiveworks that provides common access to Cassandra, Elasticsearch, HBase, MongoDB, Parquet, JDBC database and other data sources from Apache Spark.
-
Updated
Feb 23, 2015 - Scala
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
A subproject of Predictiveworks that provides common access to Cassandra, Elasticsearch, HBase, MongoDB, Parquet, JDBC database and other data sources from Apache Spark.
Simple document classifier using Apache Spark
An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
Using Spark to get some knowledge from Dota 2 match data
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
Connect to SQL Server using Apache Spark
Meetup sample code
An example of Spark and GraphX with Twitter as sample
A movie recommendation system built using Apache Spark and Scala.
Basics of Big Data and Machine Learning using Apache Spark and Scala
A basic solution to MNIST written in Scala using Apache Spark
CSV to elastic search in json format using Apache spark
Created by Matei Zaharia
Released May 26, 2014