Skip to content
View mcolebrook's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report mcolebrook

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
23 stars written in Scala
Clear filter

Apache Spark - A unified analytics engine for large-scale data processing

Scala 42,761 29,054 Updated Feb 6, 2026

Simple and Distributed Machine Learning

Scala 5,197 859 Updated Feb 6, 2026

Spark: The Definitive Guide's Code Repository

Scala 3,092 2,890 Updated Aug 26, 2020

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

Scala 1,045 315 Updated Jul 12, 2025

A connector for Spark that allows reading and writing to/from Redis cluster

Scala 946 367 Updated Oct 22, 2024

This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…

Scala 809 159 Updated Feb 5, 2026

Spark Connector to read and write with Pulsar

Scala 117 52 Updated Feb 6, 2026

An Apache Spark-shell backend for IPython

Scala 105 29 Updated Jul 2, 2021

A Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)

Scala 89 36 Updated Apr 15, 2024

Spark-based variant calling, with experimental support for multi-sample somatic calling (including RNA) and local assembly

Scala 85 21 Updated Jan 13, 2018

Distributed Linear Programming Solver on top of Apache Spark

Scala 79 22 Updated Jan 4, 2021

Apache Spark jobs such as Principal Coordinate Analysis.

Scala 75 37 Updated Jan 30, 2017

Vagrant, Apache Spark and Apache Zeppelin VM for teaching

Scala 44 34 Updated Oct 19, 2017

VariantSpark is a framework for applying Spark-based Machine Learning methods to whole-genome variant information

Scala 33 13 Updated Sep 28, 2017

Machine Learning with Scala Quick Start Guide, published by Packt

Scala 24 14 Updated Jul 20, 2023

A scala library for IBM ILOG CPLEX

Scala 20 2 Updated Jan 27, 2020

An example of bioinformatics and bigdata tools can playing nicely together

Scala 14 7 Updated May 17, 2016

Samples for Apache Spark GraphX library

Scala 10 Updated Apr 30, 2016

Learning Spark on Kubernetes in a series of Warsaw Data Engineering meetups online!

Scala 8 3 Updated Feb 3, 2021

sparse logistic regression in spark

Scala 7 1 Updated Oct 23, 2017

Word count and some basic log file parsing and analytics

Scala 4 5 Updated Mar 10, 2015

Try out Spark 1.5 out using VMs provisioned by Vagrant and Ansible

Scala 1 Updated Sep 27, 2015