-
Universidad de La Laguna
- Tenerife, Spain
- https://mcolebrook.github.io
- https://orcid.org/0000-0003-3074-1697
- @mcolebrook
Lists (2)
Sort Name ascending (A-Z)
Stars
Apache Spark - A unified analytics engine for large-scale data processing
Spark: The Definitive Guide's Code Repository
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
A connector for Spark that allows reading and writing to/from Redis cluster
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
Spark Connector to read and write with Pulsar
A Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)
Spark-based variant calling, with experimental support for multi-sample somatic calling (including RNA) and local assembly
Distributed Linear Programming Solver on top of Apache Spark
Apache Spark jobs such as Principal Coordinate Analysis.
Vagrant, Apache Spark and Apache Zeppelin VM for teaching
VariantSpark is a framework for applying Spark-based Machine Learning methods to whole-genome variant information
Machine Learning with Scala Quick Start Guide, published by Packt
An example of bioinformatics and bigdata tools can playing nicely together
Learning Spark on Kubernetes in a series of Warsaw Data Engineering meetups online!
Word count and some basic log file parsing and analytics
Try out Spark 1.5 out using VMs provisioned by Vagrant and Ansible