A repository for my master's thesis on scaling up the genome sequencing pipeline by using NoSQL or NewSQL datastores.
As a case study, I developed a version of GEMINI, a genome analysis tool, running on Apache Cassandra i.s.o. SQLite. This is a fork of version 0.1.11 of GEMINI.
My thesis text (in Dutch), detailing design decisions, can be found here:
GEMINI itself is due to (Manuscript):
Paila U, Chapman BA, Kirchner R, Quinlan AR (2013) GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations. PLoS Comput Biol 9(7): e1003153. doi:10.1371/journal.pcbi.1003153
And freely available under the MIT license.