- Kolmogorov, Mikhail;
- Billingsley, Kimberley J;
- Mastoras, Mira;
- Meredith, Melissa;
- Monlong, Jean;
- Lorig-Roach, Ryan;
- Asri, Mobin;
- Alvarez Jerez, Pilar;
- Malik, Laksh;
- Dewan, Ramita;
- Reed, Xylena;
- Genner, Rylee M;
- Daida, Kensuke;
- Behera, Sairam;
- Shafin, Kishwar;
- Pesout, Trevor;
- Prabakaran, Jeshuwin;
- Carnevali, Paolo;
- Yang, Jianzhi;
- Rhie, Arang;
- Scholz, Sonja W;
- Traynor, Bryan J;
- Miga, Karen H;
- Jain, Miten;
- Timp, Winston;
- Phillippy, Adam M;
- Chaisson, Mark;
- Sedlazeck, Fritz J;
- Blauwendraat, Cornelis;
- Paten, Benedict
Long-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.