The Nanopore Genome Assembly Pipeline (NanoGAP) utilises a variety of open-source tools and software to automate a non-hybrid (long reads only) genome assembly and self-correction of error prone nanopore reads
- Developed specifically for bacterial genomes.
- Requires raw/uncorrected Oxford Nanopore Technologies (ONT) long reads only (
.fastq). - Assembly of unknown isolates (no predicted genome size needed).
- Identify nearest bacterial organism.
Conda is required for installation. You can download and install conda for Linux here.
NanoGAP uses a number of open source projects:
git clone https://github.com/escasinas/nanogap.git
cd nanogap
source install.shInput
- A single
.fastqfile.
or
- A directory containing multiple
.fastqfiles.
Output
- Genome assembly in
.fastaformat. - BLAST output of the genome's 16S rRNA.
- CSV output containing assembly information.
Command
For a single fastq file
conda activate nanogap
python nanogap.py path/to/file.fastq [options]For a directory containing fastq files
conda activate nanogap
python nanogap.py path/to/reads_directory [options]-h | --help Show help message and exit.
-t | --threads Number of threads/CPUs to run Minimap2, Flye, Racon, Medaka and Barrnap (default: MAX CPU).
-o | --outdir Name of output directory (default: ngap_output).
-m | --model Medaka model. Please see the Medaka repo for more information (default: r941_min_high_g360).