-
Notifications
You must be signed in to change notification settings - Fork 28
Input
Because nextflow is flexible, this workflow is flexible in how input fastq files are specified.
Cecret can use a sample sheet for input with the sample name and reads separated by commas. The header must be sample,fastq_1,fastq_2 - even if using nanopore fastq files. The general rule is the identifier for the file(s), the file locations, and the type if not paired-end fastq files. This method is the recommended method for use-cases involving the cloud.
Rows match files with their processing needs.
- paired-end reads:
sample,read1.fastq.gz,read2.fastq.gz - single-reads reads:
sample,sample.fastq.gz,single - nanopore reads :
sample,sample.fastq.gz,ont - fasta files:
sample,sample.fasta,fasta
Example sample sheet:
sample,fastq_1,fastq_2
SRR13957125,/home/eriny/sandbox/test_files/cecret/reads/SRR13957125_1.fastq.gz,/home/eriny/sandbox/test_files/cecret/reads/SRR13957125_2.fastq.gz
SRR13957170,/home/eriny/sandbox/test_files/cecret/reads/SRR13957170_1.fastq.gz,/home/eriny/sandbox/test_files/cecret/reads/SRR13957170_2.fastq.gz
SRR13957177S,/home/eriny/sandbox/test_files/cecret/single_reads/SRR13957177_1.fastq.gz,single
OQ255990.1,/home/eriny/sandbox/test_files/cecret/fastas/OQ255990.1.fasta,fasta
SRR22452244,/home/eriny/sandbox/test_files/cecret/nanopore/SRR22452244.fastq.gz,ont
Example usage with sample sheet using docker to manage containers
nextflow run UPHL-BioNGS/Cecret -profile docker --sample_sheet SampleSheet.csv
If using local computational resources, this workflow can read in files from directories. Paired-end Illumina files, single-end Illumina files, and a single file of nanopore reads should end with 'fastq', 'fastq.gz', 'fq', or 'fq.gz'. Fastas must end with '.fasta', '.fna', or '.fa'.
WARNING:
- Sometimes nextflow does not catch every name of paired-end fastq files. This workflow is meant to be fairly agnostic, but if paired-end fastq files are not being found it might be worth renaming them to some sort of
sample_1.fastq.gzformat or using a sample sheet. - Single and paired-end reads cannot be in the same directory
- Nanopore reads are not single-end Illumina reads
- Wildcards do not work well with AWS buckets, so it is recommended that those users use sample sheets.
These directories can be specified with a corresponding param
params.reads = <path to directory of paired-end Illumina reads>
params.single_reads = <path to directory of single-end Illumina reads>
params.nanopore = <path to directory of single-end nanopore reads>
params.fastas = <path to directory with fasta files>
More information about adjusting parameters can be found on the Params page of this wiki.
directory
└── *fastq.gz
The command would look similar to
nextflow run UPHL-BioNGS/Cecret -profile docker --reads directory
directory
└── *fastq.gz
The command would look similar to
nextflow run UPHL-BioNGS/Cecret -profile docker --single_reads directory
directory
└── *fastq.gz
The command would look similar to
nextflow run UPHL-BioNGS/Cecret -profile docker --nanopore directory
directory
└── *fasta
The command would look similar to
nextflow run UPHL-BioNGS/Cecret -profile docker --fastas directory