Skip to content
Young edited this page Oct 28, 2025 · 5 revisions

Inputs

Because nextflow is flexible, this workflow is flexible in how input fastq files are specified.

Using a sample sheet

Cecret can use a sample sheet for input with the sample name and reads separated by commas. The header must be sample,fastq_1,fastq_2 - even if using nanopore fastq files. The general rule is the identifier for the file(s), the file locations, and the type if not paired-end fastq files. This method is the recommended method for use-cases involving the cloud.

Rows match files with their processing needs.

  • paired-end reads: sample,read1.fastq.gz,read2.fastq.gz
  • single-reads reads: sample,sample.fastq.gz,single
  • nanopore reads : sample,sample.fastq.gz,ont
  • fasta files: sample,sample.fasta,fasta

Example sample sheet:

sample,fastq_1,fastq_2
SRR13957125,/home/eriny/sandbox/test_files/cecret/reads/SRR13957125_1.fastq.gz,/home/eriny/sandbox/test_files/cecret/reads/SRR13957125_2.fastq.gz
SRR13957170,/home/eriny/sandbox/test_files/cecret/reads/SRR13957170_1.fastq.gz,/home/eriny/sandbox/test_files/cecret/reads/SRR13957170_2.fastq.gz
SRR13957177S,/home/eriny/sandbox/test_files/cecret/single_reads/SRR13957177_1.fastq.gz,single
OQ255990.1,/home/eriny/sandbox/test_files/cecret/fastas/OQ255990.1.fasta,fasta
SRR22452244,/home/eriny/sandbox/test_files/cecret/nanopore/SRR22452244.fastq.gz,ont

Example usage with sample sheet using docker to manage containers

nextflow run UPHL-BioNGS/Cecret -profile docker --sample_sheet SampleSheet.csv

Files from directories

If using local computational resources, this workflow can read in files from directories. Paired-end Illumina files, single-end Illumina files, and a single file of nanopore reads should end with 'fastq', 'fastq.gz', 'fq', or 'fq.gz'. Fastas must end with '.fasta', '.fna', or '.fa'.

WARNING:

  • Sometimes nextflow does not catch every name of paired-end fastq files. This workflow is meant to be fairly agnostic, but if paired-end fastq files are not being found it might be worth renaming them to some sort of sample_1.fastq.gz format or using a sample sheet.
  • Single and paired-end reads cannot be in the same directory
  • Nanopore reads are not single-end Illumina reads
  • Wildcards do not work well with AWS buckets, so it is recommended that those users use sample sheets.

These directories can be specified with a corresponding param

params.reads = <path to directory of paired-end Illumina reads>
params.single_reads = <path to directory of single-end Illumina reads>
params.nanopore = <path to directory of single-end nanopore reads>
params.fastas = <path to directory with fasta files>

More information about adjusting parameters can be found on the Params page of this wiki.

Example directories for input

For paired-end fastq files

directory
└── *fastq.gz

The command would look similar to

nextflow run UPHL-BioNGS/Cecret -profile docker --reads directory

For single-end fastq files

directory
└── *fastq.gz

The command would look similar to

nextflow run UPHL-BioNGS/Cecret -profile docker --single_reads directory

For nanopore fastq files

directory
└── *fastq.gz

The command would look similar to

nextflow run UPHL-BioNGS/Cecret -profile docker --nanopore directory

For fasta files

directory
└── *fasta

The command would look similar to

nextflow run UPHL-BioNGS/Cecret -profile docker --fastas directory

Clone this wiki locally