GitHub - jcaperella29/Counts_matrix_Nextflow: Minimal, production-ready Nextflow DSL2 RNA-seq pipeline for QC, alignment, gene counting, transcript quantification, coverage tracks, and matrix merging.

Counts_matrix_Nextflow

A portable Nextflow DSL2 RNA-seq pipeline that performs:

QC

Genome alignment

Gene-level counting

Transcript-level quantification

Coverage track generation

Matrix merging and summary reporting

Designed to be minimal, readable, and robust, while still demonstrating real RNA-seq best practices.

Features

This pipeline:

Accepts paired-end FASTQs via a samplesheet

Performs QC → alignment → quantification → matrix merge

Generates gene counts, TPM matrices, and BigWig tracks

Automatically builds STAR and Salmon indexes

Skips coverage generation gracefully for samples with zero mapped reads

Works with Conda, Docker, or Singularity/Apptainer

Uses standard, well-documented tools

Ideal for:

Teaching / demos

Testing infrastructure

Prototyping larger RNA-seq workflows

Portfolio or template pipelines

Pipeline Overview Steps

FASTQ QC & trimming

fastp

Genome index build (once per run)

STAR --runMode genomeGenerate

Genome alignment

STAR → coordinate-sorted BAM

BAM indexing + flagstat

Coverage tracks

deepTools bamCoverage → BigWig

Automatically skipped if no mapped reads

Gene-level counting

featureCounts (GTF-based)

Transcript-level quantification

Salmon quant

Matrix merging

Gene count matrix (counts_matrix.tsv)

Transcript TPM matrix (salmon_tpm_matrix.tsv)

Summary reporting

MultiQC

Input Format Samplesheet (CSV) sample,read1,read2 S1,data/S1_R1.fastq.gz,data/S1_R2.fastq.gz S2,data/S2_R1.fastq.gz,data/S2_R2.fastq.gz

Required columns:

sample

Required Inputs

Reference genome FASTA (--ref)

Gene annotation GTF (--gtf)

Transcript FASTA for Salmon (--transcripts)

Samplesheet CSV (--samplesheet)

Output Structure results/ ├── qc/ │ ├── S1.fastp.html │ └── S1.fastp.json ├── ref/ │ ├── star/ │ │ └── STAR_INDEX/ │ └── salmon/ │ └── SALMON_INDEX/ ├── bam/ │ ├── S1.bam │ ├── S1.bam.bai │ └── S1.flagstat.txt ├── bigwig/ │ └── S1.bw ├── counts_per_sample/ │ ├── S1.counts.tsv │ └── S2.counts.tsv ├── counts_matrix.tsv ├── salmon_tpm_matrix.tsv └── multiqc_report.html

Deployment options (all code is bash) to run with Docker docker build -t rnaseq-pipeline . nextflow run main.nf -profile docker
--samplesheet samples.csv
--ref genome.fa
--gtf genes.gtf
--transcripts transcripts.fa

to run with Apptainer

singularity build containers/rnaseq-pipeline.sif docker://rnaseq-pipeline nextflow run main.nf -profile singularity
--samplesheet samples.csv
--ref genome.fa
--gtf genes.gtf
--transcripts transcripts.fa Parameters

Parameter	Default	Description
`--samplesheet`	(required)	CSV mapping samples to FASTQs
`--ref`	(required)	Reference genome FASTA
`--gtf`	(required)	Gene annotation GTF
`--transcripts`	(required)	Transcript FASTA for Salmon
`--outdir`	`results`	Output directory
`--threads`	`4`	Threads per task
`--bw_binsize`	`10`	BigWig bin size
`--bw_norm`	`CPM`	BigWig normalization

Requirements

Nextflow ≥ 23

Docker Singularity / Apptainer fastp STAR samtools subread (featureCounts) salmon deeptools multiqc otes & Design Philosophy

Intentionally simple and readable

Avoids over-engineering

Uses explicit channels instead of heavy abstraction

Guards against common RNA-seq failure modes (e.g. zero-mapped samples)

Designed to be:

Extended with DESeq2 / edgeR

Modularized into modules/

Integrated into larger multi-omics workflows

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
envs		envs
Dockerfile		Dockerfile
Readme.md		Readme.md
main.nf		main.nf
nextflow.config		nextflow.config
sbatch.sh		sbatch.sh
singularity.def		singularity.def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

jcaperella29/Counts_matrix_Nextflow

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages