micrite-gethuman

micrite-gethuman is a nextflow pipeline designed to extract high-confidence host (human) reads from clinical sequencing data.

Overview

When searching for microbial sequences in human clinical samples, it is helpful to have a "ground truth" subset of human DNA from those same samples. This subset can serve as a baseline to compare putative microbial hits too with metrics like base qualities (PHRED scores), helping to distinguish true biological signals from sequencing noise or artifacts.

This pipeline processes BAM files (paired-end reads aligned to a human reference) and applies multiple filters to ensure only the most reliable host reads are retained.

Filtering Logic

The pipeline extracts reads that meet the following criteria:

Primary Alignments Only: Excludes secondary or supplementary alignments.
Proper Pairs: Both reads must be oriented and spaced as expected by the aligner, and cannot be marked as PCR/optical duplicates.
Expected Reference Chromosome Maps specifically to a user-defined set of --hostchroms (e.g., "chr1 chr2") to ensure no decoy contig alignments contaminate the outputs.
**Quality Thresholds:**Exceeds a user-specified Mapping Quality (MAPQ).
Length Thresholds: Exceeds a user-specified minimum Query Length.

After that initial filter, we randomly subsample to a specific number of reads based on --nreads argument. Random subsampling uses the seed 111 by defualt but can be changed using the process directive (task.ext.seed = )

Quick Start

nextflow run selkamand/micrite-gethuman -profile docker \
  --sampleid testsample \
  --hostchroms "chr1 chr2" \
  --min_query_length 10 \
  --min_mapping_quality 20 \
  --nreads 5

Testing

To verify the installation and workflow logic, run the built-in test profile:

nextflow run . -profile docker,test

See test file readme for details on what to expect in test run output

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
modules		modules
testfiles		testfiles
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
test.bam		test.bam

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

micrite-gethuman

Overview

Filtering Logic

Quick Start

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

micrite-gethuman

Overview

Filtering Logic

Quick Start

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages