Refactor postvariantcalling. Split out valrociraptor vs other options #2043

FriederikeHanssen · 2025-10-30T12:32:05Z

Previously, the post-variant calling logic allowed mixing different post-processing strategies (varlociraptor,
normalization, concatenation) in confusing ways. The main workflow had complex conditional logic to determine which VCFs was used based on what post-processing was requested, a lot of computation was repeated on both raw and post-processed variants. This made the data flow hard to follow and error-prone.

Changes

Enforced either-or logic in post-variant calling:

If varlociraptor is requested → process through varlociraptor workflows exclusively
Else if concatenate_vcfs or normalize_vcfs → perform standard post-processing
Else → pass through original VCFs unchanged

Simplified main workflow:

Removed conditional VCF gathering logic from main SAREK workflow
POST_VARIANTCALLING now always outputs VCFs (either processed or pass-through)
Annotation always consumes POST_VARIANTCALLING.out.vcfs
VCF QC now runs exclusively on raw variant calls (before any post-processing)

Additional improvements:

Standardized varlociraptor scenario file handling in main.nf
Disabled redundant publishing of intermediate normalized files
Increase time out for MultiQC to ensure all plots are published
Varlociraptor subworkflows now emit separate vcf and tbi outputs
This matches the pattern used by other post-processing subworkflows
Added branching logic to handle single chunks without concatenation

Resume limitations with varlociraptor

While the overall data flow is cleaner, varlociraptor is currently not reliable resuming. I marked the likely location where this occurs but it is currently not clear to me why.

nf-core-bot · 2025-10-30T12:32:39Z

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.3.2.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

maxulysse

Looking good

FriederikeHanssen · 2025-10-31T10:45:10Z

Waiting for conlcusion on discussion here before updating all the checksums: https://nfcore.slack.com/archives/C05V9FRJYMV/p1761906884846409?thread_ts=1761564791.923049&cid=C05V9FRJYMV

maxulysse

Minor comments

main.nf

subworkflows/local/post_variantcalling/tests/main.nf.test

FriederikeHanssen · 2025-11-04T11:20:47Z

Merging with failing sentieon tests only

Based on #2043 This PR adds optional VCF filtering functionality to the post-variant calling workflow, enabling users to filter variant calls from all variant callers using `bcftools view`. Additionally, it refactors the post-variant calling logic to handle implicit index creation and improve handling of variant callers that produce multiple outputs. **Why These Changes** 1. Streamlines the workflow by providing PASS-filtered variants directly 2. Maintains flexibility through customizable filtering criteria 3. Integrates seamlessly with existing normalization and concatenation steps 4. Improves handling of variant callers with complex output structures (e.g., Strelka, Manta) **New Filtering Feature** - Adds --filter_vcfs parameter to enable optional VCF filtering with `bcftools view` with default PASS filter - Supports custom filtering criteria through `--bcftools_filter_criteria` parameter - Publishes filtered VCFs to variant_calling/filtered/<sample>/ <sample>.<variantcaller>.bcftools_filtered.vcf.gz **Workflow Refactoring** - Makes index computation implicit for all bcftools operations, simplifying channel handling - Updates emit channel structure to properly account for implicit index creation (.tbi/.csi) - Uses basename consistently to handle variant callers that produce multiple outputs - Fixes channel wiring between filtering, normalization, and concatenation steps - Removes unnecessary explicit tabix indexing process - Fix output structure in docs **Testing & CI** - Adds filtering tests for multiple variant callers - Enables filtering on full-size test profiles - Fixes FreeBayes filtered output publishing - Updates snapshots to reflect new output structure **Documentation** - Updates docs/output.md with new filtering section - Updates subway map diagrams to reflect new workflow  ## PR checklist - [ ] This comment contains a description of changes (with reason). - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-core/sarek/tree/master/.github/CONTRIBUTING.md) - [ ] If necessary, also make a PR on the nf-core/sarek _branch_ on the [nf-core/test-datasets](https://github.com/nf-core/test-datasets) repository. - [ ] Make sure your code lints (`nf-core pipelines lint`). - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir <OUTDIR>`). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir <OUTDIR>`). - [ ] Usage Documentation in `docs/usage.md` is updated. - [ ] Output Documentation in `docs/output.md` is updated. - [ ] `CHANGELOG.md` is updated. - [ ] `README.md` is updated (including new tool citations and authors/contributors).

FriederikeHanssen added 3 commits October 30, 2025 11:02

clean up logic about which vcfs get annotated to be data flow based

02fc2b9

pull up parameter handeling, standardise interface

44fda17

add tests

1c9f14d

add changelog

977e856

maxulysse reviewed Oct 30, 2025

View reviewed changes

rework annotation vcf logic

5c2f7f0

FriederikeHanssen added 10 commits October 31, 2025 18:17

update multiqc

22b8fbf

fix ordering to preserve resume

8959f16

increase timeout to avoid plots not being exported

3e3107f

update snapshots with naming

66f98ed

undo mqc update

6936920

update snapshots

19dc23c

update tests

39e2f56

update snapshot for germline calling

b5c49d6

don't publish normaized files twice

b840799

update readme to reflect new logic

10bc7d5

FriederikeHanssen mentioned this pull request Nov 2, 2025

Add filtering with bcftools #2044

Merged

11 tasks

FriederikeHanssen added 7 commits November 2, 2025 18:21

update snapshot

5cf7739

remove jq

7a6a68c

remove jq from script

446958e

fix subworkflow test, and add single chunk handeling

65d4a6c

merge dev

2f3772a

remove todo string

2e7e129

comment tests back in

9ac85dd

FriederikeHanssen marked this pull request as ready for review November 3, 2025 18:34

maxulysse approved these changes Nov 3, 2025

View reviewed changes

main.nf Outdated Show resolved Hide resolved

subworkflows/local/post_variantcalling/tests/main.nf.test Show resolved Hide resolved

FriederikeHanssen requested a review from famosab November 3, 2025 18:54

adress review comments

d363908

FriederikeHanssen added 3 commits November 4, 2025 10:46

fix snapshot

b52ab9a

update rocrate

1c78f64

add missing snapshot

cf4579e

FriederikeHanssen merged commit f6abd79 into nf-core:dev Nov 4, 2025
29 of 41 checks passed

FriederikeHanssen deleted the refactor_postvc branch November 4, 2025 11:21

FriederikeHanssen mentioned this pull request Dec 2, 2025

Release PR 3.7.0 #2063

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor postvariantcalling. Split out valrociraptor vs other options #2043

Refactor postvariantcalling. Split out valrociraptor vs other options #2043

FriederikeHanssen commented Oct 30, 2025 •

edited

Loading

Uh oh!

nf-core-bot commented Oct 30, 2025

Uh oh!

maxulysse left a comment

Uh oh!

FriederikeHanssen commented Oct 31, 2025

Uh oh!

maxulysse left a comment

Uh oh!

Uh oh!

Uh oh!

FriederikeHanssen commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor postvariantcalling. Split out valrociraptor vs other options #2043

Refactor postvariantcalling. Split out valrociraptor vs other options #2043

Conversation

FriederikeHanssen commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nf-core-bot commented Oct 30, 2025

Uh oh!

maxulysse left a comment

Choose a reason for hiding this comment

Uh oh!

FriederikeHanssen commented Oct 31, 2025

Uh oh!

maxulysse left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

FriederikeHanssen commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FriederikeHanssen commented Oct 30, 2025 •

edited

Loading