Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
105a51b
BIgMAG compatibility
jeffe107 Sep 16, 2025
5e430e3
[automated] Fix code linting
nf-core-bot Sep 17, 2025
56a200d
Update docs/output.md
jeffe107 Sep 17, 2025
0168b46
Update docs/output.md
jeffe107 Sep 17, 2025
5abd21b
Update main.nf
jeffe107 Sep 17, 2025
926d01c
Update main.nf
jeffe107 Sep 17, 2025
f5b4502
Update main.nf
jeffe107 Sep 17, 2025
83ae6a0
Update modules/local/bigmag_summary/main.nf
jeffe107 Sep 17, 2025
7ef95af
Update modules/local/bigmag_summary/main.nf
jeffe107 Sep 17, 2025
99e5bf8
Update modules/local/bigmag_summary/main.nf
jeffe107 Sep 17, 2025
38e48b2
Update bigmag_summary.py
jeffe107 Sep 17, 2025
22517cd
Update usage.md
jeffe107 Sep 17, 2025
9ca66ae
Update nextflow.config
jeffe107 Sep 17, 2025
568abc5
Update mag.nf
jeffe107 Sep 17, 2025
d34d2ad
Update nextflow_schema.json
jeffe107 Sep 17, 2025
ed196db
Update CHANGELOG.md
jeffe107 Sep 17, 2025
0f3476d
Update main.nf
jeffe107 Sep 17, 2025
bb864fa
Update meta.yml
jeffe107 Sep 17, 2025
07a0f3e
Merge branch 'dev' into dev
jfy133 Nov 3, 2025
f6e0fa5
Merge branch 'dev' into dev
jfy133 Nov 7, 2025
658f63f
BIgMAG compatibility
jeffe107 Nov 21, 2025
2ce8180
Merge branch 'dev' into dev
jfy133 Nov 21, 2025
a670f5e
[automated] Fix code linting
nf-core-bot Nov 21, 2025
be41a8a
Delete modules/local/concat_bigmag directory
jeffe107 Nov 27, 2025
bb198c3
Update docs/output.md
jeffe107 Nov 27, 2025
a65a4ef
Update docs/usage.md
jeffe107 Nov 27, 2025
af7cbc3
Update usage.md
jeffe107 Nov 27, 2025
2e5011c
Update subworkflows/local/bin_qc/main.nf
jeffe107 Nov 27, 2025
7906c2b
Add validation for BIgMAG file generation parameters
jeffe107 Nov 27, 2025
fa84fcf
Update workflows/mag.nf
jeffe107 Nov 27, 2025
8e58f35
Clarify help text for generate_bigmag_file
jeffe107 Nov 27, 2025
5c3535a
Fix error message for BIgMAG file generation
jeffe107 Nov 27, 2025
e6b272b
Fix BIgMAG parameter validation logic
jeffe107 Nov 27, 2025
c148f2d
Fix logical error in BIgMAG parameters check
jeffe107 Nov 27, 2025
547222d
Add BigMAG compatibility badge to README
jeffe107 Nov 27, 2025
819a5a6
Update BigMAG badge link in README.md
jeffe107 Nov 27, 2025
c942602
Update badge for BigMAG compatibility
jeffe107 Nov 27, 2025
a54d008
[automated] Fix code linting
nf-core-bot Nov 28, 2025
dab159e
Update docs/output.md
jeffe107 Nov 28, 2025
a6429c5
Update docs/output.md
jeffe107 Nov 28, 2025
4a05fa9
Update nextflow_schema.json
jeffe107 Nov 28, 2025
97840fd
Update test_assembly_input.config
jeffe107 Nov 28, 2025
bfb0e37
Merge branch 'dev' into dev-jeffe107
dialvarezs Nov 28, 2025
6201615
[automated] Fix code linting
nf-core-bot Nov 28, 2025
a20cf8e
Add BigMAG compatibility badge to README
jeffe107 Nov 28, 2025
cb2863a
Update BigMAG compatibility badge color
jeffe107 Nov 28, 2025
9d82e0f
Update CITATIONS.md
jeffe107 Nov 28, 2025
0bc2b1d
Update docs/usage.md
jeffe107 Nov 28, 2025
4806731
Fix punctuation in BIgMAG compatibility section
jeffe107 Nov 28, 2025
d959f3c
Fix description formatting for generate_bigmag_file
jeffe107 Nov 28, 2025
d31346d
Update snapshot
dialvarezs Nov 28, 2025
7d1a105
Move entry to dev section
jfy133 Dec 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- [#905](https://github.com/nf-core/mag/pull/905) - Add nf-test snapshot for `test_assembly_input` profile (by @dialvarezs)
- [#930](https://github.com/nf-core/mag/pull/930) - Add binner SemiBin2 (by @d4straub)
- [#861](https://github.com/nf-core/mag/pull/861) - Added `--generate_bigmag_file` to execute the bigmag workflow that generates the file to be used as input for [BIgMAG](https://github.com/jeffe107/BIgMAG) (added by @jeffe107)

### `Changed`

Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,10 @@

> Orakov, A., Fullam, A., Coelho, A. P., Khedkar, S., Szklarczyk, D., Mende, D. R., Schmidt, T. S. B., and Bork, P.. 2021. “GUNC: Detection of Chimerism and Contamination in Prokaryotic Genomes.” Genome Biology 22 (1): 178. doi: 10.1186/s13059-021-02393-0.

- [BIgMAG](https://doi.org/10.12688/f1000research.152290.2)

> Yepes-García, J., Falquet, L. (2024). Metagenome quality metrics and taxonomical annotation visualization through the integration of MAGFlow and BIgMAG. F1000Research 13:640. doi.org/10.12688/f1000research.152290.2

- [MaxBin2](https://doi.org/10.1093/bioinformatics/btv638)

> Yu-Wei, W., Simmons, B. A. & Singer, S. W. (2015) MaxBin 2.0: An Automated Binning Algorithm to Recover Genomes from Multiple Metagenomic Datasets. Bioinformatics 32 (4): 605–7. doi: 10.1093/bioinformatics/btv638.
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,15 @@
[![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/)
[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)
[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)

[![Launch on Seqera Platform](https://img.shields.io/badge/Launch%20%F0%9F%9A%80-Seqera%20Platform-%234256e7)](https://cloud.seqera.io/launch?pipeline=https://github.com/nf-core/mag)

[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23mag-4A154B?labelColor=000000&logo=slack)](https://nfcore.slack.com/channels/mag)[![Follow on Bluesky](https://img.shields.io/badge/bluesky-%40nf__core-1185fe?labelColor=000000&logo=bluesky)](https://bsky.app/profile/nf-co.re)[![Follow on Mastodon](https://img.shields.io/badge/mastodon-nf__core-6364ff?labelColor=FFFFFF&logo=mastodon)](https://mstdn.science/@nf_core)[![Watch on YouTube](http://img.shields.io/badge/youtube-nf--core-FF0000?labelColor=000000&logo=youtube)](https://www.youtube.com/c/nf-core)

![HiRSE Code Promo Badge](https://img.shields.io/badge/Promo-8db427?label=HiRSE&labelColor=005aa0&link=https%3A%2F%2Fgo.fzj.de%2FCodePromo)

[![Static Badge](https://img.shields.io/badge/%F0%9F%8D%94%20%20BIgMAG-compatible-%2324B064)](https://github.com/jeffe107/BIgMAG)

## Introduction

**nf-core/mag** is a bioinformatics best-practise analysis pipeline for assembly, binning and annotation of metagenomes.
Expand Down Expand Up @@ -97,6 +100,7 @@ Other code contributors include:
- [Greg Fedewa](https://github.com/harper357)
- [Vini Salazar](https://github.com/vinisalazar)
- [Alex Caswell](https://github.com/AlexHoratio)
- [Jeferyd Yepes](https://github.com/jeffe107)

Long read processing was inspired by [caspargross/HybridAssembly](https://github.com/caspargross/HybridAssembly) written by Caspar Gross [@caspargross](https://github.com/caspargross)

Expand Down
71 changes: 71 additions & 0 deletions bin/bigmag_summary.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#!/usr/bin/env python
Comment thread
jeffe107 marked this conversation as resolved.

## Originally written by Jeferyd Yepes and released under the MIT license.
## See git repository (https://github.com/nf-core/mag) for full license text.

import pandas as pd
import re
import argparse
import sys
import warnings

def parse_args(args=None):
parser = argparse.ArgumentParser()
parser.add_argument("-s", "--summary", metavar="FILE", help="Pipeline summary file.")
parser.add_argument("-g", "--gunc_summary", metavar="FILE", help="GUNC summary file.")

parser.add_argument(
"-o",
"--out",
required=True,
metavar="FILE",
type=argparse.FileType("w"),
help="Output file containing final bigmag summary.",
)
return parser.parse_args(args)


def main(args=None):
args = parse_args(args)

if (
not args.summary
and not args.gunc_summary
):
sys.exit(
"No summary specified! "
"Please specify the pipeline summary and the GUNC summary."
)

df_summary = pd.read_csv(args.summary, sep='\t')
df_summary.columns = df_summary.columns.str.replace(r'(_busco|_checkm2|_checkm|_gtdbtk|_gunc|_quast)$', '', regex=True)
for i in range(len(df_summary["bin"])):
name = df_summary["bin"][i]
name = re.sub(r'\.(fa|fasta)(\..*)?$', '', name)
df_summary.at[i,"bin"] = name
df_summary = df_summary.sort_values(by='bin')
df_summary["bin"] = df_summary["bin"].astype(str)

df_gunc = pd.read_csv(args.gunc_summary, sep='\t')
df_gunc["genome"] = df_gunc["genome"].astype(str)
df_gunc = df_gunc.sort_values(by='genome')

df_summary = pd.merge(df_summary, df_gunc, left_on='bin', right_on='genome', how='left')

df_summary.rename(columns={'bin': 'Bin'}, inplace=True)
columns_to_remove = ['Name', "genome", 'Input_file', 'Assembly', 'Bin Id']
df_summary = df_summary.drop(columns=columns_to_remove, errors="ignore")

df_summary['sample'] = None
for f in range(len(df_summary["Bin"])):
match = re.search(r'^.*?-.*?-(.*)$', df_summary["Bin"][f])
if match:
name = match.group(1)
name = re.sub(r'\.(unbinned|noclass)(\..*)?$', '', name)
name = re.sub(r'\.\d+(\.[^.]+)?$', '', name)
df_summary.at[f,"sample"] = name

df_summary.to_csv(args.out, sep="\t", index=True)

if __name__ == "__main__":
sys.exit(main())
1 change: 1 addition & 0 deletions bin/combine_tables.py
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@ def main(args=None):
"Coding_Density",
"Translation_Table_Used",
"Total_Coding_Sequences",
"Genome_Size"
]
checkm2_results = pd.read_csv(
args.checkm2_summary, usecols=use_columns, sep="\t"
Expand Down
9 changes: 9 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -990,4 +990,13 @@ process {
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}
withName: BIGMAG {
publishDir = [
[
path: { "${params.outdir}/GenomeBinning/BIgMAG/" },
mode: params.publish_dir_mode,
pattern: '*.tsv',
]
]
}
}
2 changes: 2 additions & 0 deletions conf/test_assembly_input.config
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,10 @@ params {

// TODO: enable when we have a suitable way to run a small test
// GUNC fails with exit code 1 if no matches, see https://github.com/grp-bork/gunc/issues/42
// To generate the BIgMAG file, it is necessary to include GUNC in the execution
run_gunc = false
gunc_db = params.pipelines_testdata_base_path + 'mag/databases/gunc/gunc-mock.dmnd'
//generate_bigmag_file = true

skip_metaeuk = false
metaeuk_mmseqs_db = 'Kalamari'
Expand Down
11 changes: 11 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -843,6 +843,17 @@ Note that in contrast to the other tools, for CheckM the bin name given in the c

All columns other than the primary `bin` key column, and the `Depth <sample name>` columns, will include a suffix specifying from which bin QC tool the column is derived from to distinguish identically named columns from different tools.

## Summary file to be used as input for BIgMAG

<details markdown="1">
<summary>Output files</summary>

- `GenomeBinning/BIgMAG/bigmag_summary.tsv`: Summary of bin sequencing depths together with GUNC, QUAST, GTDB-Tk, BUSCO and CheckM2 results.

</details>
Comment thread
jeffe107 marked this conversation as resolved.

The output file in this directory is used as input for the dashboard [BIgMAG](https://github.com/jeffe107/BIgMAG) for visualisation and evaluation of MAG quality.

## Ancient DNA

Optional, only running when parameter `-profile ancient_dna` is specified.
Expand Down
6 changes: 6 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -543,3 +543,9 @@ Up until version 4.0.0, this pipeline offered raw read taxonomic profiling using
This feature was removed in version 5.0.0 to strengthen the pipeline's focus on metagenome assembly and binning.

If you require taxonomic profiling of raw reads, we recommend using [nf-core/taxprofiler](https://nf-co.re/taxprofiler/), which is specifically designed for taxonomic profiling of raw reads and supports a wide range of tools for this purpose.

## BIgMAG compatibility

With the parameter `--generate_bigmag_file` a module will be triggered to generate a file that contains the output from all of the bin-quality tools that can be uploaded to the [BIgMAG](https://github.com/jeffe107/BIgMAG) dashboard for visualising and evaluating MAGs.
Please note that generating this file requires the parameters `--run_busco`, `--run_gunc` and `--run_checkm2`, and GTDBTk should be executed (i.e., not skipped).
The file `bigmag_summary.tsv` located at `GenomeBinning/BIgMAG` is the only file needed to run the BIgMAG dashboard.
7 changes: 7 additions & 0 deletions modules/local/bigmag/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json
channels:
- conda-forge
- bioconda
dependencies:
- conda-forge::python=3.10.6
- conda-forge::pandas=1.4.3
36 changes: 36 additions & 0 deletions modules/local/bigmag/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
process BIGMAG {

conda "conda-forge::pandas=1.4.3"
container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container
? 'https://depot.galaxyproject.org/singularity/pandas:1.4.3'
: 'biocontainers/pandas:1.4.3'}"

input:
path summary
path gunc_sum

output:
path "bigmag_summary.tsv", emit: bigmag_summary
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def summary = summary.sort().size() > 0 ? "--summary ${summary}" : ""
Comment thread
jeffe107 marked this conversation as resolved.
def gunc_summary = gunc_sum.sort().size() > 0 ? "--gunc_summary ${gunc_sum}" : ""
"""
bigmag_summary.py \
${args} \
${summary} \
Comment thread
jeffe107 marked this conversation as resolved.
${gunc_summary} \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering now why we don't have GUNC already in the bin_summary.tsv table by defult 🤔 I think it owuld make sense to have it anyway...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then all you need to do is rename the columns etc.

--out bigmag_summary.tsv

cat <<-END_VERSIONS > versions.yml
"${task.process}":
python: \$(python --version 2>&1 | sed 's/Python //g')
pandas: \$(python -c "import pkg_resources; print(pkg_resources.get_distribution('pandas').version)")
END_VERSIONS
"""
}
1 change: 1 addition & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ params {
gunc_database_type = 'progenomes'
gunc_db = null
gunc_save_db = false
generate_bigmag_file = false

// Reproducibility options
megahit_fix_cpu_1 = false
Expand Down
5 changes: 5 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -1027,6 +1027,11 @@
"type": "boolean",
"description": "Save the used GUNC reference files downloaded when not using --gunc_db parameter.",
"help_text": "If specified, the corresponding DIAMOND file downloaded from the GUNC server will be stored in your output directory alongside your GUNC results."
},
"generate_bigmag_file": {
"type": "boolean",
"description": "Make a BIgMAG input file including GUNC results.",
"help_text": "It requires --run_gunc and --run_checkm2 to be executed. BINQC, GTDB-TK, QUAST nor BUSCO can't be skipped."
}
}
},
Expand Down
5 changes: 3 additions & 2 deletions subworkflows/local/bin_qc/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ workflow BIN_QC {
ch_busco_final_summaries = channel.empty()
ch_checkm_final_summaries = channel.empty()
ch_checkm2_final_summaries = channel.empty()

ch_gunc_summary = channel.empty()

/*
================================
Expand Down Expand Up @@ -213,7 +213,7 @@ workflow BIN_QC {
ch_versions.mix(GUNC_RUN.out.versions)

// Make sure to keep directory in sync with modules.conf
GUNC_RUN.out.maxcss_level_tsv
ch_gunc_summary = GUNC_RUN.out.maxcss_level_tsv
.map { _meta, gunc_summary -> gunc_summary }
.collectFile(
name: "gunc_summary.tsv",
Expand Down Expand Up @@ -242,6 +242,7 @@ workflow BIN_QC {
busco_summary = ch_busco_final_summaries
checkm_summary = ch_checkm_final_summaries
checkm2_summary = ch_checkm2_final_summaries
gunc_summary = ch_gunc_summary
multiqc_files = ch_multiqc_files
versions = ch_versions
}
5 changes: 5 additions & 0 deletions subworkflows/local/utils_nfcore_mag_pipeline/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,11 @@ def validateInputParameters(hybrid) {
if (params.prokka_with_compliance && !params.prokka_compliance_centre) {
error('[nf-core/mag] ERROR: Invalid parameter combination: running PROKKA with compliance mode requires a centre name specified with `--prokka_compliance_centre <XYZ>`!')
}

// Check BIgMAG parameters
if (params.generate_bigmag_file && (!params.run_gunc || !params.run_checkm2 || !params.run_busco || params.skip_gtdbtk || params.skip_quast || params.skip_binqc)) {
error('[nf-core/mag] ERROR: To generate the BIgMAG file you need to include the parameters `--run_checkm2` and `--run_gunc`, and you cannot skip BINQC, GTDB-TK, QUAST nor BUSCO.')
}
}

//
Expand Down
4 changes: 2 additions & 2 deletions tests/test_alternatives.nf.test.snap
Original file line number Diff line number Diff line change
Expand Up @@ -240,15 +240,15 @@
"MEGAHIT-MetaBAT2-prokarya-unrefined-group-0_checkm2_report.tsv:md5,ec01903b7f8a7203856a35a7bd2d4c34",
"checkm2_summary.tsv:md5,ec01903b7f8a7203856a35a7bd2d4c34",
"tiara_summary.tsv:md5,4cbfb0fd90ba48dc33d75d10c1eddb17",
"bin_summary.tsv:md5,e6a07c3280c6db0f7822688559afb053",
"bin_summary.tsv:md5,cf879345b96d29b5b8584795504b81cf",
"bin_depths_summary.tsv:md5,a73f2fc180a2c52038c88fbbfa6a95c3"
]
],
"meta": {
"nf-test": "0.9.3",
"nextflow": "25.10.0"
},
"timestamp": "2025-11-16T01:37:18.534799185"
"timestamp": "2025-11-28T14:15:43.756748108"
},
"multiqc": {
"content": [
Expand Down
9 changes: 9 additions & 0 deletions workflows/mag.nf
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ include { QUAST } from '../modules/local/quast_run/mai
include { QUAST_BINS } from '../modules/local/quast_bins/main'
include { QUAST_BINS_SUMMARY } from '../modules/local/quast_bins_summary/main'
include { BIN_SUMMARY } from '../modules/local/bin_summary/main'
include { BIGMAG } from '../modules/local/bigmag/main'

workflow MAG {
take:
Expand Down Expand Up @@ -474,6 +475,14 @@ workflow MAG {
)
ch_versions = ch_versions.mix(BIN_SUMMARY.out.versions)
}
if (params.generate_bigmag_file ) {
BIGMAG(
BIN_SUMMARY.out.summary,
BIN_QC.out.gunc_summary
)
ch_bigmag_summary = BIGMAG.out.bigmag_summary
ch_versions = ch_versions.mix(BIGMAG.out.versions)
}

/*
* Prokka: Genome annotation
Expand Down