Skip to content

AuReMe/prolipipe

 
 

Repository files navigation

Prolipipe : large-scale assessment of metabolic profiles on bacteria focusing on specific pathways.

Assessing capacity to synthesize or degrade specific compounds among a large set of bacterial metabolic networks and screen them accordingly.

This workflow is licensed under the GNU GPL-3.0-or-later, see the LICENSE file for details.

Prolipipe relies on outputs from the "AuFAMe" package It runs with python >= 3.8.

These python packages are needed :

Prolipipe also needs Quarto to generate the interactive report (version > 1.7).

Prolipipe is available on the conda channel "fermentsdufutur" and can be installed with:

conda create -n prolipipe
conda activate prolipipe
conda install -c fermentsdufutur prolipipe

Prolipipe and AuFAMe rely on data about genomes, either a taxonomic ID for building accurate GSMs in the case of AuFAMe or species name and assembly level categories for clustering during Prolipipe's analyses. A compatible structure for both is the following columns in a tsv file :

  • "Species" : space-separated species name (used to categorize genomes in reports and heatmaps)
  • "Taxon_id" : strict taxID of the species ; can be "2" (Bacteria) if ignored
  • "Filename" : strict name of the genome file, without file extension
  • "Strain" : space-separated strain name
  • "Status" : another metric to categorize genomes based on assembly quality ; that way, can be either "Complete", "Chromosome", "Scaffold" and "Contig"

To run Prolipipe with padmet files from AuFAMe as input, generate TSV files and an interactive Quarto report:

prolipipe -pad DIRECTORY --tax TAXFILE --pwy PWY_FOLD

To generate TSV files without the Quarto report:

prolipipe -pad DIRECTORY --tax TAXFILE --pwy PWY_FOLD --no-report

To run Prolipipe with TSV files from AuFAMe as input and generate an interactive Quarto report:

prolipipe -i DIRECTORY --tax TAXFILE --pwy PWY_FOLD

To regenerate Quarto report from TSV files created by Prolipipe:

prolipipe-report -i DIRECTORY -d OUT_DIRECTORY

About

The prolific pipeline is about genome annotation and metabolic pathway reconstruction. From .fasta genomes, it enables to find if strains of bacteria are theorically able to produce metabolites

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 100.0%