Skip to content

alejandrogzi/xorf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

138 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xorf

GitHub License

end-to-end robust and comprehensive ORF prediction pipeline
The Hiller Lab at the Senckenberg Research Institute

orf . pipeline . us


genome       ─────────────────────────────────────────────
ORFs           ATG════TAA     ATG══════════TGA   ATG══TAG

translationAi  █████░░░░      ████████░░░░       ██░░░░░
RNASamba       ████░░░░░      ███████░░░░░       █░░░░░░
TRANSAID       ██████░░░      █████████░░░       ██░░░░░
Netstart2      ███░░░░░░      ██████░░░░░        █░░░░░░
BLAST          ███████░░      ████████░░░        ░░░░░░░

features       └──────┘       └──────────┘       └─────┘
                   │                │                │
                   ▼                ▼                ▼
              GBoost: +        GBoost: +        GBoost: −

Usage

Note

Requirements: Nextflow ≥ 25.04.6, Docker or Apptainer, Java.

git clone https://github.com/hillerlab/xorf.git
cd xorf

Edit params.json (set regions, sequence, database), then:

# Docker
nextflow run main.nf -params-file params.json -profile docker

# Apptainer / Singularity
nextflow run main.nf -params-file params.json -profile apptainer

Smoke test:

nextflow run main.nf -profile test,apptainer

Note

You can also specify these options directly in params.json.

A helper sh script is provided to run the pipeline on a SLURM cluster. See details below.

Click to expand

Edit the path variables at the top of assets/hpc/xorf.sh (cache dir, container image, manifest path), then submit:

sbatch --array=1-<N> xorf.sh

Each array task spawns one Nextflow head job that submits all compute as child SLURM jobs.

PREDICT_ORFS run as SLURM job arrays. Partition routing, array sizes, and resource tiers are documented inline in nextflow.config — edit there to match your cluster.


Output

results/
├── 00_concat/       *bed
├──── 00_concat/raw/ *bed
├── 01_duplicates/   *bed
├── 02_results/      *bed
└── pipeline_info/    timeline, trace, DAG

Where to edit

File What
params.json Genome paths, alignment settings, checkpoints — per run
nextflow.config Compute resources, profiles, container, SLURM — rarely

About

comprehensive prediction of open-reading-frames

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors