You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #158 produces a merged StringTie GTF but does not feed it into the ORF callers — novel transcript discovery currently has no effect on biological outputs. This issue wires the hybrid GTF (canonical backbone from #161 + class u novel transcripts from #164) into Ribo-TISH and Ribotricer.
Blocked by:#161 (canonical backbone), #164 (StringTie/gffcompare filter), nf-core/modules#11644 (ribotish/predict -a flag), #162 (ribotish/quality investigation). Gated on --extended_orf_analysis.
STAR --quantMode TranscriptomeSAM produces a BAM keyed to the reference transcriptome at alignment time. RiboCode and riboWaltz consume this transcriptome BAM — novel StringTie sequences not present at alignment time are invisible to them.
Resolution:
riboWaltz: canonical annotation only, permanently. It is a QC tool; hybrid GTF input has no scientific value and would degrade CDS-diagnostic plot readability.
RiboCode: canonical only in this phase. Equal novel-locus coverage for RiboCode requires a second STAR alignment against a hybrid transcriptome — addressed separately in issue feat: ORF-level differential translation analysis (DT, DTE, and DOU) #168 (second STAR pass). That issue must be filed and addressed to restore three-caller parity for novel ORFs.
Ribo-TISH + Ribotricer: genome-BAM tools — can accept any GTF directly. Updated in this issue.
-g novel_intergenic.gtf # discovery target
-a canonical_backbone.gtf # background model + ORF classification
Or pass hybrid_reference.gtf to -g (simpler; slightly less clean separation).
Ribotricer prepare-orfs: pass hybrid_reference.gtf directly. No module change needed — Ribotricer classifies all CDS-absent transcripts as novel automatically.
Plastid metagene_generate / psite: canonical backbone GTF only (requires CDS for ROI generation). Plastid wiggle tracks are genome-wide — they can quantify P-sites at any coordinate including novel ORF loci without a GTF change.
Do not pre-annotate novel transcripts with TransDecoder — labelling sequence-predicted ORFs as annotated CDS confuses all downstream classifiers and conflates sequence-based prediction with Ribo-seq evidence.
Summary
PR #158 produces a merged StringTie GTF but does not feed it into the ORF callers — novel transcript discovery currently has no effect on biological outputs. This issue wires the hybrid GTF (canonical backbone from #161 + class
unovel transcripts from #164) into Ribo-TISH and Ribotricer.Blocked by: #161 (canonical backbone), #164 (StringTie/gffcompare filter), nf-core/modules#11644 (ribotish/predict
-aflag), #162 (ribotish/quality investigation). Gated on--extended_orf_analysis.Architecture decision: transcriptome-BAM constraint
STAR
--quantMode TranscriptomeSAMproduces a BAM keyed to the reference transcriptome at alignment time. RiboCode and riboWaltz consume this transcriptome BAM — novel StringTie sequences not present at alignment time are invisible to them.Resolution:
Hybrid GTF construction
Emit
hybrid_reference.gtfas a published output at<outdir>/stringtie/hybrid_reference.gtf.Per-tool wiring
Ribo-TISH
predict(requires nf-core/modules#11644):Or pass
hybrid_reference.gtfto-g(simpler; slightly less clean separation).Ribotricer
prepare-orfs: passhybrid_reference.gtfdirectly. No module change needed — Ribotricer classifies all CDS-absent transcripts asnovelautomatically.Plastid
metagene_generate/psite: canonical backbone GTF only (requires CDS for ROI generation). Plastid wiggle tracks are genome-wide — they can quantify P-sites at any coordinate including novel ORF loci without a GTF change.Do not pre-annotate novel transcripts with TransDecoder — labelling sequence-predicted ORFs as
annotatedCDS confuses all downstream classifiers and conflates sequence-based prediction with Ribo-seq evidence.References