Skip to content

Filter out transcript with incomplete CDS 5' and/or 3' #1264

@susannasiebert

Description

@susannasiebert

Some transcript are annotated with an incomplete CDS 5' or 3' position. This is reflected, for example, by the first amino acid being X in the reference peptide fasta. Such transcripts are not desired and should be filtered out.

VEP annotates such transcripts in the FLAGS (transcript quality) CSQ field. This can be either empty or have values of cds_start_NF and/or cds_end_NF. If these values are set, transcript should not be included for processing in pVACseq.

TBD, the exact behavior here. Some options are:

  1. Hard-filter these upstream when we parse the VCF
  2. Hard-filter these upstream but only when a new variable is set. If we go this route, we probably also want to deprioritize affected transcripts when generating the aggregated report in cases where users do not set the variable to hard-filter them.
  3. When creating the aggregated report, add a new option to the transcript prioritization strategy (transcript_quality, maybe) that will deprioritize transcripts with a FLAGS set

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions