Skip to content

Conversation

@van-lamnguyen
Copy link

@van-lamnguyen van-lamnguyen commented Jan 15, 2025

This pull request adds or modifies an ECC.

ECC: H3 K27M Mutation
This is the first introduction of the ECC H3 K27M Mutation.

Close #8
Describe the changes you have implemented and link to any relevant issues.

Before submitting this PR, please make sure:

  • You have added a few sentences describing the PR here.
  • You have added yourself or the appropriate individual as the assignee.
  • You have added at least one relevant code reviewer to the PR.
  • You have added an entry to the CHANGELOG.md (see "keep a changelog" for more information).
  • Your commit messages follow the conventional commit style.

@van-lamnguyen van-lamnguyen self-assigned this Jan 15, 2025
@claymcleod claymcleod requested review from a team and removed request for claymcleod and mcrusch January 15, 2025 22:34
@claymcleod claymcleod added the E-MOLEC A molecular characteristic. label Jan 15, 2025
kind: binary
description:
"true":
summary: Here is a summary for the 'true' value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs to be filled out: how will the ECC be assigned, what will the permissible values be, and what do each of the permissible values mean.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comments. However, I am not sure what the description of the binary feature refers to.
In my understanding, it could be one of two cases:

  1. It describes the ECC, such as how we evaluate or confirm that ECC - criteria to define this ECC, for example:
    descriptions:
    "true":
    summary: Confirm by Sanger sequencing.
    "true":
    summary: confirm by RNA seq.

  2. It describe the ontology characteristics, meaning it should be in the definition of an ontology, for example:

Ontology Class: Diffuse midline glioma, H3 K27-altered
ECC:
- identifier: ECC-MOLEC-00001
name: H3 K27M Mutation
appear: true
- identifier: ECC-MOPHL-00001
name: Vascular Proliferation
appear: false
...

Please let us know your ideas or examples if possible, and correct me if I misunderstood.

Copy link

@Ssandor13 Ssandor13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H3 K27M refers to a specific mutation in the histone H3 protein, where lysine at position 27 is replaced by methionine.
This mutation is primarily associated with diffuse midline gliomas (DMGs), a category of highly aggressive brain tumors that predominantly affect children but can also occur in adults.
The mutation is linked to poor prognosis, with median overall survival ranging from 10.1 to 14.4 months post-diagnosis.PMC7739048, 38102230

@van-lamnguyen
A few things to add:

  1. I think we should describe if this variant is hg38 or hg19 and what it is in the alternative ref genome as this is important to note when classifying the biomarker.
  2. what is the Human Genome Variation Society (HGVS) nomenclature?
  3. This often occurs in the H3F3A (H3.3) or HIST1H3B/C (H3.1) genes.
  4. When this variant exists, what biological function is disrupted?

@van-lamnguyen
Copy link
Author

van-lamnguyen commented Jan 30, 2025

  1. I think we should describe if this variant is hg38 or hg19 and what it is in the alternative ref genome as this is important to note when classifying the biomarker.
  2. what is the Human Genome Variation Society (HGVS) nomenclature?
  3. This often occurs in the H3F3A (H3.3) or HIST1H3B/C (H3.1) genes.
  4. When this variant exists, what biological function is disrupted?

@Ssandor13 Thank you for your comments. The comments is all about detail of the variant. As I know that we have variant database, so is that possible if we have a field to add the ID of the variant from our database?. That will help us to easily get more information if needed and keep the content consistent.

@Ssandor13
Copy link

#10 (comment)
So we have these in PeCan, but they are pretty stale and outdated and I don't recommend sharing this: https://pecan.stjude.cloud/variants/details/30241

The characteristic is having this variant specifically, so I think the content should be about the variant, if we can point out to other databases like OMIM or ClinVar that provide a details description, I think that is worth discussing internally with the team what the best practice is. I'd love to hear your thoughts on which direction we should take.

@claymcleod claymcleod force-pushed the main branch 2 times, most recently from fc80b55 to ffb8ac0 Compare April 2, 2025 23:55
@van-lamnguyen
Copy link
Author

Hi all, I have updated the value for H3K27M ECC based on the initial question whether this mutaiton is observed from the sample or not. The mutation can be indentified by both immohistochemistry (IHC) and NGS techniques at genetic level. So, I've described both IHC and NGS in the value's description. However, I have a question: since we are considering this ECC as a molecular characteristic, should we only consider NGS testing results for this ECC or should we also describe IHG information for value's description for this ECC?
Addtionally, I've added a note explaining why K27M and K28M are used interchangeably in the ECC's description.

@claymcleod
Copy link
Member

That's a good question. Off the cuff, my feeling is that a characteristic should be as specific as reasonably possible. To that end, I feel this should really just focus on the molecular detection (not IHC).

@@ -0,0 +1,70 @@
state: proposed
name: H3 K27M Mutation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: H3 K27M Mutation
name: H3 K27M alteration

I submit we should center on the name ("" alteration) for point mutations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to suggest changes for every instances of mutation -> alteration, but please make those (and use alteration in the future).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions, I'd like to note that alteration term is a board term in genetics definition, which can be refer to various type of DNA changes including stucture variations, deletions, insertions, and more; not just a point mutation. Personally, I think that mutation" or "mutant would be more percise in the context. These terms are commonly used in scientific papers. That said, I also understood that we might want to generalize our terminolgy, so let's discuss futher to find the best convention for us to avoid ambiguity.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards mutation when the variant is limited to SNV and/or Indel such as in this case. When the type of hit can be more broad, then alteration.

kind: binary
description:
"true":
summary: H3 K27M mutation is observed from the sample.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary: H3 K27M mutation is observed from the sample.
summary: The H3 K27M alteration is present in the associated sample.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "The H3 K27M mutation is present in the sample."

Copy link
Member

@claymcleod claymcleod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks pretty good. I think it does maybe inform the data model a bit, as there are both molecular and histological characteristics that roll up to the same general idea. We should discuss this in a meeting with all 5 of us.

@claymcleod claymcleod requested review from Ssandor13 and mcrusch April 15, 2025 15:17
van-lamnguyen and others added 3 commits April 18, 2025 08:38
Co-authored-by: Clay McLeod <3411613+claymcleod@users.noreply.github.com>
Co-authored-by: Clay McLeod <3411613+claymcleod@users.noreply.github.com>
Co-authored-by: Clay McLeod <3411613+claymcleod@users.noreply.github.com>
Copy link

@mcrusch mcrusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impressive. I think the two big points to figure out are mutation vs alteration and the "details" section of the values.

@@ -0,0 +1,70 @@
state: proposed
name: H3 K27M Mutation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards mutation when the variant is limited to SNV and/or Indel such as in this case. When the type of hit can be more broad, then alteration.

This mutation disrupts normal epigenetic regulation by inhibiting the trimethylation of histone H3 at lysine 27 (H3K27me3), leading to widespread changes in gene expression that drive tumorigenesis.
The H3 K27M mutation is primarily associated with diffuse midline gliomas (DMGs) [22286216](https://pubmed.ncbi.nlm.nih.gov/22286216/), a group of highly aggressive brain tumors that predominantly affect children, though they can also occur in adults.
This mutation is linked to a poor prognosis, with median overall survival ranging from 10 to 14 months after diagnosis [PMC7739048](https://pmc.ncbi.nlm.nih.gov/articles/PMC7739048/), [38102230](https://pubmed.ncbi.nlm.nih.gov/38102230/).
Note that the H3-K27M mutation is sometimes also referred to as K28M in annotations that include the initiator methionine in protein numbering. In histone biology literature, the convention is to exclude the initiator methionine [29766298](https://pubmed.ncbi.nlm.nih.gov/29766298/), thus the mutation is commonly described as K27M.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very important comment. I was going to suggest it if it wasn't in here.

kind: binary
description:
"true":
summary: H3 K27M mutation is observed from the sample.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "The H3 K27M mutation is present in the sample."

"true":
summary: H3 K27M mutation is observed from the sample.
details: |
The presence of the H3 K27M mutation can be initially identified by positive nuclear staining in tumor cells using a H3 K27M mutation-specific immunohistochemistry (IHC) antibody, which strongly indicates the presence of the mutant histone protein.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think that for most modern datasets, we would be basing the call purely on NGS. The way this is written, it sounds like NGS would only be used for confirmation.

Overall, I don't really like the details section here. The way I think of it is that we want the semantics to be clear, i.e. it's important that people know what true means and what false means. I think that's abundantly clear from the summary alone. However, different institutions may have different mechanisms and even possibly different standards for deciding the presence/absence of a variant, and I think it's OK (and even preferable) to allow that variation. And it can change over time. The more detail you give about what we really mean about presence and absence, the more in sync everybody will be in theory, but it's also more work to maintain the characteristics, and I think by being so proscriptive we could find that it opens up a lot of debate that we would be better to avoid by just leaving those details up to each user. In this version we partly specify the criteria for determining presence/absence, but we leave a lot of it open to interpretation (e.g. variant calling methods and criteria on the NGS data). I think it's probably the worst of both worlds.

My suggestion is that we omit the information currently in details entirely, and I would say we shouldn't need value details for Booleans generally (and possibly not for some others as well). If the guidance that we put in here is from a referenced resource, then you could move this information to the "context" of that reference, where you could possibly summarize it as "This paper recommends practices for determining the presence of H3 K27M using immunohistochemistry and NGS testing".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

E-MOLEC A molecular characteristic.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ECC] H3 K27M alteration

5 participants