Skip to content

Merging axiom annotations cause unexpected loss of information in OBO output #1244

@gouttegd

Description

@gouttegd

Now that the repair --merge-axiom-annotations command has been fixed (#1240), I noticed that this feature has an unexpected effect when writing to OBO.

Consider the following minimal example:

Prefix(fbbt:=<http://purl.obolibrary.org/obo/fbbt#>)
Prefix(obo:=<http://purl.obolibrary.org/obo/>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)
Prefix(oboInOwl:=<http://www.geneontology.org/formats/oboInOwl#>)


Ontology(<http://purl.obolibrary.org/obo/fbbt.owl>
Declaration(Class(obo:FBbt_00007113))
Declaration(AnnotationProperty(fbbt:HARTENSTEIN_BRAIN_LINEAGE))
Declaration(AnnotationProperty(fbbt:ITO_LEE_BRAIN_LINEAGE))
Declaration(AnnotationProperty(oboInOwl:SynonymTypeProperty))
Declaration(AnnotationProperty(oboInOwl:hasExactSynonym))
Declaration(AnnotationProperty(oboInOwl:hasSynonymType))
Declaration(AnnotationProperty(rdfs:label))

# Annotation Property: fbbt:HARTENSTEIN_BRAIN_LINEAGE (name in Hartenstein brain lineage nomenclature scheme)
AnnotationAssertion(rdfs:label fbbt:HARTENSTEIN_BRAIN_LINEAGE "name in Hartenstein brain lineage nomenclature scheme")
SubAnnotationPropertyOf(fbbt:HARTENSTEIN_BRAIN_LINEAGE oboInOwl:SynonymTypeProperty)

# Annotation Property: fbbt:ITO_LEE_BRAIN_LINEAGE (name in Ito/Lee brain lineage nomenclature scheme)
AnnotationAssertion(rdfs:label fbbt:ITO_LEE_BRAIN_LINEAGE "name in Ito/Lee brain lineage nomenclature scheme")
SubAnnotationPropertyOf(fbbt:ITO_LEE_BRAIN_LINEAGE oboInOwl:SynonymTypeProperty)

# Class: obo:FBbt_00007113 (neuroblast MB)
AnnotationAssertion(Annotation(oboInOwl:hasSynonymType fbbt:HARTENSTEIN_BRAIN_LINEAGE) oboInOwl:hasExactSynonym obo:FBbt_00007113 "neuroblast MBp")
AnnotationAssertion(Annotation(oboInOwl:hasSynonymType fbbt:ITO_LEE_BRAIN_LINEAGE) oboInOwl:hasExactSynonym obo:FBbt_00007113 "neuroblast MBp")
AnnotationAssertion(rdfs:label obo:FBbt_00007113 "neuroblast MB")

)

where the obo:FBbt_00007113 class has two (identical) exact synonyms (“neuroblast MBp”) with two different types (HARTENSTEIN_BRAIN_LINEAGE and ITO_LEE_BRAIN_LINEAGE).

Converting to OBO, this yields (as expected):

[Term]
id: FBbt:00007113
name: neuroblast MB
synonym: "neuroblast MBp" EXACT ITO_LEE_BRAIN_LINEAGE []
synonym: "neuroblast MBp" EXACT HARTENSTEIN_BRAIN_LINEAGE []

But if we merge the axiom annotations with repair --merge-axiom-annotations, the original file becomes (as expected):

# showing only what differs from the file above...
AnnotationAssertion(Annotation(oboInOwl:hasSynonymType fbbt:HARTENSTEIN_BRAIN_LINEAGE) 
                    Annotation(oboInOwl:hasSynonymType fbbt:ITO_LEE_BRAIN_LINEAGE)
                    oboInOwl:hasExactSynonym obo:FBbt_00007113 "neuroblast MBp")

where we have only one annotation assertion axiom for a “neuroblast MBp” carrying two different axiom annotations representing the two different synonym types. Converting to OBO, this yields (unexpectedly):

[Term]
id: FBbt:00007113
name: neuroblast MB
synonym: "neuroblast MBp" EXACT ITO_LEE_BRAIN_LINEAGE []

Presumably, the OWL2OBO converter in the OWLAPI does not expect a oboInOwl:has*Synonym assertion axiom to have more than one oboInOwl:hasSynonymType annotation, and when encountering several of them it only uses one and discard the others.

I am not entirely sure which of ROBOT or of the OWLAPI is “at fault” here, or even if there is a fault. I think ROBOT is right to merge the two axioms, but the OWLAPI is not wrong to assume that only one synonym type is allowed (AFAIK the OBO format does not permit one synonym tag to have more than one type).

Possible fixes:

(1) Amend ROBOT’s robot --merge-axiom-annotations to specifically detect cases like the one above and avoid merging in such cases.

(2) Amend the OWLAPI OBO2OWL converter so that a single annotation assertion for a synonym carrying more than one synonym type annotations gets converted into distinct synonym: tags.

I think (2) is a better solution but I’d like to know what other people think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions