An ontological definition of disease enables each type of disease to be singularly classified in a formalized structure. By intention, the use of disease ontology terms should facilitate a cross-link of information between separate disease-related knowledge resources for a given domain.
However, multiple disease ontology frameworks have been developed for human disease (i.e. OncoTree, Experimental Factor Ontology (EFO), Disease Ontology (DO), UMLS, ICD-10), and they are used to different extents across knowledge resources in the oncology domain, such as the following:
- gene-disease associations
- drug-disease indications
- variant-disease associations
In order to integrate such knowledge resources, there is henceforth a need to cross-link or map the entries across disease ontologies to the extent it is possible.
phenOncoX is an R data package that attempts to address this challenge. In short, phenOncoX provides a global cross-mapped set of phenotype ontology terms attributed to cancer phenotypes.
The mapping established within phenOncoX is semi-manually curated, using OncoTree as the starting point for a list of UMLS phenotype terms per cancer subtype/primary site. Next, phenOncoX appends a number of phenotypes attributed to heritable cancer conditions. Furthermore, each cancer subtype entry in OncoTree is expanded with additional subtypes that are found in the UMLS child-parent hierarchy of disease terms.
For each entry in the final list of phenotype terms, we make cross-mappings with phenotype terms from EFO, DO, and the ICD10 classification.
As of mid November 2024, the following ontology versions are used to create the mapping:
- OncoTree (2021_11_02)
- Experimental Factor Ontology v3.72.0 (2024-11-18)
- Disease Ontology (v2024-11-01)
IMPORTANT NOTE: The mapping established by phenOncoX attempts to be comprehensive, but we acknowledge that the presence of missing or erroneous cross-references might still occur.
sigven AT ifi.uio.no