Copyright © 2016 the Contributors to the Lexicon Model for Ontologies: Community Report, 10 May 2016 Specification, published by the Ontology-Lexicon Community Group under the W3C Community Final Specification Agreement (FSA). A human-readable summary is available.
This document describes the lexicon model for ontologies (lemon) as a main outcome of the work of the Ontology Lexicon (Ontolex) community group.
Ontologies are an important component of the Semantic Web but current ontology languages such as OWL and RDF(S) lack support for enriching ontologies with linguistic information, in particular with information concerning how ontology entities, i.e. properties, classes, individuals, etc. can be realized in natural language. The model described in this document aims to close this gap by providing a vocabulary that allows ontologies to be enriched with information about how the vocabulary elements described in them are realized linguistically, in particular in natural languages.
OWL and RDF(S) rely on the RDFS lable property to capture the relation between a vocabulary element and its (preferred) lexicalization in a given language. This lexicalization provides a lexical anchor that makes the class, property, individual etc. understandable to a human user. The use of a simple label for linguistic grounding as available in OWL and RDF(S) is far from being able to capture the necessary linguistic and lexical information that Natural Language Processing (NLP) applications working with a particular ontology need.
The aim of lemon is to provide rich linguistic grounding for ontologies. Rich linguistic grounding includes the representation of morphological and syntactic properties of lexical entries as well as the syntax-semantics interface, i.e. the meaning of these lexical entries with respect to an ontology or vocabulary.
This specification was published by the Ontology-Lexicon Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Final Specification Agreement (FSA) other conditions apply. Learn more about W3C Community and Business Groups.
This specification was published by the OntoLex Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.
This document is the first official report of the OntoLex community group. It does not represent the view of single individuals but reflects the consensus and agreement reach as part of the regular group discussions. The report should be regarded as the official specification of lemon.
If you wish to make comments regarding this document, please send them to public-ontolex@w3.org (subscribe, archives).
This document describes the specification of the lexicon model for ontologies (lemon) as resulting from the work of the W3C Ontology Lexicon Community Group.
The aim of the lexicon model for ontologies (lemon) is to provide rich linguistic grounding for ontologies. Rich linguistic grounding includes the representation of morphological and syntactic properties of lexical entries as well as the syntax-semantics interface, i.e. the meaning of these lexical entries with respect to an ontology or vocabulary.
This document is structured into nine sections, where the first five correspond to the main modules of lemon. Depending on their needs and requirements, applications will use one or more of the modules mentioned below, with the use of the OntoLex module being the minimal choice.
The last three sections do not describe the formal modelling but clarify
Ontologies are an important component of the Semantic Web but current standards such as OWL only support the addition of a simple label to entities in the ontology. It is not currently possible to add inflected forms, different genders, usage notes or create a full lexical resource such as Princeton WordNet. The model described in this document aims to close this gap by providing a vocabulary that allows ontologies to be enriched with information about how the vocabulary elements described in them are realized linguistically, in particular in natural languages, in order to render ontologies suitable for supporting meaningful interaction with and manipulation of them by human users and allowing NLP tools to be able work with ontologies.
OWL and RDF(S) rely on a property rdfs:label
to capture the relation between a vocabulary element and its (preferred) lexicalization in a given language. This lexicalization provides a lexical anchor that makes the concept, property, individual etc. understandable to a human user. The use of a simple label for linguistic grounding as available in OWL and RDF(S) is far from being able to capture the necessary linguistic and lexical information that Natural Language Processing (NLP) applications working with a particular ontology need. Such NLP applications are for example:
The purpose of the model is to support linguistic grounding of a given ontology by adding information about how the elements in the vocabulary of the ontology (individuals, classes, properties) are lexicalized in a given natural language.
The model follows the principle of semantics by reference [1] in the sense that the semantics of a lexical entry is expressed by reference to an individual, class or property defined in the ontology. In some cases, the lexicon itself can add named concepts which are not made explicit in the ontology.
The model described here is open in the sense that it provides a core vocabulary to add information about the linguistic realization of ontology and vocabulary elements. This vocabulary can and should be extended as required by a particular application. In particular, the model abstracts from specific linguistic theory or category systems used to describe the linguistic properties of lexical entries and their syntactic behavior, encouraging reuse of existing data category systems or linguistic ontologies. The model is thus agnostic with respect to the linguistic theory and category systems. We make explicit in this document at which points we refer to an external repository of data categories or introduce novel sub-properties of properties defined in lemon.
The model as presented here is inspired by many other models, in particular the Lexical Markup Framework (LMF), the LexInfo model, the LIR model, the Linguistic Meta Model (LMM), the semiotics.owl ontology design pattern, and the Senso Comune core model.
It is important to also mention what is not the purpose of the model:
The model is available with the following sub-namespaces for the various modules of the overall model:
All modules may be imported from the following URL:
Throughout this document, we will use Turtle RDF Syntax to provide examples showing the use of the model. Axioms will be paraphrased in natural language. We will assume the following namespaces throughout all the examples in this document:
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
@prefix synsem: <http://www.w3.org/ns/lemon/synsem#> .
@prefix decomp: <http://www.w3.org/ns/lemon/decomp#> .
@prefix vartrans: <http://www.w3.org/ns/lemon/vartrans#> .
@prefix lime: <http://www.w3.org/ns/lemon/lime#> .
As we frequently also refer to other models, we will also assume the following namespaces in all examples:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix skos: <http://www.w3.org/2004/02/skos#>.
@prefix dbr: <http://dbpedia.org/resource/>.
@prefix dbo: <http://dbpedia.org/ontology/>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#>.
@prefix semiotics: <http://www.ontologydesignpatterns.org/cp/owl/semiotics.owl#>.
@prefix oils: <http://lemon-model.net/oils#>.
@prefix dct: <http://purl.org/dc/terms/>.
@prefix provo: <http://www.w3.org/ns/prov#>.
Furthermore, we require that instances of the model adhere to the RDF 1.1 specification and follow the appropriate guidelines. In particular, we require that language tags adhere to Best Common Practice 47, where tags are made up of a language code (based on ISO 639 codes part 1, 2, 3 or 5), optionally followed by a hyphen and a ISO 3166-1 country code. Language tags may also contain further subtags expressing e.g. the region, script or further variants.
In all examples in this document, the above namespaces are introduced using an appropriate @prefix
statement. Prefixes are omitted from class and object property definitions if the referenced ontology element is defined in the same module. For cross-module and external references, the prefix is made explicit.
In many examples we will use the LexInfo ontology to describe grammatical categories, although this is not required for using this model. The LexInfo model and guidelines for constructing and extending linguistic category schemes are provided in the section on linguistic description.
The following diagram depicts the core model (ontolex). Boxes represent classes of the model. Arrows with filled heads represent object properties, while arrows with empty heads represent subclass relations. In arrows labeled 'X/Y' (e.g. sense/isSenseOf), X (sense) is the name of the object property and Y (isSenseOf) the name of the inverse property.
The main class of the core of the lexicon ontology model is the class Lexical Entry. A lexical entry is defined as follows:
SubClassOf: lexicalForm min 1 Form, canonicalForm max 1 Form, semiotics:Expression
A Lexical Entry thus needs to be associated with at least one form, and has at most one canonical form (see below).
Lexical entries are further specialized into words, affixes (e.g., suffix, prefix, infix or circumfix) and multiword expressions.
A multiword expression is a lexical entry that consists of two or more words.
SubClassOf: LexicalEntry
An affix is a lexical entry that represents a morpheme (suffix, prefix, infix, circumfix) that is attached to a word stem to form a new word.
SubClassOf: LexicalEntry
The following Turtle code gives examples of lexical entries for each of these subclasses, corresponding to the word cat, the multiword expression minimum finance lease payments and the affix anti:
A lexical entry can be realized in different ways from a grammatical point of view. These different grammatical realizations are represented as different forms of the lexical entry. A form is defined as follows:
A form represents one grammatical realization of a lexical entry.
SubclassOf: writtenRep min 1 rdf:langString
A lexical entry can be associated to one of its forms by means of the lexicalForm property, although it is preferred to use one of the two subproperties (canonical form, other form) defined below.
The lexical form property relates a lexical entry to one grammatical form variant of the lexical entry.
Domain: LexicalEntry
Range: Form
Each form can thus have one or more written representations, defined as follows:
The written representation property indicates the written representation of a form.
Domain: Form
Range: rdf:langString
SubPropertyOf: representation
A simple example of a lexical entry with two different forms corresponding to two different grammatical realizations (as singular and plural noun, respectively) is given below:
:lex_child a ontolex:LexicalEntry ;
ontolex:lexicalForm :form_child_singular, :form_child_plural .
:form_child_singular a ontolex:Form ;
ontolex:writtenRep "child"@en .
:form_child_plural a ontolex:Form ;
ontolex:writtenRep "children"@en .
Different forms are used to express different morphological forms of the entry. They should not be used to represent ortographical variants, which should be represented as different representations of the same form. For example, for the lexical entry color, we would have two different representations of the same form, one for the British English written representation colour and one for the American English written representation color. Both representations have the same pronunciation and the same meaning, so they are two different lexicographic variants of the same lexical entry:
:lex_color a ontolex:LexicalEntry;
ontolex:lexicalForm :form_color.
:form_color a ontolex:Form;
ontolex:writtenRep "colour"@en-GB, "color"@en-US.
A form may also have a phonetic representation, indicating the pronunciation of the word.
The phonetic representation property indicates one phonetic representation of the pronunciation of the form using a scheme such as the International Phonetic Alphabet (IPA).
Domain: Form
Range: rdf:langString
SubPropertyOf: representation
The following example shows how we can represent two different pronunciations for one form of a lexical entry using the example of "privacy" (the phonetic code is based on IPA):
:lex_privacy a ontolex:LexicalEntry;
ontolex:lexicalForm :form_privacy.
:form_privacy a ontolex:Form;
ontolex:writtenRep "privacy"@en;
ontolex:phoneticRep "ˈpɹɪv.ə.si"@en-US-fonipa;
ontolex:phoneticRep "ˈpɹaɪ.və.si"@en-GB-fonipa.
Phonetic representation and written representation are both considered to be sub-properties of a more general property representation, for which users may define extra sub-properties as required.
The representation property indicates a string by which the form is represented according to some scheme.
Domain: Form
Range: rdf:langString
A lexical entry has a canonical form, which is the form that primarily identifies this entry and may be used as an index term in the lexicon. The canonical form for single words is typically the lemma of that word and is determined by lexicographic conventions for that language. In the case of verbs, the lemma is typically the infinitive form or, alternatively, the present tense of the verb (note that if an external particle is used to indicate the infinitive as in English "to play", this particle should be omitted). For nouns it is the noun singular form, while for adjectives it is the positive (i.e., non-negative, non-graded) form. For multiword entries it is assumed that the same principles of lemmatization are applied to the head word.
The property canonical form has a LexicalEntry as domain and a Form as range. It is a subproperty of the property lexicalForm. The canonical form has to be unique, so that the property canonical form is declared to be functional:
The canonical form property relates a lexical entry to its canonical or dictionary form. This usually indicates the "lemma" form of a lexical entry.
Domain: LexicalEntry
Range: Form
Characteristics: Functional
SubPropertyOf: lexicalForm
It is recommended to use the rdfs:label
property to indicate the canonical form in addition to the property canonicalForm to ensure compatibility with RDFS-based systems that expect an RDFS label. The lexical entries for the noun "cat", the verb "marry" and the adjective "high" would look as follows (in Turtle syntax):
:lex_cat a ontolex:LexicalEntry, ontolex:Word;
ontolex:canonicalForm :form_cat;
rdfs:label "cat"@en .
:form_cat a ontolex:Form;
ontolex:writtenRep "cat"@en .
:lex_marry a ontolex:LexicalEntry, ontolex:Word;
ontolex:canonicalForm :form_marry;
rdfs:label "marry"@en .
:form_marry a ontolex:Form;
ontolex:writtenRep "marry"@en .
:lex_high a ontolex:LexicalEntry, ontolex:Word;
ontolex:canonicalForm :form_high;
rdfs:label "high"@en .
:form_high a ontolex:Form;
ontolex:writtenRep "high"@en .
Of course, lexical entries need not to correspond to one word only, they can correspond to a multiword term, as the following example for the lexical entry "intangible assets" shows:
:lex_intangible_assets a ontolex:LexicalEntry, ontolex:MultiwordExpression;
ontolex:canonicalForm :form_intangible_assets;
rdfs:label "intangible assets"@en .
:form_intangible_assets a ontolex:Form;
ontolex:writtenRep "intangible assets"@en .
Mulitword expressions are assumed to be distinct in both their full form and any abbreviated form as there may be distinct lexical and pragmatic properties associated with the two different forms of the term. Links using other vocabularies such as LexInfo may be used to describe the type of abbreviation:
:nasa a ontolex:LexicalEntry, lexinfo:Acronym ;
ontolex:canonicalForm :form_nasa ;
lexinfo:abbreviationFor :national_aeronautics_and_space_administration;
rdfs:label "NASA"@en .
:form_nasa a ontolex:Form ;
ontolex:writtenRep "NASA"@en .
:national_aeronautics_and_space_administration a ontolex:LexicalEntry, ontolex:MultiwordExpression ;
ontolex:canonicalForm :form_national_aeronautics_and_space_administration ;
lexinfo:abbreviationFor :nasa ;
rdfs:label "National Aeronautics and Space Administration"@en .
:form_national_aeronautics_and_space_administration a ontolex:Form ;
ontolex:writtenRep "National Aeronautics and Space Administration"@en .
It is also possible to indicate non-canonical forms of lexical entries, which we call other forms:
The other form property relates a lexical entry to a non-preferred ("non-lemma") form that realizes the given lexical entry.
Domain: LexicalEntry
Range: Form
SubPropertyOf: lexicalForm
For example, we may specify non-canonical forms of the verb (to) marry as follows:
:lex_marry a ontolex:LexicalEntry ;
ontolex:canonicalForm :form_marry ;
ontolex:otherForm :form_marries .
:form_marry a ontolex:Form;
ontolex:writtenRep "marry"@en .
:form_marries a ontolex:Form;
ontolex:writtenRep "marries"@en .
The morphological class (i.e., declension, conjugation or similar) may be specified with the morphological pattern property to avoid having to list all regular forms of a word. The implementation of these patterns is not specified by this document (but should be provided by some suitable vocabulary such as LIAM).
The morphological pattern property indicates the morphological class of a word.
Domain: LexicalEntry
The following example shows how to indicate the conjugation for the Latin words amare and videre.
The model supports the specification of the meaning of lexical entries with respect to a given ontology. The lexicon model for ontologies follows the paradigm of semantics by reference in the sense that the meaning of a lexical entry is specified by pointing to the ontological concept that captures or represents its meaning.
The property denotes is defined as follows:
The denotes property relates a lexical entry to a predicate in a given ontology that represents its meaning and has some denotational or model-theoretic semantics.
Domain: LexicalEntry
Range: rdfs:Resource
SubPropertyOf: semiotics:denotes
InverseOf: isDenotedBy
PropertyChain: sense o reference
For the lexical entries cat and marriage, the meaning could be expressed by pointing to the corresponding DBpedia resources:
:lex_cat a ontolex:LexicalEntry;
ontolex:canonicalForm :form_cat;
ontolex:denotes <http://dbpedia.org/resource/Cat>.
:form_cat a ontolex:Form;
ontolex:writtenRep "cat"@en.
:lex_marriage a ontolex:LexicalEntry;
ontolex:canonicalForm :form_marriage;
ontolex:denotes <http://dbpedia.org/resource/Marriage>.
:form_marriage a ontolex:Form;
ontolex:writtenRep "marriage"@en .
The following example shows how we can model the fact that a word is ambiguous with respect to the meanings it denotes, for example the word 'troll' can refer both to a mythical creature and to someone who makes inflammatory posts on the internet. These two meanings can be easily captured as shown in the following example:
:troll a ontolex:LexicalEntry ;
ontolex:denotes <http://dbpedia.org/resource/Troll> ;
ontolex:denotes <http://dbpedia.org/resource/Internet_troll> .
Two terms may be different lexical entries if they are distinct in part-of-speech, gender, inflected forms or etymology. For example the following words with lemma 'bank' are all considered distinct:
:bank1_en a ontolex:LexicalEntry ;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
lexinfo:partOfSpeech lexinfo:noun ;
lexinfo:etymologicalRoot :banque_frm ;
ontolex:denotes <http://dbpedia.org/resource/Bank> .
:bank2_en a ontolex:LexicalEntry ;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
lexinfo:partOfSpeech lexinfo:noun ;
lexinfo:etymologicalRoot :hobanca_ang ;
ontolex:denotes <http://dbpedia.org/resource/Bank_(geographic)> .
:bank3_en a ontolex:LexicalEntry ;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
lexinfo:partOfSpeech lexinfo:verb ;
lexinfo:etymologicalRoot :hobanca_ang ;
ontolex:denotes <http://dbpedia.org/resource/Banked_turn> .
:bank1_de a ontolex:LexicalEntry ;
dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
lexinfo:partOfSpeech lexinfo:noun ;
lexinfo:gender lexinfo:feminine ;
ontolex:denotes <http://dbpedia.org/resource/Bank> ;
ontolex:otherForm :banken .
:banken ontolex:writtenRep "Banken"@de ;
lexinfo:number lexinfo:plural .
:bank2_de a ontolex:LexicalEntry ;
odct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
lexinfo:partOfSpeech lexinfo:noun ;
lexinfo:gender lexinfo:feminine ;
ontolex:denotes <http://dbpedia.org/resource/Bench_(furniture)> ;
ontolex:otherForm :baenke .
:baenke ontolex:writtenRep "Bänke"@de ;
lexinfo:number lexinfo:plural .
Note that the target of a denotation does not need to be an individual in the ontology but may also refer to a class, property or datatype property defined by the ontology. The model is agnostic with respect to the ontology language used to express the ontological meaning referred to. The assumption is merely that the entity in the range represents some predicate that has a denotational semantics in some formal logical system.
Properties in the model for linking to ontologies have an inverse property named as "is x-ed by", where x is the original property name to enable the lexicon to be defined in an ontology focused manner. In the case of denotes this property is isDenotedBy.
In some cases the meaning of a lexical entry is not explicit in the given ontology. Yet, to represent the meaning of a lexical entry we might want to create a new class at the interface between lexicon and ontology by reusing atomic ontological entities defined in the ontology in question. For example, we might want to express the meaning of an adjective by creating an anonymous restriction class at the level of the lexicon-ontology interface. This is illustrated below for the adjective "female" expressing the membership of an anonymous class ∃gender.{female}:
:female a ontolex:LexicalEntry;
lexinfo:partOfSpeech lexinfo:adjective;
ontolex:canonicalForm :female_canonical_form;
ontolex:sense :female_sense.
:female_canonical_form ontolex:writtenRep "female"@en.
:female_sense ontolex:reference [
a owl:Restriction;
owl:onProperty <http://dbpedia.org/ontology/gender> ;
owl:hasValue <http://dbpedia.org/resource/Female> ] ;
synsem:isA :female_arg .
For many practical modelling situations, the denotes property is not sufficient to capture the precise linking between a lexical entry and its meaning with respect to a given ontology. Thus, lemon introduces an intermediate element called lexical sense to capture the particular sense of a word that refers to the particular ontology entity. The lexical entry is linked to a lexical sense by means of the sense property and the lexical sense is linked to the ontology by means of the reference property. The chain sense ∘ reference is equivalent to the property denotes introduced above.
A lexical sense represents the lexical meaning of a lexical entry when interpreted as referring to the corresponding ontology element. A lexical sense thus represents a reification of a pair of a uniquely determined lexical entry and a uniquely determined ontology entity it refers to. A link between a lexical entry and an ontology entity via a Lexical Sense object implies that the lexical entry can be used to refer to the ontology entity in question.
SubClassOf: reference exactly 1 rdfs:Resource; isSenseOf exactly 1 LexicalEntry, semiotics:Meaning
Via the lexical sense object we can attach additional properties to a pair of lexical entry and ontological predicate that it denotes to describe under which conditions (context, register, domain, etc.) it is valid to regard the lexical entry as having the ontological entity as meaning. For example, we may wish to express the usages of the word "consumption" in terms of the topic and diachronic usage of the word. As shown in the following example, we can use the Dublin Core property subject to indicate the topic of the Sense. The example also shows how to use the property dating defined in the LexInfo ontology to specify that the fourth sense of consumption is outdated.
:lex_consumption a ontolex:LexicalEntry;
ontolex:canonicalForm :form_consumption;
ontolex:sense :consumption_sense1;
ontolex:sense :consumption_sense2;
ontolex:sense :consumption_sense3;
ontolex:sense :consumption_sense4 .
:form_consumption ontolex:writtenRep "consumption"@en.
:consumption_sense1 a ontolex:LexicalSense;
dct:subject <http://dbpedia.org/resource/Ecology> ;
ontolex:reference <http://dbpedia.org/resource/Consumption_(ecology)> .
:consumption_sense2 a ontolex:LexicalSense;
dct:subject <http://dbpedia.org/resource/Anatomy> ;
ontolex:reference <http://dbpedia.org/resource/Ingestion> .
:consumption_sense3 a ontolex:LexicalSense;
dct:subject <http://dbpedia.org/resource/Economics> ;
ontolex:reference <http://dbpedia.org/resource/Consumption_(economics)> .
:consumption_sense4 a ontolex:LexicalSense;
dct:subject <http://dbpedia.org/resource/Medicine> ;
lexinfo:dating lexinfo:old ;
ontolex:reference <http://dbpedia.org/resource/Tuberculosis> .
The lexical sense has a single lexical entry and a single reference in the ontology. As a consequence, the properties "sense" and "reference" are defined as inverse functional and functional, respectively.
The sense property relates a lexical entry to one of its lexical senses.
Domain: LexicalEntry
Range: LexicalSense
InverseOf: isSenseOf
Characteristics: Inverse Functional
The interpretation of a word (lexical entry) with respect to a meaning defined in a given ontology is often modulated by usage conditions or pragmatic implications in particular due to register, connotations or meaning nuances of a word. For example, consider as an example the French words 'rivière' and 'fleuve', which refer to rivers flowing into a sea and flowing into other rivers, respectively. As corresponding ontological classes to capture the specific meanings of these French words might not be available in the ontology, these meaning nuances can be specified using the property usage, which allows information to be captured related to usage conditions and pragmatic implications under which the lexical entry can be used to refer to the ontological meaning in question. These usage conditions are not introduced instead of a formally defined sense but complement the corresponding sense by additional information describing the usage of the lexical entry.
How exactly constraints on the usage of senses are defined is not specified by lemon. Yet, we give an example below that shows how to model the lexical meaning of 'rivière' and 'fleuve' when used to refer to the DBpedia class River:
:riviere a ontolex:LexicalEntry ;
ontolex:sense :riviere_sense .
:fleuve a ontolex:LexicalEntry ;
ontolex:sense :fleuve_sense .
:riviere_sense ontolex:reference <http://dbpedia.org/ontology/River> ;
ontolex:usage [
rdf:value "A riviere is a river that flows into another river"@en
] .
:fleuve_sense ontolex:reference <http://dbpedia.org/ontology/River>;
ontolex:usage [
rdf:value "A fleuve is a river that flows into the sea"@en
] .
We have seen above how to capture the fact that a certain lexical entry can be used to denote a certain ontological predicate. We capture this by saying that the lexical entry denotes the class or ontology element in question. However, sometimes we would like to express the fact that a certain lexical entry evokes a certain mental concept rather than that it refers to a class with a formal interpretation in some model. Thus, in lemon we introduce the class Lexical Concept that represents a mental abstraction, concept or unit of thought that can be lexicalized by a given collection of senses. A lexical concept is thus a subclass of skos:Concept.
A lexical concept represents a mental abstraction, concept or unit of thought that can be lexicalized by a given collection of senses.
SubClassOf: skos:Concept
The lexical entry is said to evoke a particular lexical concept, similar to how a lexical entry denotes an ontology reference.
The evokes property relates a lexical entry to one of the lexical concepts it evokes, i.e. the mental concept that speakers of a language might associate when hearing the lexical entry.
Domain: Lexical Entry
Range: Lexical Concept
InverseOf: isEvokedBy
Property Chain: sense o isLexicalizedSenseOf
The evoked concept is different from the reference in the ontology, as the reference primarily gives an interpretation of a word in terms of the identifiers that would be generated by the semantic parsing of the sentence. For example if we were to understand the sentence John F. Kennedy died in 1963. we may understand the verb "die (in)" as generating the URI deathDate within a SPARQL query. However, we might also want to record the actual lexical sense of the word with respect to a mental lexicon, in which die evokes the event of dying, as modelled in the following example:
:die a ontolex:Word ;
ontolex:denotes <http://dbpedia.org/ontology/deathDate> ;
ontolex:evokes :Dying .
We can link a lexical concept to a lexical sense that lexicalizes the concept via the property lexicalized sense:
The lexicalized sense property relates a lexical concept to a corresponding lexical sense that lexicalizes the concept.
Domain: Lexical Concept
Range: Lexical Sense
InverseOf: isLexicalizedSenseOf
A simple example involving the use of a lexical concept is the following:
:temporary_change_of_possession a ontolex:LexicalConcept;
ontolex:lexicalizedSense :borrow_sense;
ontolex:lexicalizedSense :lend_sense;
ontolex:isEvokedBy :borrow_le;
ontolex:isEvokedBy :lend_le.
:borrow_le a ontolex:LexicalEntry;
ontolex:sense :borrow_sense;
ontolex:evokes :temporary_change_of_possession.
:lend_le a ontolex:LexicalEntry;
ontolex_sense :lend_sense;
ontolex:evokes :temporary_change_of_possession.
Similarly, we can link a lexical concept to a reference in the ontology by means of the concept property:
The concept property relates an ontological entity to a lexical concept that represents the corresponding meaning.
Domain: owl:Thing
Range: Lexical Concept
InverseOf: isConceptOf
The combined usage of the properties denotes, sense, evokes, concept and lexicalized sense is demonstrated in the example below for the case of a lexical resource such as Princeton WordNet. Roughly, the synsets in a wordnet correspond to a lexical concept in lemon. The modelling would thus look as follows:
:cat_lex a ontolex:LexicalEntry ;
ontolex:canonicalForm :cat_form ;
ontolex:sense :cat_sense ;
ontolex:denotes <http://dbpedia.org/resource/Cat> ;
ontolex:evokes pwn:102124272-n .
:cat_form ontolex:writtenRep "cat"@en .
:cat_sense a ontolex:LexicalSense ;
ontolex:reference <http://dbpedia.org/resource/Cat> ;
ontolex:isLexicalizedSenseOf pwn:102124272-n ;
ontolex:isSenseOf :cat_lex .
<http://dbpedia.org/resource/Cat>
ontolex:concept pwn:102124272-n ;
ontolex:isReferenceOf :cat_sense ;
ontolex:isDenotedBy :cat_lex .
pwn:102124272-n a ontolex:LexicalConcept;
ontolex:isEvokedBy :cat_lex ;
ontolex:lexicalizedSense :cat_sense ;
ontolex:isConceptOf <http://dbpedia.org/resource/Cat> .
A definition can be added to a lexical concept as a gloss by using the skos:definition property.
In addition to organizing a lexicon by lexical entries, we may alternatively create a lexicon of concepts, by means of the the concept set class, defined as follows:
A concept set represents a collection of lexical concepts.
SubClassOf: skos:ConceptScheme, void:Dataset
EquivalentClass: skos:inScheme min 1 LexicalConcept
In this way lexicons can be ordered onomasiologically, that is by meanings rather than by lemmas. The concept set is a special type of skos:ConceptScheme. A lexical concept is linked to a ConceptSet using the property skos:inScheme
:conceptLexicon a ontolex:ConceptSet .
:consumption1 a ontolex:LexicalConcept ;
ontolex:isConceptOf <http://dbpedia.org/resource/Tuberculosis> ;
skos:definition "Tuberculosis, MTB, or TB (short for tubercle bacillus), in the past also called phthisis, phthisis pulmonalis, or consumption, is a widespread, and in many cases fatal, infectious disease caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. Tuberculosis typically attacks the lungs, but can also affect other parts of the body. It is spread through the air when people who have an active TB infection cough, sneeze, or otherwise transmit respiratory fluids through the air."@en;
ontolex:isEvokedBy :consumption ;
skos:inScheme :conceptLexicon .
:consumption2 a ontolex:LexicalConcept ;
ontolex:isConceptOf <http://dbpedia.org/resource/Consumption_(Economics)> ;
skos:definition "Consumption is a major concept in economics and is also studied by many other social sciences. Economists are particularly interested in the relationship between consumption and income, and therefore in economics the consumption function plays a major role.";
ontolex:isEvokedBy :consumption ;
skos:inScheme :conceptLexicon .
:tuberculosis1 a ontolex:LexicalConcept ;
ontolex:isConceptOf <http://dbpedia.org/resource/Tuberculosis> ;
skos:definition "Tuberculosis, MTB, or TB (short for tubercle bacillus), in the past also called phthisis, phthisis pulmonalis, or consumption, is a widespread, and in many cases fatal, infectious disease caused by various strains of mycobacteria, usually Mycobacterium tuberculosis. Tuberculosis typically attacks the lungs, but can also affect other parts of the body. It is spread through the air when people who have an active TB infection cough, sneeze, or otherwise transmit respiratory fluids through the air."@en;
ontolex:isEvokedBy :tuberculosis ;
skos:inScheme :conceptLexicon .
:consumption a ontolex:LexicalEntry ;
ontolex:canonicalForm :consumption_lemma .
:consumption_lemma ontolex:writtenRep "consumption"@en .
:tuberculosis a ontolex:LexicalEntry ;
ontolex:canonicalForm :tuberculosis_lemma .
:tuberculosis_lemma ontolex:writtenRep "tuberculosis"@en .
Most words in a language do not stand by their own, but have a certain syntactic behavior in the sense that they appear in certain syntactic structures and require a number of syntactic arguments to be complete. Examples of this are i) transitive verbs (e.g. to own), which require a syntactic subject and a syntactic object, ii) relational nouns (e.g. capital (of), mother (of), son (of), brother (of), etc.), which require a prepositional object, or iii) adjectives, which require a noun to modify, etc. The syntactic behavior of a lexical entry is defined in lemon by a syntactic frame:
A syntactic frame represents the syntactic behavior of an open class word in terms of the (syntactic) arguments it requires. It essentially describes the so called subcategorization structure of the word in question, in particular the syntactic arguments it requires.
In order to relate a lexical entry to one of its various syntactic behaviors as captured by a syntactic frame, the synsem
module defines the syntactic behavior property. Each lexical entry should have its own syntactic frame instance, generic behavior such as 'transitive' should be captured by classes.
The syntactic behavior property relates a lexical entry to one of its syntactic behaviors as captured by a syntactic frame.
Domain: ontolex:LexicalEntry
Range: SyntacticFrame
Characteristics: InverseFunctional
The following example shows how to indicate that the verb (to) own can be used as a transitive verb. This is accomplished by adding a frame own_frame_transitive that is declared as a transitive frame, using the class TransitiveFrame defined in the LexInfo Ontology.
:own_lex a ontolex:LexicalEntry ;
ontolex:canonicalForm :own_form ;
synsem:synBehavior :own_frame_transitive .
:own_frame_transitive a synsem:SyntacticFrame, lexinfo:TransitiveFrame.
:own_form ontolex:writtenRep "own"@en .
Arguments of a syntactic frame are represented by the class Syntactic Argument:
A syntactic argument represents a slot that needs to be filled for a certain syntactic frame to be complete. Syntactic arguments typically realize a certain grammatical function (e.g. subject, direct object, indirect object, prepositional object, etc.).
The object property synArg is used to relate a (syntactic) frame to one of its syntactic arguments.
The object property synArg relates a syntactic frame to one of its syntactic arguments.
Domain: SyntacticFrame
Range: SyntacticArgument
The following example shows how to extend the example for the verb (to) own by specifically indicating the arguments, in this case via two specific sub-properties of synArg, i.e. lexinfo:subject or lexinfo:directObject defined in the external LexInfo ontology.
:own_lex a ontolex:LexicalEntry ;
ontolex:canonicalForm :own_form ;
synsem:synBehavior :own_frame_transitive .
:own_form ontolex:writtenRep "own"@en.
:own_frame_transitive a lexinfo:TransitiveFrame;
lexinfo:subject :own_frame_subj;
lexinfo:directObject :own_frame_obj.
Note that if an external ontology is used to describe the type of arguments in more detail, e.g. indicating the grammatical function as in the example above, the external property used needs to be a sub-property of synArg.
At the lexicon-ontology interface, syntactic frames need to be mapped or bound to ontological structures that represent their meaning. In the same way that a lexical sense binds a lexical entry to an ontology entity, the OntoMap maps a syntactic frame onto an ontology entity.
An ontology mapping (OntoMap for short) specifies how a syntactic frame and its syntactic arguments map to a set of concepts and properties in the ontology that together specify the meaning of the syntactic frame.
In order to link an ontology map to a corresponding sense, the model foresees the property ontoMapping, which is defined as functional and inverse functional, that is in exact 1:1 relationship with a lexical sense. As such, it is recommended that in the case that a lexicon requires both the ontology map and the lexical sense, then these two entities are defined using the same URI as there is no technical reason to distinguish them and they have very similar functions.
The ontoMapping property relates an ontology mapping to its corresponding lexical sense.
Domain: OntoMap
Range: LexicalSense
Characteristics: Functional, InverseFunctional
The synsem
module introduces the property ontoCorrespondence to establish a mapping between an argument of a predicate defined in the ontology and the syntactic argument that realizes this predicate argument in a given syntactic frame:
The ontoCorrespondence property binds an argument of a predicate defined in the ontology to a syntactic argument that realizes this predicate argument syntactically.
Domain: OntoMap or LexicalSense
Range: SyntacticArgument
Without limitation, we assume that an ontology consists of symbols representing individuals, unary predicates and binary predicates, as indicated by the following table:
Type |
Predicate |
Predicate Logic Notation |
RDF Notation |
---|---|---|---|
Class |
Unary predicate |
City(x) |
|
Object, Datatype or Annotation Property |
Binary predicate |
knows(x,y), |
|
Individual |
Constant (null-ary predicate) |
London, |
|
Predicates with an arity of more than two can be represented by complex senses (see below). This is due to the fact that this module is aligned to RDF and OWL, which distinguish between: individuals/resources (constants), classes (unary predicates) and properties (predicates of arity "2").
In the following, we introduce three sub-properties of the ontoCorrespondence property. The first property is a is used to refer to the single argument of a unary predicate in the ontology:
The is a property represents the single argument of a class or unary predicate.
SubPropertyOf: ontoCorrespondence
Following the terminology used in RDF/OWL we call the first argument of a property its subject and the second argument the object. The synsem
module defines two properties subjOfProp and objOfProp that can be used to refer to the 1st (subject) and 2nd (object) argument of a property, that is a predicate of arity "2".
The subjOfProp property represents the 1st argument or subject of a binary predicate (property) in the ontology.
SubPropertyOf: ontoCorrespondence
The objOfProp represents the 2nd argument or object of a binary predicate (property) in the ontology.
SubPropertyOf: ontoCorrespondence
Finally, we can specify the reference owner that expresses the meaning of "to own" with respect to the DBpedia ontology, specifying the mapping between arguments of the property owner and the arguments that realize these arguments syntactically.
:own_lex a ontolex:LexicalEntry ;
ontolex:canonicalForm :own_form ;
synsem:synBehavior :own_frame_transitive ;
ontolex:denotes <http://dbpedia.org/ontology/owner> .
:own_form ontolex:writtenRep "own"@en.
:own_frame_transitive a lexinfo:TransitiveFrame;
lexinfo:subject :own_subj;
lexinfo:directObject :own_obj.
:own_ontomap a synsem:OntoMap;
synsem:subjOfProp :own_obj;
synsem:objOfProp :own_subj.
As a further example we show a lexical entry for the relational noun "father (of)". The entry indicates that the relation noun "father (of)" can be used to verbalize the DBpedia property father, whereby the subject in a copula construct such as "X is father of Y" (:arg1
below) corresponds to the 2nd argument of the property father, and the prepositional argument at position Y (:arg2
below) corresponds to the 1st argument of the property father. We use the LexInfo vocabulary to provide linguistic information.
:father_of a ontolex:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
ontolex:canonicalForm :father_form;
synsem:synBehavior :father_of_nounpp;
ontolex:sense :father_sense_ontomap.
:father_form a ontolex:Form;
ontolex:writtenRep "father"@en.
:father_of_nounpp a lexinfo:NounPPFrame;
lexinfo:subject :arg1;
lexinfo:prepositionalArg :arg2.
:father_sense_ontomap a synsem:OntoMap, ontolex:LexicalSense;
synsem:ontoMapping :father_sense_ontomap;
ontolex:reference <http://dbpedia.org/ontology/father>;
synsem:subjOfProp :arg2;
synsem:objOfProp :arg1.
:arg2 synsem:marker :of .
The object property marker indicates the marker of a syntactic argument; this can be a case marker or some other lexical entry such as a preposition or particle.
Domain: SyntacticArgument
Range: rdfs:Resource
The following example shows how to specify that the intransitive verb operate, subcategorizing a prepositional phrase introduced by the preposition in, can be used to denote the property regionServed in DBpedia. The entry specifies that in a construction such as `X operates in Y', the X refers to the subject of the property regionServed
, and the Y refers to the object of the property regionServed. Again, we use the LexInfo ontology in our example to provide linguistic information:
:operate_in a ontolex:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:verb ;
ontolex:canonicalForm :operate_form;
synsem:synBehavior :operate_intransitivepp;
ontolex:sense :operate_sense_ontomap.
:operate_form a ontolex:Form;
ontolex:writtenRep "operate"@en.
:operate_intransitivepp a synsem:SyntacticFrame;
lexinfo:subject :operate_subj ;
lexinfo:prepositionalArg :operate_pobj.
:operate_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
synsem:ontoMapping :operate_sense_ontomap;
ontolex:reference <http://dbpedia.org/ontology/regionServed>;
synsem:subjOfProp :operate_subj;
synsem:objOfProp :operate_pobj.
:operate_pobj synsem:marker :in .
In many cases, the meaning of a syntactic frame can not be expressed by exactly one binary predicate as in the examples given above. Take for instance the case of a transitive verb (to) launch, which subcategorizes a subject expressing the company that launched a product, a direct object expressing the launched product, and a prepositional object introduced by the preposition in indicating the year of the launch of the product in question. The important thing here is that there are three syntactic arguments (subject, object and prepositional object, represented as arg1
, arg2
and arg3
below, respectively) that realize the arguments of a complex predicate that consist of the sub-predicates dbpedia:product and dbpedia:launchDate.
Thus, the synsem
module introduces the property submap that relates a (complex) ontological map involving various ontological predicates to a set of less complex ontological maps that bind the arguments of one of the involved predicates to a syntactic argument that realizes it.
The submap property relates a (complex) ontological mapping to a set of bindings that together bind the arguments of the involved predicates to a set of syntactic arguments that realize them syntactically.
Domain: OntoMap
Range: OntoMap
The following example shows how to use the submap property to indicate that the meaning of the phrase X launched Y in Z is a composition of the properties dbpedia:product and dbpedia:launchDate, which together express the meaning of the syntactic frame:
:launch a ontolex:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:verb ;
ontolex:canonicalForm :launch_canonical_form;
synsem:synBehavior :launch_transitive_pp;
ontolex:sense :launch_sense_ontomap.
:launch_canonical_form ontolex:writtenRep "launch"@en.
:launch_transitive_pp a lexinfo:TransitivePPFrame;
lexinfo:subject :arg1 ;
lexinfo:directObject :arg2 ;
lexinfo:prepositionalAdjunct :arg3.
:arg3 synsem:marker :in ;
synsem:optional "true"^^xsd:boolean .
:launch_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
synsem:ontoMapping :launch_sense_ontomap;
synsem:submap :launch_submap1;
synsem:submap :launch_submap2.
:launch_submap1 ontolex:reference <http://dbpedia.org/ontology/product>;
synsem:subjOfProp :arg1;
synsem:objOfProp :arg2.
:launch_submap2 ontolex:reference <http://dbpedia.org/ontology/launchDate>;
synsem:subjOfProp :arg2;
synsem:objOfProp :arg3.
It is possible to specify that a certain argument is not compulsory by the optional property. It is generally only advised to use this property with complex senses. Indicating that an argument is optional means that it does not have to be realized syntactically in which case from a semantic point of view the corresponding argument of the ontological predicate is existentially quantifier over. In the above example we have indicated that arg3
is optional, allowing to assign the correct semantics to an expression such as X launched Y by existentially quantifying over the year.
The optional property indicates whether a syntactic argument is optional, that is, it can be syntactically omitted.
Domain: SyntacticArgument
Range: xsd:boolean
The following example shows how we can capture the diathesis alternation between X gave Y Z and X gave Z to Y, which in our modelling represent the same ontological meaning:
:give a ontolex:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:verb ;
ontolex:canonicalForm :give_form;
synsem:synBehavior :give_ditransitive;
synsem:synBehavior :give_transitive_pp;
ontolex:sense :giving_sense_ontomap.
:give_form a ontolex:Form;
ontolex:writtenRep "give"@en.
:give_transitive_pp a lexinfo:TransitivePPFrame;
lexinfo:subject :give_subj1 ;
lexinfo:directObject :give_dobj1;
lexinfo:prepositionalArg :give_pobj1.
:give_ditransitive a lexinfo:DitransitiveFrame;
lexinfo:subject :give_subj2 ;
lexinfo:indirectObject :give_iobj2;
lexinfo:directObject :give_dobj2.
:giving_sense_ontomap a ontolex:LexicalSense, synsem:OntoMap;
synsem:ontoMapping :giving_sense_ontomap;
ontolex:reference <http://www.ontologyportal.org/SUMO.owl#Giving>;
synsem:submap :giving_submap1;
synsem:submap :giving_submap2;
synsem:submap :giving_submap3.
:giving_submap1 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#agent>;
synsem:subjOfProp :giving_event;
synsem:objOfProp :give_subj1;
synsem:objOfProp :give_subj2.
:giving_submap2 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#patient>;
synsem:subjOfProp :giving_event;
synsem:objOfProp :give_dobj2;
synsem:objOfProp :give_dobj1.
:giving_submap3 ontolex:reference <http://www.ontologyportal.org/SUMO.owl#destination>;
synsem:subjOfProp :giving_event;
synsem:objOfProp :give_iobj2;
synsem:objOfProp :give_pobj1.
:give_pobj1 synsem:marker :to .
For adjectives a modelling may be as follows:
:female a ontolex:LexicalEntry;
lexinfo:partOfSpeech lexinfo:adjective;
ontolex:canonicalForm :female_canonical_form;
synsem:synBehavior :female_syn,:female_syn1;
ontolex:sense :female_sense_ontomap.
:female_canonical_form ontolex:writtenRep "female"@en.
:female_sense_ontomap ontolex:reference [
a owl:Restriction;
owl:onProperty <http://dbpedia.org/ontology/gender> ;
owl:hasValue <http://dbpedia.org/resource/Female> ] ;
synsem:ontoMapping :female_sense_ontomap;
synsem:isA :female_arg .
:female_syn a lexinfo:AdjectivePredicateFrame;
lexinfo:copulativeSubject :female_arg.
:female_syn1 a lexinfo:AdjectiveAttributiveFrame ;
lexinfo:attributiveArg :female_arg.
Note that in the above example the property synsem:isA
property is used to mark the single argument/variable of the class of all the things that have female gender. The copulative subject in an expression such as "Mary is female" is bound to this single argument of the corresponding ontological predicate. The semantics is thus in essence the characteristic function that for each element decides if it is in the set denoted by the class.
Conditions describe precise conditions that must be met by a context in which a lexical entry can be used to refer to a certain ontological predicate (reference). These contextual conditions are attached to the lexical sense that mediates the relation between a lexical entry and the ontological predicate it can be used to express.
The condition property defines an evaluable constraint that derives from using a certain lexical entry to express a given ontological predicate.
Domain: LexicalSense
Range: rdfs:Resource
SubPropertyOf: usage
Two special types of conditions are defined in the synsem module, which formulate constraints on the type of arguments that can be used at the first or second position of a property when a certain lexical entry is used to express that property. Take for instance the distinction between the English verbs (to) ride and (to) drive. Both express the means of transportation, but have different implications. Ride implies that the means of transportation is a bike. Instead of introducing different ontological predicates and different senses, the modulation can be captured by specifying restrictions on the values that can fill the 1st or 2nd argument of the corresponding ontological predicate. This is illustrated by the example below:
:ride a ontolex:LexicalEntry ;
ontolex:sense :ride_sense1 .
:ride_sense1 a ontolex:LexicalSense ;
ontolex:reference :methodOfTransportation ;
synsem:propertyRange :Bicycle ;
synsem:semArg :subj, :obj .
:methodOfTransportation a rdf:Property ;
rdfs:range :Vehicle .
It is important to note that the propertyDomain or propertyRange properties do not modify in any way the ontological status or commitment of the corresponding property (here: methodOfTransportation). Instead, they make explicit certain implications on the type of arguments involved that derive from the use of a certain lexical entry to express the property in question.
The propertyDomain property specifies a constraint on the type of arguments that can be used at the first position of the property that is referenced by the given sense.
Domain: LexicalSense
Range: rdfs:Resource
Decomposition is the process of indicating which elements constitute a multiword or compound lexical entry. The simplest way to do this is by means of the subterm property, which indicates that a lexical entry is a part of another entry. This property allows us to specify which lexical entries a certain compound lexical entry is composed of.
The property subterm relates a compound lexical entry to one of the lexical entries it is composed of.
Domain: LexicalEntry
Range: LexicalEntry
The subterm property is used to indicate which terms have been derived from another term by means of adding or removing words, for example
The subterm property may also be used to indicate the decomposition of compound words. The following example shows how to indicate that the German compound Lungenentzündung ('pneumonia' literally 'lung inflammation') is decomposed into the lexical entries Lunge and Entzündung:
:Lungenentzündung a ontolex:LexicalEntry ;
decomp:subterm :Lunge_lex;
decomp:subterm :Entzündung_lex .
It is important to mention that the subterm property is a relation between lexical entries and does neither indicate the specific inflected word of a lexical entry that appears in the compound nor the position at which it appears.
The subterm property allows us to indicate which lexical entries a compound is composed of, but it does not indicate the internal structure of the compound. This can be achieved by introducing so called components. Such components represent a fixed list of each of the elements that compose a lexical entry. In the most common case of a multiword expression, the components of the lexical entry are the individual tokens that compose that entry.
A component is a particular realization of a lexical entry that forms part of a compound lexical entry.
Each component is said to be a constituent of a lexical entry:
The property constituent relates a lexical entry or component to a component that it is constituted by.
Domain: LexicalEntry or Component
Range: Component
:AfricanSwineFever a ontolex:MultiwordExpression ;
decomp:constituent :African_comp , :Swine_comp , :Fever_comp ;
decomp:subterm :SwineFever .
:African_comp a decomp:Component .
:Swine_comp a decomp:Component .
:Fever_comp a decomp:Component .
:SwineFever a ontolex:MultiwordExpression ;
decomp:constituent :Swine_comp , :Fever_comp .
As a component represents a particular realization of a lexical entry which forms part of a compound lexical entry, we need to link the component to the corresponding lexical entry it is a realization of. This is done by the property correspondsTo:
The property correspondsTo links a component to a corresponding lexical entry or argument.
Domain: Component
Range: LexicalEntry or SyntacticArgument
It may be necessary to add inflectional properties to the component to uniquely determine the actual form of the lexical entry. This inflectional information can be attached to the component as shown in the following example for the Spanish term 'comunidad autónoma' (federal state), whose second word is the singular feminine form autónoma instead of the canonical form autónomo.
:comunidad_autonoma_lex a ontolex:LexicalEntry ;
decomp:constituent :comunidad_component;
decomp:constituent :autonoma_component .
:comunidad_component a decomp:Component;
decomp:correspondsTo :comunidad_lex.
:autonoma_component a decomp:Component;
decomp:correspondsTo :autonomo_lex;
lexinfo:gender lexinfo:feminine;
lexinfo:number lexinfo:singular.
If we want to specify the order of the components, we can use the RDF properties rdf:_1
, rdf:_2
, etc. as in the following example to specify the absolute order, in addition to the constituent properties. Note that the property constituent alone is not sufficient to specify the order of components.
:comunidad_autonoma_lex a ontolex:LexicalEntry ;
decomp:constituent :comunidad_component;
rdf:_1 :comunidad_component;
decomp:constituent :autonoma_component;
rdf:_2 :autonoma_component;
ontolex:denotes <http://dbpedia.org/ontology/federalState>;
ontolex:canonicalForm :comunidad_autonoma_lex_canonical_form.
:comunidad_autonoma_lex_canonical_form ontolex:writtenRep "comunidad autónoma"@es.
:comunidad_component a decomp:Component;
decomp:correspondsTo :comunidad_lex.
:autonoma_component a decomp:Component;
decomp:correspondsTo :autonomo_lex;
lexinfo:gender lexinfo:feminine;
lexinfo:number lexinfo:singular.
The constituent property can also be used to specify the structure of a phrase, by means of showing some components as being constituted of further components. In this way, each of the components represents a node in the phrase structure tree and may be annotated with a phrase tag as in the following example:
:AfricanSwineFever_root a decomp:Component ;
decomp:correspondsTo :AfricanSwineFever ;
decomp:constituent :African_node, :SwineFever_node ;
rdf:_1 :African_node;
rdf:_2 :SwineFever_node;
olia:hasTag penn:NP .
:African_node a decomp:Component ;
decomp:correspondsTo :African ;
olia:hasTag penn:JJ .
:SwineFever_node a decomp:Component ;
decomp:constituent :Swine_node, :Fever_node ;
rdf:_1 Swine_node;
rdf:_2 Fever_node;
olia:hasTag penn:NP .
:Swine_node a decomp:Component ;
decomp:correspondsTo :Swine ;
olia:hasTag penn:NN .
:Fever_node a decomp:Component ;
decomp:correspondsTo :Fever ;
olia:hasTag penn:NN .
The syntactic categories of the phrases are indicated using the property olia:hasTag from the OLiA vocabulary using the Penn TreeBank tagset.
The following example shows how to use the synsem module in conjunction with the decomp module to indicate the phrase structure tree of a frame. This is done by making the frame the target of the correspondsTo property and including components in the tree that correspond to individual arguments. As such it is possible to represent modelling of lexicalized grammars within the lexicon.
:know a ontolex:Word ;
synsem:synBehavior :know_frame .
:know_frame a synsem:SyntacticFrame ;
lexinfo:subject :subject ;
lexinfo:directObjet :directObject .
:know_root a decomp:Component ;
decomp:correspondsTo :know_frame ;
decomp:constituent :X_node, :knowY_node ;
olia:hasTag penn:S .
:X_node a decomp:Component ;
decomp:correspondsTo :subject ;
olia:hasTag penn:NP .
:knowY_node a decomp:Component ;
decomp:constituent :know_node, :Y_node ;
olia:hasTag penn:VP .
:know_node a decomp:Component ;
decomp:correspondsTo :know ;
olia:hasTag penn:V .
:Y_node a decomp:Component ;
decomp:correspondsTo :directObject ;
olia:hasTag penn:NP .
The variation and translation module introduces vocabulary needed to represent relations between lexical entries and lexical senses that are variants of each other. The following diagram provides an overview of the vocabulary introduced by the module:
The model defines a generic class lexico-semantic relation that allows us to relate two lexical entries or two lexical senses to each other, this is done principally by means of two properties lexicalRel and senseRel that allow to directly link two lexical entries / lexical senses that are related.
The lexicalRel property relates two lexical entries that stand in some lexical relation.
Domain: ontolex:LexicalEntry
Range: ontolex:LexicalEntry
The senseRel property relates two lexical senses that stand in some sense relation.
Domain: ontolex:LexicalSense
Range: ontolex:LexicalSense
In general, these properties should not be used directly but instead a sub-property should be introduced, for example:
:fao lexinfo:initialismFor :food_and_agriculture_organization.
:surrogate_mother lexinfo:hypernym :mother.
lexinfo:initialismFor rdfs:subProperty vartrans:lexicalRel.
lexinfo:hypernym rdfs:subProperty vartrans:senseRel.
In the case that further information about the relationship needs to be represented it is possible to create an individual that 'reifies' the relationship.
A lexico-semantic relation represents the relation between two lexical entries or lexical senses that are related by some lexical or semantic relationship.
subClassOf: relates exactly 2 (ontolex:LexicalEntry OR ontolex:LexicalSense OR ontolex:LexicalConcept)
The object property relates links a lexico-semantic relation to the lexical entries or lexical senses between which it establishes the relation:
The relates property links a lexico-semantic relation to the two lexical entries or lexical senses between which it establishes the relation.
Domain: LexicoSemanticRelation
Range: ontolex:LexicalEntry OR ontolex:LexicalSense OR ontolex:LexicalConcept
As many lexico-semantic relations are asymmetric, it is necessary to distinguish the source from the target:
The source property indicates the lexical sense or lexical entry involved in a lexico-semantic relation as a 'source'.
SubPropertyOf: relates
The target property indicates the lexical sense or lexical entry involved in a lexico-semantic relation as a 'target'.
SubPropertyOf: relates
The class lexico-semantic relation is specialized into the following two subclasses: lexical relation and sense relation, which relate two lexical entries or two lexical senses, respectively:
A lexical relation is a lexico-semantic relation that represents the relation between two lexical entries the surface forms of which are related grammatically, stylistically or by some operation motivated by linguistic economy.
subClassOf: LexicoSemanticRelation, relates exactly 2 ontolex:LexicalEntry
By lexical relations we understand those relations at the surface forms, mainly motivated by grammatical requirements, style (Wortklang), and linguistic economy (helping to avoid excessive denominative repetition and improving textual coherence). Examples of lexical relations are the following:
The specific type of lexical or sense relation can be specified via the object property category, which is defined as follows:
The category property indicates the specific type of relation by which two lexical entries or two lexical senses are related.
Domain Lexico-Semantic Relation
Characteristics: Functional
The following example shows how to model the relation between "Food and Agriculture Organization" and its initialism "FAO" as one example of a lexical relation:
:fao a ontolex:LexicalEntry ;
ontolex:sense :fao_sense;
ontolex:lexicalForm :fao_form.
:fao_sense ontolex:reference <http://dbpedia.org/resource/Food_and_Agriculture_Organization> .
:food_and_agriculture_organization a ontolex:LexicalEntry;
ontolex:sense :food_and_agriculture_organization_sense ;
ontolex:lexicalForm :food_and_agriculture_organization_form.
:food_and_agriculture_organization_sense ontolex:reference <http://dbpedia.org/resource/Food_and_Agriculture_Organization> .
:fao_form ontolex:writtenRep "FAO"@en .
:food_and_agriculture_organization_form ontolex:writtenRep "Food and Agriculture Organization"@en .
:fao_initialism a vartrans:LexicalRelation ;
vartrans:source :food_and_agriculture_organization ;
vartrans:target :fao ;
vartrans:category :initialism.
A sense relation is a lexico-semantic relation that represents the relation between two lexical senses the meanings of which are related.
subClassOf: LexicoSemanticRelation, relates exactly 2 ontolex:LexicalSense
Examples of semantic relations are the equivalence relation between two senses, hypernymy and hyponymy relations, synonymy, antonymy, translations, etc.
The following example gives an example of a sense relation:
:surrogate_mother_lex a ontolex:LexicalEntry ;
ontolex:sense :surrogate_mother_sense ;
ontolex:canonicalForm :surrogate_mother_form.
:surrogate_mother_sense ontolex:reference <http://dbpedia.org/ontology/surrogate_mother>.
:surrogate_mother_form ontolex:writtenRep "surrogate mother"@en .
:mother_lex a ontolex:LexicalEntry ;
ontolex:sense :mother_sense ;
ontolex:canonicalForm :mother_form.
:mother_sense ontolex:reference <http://dbpedia.org/ontology/mother>.
mother_form ontolex:writtenRep "mother"@en .
:senseRelation a vartrans:SenseRelation ;
vartrans:source :surrogate_mother_sense ;
vartrans:target :mother_sense ;
vartrans:category lexinfo:hypernym .
Further, we consider terminological relations, which are defined as follows:
A terminological relation is a sense relation that relates two lexical senses of terms that are semantically related in the sense that they can be exchanged in most contexts, but their surface forms are not directly related. The variants vary along dimensions that are not captured by the given ontology and are intentionally (pragmatically) caused.
SubclassOf: SenseRelation
Examples of categories of terminological relations include:
We illustrate the use of terminological relations with the following example of a diachronic variant:
:tuberculosis a ontolex:LexicalEntry ;
ontolex:lexicalForm :tuberculosis_form ;
ontolex:sense :tuberculosis_sense.
:tuberculosis_form ontolex:writtenRep "tuberculosis"@en .
:tuberculosis_sense ontolex:reference <http://dbpedia.org/resource/Tuberculosis>.
:phthisis a ontolex:LexicalEntry ;
ontolex:lexicalForm :phthisis_form ;
ontolex:sense :phthisis_sense.
:phthisis_form ontolex:writtenRep "phthisis"@en .
:phtisis_sense ontolex:reference <http://dbpedia.org/resource/Tuberculosis>;
dct:subject <http://dbpedia.org/resource/Medicine> .
:phtisis_diachronic_relation a vartrans:TerminologicalRelation ;
vartrans:source :phthisis_sense ;
vartrans:target :tuberculosis_sense ;
vartrans:category :diachronic.
Finally, it is also possible to give relationships between concepts, and this is useful for modelling relations between synsets in wordnets and other similar resources
The conceptRel property relates two lexical concepts that stand in some sense relation.
Domain: ontolex:LexicalConcept
Range: ontolex:LexicalConcept
Translation relates two lexical entries from different languages the meaning of which is 'equivalent'. This 'equivalence` can be expressed at three different levels:
In order to express that the lexical senses of two lexical entries are ontologically equivalent, we do not need other machinery than the one introduced already in the ontolex section above:
:surrogate_mother a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
ontolex:sense :surrogate_mother_sense.
:surrogate_mother_sense ontolex:reference ontology:SurrogateMother.
:madre_de_alquiler a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> ;
ontolex:sense :madre_de_alquiler_sense.
:madre_de_alquiler_sense ontolex:reference ontology:SurrogateMother.
:leihmutter a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
ontolex:sense :leihmutter_sense.
:leihmutter_sense ontolex:reference ontology:SurrogateMother.
By this, the corresponding senses of the lexical entries surrogate mother, madre de alquiler and Leihmutter are said to be equivalent in that they denote the same class in the ontology.
The second alternative mentioned above can be realized through the class translation, which relates two senses that can be regarded as equivalent in that they can be exchanged for each other.
A translation is a sense relation expressing that two lexical senses corresponding to two lexical entries in different languages can be translated to each other without any major meaning shifts.
subClassOf: SenseRelation
:zip_code a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
ontolex:sense :zip_code_sense.
:zip_code_sense ontolex:reference <http://dbpedia.org/ontology/zipCode>.
:postleitzahl a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
ontolex:sense :postleitzahl_sense.
:postleitzahl_sense ontolex:reference <http://de.dbpedia.org/resource/Postleitzahl>.
:trans a vartrans:Translation;
vartrans:source :zip_code_sense;
vartrans:target :postleitzahl_sense;
vartrans:category <http://purl.org/net/translation-categories#directEquivalent>.
Thus, in spite of using having different denotations, both Postleitzahl and zip code can be seen as cross-lingual equivalents and thus as translations of each other.
Besides the class Translation, which reifies the translation relation between two lexical senses, as a shortcut the model also allows us to directly express the relation of translation between lexical senses by a property translation that is regarded as equivalent to the reification:
The translation property relates two lexical senses of two lexical entries that stand in a translation relation to one another.
subPropertyOf: senseRel
With the translation property, the above example can be replaced with:
:zip_code a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> ;
ontolex:sense :zip_code_sense.
:zip_code_sense ontolex:reference <http://dbpedia.org/ontology/zipCode>.
:postleitzahl a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/de>, <http://lexvo.org/id/iso639-1/de> ;
ontolex:sense :postleitzahl_sense.
:postleitzahl_sense ontolex:reference <http://de.dbpedia.org/resource/Postleitzahl>.
:zip_code_sense vartrans:translation :postleitzahl_sense.
The third option foreseen in the vartrans
model is one where we say that a lexical entry can be translated into some other entry in some contexts, underspecifying the exact lexical senses involved and the exact contextual conditions under which this translation is valid. For this, the model introduces the property translatableAs:
The translatableAs property relates a lexical entry in some language to a lexical entry in another language that it can be translated as depending on the particular context and specific senses of the involved lexical entries.
Domain: ontolex:LexicalEntry
Range: ontolex:LexicalEntry
Characteristics: Symmetric
Subproperty of: isSenseOf o translation o sense
The following example shows how to use the relation translatableAs
to specify that corner (which can mean street intersection or intersection of two inside walls) can be translated as the Spanish rincón (intersection of two inside walls) or esquina (street intersection), depending on the particular sense involved.
:corner a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/eng>, <http://lexvo.org/id/iso639-1/en> .
:rincón a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> .
:esquina a ontolex:LexicalEntry;
dct:language <http://id.loc.gov/vocabulary/iso639-2/es>, <http://lexvo.org/id/iso639-1/es> .
:corner vartrans:translatableAs :rincón.
:corner vartrans:translatableAs :esquina.
We can group translations into a set by using the class translation set, for instance if they come from the same language resource, or belonging to the same organisation, etc.:
In order to relate a translation set to one of the translations contained in it, the model defines a property trans:
The trans property relates a TranslationSet to one of its translations.
Domain: TranslationSet
Range: Translation
:study a ontolex:LexicalEntry ;
ontolex:sense :study_sense ;
dct:language iso639:en .
:Studium a ontolex:LexicalEntry ;
ontolex:sense :Studium_sense ;
dct:language iso639:de .
:Untersuchung a ontolex:LexicalEntry ;
ontolex:sense :Untersuchung_sense ;
dct:language iso639:de .
:staidear a ontolex:LexicalEntry ;
ontolex:ense :staidear_sense ;
dct:language iso639:ga .
:t1 a vartrans:Translation ;
vartrans:source :study_sense ;
vartrans:target :Studium_sense .
:t2 a vartrans:Translation ;
vartrans:source :study_sense ;
vartrans:target :staidear_sense .
:t3 a vartrans:Translation ;
vartrans:source :study_sense ;
vartrans:target :Untersuching_sense .
:ts1 a vartrans:TranslationSet ;
vartrans:trans :t1, :t3 ;
dc:source "Automatically translated"@en .
:ts2 a vartrans:TranslationSet ;
vartrans:trans :t2 ;
dc:source "Wiktionary"@en .
The LInguistic MEtadata (lime) module allows for describing metadata at the level of the lexicon-ontology interface. This module is intended to complement existing metadata schemas such as Dublin Core, the PROV ontology, DCAT or VoID, as lime provides a profile to describe metadata as related to the lexicon-ontology interface.
Following the conceptual model of the lexicon-ontology interface, lime distinguishes three main metadata entities:
Note: the reference dataset here is not limited to OWL vocabularies, but includes any RDF dataset which contains references to objects of a domain of discourse.
As a metadata vocabulary, lime focuses on summarizing quantitative and qualitative information about these entities and the relations among them.
Metadata is attached in particular to three types of sets that lime distinguishes:
In the following sections, we provide detailed descriptions for the lime vocabulary to describe metadata for the lexicon as a whole as well as for the three types of sets described above. Metadata about ontologies (and domain datasets as well) and lexical concept sets can be provided by means of the already mentioned existing metadata vocabularies.
The main metadata-bearing entity in lemon is a lexicon object that represents a collection of lexical entries for a particular language. A small example lexicon consisting of four lexical entries for cat, marry, high and intangible assets would look as follows:
:lexicon a lime:Lexicon;
lime:language "en";
lime:entry :lex_high;
lime:entry :lex_cat;
lime:entry :lex_marry;
lime:entry :lex_intangible_assets.
A lexicon is expected to consist of at least one lexical entry and is defined as a subclass of void:Dataset:
A lexicon represents a collection of lexical entries for a particular language or domain.
SubClassOf: entry min 1 ontolex:LexicalEntry, language exactly 1 rdfs:Literal, void:Dataset
The property linking a lexicon to a lexical entry is the property entry:
The entry property relates a lexicon to one of the lexical entries contained in it.
Domain: Lexicon
Range: ontolex:LexicalEntry
The language property can be stated on either a lexicon or a lexical entry (note that all entries in the same lexicon should be in the same language and that the language of the lexicon and entry should be consistent with the language tags used on all forms) and its value should be a literal representing the language.
The language property indicates the language of a lexicon, a lexical entry, a concept set or a lexicalization set.
Domain: Lexicon or ontolex:LexicalEntry or ConceptSet or LexicalizationSet
Range: xsd:language
Beyond using the lime:language property, which has a Literal as a range, it is recommended to use the Dublin Core language property with reference to either Lexvo.org or The Library of Congress Vocabulary
http://www.lexvo.org/id/iso639-3/xxx
where xxx is the 3-Letter ISO 639-3 codehttp://id.loc.gov/vocabulary/iso639-1/xx
where xx is the 2-Letter ISO 639-1 codeThe property lexical entries indicates the number of lexical entries contained in a lexicon. The property is also used for lexicalization and conceptualization sets, indicating in this case the number of lexical entries involved in these sets.
The lexical entries property indicates the number of distinct lexical entries contained in a lexicon, lexicalization set or conceptualization set.
Domain: Lexicon or LexicalizationSet or ConceptualizationSet
Range: xsd:integer
The model also allows us to specify the linguistic (annotation) model used to describe characteristics of lexical entries via the linguisticCatalog property:
The linguistic catalog property indicates the catalog of linguistic categories used in a lexicon to define linguistic properties of lexical entries.
Domain: Lexicon
SubPropertyOf: voaf:Vocabulary
As an example we may describe a simple lexicon using the above introduced properties in addition to Dublin Core properties. The part-of-speech of the four lexical entries is indicated using the lexinfo vocabulary, so that the value of linguisticCatalog is set to http://www.lexinfo.net/ontologies/2.0/lexinfo
. In the example, there is one (RDF) resource that represents both the lexicon itself and its metadata:
:lexicon a lime:Lexicon;
lime:language "en";
dct:language <http://id.loc.gov/vocabulary/iso639-2/en>, <http://lexvo.org/id/iso639-1/eng> ;
lime:lexicalEntries "4"^^xsd:integer;
lime:linguisticCatalog <http://www.lexinfo.net/ontologies/2.0/lexinfo> ;
dct:description "This is an example lexicon"@en;
dct:description "Questo è un lessico di esempio"@it;
dct:creator <http://john.mccr.ae/>;
void:triples "29"^^xsd:integer ;
lime:entry :lex_high;
lime:entry :lex_cat;
lime:entry :lex_marry;
lime:entry :lex_intangible_assets.
:lex_cat a ontolex:LexicalEntry, lexinfo:Noun;
ontolex:canonicalForm :form_cat.
:form_cat ontolex:writtenRep "cat"@en.
:lex_marry a ontolex:LexicalEntry, lexinfo:Verb;
ontolex:canonicalForm :form_marry.
:form_marry ontolex:writtenRep "marry"@en .
:lex_high a ontolex:LexicalEntry, lexinfo:Adjective;
ontolex:canonicalForm :form_high.
:form_high ontolex:writtenRep "high"@en .
:lex_intangible_assets a ontolex:LexicalEntry, lexinfo:Noun;
ontolex:canonicalForm :form_intangible_assets.
:form_intangible_assets ontolex:writtenRep "intangible assets"@en.
A lexicalization set is a void:Dataset that comprises a collection of so called lexicalizations, which we understand as pairs of a lexical entry and an associated reference in the ontology.
A lexicalization set is a dataset that comprises a collection of lexicalizations, that is pairs of lexical entry and corresponding reference in the associated ontology/vocabulary/dataset.
SubClassOf: void:Dataset, lexiconDataset max 1 lime:Lexicon, referenceDataset exactly 1 void:Dataset, partition only LexicalizationSet, lexicalizationModel exactly 1
The lexicalization set is linked to the ontology and the lexicon by means of the properties reference dataset and lexicon dataset, respectively.
The reference dataset property indicates the dataset that contains the domain objects or vocabulary elements that are either referenced by a given lexicon, providing the grounding vocabulary for the meaning of the lexical entries, or linked to lexical concepts in a concept set by means of a lexical link set.
Domain: LexicalizationSet or LexicalLinkset
Range: void:Dataset
The lexicon dataset property indicates the lexicon that contains the entries referred to in a lexicalization set or a conceptualization set.
Domain: LexicalizationSet or ConceptualizationSet
Range: Lexicon
The optionality of the lexicon dataset property is required to support other lexicalization models (e.g. RDFS, SKOS, SKOS-XL) that do not introduce a separate notion of lexicon, since lexical entries only exist implicitly being part of a lexicalization. The property lexicalization model indicates the specific lexicalization model used.
The lexicalization model property indicates the model used for representing lexical information. Possible values include (but are not limited to) http://www.w3.org/2000/01/rdf-schema# (for the use of rdfs:label), http://www.w3.org/2004/02/skos/core (for the use of skos:pref/alt/hiddenLabel), http://www.w3.org/2008/05/skos-xl (for the use of skosxl:pref/alt/hiddenLabel) and http://www.w3.org/ns/lemon/ontolex for lemon.
Domain: LexicalizationSet
Range: rdfs:Resource
SubPropertyOf: void:vocabulary
The model defines the property references, which indicates the number of vocabulary elements lexicalized by at least one lexical entry. This number can be obviously smaller than the number of entities in the ontology (in case some vocabulary elements are not lexicalized) and the number of lexical entries in the lexicon (in case that several lexical entries refer to the same ontology element), respectively.
The references property indicates the number of distinct ontology or vocabulary elements that are either associated with lexical entries via a lexicalization set or linked to lexical concepts via a lexical link set.
Domain: LexicalizationSet or LexicalLinkset
Range: xsd:integer
In the following example, we describe a lexicalization set expressing how elements of an ontology can be verbalized in Japanese by means of entries from a supplied lexicon. The metadata clearly tells which ontology and lexicon are involved in the lexicalization set, that is http://www.example.com/ontology and http://www.example.com/lexicon, respectively, as well as the relevant natural language. The knowledge of these facts about a lexicalization set allows us to assess its usefulness for a given task as well to discover relevant lexicalization sets, when we are constrained by the choice of an ontology, lexicon or natural language.
The ontology is modelled as an instance of the class voaf:Vocabulary that is a kind of void:Dataset representing vocabularies (both RDFS Schemas and OWL Ontologies). We benefit from the more specific distinctions made by VOAF, by breaking down the total number of entities in the ontology (held by the property void:entities) into separate counts for the classes and properties (held by voaf:classNumber and voaf:propertyNumber, respectively).
Similarly, terms from the lime vocabulary are used to represent statistics about the linguistic content of the lexicon and the lexicalization set. Overall, the ontology defines 100 entities and the lexicon 80 lexical entries; however, only 20 entities from the target ontology have been associated with a total of 50 lexical entries. In this sense, only 20 references from the ontology have been actually lexicalized by linking them to a lexical entry.
When counting the entities in the ontology or, in general, in the reference dataset, we recommend to ignore the resources describing the ontology itself (that is an instance of the class owl:Ontology) as well as other metadata entities.
:Lexicalization a lime:LexicalizationSet ;
lime:language "ja";
dct:language <http://id.loc.gov/vocabulary/iso639-1/ja>, <http://lexvo.org/id/iso639-3/jpn> ;
lime:lexicalizationModel <http://www.w3.org/ns/lemon/all> ;
lime:referenceDataset <http://www.example.com/ontology> ;
lime:lexiconDataset <http://www.example.com/lexicon> ;
lime:references 20 ;
lime:lexicalEntries 50 .
<http://www.example.com/ontology> a owl:Ontology, voaf:Vocabulary, void:Dataset ;
void:entities 100 ;
voaf:classNumber 60 ;
voaf:propertyNumber 40 .
<http://www.example.com/lexicon> a lime:Lexicon ;
lime:language "ja" ;
dct:language <http://id.loc.gov/vocabulary/iso639-1/ja>, <http://lexvo.org/id/iso639-3/jpn> ;
lime:lexicalEntries 80 .
A lexicalization set comprises a set of pairs of a lexical entry and the corresponding reference that the lexical entry denotes. These pairs are expressed differently depending on the lexical model adopted:
rdfs:label
.skos(-xl):{pref,alt,hidden}Label
properties.In addition to specifying the number of entities in the ontology lexicalized, it is also possible to give the total number of lexicalizations, that is the total connections between lexical entries and references. This number should in most cases be the same as the total number of lexical senses defined in the lexicon. The value may be given by the absolute number of lexicalizations:
The lexicalizations property indicates the total number of lexicalizations in a lexicalization set, that is the number of unique pairs of lexical entry and denoted ontology element.
Domain: LexicalizationSet
Range: xsd:integer
In addition or alternatively to the absolute number of lexicalizations, the model also supports the indication of the average number of lexicalizations per ontology element:
The average number of lexicalizations property indicates the average number of lexicalizations per ontology element.
Domain: LexicalizationSet
Range: xsd:decimal
The average number of lexicalizations is calculated as specified by the following formula:
The following example describes an ontology consisting of 30 ontology elements. The corresponding lexicalization set contains 20 lexicalizations involving 15 lexical entries (so some entries have multiple meanings in the ontology). On average, for each element in the ontology there are thus 20/30 = 0.66 lexicalizations.
:Lexicalization a lime:LexicalizationSet ;
lime:lexicalizations 20 ;
lime:references 20 ;
lime:lexicalEntries 15 ;
lime:avgNumOfLexicalizations 0.66 ;
lime:referenceDataset <http://www.example.com/ontology> ;
lime:lexiconDataset <http://www.example.com/lexicon> .
<http://www.example.com/ontology> a owl:Ontology, void:Dataset ;
void:entities 30 .
Finally, the percentage property may be used to express the percentage of entities in an ontology which are lexicalized, formally:
In many cases, we want to provide descriptive metadata about a subset of a lexicalization set, that is for the subset representing all the lexicalizations for a certain type of ontology entity (class, property, etc.). To logically partition a lexicalization set, the lime module introduces the property partition:
The partition property relates a lexicalization set or lexical linkset to a logical subset that contains lexicalizations for a given ontological type only.
Domain: LexicalizationSet or LexicalLinkset
Range:: LexicalizationSet or LexicalLinkset
SubPropertyOf: void:subset
The resource type property indicates the type of ontological entity of a lexicalization set or lexical linkset.
Domain: LexicalizationSet or LexicalLinkset
Range: rdfs:Class
Characteristics: Functional
For example, we may limit our metadata about lexicalizations to a particular class, e.g. restricting the metadata to the logical partition of lexicalizations that denote an element in the extension of the corresponding class:
:Lexicalization a lime:LexicalizationSet ;
lime:partition :CountryPartition ;
lime:references 2000 .
:CountryPartition
lime:resourceType ontology:Country ;
lime:references 50 .
In addition, it is also possible to give RDF(S) or OWL types as the target of the resource type property. This allows us to state the number of classes that are lexicalized by at least one lexical entry:
Lexical linksets are similar in many ways to the lexicalization sets above in the sense that they connect a concept set to an ontology. The primary purpose of this is to describe the linking of a concept set such as the synsets in a wordnet to an ontology.
A lexical linkset represents a collection of links between a reference dataset and a set of lexical concepts (e.g. synsets of a wordnet).
SubClassOf: void:Linkset, conceptualDataset exactly 1 ontolex:ConceptSet, referenceDataset exactly 1 void:Dataset, partition only LexicalLinkset
The lexical linkset is linked to a concept set by means of the conceptual dataset property:
The conceptual dataset property relates a lexical link set or a conceptualization set to a corresponding concept set.
Domain: LexicalLinkset or ConceptualizationSet
Range: ontolex:ConceptSet
There are several properties that are analogous to properties defined for a lexicalization set. For example concepts indicates the number of concepts in a concept set:
The concepts property indicates the number of lexical concepts defined in a concept set or involved in either a LexicalLinkset or ConceptualizationSet.
Domain: ontolex:ConceptSet or LexicalLinkset or ConceptualizationSet
Range: xsd:integer
Similarly, the links and avgNumOfLinks properties are analogous to the properties lexicalizations and avgNumOfLexicalizations.
The links property indicates the number of links between concepts in the concept set and entities in the reference dataset.
Domain: LexicalLinkset
Range: xsd:integer
The average number of links property indicates the average number of links to lexical concepts for each ontology element in the reference dataset.
Domain: LexicalLinkset
Range: xsd:decimal
Finally, we note that the references, percentage and partition properties apply to the lexical linkset in the same way as to the lexicalization set.
A conceptualization set is analogous to a lexicalization set, but associates a concept set with a lexicon and consists of conceptualizations, that is pairs formed by a single lexical entry and its associated lexical concept.
A conceptualization set represents a collection of links between lexical entries in a lexicon and lexical concepts in a concept set they evoke.
SubClassOf: void:Dataset, lexiconDataset exactly 1 Lexicon, conceptualDataset exactly 1 ontolex:ConceptSet
A number of properties already described for other metadata entities can also be used in the description of a conceptualization set.
Additional properties have been defined specifically to characterize a given set of conceptualizations:
The conceptualizations property indicates the number of distinct conceptualizations in a conceptualization set.
Domain: ConceptualizationSet
Range: xsd:integer
The average ambiguity property indicates the average number of lexical concepts evoked by each lemma/canonical form in the lexicon.
Domain: ConceptualizationSet
Range: xsd:decimal
The average synonymy property indicates the average number of lexical entries evoking each lexical concept in the concept set.
Domain: ConceptualizationSet
Range: xsd:decimal
The following example shows how to describe the metadata of a version of WordNet 3.0 transformed into RDF. The example illustrates how to describe the main components of the resource (a lexicon, a concept set and a conceptualization relating them). The transformation to RDF is based on a straightforward mapping between the WordNet meta-model and the ontolex model:
By having this mapping in mind, it should be clear how some of the statistics about WordNet 3.0 would be specified by means of the vocabulary introduced by the lime module:
:WnConceptualizationSet a lime:ConceptualizationSet ;
lime:conceptualDataset :WnConceptSet ;
lime:lexiconDataset :WnLexicon ;
lime:lexicalEntries "155287"^^xsd:integer ;
lime:concepts "117659"^^xsd:integer ;
lime:conceptualizations "206941"^^xsd:integer ;
lime:avgAmbiguity "1.33"^^xsd:decimal ;
lime:avgSynonymy "1.76"^^xsd:decimal
.
:WnConceptSet a ontolex:ConceptSet ;
lime:concepts "117659"^^xsd:integer .
:WnLexicon a lime:Lexicon ;
lime:lexicalEntries "155287"^^xsd:integer .
The lime module essentially provides vocabulary to describe the relation between three sets:
The model considers binary relations over these sets as follows:
For each Ri, it holds that the relation is a subset of the Cartesian product of the involved sets, i.e. Ri ⊆ A × B
For each of these relations Ri ⊆ A × B, we define the following counts:
and ratios:
The lime model does not introduce all the properties to express all of the above counts for all three relations, but has selected to model the following relations:
Relation |
Related Dataset |
cardinality(Ri) |
count(πA(Ri)) |
count(πB(Ri)) |
coverageA(Ri) |
averageA(Ri) |
averageB(Ri) |
---|---|---|---|---|---|---|---|
Rlex ⊆ O × L |
lime:lexicalizations |
lime:references |
lime:lexicalEntries |
percentage |
avgNumOfLexicalizations |
---- N/A ---- |
|
Rcon ⊆ L × C |
lime:conceptualizations |
lime:lexicalEntries |
lime:concepts |
---- N/A ---- |
avgAmbiguity |
avgSynonymy |
|
Rlink ⊆ O × C |
lime:links |
lime:references |
lime:concepts |
percentage |
avgNumOfLinks |
---- N/A ---- |
In this section, we describe different publication scenarios for lemon models. The lexicon ontology model essentially describes three types of entities:
Irrespective of their logical dependencies, all of the entities above can be published as physically independent data sources. At the other end of the set of options, the entities can be published together as one data source.
We highlight four common publication scenarios:
rdfs:label
, skos
or skosxl
labeling properties.Similarly, there is Concept Set for a collection of lexical concepts and ConceptualizationSet
for the triples expressing how lexical concepts relate to lexical entries from a given lexicon. Similar considerations to the ones above apply to these datasets.
Identifying a Concept Set as an independent dataset allows reusing the same lexical concepts across different conceptualization sets. For example, this allows us to reuse the same lexical concepts from an existing wordnet to conceptualize a lexicon in a different natural language than the one for which the resource was initially conceived. Otherwise, it is possible to define different concept sets, one for each conceptualization set, and then to relate them via a VoID Linkset
.
An important goal of a lexicon is to record linguistic properties of the lexical entries defined in the lexicon such as its part-of-speech, gender, aspect, inflectional pattern, etc. The lemon model does not prescribe any vocabulary for doing so, but leaves it at the discretion of the user of the model to select an appropriate vocabulary that is in line with a given theoretical linguistic framework or grammar. We show below how third party category systems can be reused to describe the properties of lexical entires in a lemon lexicon. We will use the [http://lexinfo.net/ontology/2.0/lexinfo# lexinfo] ontology in our examples as such as third party ontology describing relevant linguistic categories and properties.
A lexicon typically indicates the part-of-speech of a given lexical entry. We can specify the part of speech of a word as follows using the lexinfo vocabulary:
When defining categories, it is crucial to link these categories to other models to establish coherence. The partOfSpeech property is defined as follows in lexinfo:
lexinfo:partOfSpeech
rdfs:label "part of speech"@en ;
rdfs:comment "A category assigned to a word based on its grammatical and semantic properties."@en ;
dcr:datcat <http://www.isocat.org/datcat/DC-1345> ,
<http://www.isocat.org/datcat/DC-396> ;
rdfs:range lexinfo:PartOfSpeech ;
rdfs:subPropertyOf lexinfo:morphosyntacticProperty .
The concrete part of speech "noun" is defined as follows and linked to the ISOcat category DC-1333.
lexinfo:noun
a lexinfo:PartOfSpeech, lexinfo:NounPOS ;
rdfs:label "noun"@en ;
rdfs:comment "Part of speech used to express the name of a person, place, action or thing."@en ;
dcr:datcat <http://www.isocat.org/datcat/DC-1333> .
The following morpho-syntactic properties are defined in the lexinfo ontology:
When using these properties, care should be taken to distinguish between linguistic properties of the entry itself and properties of any of the forms. By default, it should be assumed that a property of a lexical entry also holds for all its forms. For example, in many languages gender is an entry property for nouns, but a form property for adjectives, for example:
:spiaggia a ontolex:Word ;
ontolex:canonicalForm :spiaggia_lemma ;
ontolex:otherForm spiaggia_plural ;
lexinfo:partOfSpeech lexinfo:noun ;
lexinfo:gender lexinfo:feminine .
:spiaggia_lemma
ontolex:writtenRep "spiaggia"@it ;
lexinfo:number lexinfo:singular .
:spiaggia_plural
ontolex:writtenRep "spiagge"@it ;
lexinfo:number lexinfo:plural .
:famoso a ontolex:Word ;
ontolex:canonicalForm :famoso_lemma ;
ontolex:otherForm :famosa_form, :famose_form, famosi_form ;
lexinfo:partOfSpeech lexinfo:adjective .
:famoso_lemma
ontolex:writtenRep "famoso"@it ;
lexinfo:number lexinfo:singular ;
lexinfo:gender lexinfo:masculine .
:famosa_form
ontolex:writtenRep "famosa"@it ;
lexinfo:number lexinfo:singular ;
lexinfo:gender lexinfo:feminine .
For convenience, lexinfo also introduces specific classes for each part of speech so that the part of speech of a word can be specified by a rdf:type statement. For example, the part of speech Noun is defined as follows:
Noun ≡ ∃ partOfSpeech.NounPOS
It is recommended to use both the rdf:type statement as well as the lexinfo:partOfSpeech to maximize interoperability in spite of the small redundancy:
Pragmatic aspects related to the usage of a lexical entry as well as the paradigmatic relationships between lexical entries can also be described using the lemon model by resorting to some external vocabulary. As for the case of the description of the morphosyntactic properties of lexical entries and their forms, lemon does not prescribe any vocabulary but encourages the use of external vocabularies to describe aspects related to the temporal use of a lexical entry, e.g. to indicate whether the use of the lexical entry is modern or anachronic or to specify lexico-semantic relationships between lexical senses. Examples of such paradigmatic or lexico-semantic relationships are: synonymy, antonymy, holonymy, hypernymy, meronymy, etc.
When describing syntactic frames it is important to specify the grammatical role or function played by different syntactic arguments. We might want to specify, for instance, which argument plays the grammatical role of subject and which argument plays the role of a direct object, etc. LexInfo distinguishes the following types of arguments:
Each argument is associated with a specific property indicating the grammatical role to the actual object representing the syntactic argument.
:father a lexinfo:Noun ;
synsem:synBehavior :father_frame.
:father_frame a lexinfo:NounPredicateFrame ;
rdfs:label "X is the father of Y" , "X is Y's father" ;
lexinfo:copulativeArg :father_frame_arg1 ;
lexinfo:possessiveAdjunct :father_frame_arg2 .
:father_frame_arg1 a lexinfo:CopulativeArg .
:father_frame_arg2 a lexinfo:PossessiveAdjunct .
Syntactic or subcategorization frames describe which syntactic arguments a certain lexical entry (verb, noun etc.) requires to be complete. A verb that requires a subject and a direct object is called a transitive verb. The corresponding frame that generalizes across particular verbs is called transitive frame or transitive construction (in construction grammar theories).
In lexinfo, frames can be axiomatized by describing which type of arguments they subcategorize. A transitive frame would be axiomatized as follows in lexinfo:
TransitiveFrame ≡ VerbFrame ⊓ (=1 subject ⊓ =1 directObject)
In addition, it is possible to define other properties in an external resource, that may be difficult to translate across resources. An example of such a property is translation confidence as shown below:
:bench a ontolex:LexicalEntry ;
ontolex:lexicalForm [ ontolex:writtenRep "bench"@en].
:bench-sense a ontolex:LexicalSense ;
ontolex:isSenseOf :bench .
:banco a ontolex:LexicalEntry ;
ontolex:lexicalForm [ ontolex:writtenRep "banco"@es].
:banco-sense a ontolex:LexicalSense ;
ontolex:isSenseOf :banco .
:tranSetEN-ES a vartrans:TranslationSet ;
dc:source <http://hdl.handle.net/10230/17110> ;
vartrans:trans :bench_banco-trans .
:bench_banco-trans a vartrans:Translation ;
vartrans:source :bench-sense ;
vartrans:target :banco-sense .
:tranSetEN-ES a prov:Entity .
:bench_banco-trans a prov:Entity .
:humanTranslationActivity a prov:Activity .
:executionOfMyAlgorithm a prov:Activity .
:bench_banco-trans prov:qualifiedGeneration [
a prov:Generation ;
prov:activity :humanTranslationActivity ;
lexinfo:translationConfidence 1.0 ;
] .
:bench_banco-trans prov:qualifiedGeneration [
a prov:Generation ;
prov:activity :executionOfMyAlgorithm ;
lexinfo:translationConfidence 0.3 ;
] .
Lexical nets, so called wordnets in particular, are an important type of lexical resource used very often in natural language processing applications. Lexical nets organize the senses of words into groups of equivalent meaning, so called synsets. Further, synsets are related to each other using lexico-semantic relationships so that the the resource can be regarded as a "net". We discuss below how lexical nets can be represented using the lemon vocabulary using Princeton wordnet as an example.
As mentioned above, lexical nets indicate the different lexical senses that a word has and groups these senses into sets of equivalent senses (so called synsets). Below we state how the main entities of a lexical net (words, lemmas, senses and synsets) can be represented in lemon:
Lexico-semantic relations should be represented between lexical concepts. The WordNet-RDF ontology defines some of these lexico-semantic relations:
http://globalwordnet.github.io/schemas/wn#
An full description of the Global WordNet Association extension of OntoLex-lemon is available here, and an example of modelling is given here:
<#example-en> a lime:Lexicon ;
rdfs:label "Example wordnet (English)"@en ;
dc:language "en" ;
schema:email "john@mccr.ae" ;
cc:license <https://creativecommons.org/publicdomain/zero/1.0/> ;
owl:versionInfo "1.0" ;
schema:citation "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)." ;
schema:url "http://globalwordnet.github.io/schemas/" ;
dc:publisher "Global Wordnet Association" ;
lime:entry <#w1>, <#w2>, <#w3> .
<#w1> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "grandfather"@en
] ;
wn:partOfSpeech wn:noun ;
ontolex:sense <#example-10161911-n-1> .
<#example-10161911-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-10161911-n> .
<#w2> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "paternal grandfather"@en
] ;
wn:partOfSpeech wn:noun ;
ontolex:sense <#example-1-n-1> .
<#example-1-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-1-n> .
[] a ontolex:Sense ;
vartrans:source <#example-1-n-1> ;
vartrans:category wn:derivation ;
vartrans:target <#example-10161911-n-1> ;
dc:creator "John McCrae"@en .
<#w3> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "pay"@en
] ;
wn:partOfSpeech wn:verb ;
synsem:synBehavior [
rdfs:label "Sam cannot %s Sue" @en
], [
rdfs:label "Sam and Sue %s"@en
], [
rdfs:label "The banks %s the check"@en
] .
<#example-10161911-n> a ontolex:LexicalConcept ;
skos:inScheme <#example-en> ;
wn:ili ili:i90287 ;
wn:definition [
rdf:value "the father of your father or mother"@en
] .
[]
vartrans:source <#example-10161911-n> ;
vartrans:category wn:hypernym ;
vartrans:target <#example-10162692-n> .
<#example-1-n> a ontolex:LexicalConcept ;
skos:inScheme <#example-en> ;
wn:definition [
rdf:value "the father of your father or mother"@en
] ;
wn:iliDefinition [
rdf:value "the father of your father or mother"@en ;
dc:source "https://en.wiktionary.org/wiki/farfar"
] .
[]
vartrans:source <#example-1-n> ;
vartrans:category wn:hypernym ;
vartrans:target <#example-10162692-n> .
<#example-sv> a lime:Lexicon ;
rdfs:label "Example wordnet (Swedish)"@sv ;
dc:language "sv" ;
schema:email "john@mccr.ae" ;
cc:license <https://creativecommons.org/publicdomain/zero/1.0/> ;
owl:versionInfo "1.0" ;
schema:citation "CILI: the Collaborative Interlingual Index. Francis Bond, Piek Vossen, John P. McCrae and Christiane Fellbaum, Proceedings of the Global WordNet Conference 2016, (2016)." ;
schema:url "http://globalwordnet.github.io/schemas" ;
dc:publisher "Global Wordnet Association" ;
lime:entry <#w4> .
<#w4> a ontolex:LexicalEntry ;
ontolex:canonicalForm [
ontolex:writtenRep "farfar"@sv
] ;
ontolex:otherForm [
ontolex:writtenRep "farfäder"@sv ;
wn:tag "NNS"
] ;
wn:partOfSpeech wn:noun ;
wn:sense <#example-2-n-1> .
<#example-2-n-1> a ontolex:LexicalSense ;
ontolex:reference <#example-1-n> ;
wn:example [
rdf:value "Jag vill berätta för er att min farfar var svensk beredskapssoldat vid norska gränsen under andra världskriget, ett krig som Sverige stod utanför"@sv ;
dc:source "Europarl Corpus"
] .
In this section, we informally clarify the relation to other models, in particular SKOS, the Lexical Markup Model (LMF), and the Open Annotation standard.
SKOS is a vocabulary used to represent so called knowledge organization systems (KOS), comprising taxonomies, classification schemes, thesauri etc. SKOS thus addresses an orthogonal use case to lemon. lemon was designed to provide detailed information about the linguistic grounding of an ontological vocabulary, specifying in particular by which lexical entries a class or property can be verbalized. SKOS has only a very rudimentary way of doing this, that is by means of SKOS labels and the properties (prefLabel, altLabel and hiddenLabel). This is by no means a criticism of SKOS, but merely to make clear that SKOS and lemon have been designed with a different purpose and use case in mind.
Nevertheless, SKOS and lemon can be used in conjunction to provide more detailed information about the "labels". We recommend to use the property evokes and its inverse isEvokedBy to relate a skos:Concept to a lexical entry. This is shown in the following example:
The use case we address is one where a thesaurus or other taxonomic resource or classification system in SKOS needs to be enriched with more detailed linguistic information.
:financial_assets a skos:Concept;
ontolex:lexicalizedSense :financial_assets_lex.
:financial_assets_lex a ontolex:LexicalEntry;
ontolex:evokes :financial_assets;
ontolex:canonicalForm :financial_assets_form.
:financial_assets_form ontolex:writtenRep "financial assets".
The above represents the recommended way of linking a SKOS concept to a lexical entry in the lexicon ontology model.
To show how to make statements about preferred lexicalizations akin to the properties prefLabel
, altLabel
and hiddenLabel
as used in SKOS, the following example shows how to attach such preference information via the lexical senses:
:tuberculosis a skos:Concept;
ontolex:isEvokedBy :tuberculosis_lex;
ontolex:isEvokedBy :consumption_lex.
:tuberculosis_lex a ontolex:LexicalEntry;
ontolex:sense :tuberculosis_sense;
ontolex:evokes :tuberculosis.
:tuberculosis_sense a ontolex:LexicalSense;
ontolex:isLexicalizedSenseOf :tuberculosis;
ontolex:usage [ rdf:value "preferred" ].
:consumption_lex a ontolex:LexicalEntry;
ontolex:sense :consumption_sense;
ontolex:evokes :tuberculosis.
:consumption_sense a ontolex:LexicalSense;
ontolex:isLexicalizedSenseOf :tuberculosis;
ontolex:usage [ rdf:value "outdated" ].
In case you are using reified labels as in SKOS-XL, it is possible to have forms or lexical entries in the range of the skosxl:prefLabel
, skosxl:altLabel
and skosxl:hiddelLabel
properties. However, we note that from this it would follow that lexical entries and forms would be inferred to be skosxl:Label
s, which does not correspond to the understanding of forms and lexical entries of this community as linguistic objects rather than mere `labels'.
The Lexical Markup Framework (LMF) (ISO-24613:2008) is a standard for representing machine readable lexicons. The model is not suited, however, to publish lexica on the web as linked data as it only knows a serialization in XML rather than in RDF. Further, LMF does not address the interface between lexica and ontologies as lemon does.
Nevertheless, the lemon model draws heavy inspiration from the LMF model. lemon has imported many classes/entities from LMF and adopted its core ontology. On the other hand, lemon has added vocabulary to describe the syntax-semantics interface with respect to an ontology and remove a number of classes that create syntactic overhead. A complete description of the relationship between LMF and the original lemon model is provided here. The main differences are summarized here:
In many uses cases the need arises to annotate a text corpus with links to entities defined in a lexicon, e.g. lexical entries, forms, lexical senses, lexical concepts etc. lemon does not support this annotation per se, as there are other models that are dedicated exactly to this. This is the case for the Open Annotation standard. In both models an element of the lexicon may be the target of an annotation. This target may be a form, lexical entry, lexical sense or lexical concept and it is important to give the class to make clear what the target of the annotation is.
We will now give an example of annotating a word "cat" occurring at character 7 in a file at the URL 2, where the lemon element is given as the body of an annotation. For example
@prefix dctypes: <http://purl.org/dc/dcmitype/> .
@prefix oa: <http://www.w3.org/ns/oa#> .
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
:annotation a oa:Annotation ;
oa:hasBody :cat ;
oa:hasTarget <anno#target> .
:annotation#target a dctypes:Text ;
oa:hasSelector <http://www.example.com/doc.txt#char=7,10> .
<http://www.example.com/doc.txt#char=7,10> a oa:FragmentSelector .
<cat> a ontolex:LexicalEntry .
The following persons have contributed to the creation of this document and are gratefully acknowledged.
A lexical entry represents a unit of analysis of the lexicon that consists of a set of forms that are grammatically related and a set of base meanings that are associated with all of these forms. Thus, a lexical entry is a word, multiword expression or affix with a single part-of-speech, morphological pattern, etymology and set of senses.