Property talk:P4390

From Wikidata
Jump to navigation Jump to search

Documentation

mapping relation type
qualifier to more precisely define the relation of the item to the external identifier using SKOS
RepresentsSKOS generic mapping relation (Q60817979)
Data typeItem
Domainstatements of properties of type external identifier (Q21754218) (note: this should be moved to the property statements)
Allowed valuesclose match (Q39893184), exact match (Q39893449), related match (Q39894604), broad match (Q39894595) or narrow match (Q39893967)
Usage notesOnly use this to qualify identifiers linked to the closest matching wikidata item. See Discussion page for details.
Exampleoverseas countries and territories (Q1451600)close match (Q39893184)
Lake Constance Region (Q34947397)exact match (Q39893449)
assessment centre (Q265558)related match (Q39894604)
Source
According to this template: Simple Knowledge Organization System (Q2288360) (see formal and extended definition of mapping relations)
According to statements in the property:
https://www.w3.org/TR/skos-primer/#secmapping
When possible, data should only be stored as statements
See alsoexact match (P2888), narrower external class (P3950), broader concept (P4900)
Lists
Proposal discussionProposal discussion
Current uses
Total229,414
Qualifier229,398>99.9% of uses
Reference16<0.1% of uses
Search for values
Explanations [Edit]

When to use "mapping relation" qualifiers?

[edit]

External identifiers express an "equivalence relationship" between the item and an entity in an external database. In the great majority of cases (most prominently, persons) this is self-evident and sufficient. However, some vocabularies - for example, thesauri from the field of social sciences - may comprise concepts where more specific relationships are needed, because the external concept is not an exact match for the Wikidata item. For such cases, it is possible to qualify the relation further by use of "mapping relation" qualifier.

The allowed values stem from Simple Knowledge Organization System (Q2288360) (SKOS) mapping relations, they are:

  • exact match indicates that two concepts have equivalent meaning, and the link can be exploited across a wide range of applications and schemes. The link is meant to be transitive (A = B and B = C means A = C).
  • close match indicates that two concepts are sufficiently similar that they can be used interchangeably in many applications. This link is not meant to be transitive.
  • narrow match indicates that one concept is narrower than the other (for the representation of hierarchical links). The link is not meant to be transitive.
  • broad match indicates that one concept is broader than the other (inverse of narrowMatch). The link is not meant to be transitive.
  • related match indicates a non-hierarchical assoziative relationship between two concepts. The link is not meant to be transitive.

These relations are introduced to allow more complete and more precise mappings to and from external knowledge organization systems. They not intended for adding random relationships to external entities. If an exatly matching item exists, almost always no other relation to the external id is necessary. If no exactly matching item exists, the external identifier should link to the closest matching wikidata item, not to more distantly related items, e.g. "Germany" "narrow match" "id-of-Berlin" or "Barak Obama" "close match" "id-of-Michelle Obama" are not appropriate.

Definitions of external id properties may enforce or proscribe the use of the "mapping relation" qualifier, by setting up a mandatory qualifier or allowed qualifiers constraint. Before adding qualifiers for individual values of a well established external id property, it may be advisable to discuss their use on the talk page of the property.

Scope is as qualifier (Q54828449): the property must be used by specified way only (Help)
List of violations of this constraint: Database reports/Constraint violations/P4390#Scope, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200), Wikibase lexeme (Q51885771): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4390#Entity types

Help with usage instructions

[edit]

@Jura1, ArthurPSmith: I've added the proposed usage instructions from the property proposal as a formal help page, and added a hint in the English description of the property. I hope that helps mitigating concerns you expressed in the discussion about the property. I'd be very happy to learn how these instructions could be improved. Jneubert (talk) 06:22, 19 October 2017 (UTC)[reply]

descriptions should be for people seeing the property in statements, we generally put more detailed instructions appropriate for people using a property in "wikidata usage instructions". I rearranged that, also added the various statements this property should have and constraints and trimmed the documentation. Hopefully everything looks about right now? ArthurPSmith (talk) 12:42, 19 October 2017 (UTC)[reply]
@ArthurPSmith: Thank you very much! I'm fine with the changes. The examples got trucated somehow (missing STW IDs) and perhaps should be removed alltogether, because they are on the property page. I have added a source website for the property (P1896) property there, but would suggest to leave it on the documentation page, too. (I somehow did not find a way to change the documentation text and remove the note, and don't understand which parts of the text are generated from which sources. If there is a help page an that, it would be nice if you could provide a link.) Thanks again, Jneubert (talk) 13:59, 19 October 2017 (UTC)[reply]
See Template:Property documentation - unfortunately there's a bug in it for wikidata property examples with qualifiers, this is how it looks, but the examples display reasonably clearly on the main property page so I think they should be left as is. ArthurPSmith (talk) 14:02, 19 October 2017 (UTC)[reply]
Thanks - so we live with it. I've removed again source website for the property (P1896), which I had inserted earlier, because it messes up the documentation page and is not machine-actionable anyway.Jneubert (talk) 16:50, 19 October 2017 (UTC)[reply]
  • Quote: "These relations are introduced to allow more complete and more precise mappings to and from external knowledge organization systems. They not intended for adding random relationships to external entities. E.g., "Germany" "narrow match" "id-of-Berlin" or "Barak Obama" "close match" "id-of-Michelle Obama" are not appropriate."
The sample seems too obvious as both items are likely to exists, but I don't think "random relationship" describes the general problem:
  • Wikidata grows (new items are created and I don't see why one should search for identifiers someone somehow attached to a related item)
  • it can be that an existing exact-match item is described in a language one can't read and even when searching for it one wont find it (so adding such a partial match despite a better existing item is bound to happen)
  • an exact match exists and is well described, but somehow people missed it.
To ensure this doesn't get carried away, one needs to monitor partial matches and periodically check to find better matches.
--- Jura 13:04, 19 October 2017 (UTC)[reply]
Well, if I remember correctly, the first example was introduced by you :) Anyway, I'd agree that they are probably all-too-obvious - could you help phrasing it better?
Your description of the gereral problem with mappings and vocabulary evolution are absoultely correct and in my eyes deserve an own paragraph (plus recommendations for processes and the development of tools to support them). These are problems for every vocabulary mapping, and normally handled poorly (for sure you could find lots of examples in existing STW mappings). Wikidata is in a unique position here because:
  1. . It is much broader than every single vocabulary, so the overlap is from start much higher than, e.g., between Agrovoc and STW.
  2. . When it makes sense for both sides - for Wikidata and for the purpose of the mapping -, new items can be created easily (in particular if a concept is defined by more than a single external identifier). With traditional knowledge organization systems maintained in the resposibility of different institutions this is completely unimaginable.
  3. . Wikidata is open and makes it easy to develop tools for mapping maintenance (in particular for periodical reviews of partial matches - here a first simplistic approach) and to make them re-usable and improvable for others who herd their own vocabulary.
These are the reasons why I'm advocating using Wikidata as linking hub, and it is very much in my interest to have this framed in a way which does not break things, and to provide tools to encourage a sensible use. Jneubert (talk) 15:06, 19 October 2017 (UTC)[reply]
BTW, the first entry in the list linked above ('personnel selection') perfectly illustrates your point. Please leave it untouched for now, I want to discuss it with my colleagues. Jneubert (talk) 15:08, 19 October 2017 (UTC)[reply]
  • That some other vocabulary may have the same problem doesn't really help us solve it for Wikidata, especially since Wikidata doesn't have the luxury of a single editor who updates everything. Maintenance needs to be done here and now and not in some distant future by some yet to developed tool.
    --- Jura 04:04, 20 October 2017 (UTC)[reply]
  • It looks like it was sufficient to move the sample two lines further down.
    --- Jura 04:08, 20 October 2017 (UTC)[reply]

Populating an external-id property with a default mapping relation type

[edit]

For the existing ~400 values of STW Thesaurus for Economics ID (P3911), I have built and applied a query which creates qualifier insert statements for QuickStatements2. It does not create any statements if a qualifier is already defined for the property, and worked fine for the purpose, but is otherwise not tested. A query for a list of mappings with labels from both sides and the mapping relation is also available (configured for STW, too). Jneubert (talk) 16:40, 19 October 2017 (UTC)[reply]

And of course, don't use this or a similar script without prior consensus for properties with a broader usage. Jneubert (talk) 16:42, 19 October 2017 (UTC)[reply]

Don't use without prior agreement

[edit]

Jneubert Why did you revert?
--- Jura 09:47, 20 October 2017 (UTC)[reply]

I was just explaining - see below. Jneubert (talk) 09:53, 20 October 2017 (UTC)[reply]

Property description

[edit]

Hi @Jura1: I've reverted your edit of the property description which would require that the qualiifer is used only when it is listed as "allowed qualifier" for the according property. I reverted it for two reasons:

  1. The rule cannot be enforced without much larger side effects, because
  2. Please let's discuss and seek for consensus, before making such far-reaching changes. I'm very open to such an discussion. (Indeed I already had, following your suggestion in the property proposal, added an hint to the property description, which was re-phraded and moved to the usage instructions by @ArthurPSmith: during his streamlining of the property.) So let's discuss what is appropriate.

-- Jneubert (talk) 09:50, 20 October 2017 (UTC)[reply]

  • So it's mandatory or allow qualifier: we can complete that. There is nothing farreaching. I think the help you designed mentions it shouldn't be applied to random properties.
    --- Jura 09:55, 20 October 2017 (UTC)[reply]
  • Problem is, adding an "allowed qualifiers" contstraint to a property would require consensus about any qualifier which already is or in future should be used on that property. This would mean that allowing the use of one qualifier would at the same time require a very strict regime over all qualifiers. Jneubert (talk) 10:03, 20 October 2017 (UTC)[reply]
  • While I think the constraint of limiting the values of mapping relation type (P4390) to the SKOS values makes sense, I don't see a good reason to generally limit the qualifier to be only used for specific external IDs. It's useful when there's the ability of people to express information about how well a qualifier matches in the specific cases where they feel a need to do so. It's better to be inclusive than exclusive. ChristianKl (talk) 10:06, 20 October 2017 (UTC)[reply]
  • @Jneubert: it's just needs to be listed before, list can be expanded: there is nothing fixed about it. Obviously, given your approach with P3911 maybe it should actually be a mandatory qualifier only.
    --- Jura 10:17, 20 October 2017 (UTC)[reply]
  • "needs to be listed before" means just the strict regime I wouldn't want to apply to the use of all qualifiers, just in order to allow the use of one qualifier. Making the qualifier mandatory is easy for new or sparsely populated properties. For larger ones, it may be a longer transition, even if there is perfect agreement that it makes sense.Jneubert (talk) 11:05, 20 October 2017 (UTC)[reply]
  • My impression from the use of qualifiers in Wikidata is that it is not very regulated by now. I wouldn't disagree that some regulation may make sense (particularly if we can see that things move in a wrong direction), but on the other side, it could be more encourageing to give some usage instructions on the help page for a start and see how things evolve, how the qualifier is discussed and used by different people on different properties (as I understand, we can monitor the usage via the " Properties using this qualifier" report linked from the documentation page). Jneubert (talk) 11:05, 20 October 2017 (UTC)[reply]
  • I suppose that discussions on qualifier usage has been taken place on several properties. It would be great if someone could add hints. (PS. I'm out of office now and don't know if I will be able to follow this thread during the weekend - so please don't wonder about delayed responses.) Jneubert (talk) 11:13, 20 October 2017 (UTC)[reply]
  • Somehow the creation discussion failed to get a proper conclusion. We just got a smallish creation notice, but it seems that in the last version you proposed, exact match or another should be used on all statements being qualified. Accordingly, I think we should update the help and mention it in the description. It's odd that this hasn't been clarified before. Sorry for the inconvenience this creation process may have been for you.
    --- Jura 05:51, 21 October 2017 (UTC)[reply]
  • Hmm - I'd agree that applying this qualifier to all values of a property - and enforcing that with a mandatory constraint - would be the cleanest solution. However, I don't think this should be the only possible pattern. There are almost 2,000 external id properties, some with hundreds of thousands of of uses. Some "property communities" will embrace the qualifier (hopefully), some will oppose it's use, most will only think about it if somebody sees value in it and starts a discussion, and perhaps will just let it happen and see how it works out. So I think "discuss on the property page before you use it" is the most important message to people who stumble upon the qualifier and consider it useful for a certain isolated assignment. So I think we should hint to the help page and explain there, but this seems to be not so common in property descriptions. Do you have examples where such hints are given in other property descriptions?
    On the other hand, when a discussion had taken place, it would be good if property communities can send a strong signal that this qualifier should not be used with "their" property. There seems to be nothing like a "not allowed qualifiers" constraint, but perhaps this could be useful in other cases, too, where certain particular qualifiers may be considered harmful. However, I have no idea what the process is for introducing new constraints (and I'm not sure either, if the use case is common enough to justify a new constraint). Jneubert (talk) 15:23, 23 October 2017 (UTC)[reply]

Reports for the maintenance of vocabulary mappings

[edit]

For vocabulary mappings to Wikidata which use the mapping relation type (P4390) qualifier, the regular constraint reports do not cover the various occurring cases well. The "single value" and "different values" constraint reports, in particular, imply 1:1 relationships, while with qualified relations, n:m relationships may occur and deserve special attention. Furthermore, in-exact relations should be re-evaluated from time to time in order to look for better matching Wikidata items which have been added in the meantime.

Meaningful reports have to include labels and thus require information both from WDQS and a SPARQL endpoint for the linked external vocabulary. Federated queries from WDQS are limited to a list of named endpoints. Therefore, a general solution currently seems not to be achievable. For a particular property with mandatory P4390 qualifier, STW Thesaurus for Economics ID (P3911), there are examples of maintenance reports, using a custom endpoint.

-- Jneubert (talk) 09:23, 7 November 2017 (UTC)[reply]

Add to Mix-n-Match

[edit]

I've asked Magnus to add this in Mix-n-Match: https://www.wikidata.org/wiki/Topic:U95t8yw7r7i5r1n9 --Vladimir Alexiev (talk) 15:57, 11 March 2018 (UTC)[reply]

There has been already some exchange with Magnus about this in the property proposal discussion. Jneubert (talk) 08:44, 12 March 2018 (UTC)[reply]