Wikidata:Property proposal/SIMBAD catalog properties (used more than 1 million times)

From Wikidata
Jump to navigation Jump to search

‎SIMBAD catalog properties (used more than 1 million times)

[edit]

Gaia Data Release 2 ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in Gaia Data Release 2
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{18}
Example 1BS Cnc (Q2889194)661284024235415808
Example 2Gliese 450 (Q5880899)4031586157514097024
Example 3TYC 3645-2080-1 (Q75838267)1943381923013901440
SourceGaia Data Release 2 (Q51905050)
Planned usemigrate all P528 values qualified with P972 Q51905050 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=Gaia%20DR2%20$1

2MASS ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in the Two Micron All Sky Survey
Data typeExternal identifier
Domainastronomical objects
Allowed valuesJ[0-9]{8}[+-][0-9]{7}
Example 1BS Cnc (Q2889194)J08390909+1935327
Example 2Gliese 450 (Q5880899)J11510737+3516188
Example 3TYC 3645-2080-1 (Q75838267)J23350993+4851114
Source2MASS (Q1454942)
Planned usemigrate all P528 values qualified with P972 Q1454942 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=2MASS%20$1

Tycho-2 Catalogue ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in the Tycho-2 Catalogue
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{1,4}-[0-9]{1,4}-1
Example 1BS Cnc (Q2889194)1395-2445-1
Example 2Gliese 450 (Q5880899)2526-2357-1
Example 3TYC 3645-2080-1 (Q75838267)3645-2080-1
SourceThe Tycho-2 catalogue of the 2.5 million brightest stars (Q2725928)
Planned usemigrate all P528 values qualified with P972 Q2725928 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=TYC%20$1

Gaia Data Release 1 ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in Gaia Data Release 1
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{18}
Example 1BS Cnc (Q2889194)661284019938140032
Example 2Gliese 450 (Q5880899)4031586157514097024
Example 3TYC 3645-2080-1 (Q75838267)1943381923012780160
SourceGaia Data Release 1 (Q37859523)
Planned usemigrate all P528 values qualified with P972 Q37859523 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=Gaia%20DR1%20$1

SDSS object ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in the Sloan Digital Sky Survey
Data typeExternal identifier
Domainastronomical objects
Allowed valuesJ[0-9]{6}\.[0-9]{2}[+-][0-9]{7}\.[0-9]
Example 1BS Cnc (Q2889194)J083909.03+193532.4
Example 2Gliese 450 (Q5880899)J115106.57+351627.2
Example 3TYC 3645-2080-1 (Q75838267)J233509.93+485111.4
SourceSloan Digital Sky Survey (Q840332)
Planned usemigrate all P528 values qualified with P972 Q840332 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=SDSS%20$1

OGLE-III object ID

[edit]

Return to Wikidata:Property proposal/Natural science

   Under discussion
Descriptionidentifier for an astronomical object in the Optical Gravitational Lensing Experiment
Data typeExternal identifier
Domainastronomical objects
Example 1R99 (Q22087000)BRIGHT-LMC-MISC-429
Example 2R85 (Q28406638)BRIGHT-LMC-MISC-9
Example 3SV* HV 2827 (Q74703824)LMC-CEP-4689
SourceThe Optical Gravitational Lensing Experiment. The OGLE-III catalog of variable stars. I. Classical Cepheids in the Large Magellanic Cloud (Q67054966)
Planned usemigrate all P528 values qualified with P972 Q67054966 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=OGLE%20$1

Motivation

[edit]

The specific combination of catalog code (P528) qualified by catalog (P972) is used in 24 million statements, the vast majority of which are for astronomical objects. About 14 million of these statements come from six catalogues, so migrating those statements to use these properties would remove the 14 million triples taken up by the P972 qualifiers. (Another 18 catalogues have more statements than the number of statements for inventory number (P217) with qualifier collection (P195) The Palace Museum (Q2047427)—127545 as of 6 August 2024.)

(This migration would similar to the migration that took place after the properties proposed at Wikidata:Property proposal/proper motion components were created. While this page intends to handle only the six largest catalogues, if you believe there are other large catalogues whose catalog codes would do well to be migrated to properties, please say so in a comment.) Mahir256 (talk) 21:56, 6 August 2024 (UTC)[reply]

Discussion

[edit]
@Mahir256 Is there any specific reason why we want to reduce number of P528 statements? Ghuron (talk) 00:03, 7 August 2024 (UTC)[reply]
@Ghuron: We have dedicated external identifier properties rather than lumping them all in a single property and qualifying them, just as we have dedicated website account properties rather than always using website account on (P553) qualified with website username or ID (P554). This proposal is intended as a logical parallel of both of those decisions. Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]
@Mahir256: Let me rephrase how I understood your rationalization: if p:P528/pq:P972 wd:Q51905050 occurs more than a million times, then it is both a necessary and sufficient condition for creating a new property, since it reduces the number of triplets and thus reduces the risk of Blazegraph crashing. Is that a correct summary? Ghuron (talk) 22:44, 12 August 2024 (UTC)[reply]
@Ghuron: I would not phrase it quite so absolutely, but I do want to see the number of triples reduced and believe this is a way to do it; an extremely high number of identically structured uses of a generic identification property like catalog code (P528) with the same qualifiers suggests that a more specialized identifier property is worth introducing to streamline things, just as has been done multiple times before. Mahir256 (talk) 16:50, 13 August 2024 (UTC)[reply]
As stated by Ghuron, is there any reason why we need to reduce the number of P528 statements? In the first place there are millions of Gaia IDs because of the import of the Simbad database (I am NOT against this import btw).
Also, I wonder why only some catalogues would have their own properties. This will create a weird in-between for catalogues in P258 vs catalogues having their own properties. This makes no sense imo.
Romuald 2 (talk) 15:31, 8 August 2024 (UTC)[reply]
  • There is nothing wrong with having separate external id properties for most used identifiers with the correct "url formatter".
    But I have 2 major objections:
  1. I don't see any reason to use https://simbad.u-strasbg.fr/simbad/sim-id?Ident= as a url. Those items that are on simbad, we already have Property:P3083 with the link to simbad. Those rare items that are not on simbad, this link will result in 404
  2. Having in mind (1) it would make sense to link to really useful external storages, that are only partially synchronized with simbad (like HyperLEDA or Gaia Archive). And that leads us to question about proposed set of properties:
    1. Why did we choose Gaia DR2, because this is only temporary IDs, permanent are Gaia DR3?
    2. Why did we choose Tycho-2, they pretty much 100% imported in Simbad?
Ghuron (talk) 12:52, 9 August 2024 (UTC)[reply]
  • @Romuald 2: Reducing the number of RDF triples that Wikidata consists of is generally a good thing, as there is a lot of discussion going on about the health of the Query Service and how reducing the number of triples that a single running Blazegraph instance holds is generally a good thing. Also I had noted that there were 18 other catalogs with more entries than the most frequent inventory number source; I only didn't add them to this page because it would have got too long. If these six go through, then I will promptly propose properties for those 18 (and as I stated in the motivation above, if you believe there are other large catalogues whose catalog codes would do well to be migrated to properties, please say so in a comment). Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]
    @Ghuron: The reason I selected the SIMBAD formatter URL is that the external IDs I tried with that URL all seemed to resolve to the right objects; if there are in fact objects for which this resolution doesn't work, it would be great if you could name some. The caveat "(used more than 1 million times)" in the title of this property proposal page is important; because your imports did not yield more than 1 million Gaia DR3 identifiers, I did not think to propose a property for it here, though I'd gladly support one for Gaia DR3 if you think it would be useful. I don't know who "we" is as regards either Gaia DR2 or Tycho-2; you're the one who mass-imported the objects, so I'm working with the catalog codes I see on those objects. Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]
    @Ghuron and @GZWDer, would you like to give your opinions? Regards, ZI Jony (Talk) 18:37, 16 September 2024 (UTC)[reply]
    I view external identifiers somewhat differently than @Mahir256. In my understanding, a new external identifier is needed when it provides a link to a new, previously unrelated external data source. In the proposed cases, we are getting connection to the same SIMBAD that we are already connected to via Property:P3083. Personally, the proposed identifiers will not bring me any valie (nor will they cause any harm).
    I understand the idea that this will reduce the number of triplets, but I think that the measly few million that we are discussing here are a drop in the ocean. Our goal is to upload data to Wikidata, and not try to optimize it in a way that makes life easier for the foundation's engineers. Let them do their job and we will do ours. Ghuron (talk) 19:00, 16 September 2024 (UTC)[reply]