Shortcuts: WD:RFBOT, WD:BRFA, WD:RFP/BOT
Wikidata:Requests for permissions/Bot
Wikidata:Requests for permissions/Bot |
{{Wikidata:Requests for permissions/Bot/RobotName}}
.
Old requests go to the archive.Once consensus is obtained in favor of granting the botflag, please post requests at the bureaucrats' noticeboard.
Contents
- 1 Andrebot 2
- 2 CarbonBot
- 3 QichwaBot
- 4 ThesaurusLinguaeAegyptiaeBot
- 5 Leaderbot
- 6 ZLBot
- 7 UmisBot
- 8 DannyS712 bot
- 9 TapuriaBot
- 10 IliasChoumaniBot
- 11 Browse9ja bot
- 12 OpeninfoBot
- 13 MidleadingBot 5
- 14 So9qBot 9
- 15 So9qBot 8
- 16 HVSH-Bot
- 17 RudolfoBot
- 18 GamerProfilesBot
- 19 MangadexBot
- 20 WingUCTBOT
- 21 MajavahBot
- 22 FromCrossrefBot 1: Publication dates
- 23 UrbanBot
- 24 ACMIsyncbot
- 25 WikiRankBot
- 26 ForgesBot
- 27 IngeniousBot 3
- 28 LucaDrBiondi@Biondibot
- 29 Kalliope 7.3
- 30 DL2204bot 2
- 31 Botcrux 11
- 32 Cewbot 5
- 33 Mr Robot
- 34 RobertgarrigosBOT
- 35 YSObot
- 36 AradglBot
- 37 PodcastBot
Bot Name | Request created | Last editor | Last edited |
---|---|---|---|
Andrebot 2 | 2024-10-28, 13:06:20 | Ymblanter | 2024-11-09, 18:19:27 |
CarbonBot | 2024-10-16, 18:41:08 | Ymblanter | 2024-11-03, 07:46:31 |
QichwaBot | 2024-09-25, 17:03:35 | Wüstenspringmaus | 2024-10-02, 13:06:08 |
ThesaurusLinguaeAegyptiaeBot | 2024-09-17, 10:03:37 | Lhoussine AIT TAYFST | 2024-09-17, 15:28:31 |
Leaderbot | 2024-08-21, 18:17:53 | Lymantria | 2024-09-10, 17:29:51 |
ZLBot | 2024-08-03, 12:45:33 | Wüstenspringmaus | 2024-08-28, 15:26:21 |
UmisBot | 2024-07-25, 16:44:40 | Ymblanter | 2024-08-16, 20:25:14 |
DannyS712 bot | 2024-07-21, 03:09:22 | Ymblanter | 2024-07-26, 04:29:22 |
TapuriaBot | 2024-06-03, 16:18:28 | BrokenSegue | 2024-06-07, 15:31:46 |
IliasChoumaniBot | 2024-06-03, 10:16:37 | IliasChoumaniBot | 2024-07-18, 11:01:28 |
Browse9ja bot | 2024-05-16, 02:16:04 | Browse9ja bot | 2024-05-25, 13:12:09 |
OpeninfoBot | 2024-04-16, 11:14:27 | Ymblanter | 2024-05-09, 19:22:52 |
MidleadingBot 5 | 2024-02-05, 13:04:20 | Ymblanter | 2024-11-05, 19:27:35 |
So9qBot 9 | 2024-01-05, 18:41:06 | Ymblanter | 2024-10-08, 18:41:20 |
So9qBot 8 | 2023-12-17, 15:07:59 | Samoasambia | 2024-10-28, 22:20:08 |
HVSH-Bot | 2023-12-31, 12:37:18 | So9q | 2024-01-02, 10:35:04 |
RudolfoBot | 2023-11-29, 09:29:38 | TiagoLubiana | 2023-11-30, 23:47:22 |
GamerProfilesBot | 2023-10-05, 11:06:23 | Jean-Frédéric | 2024-05-19, 07:39:50 |
MangadexBot | 2023-08-06, 18:01:17 | RPI2026F1 | 2024-01-25, 16:22:21 |
WingUCTBOT | 2023-07-31, 10:07:51 | So9q | 2024-01-02, 10:50:02 |
MajavahBot | 2023-07-11, 19:54:55 | Wüstenspringmaus | 2024-08-29, 11:05:24 |
FromCrossrefBot 1: Publication dates | 2023-07-07, 14:31:17 | Succu | 2023-11-07, 20:19:56 |
UrbanBot | 2023-06-29, 16:04:49 | Urban Versis 32 | 2023-07-15, 02:40:06 |
AcmiBot | 2023-05-16, 00:36:49 | BrokenSegue | 2023-06-22, 20:40:33 |
WikiRankBot | 2023-05-12, 03:36:56 | BrokenSegue | 2024-02-22, 15:59:51 |
ForgesBot | 2023-04-26, 09:30:12 | BrokenSegue | 2023-04-26, 17:13:55 |
IngeniousBot 3 | 2023-03-22, 16:29:58 | Ymblanter | 2023-06-23, 19:04:15 |
LucaDrBiondi@Biondibot | 2023-02-28, 18:25:03 | LucaDrBiondi | 2023-03-31, 16:10:37 |
Kalliope 7.3 | 2022-12-07, 09:16:20 | DannyS712 | 2024-06-09, 07:00:55 |
DL2204bot 2 | 2022-11-30, 11:19:21 | DannyS712 | 2024-06-09, 07:02:03 |
Botcrux 11 | 2022-11-28, 09:05:27 | Wüstenspringmaus | 2024-08-30, 09:13:32 |
Cewbot 5 | 2022-11-15, 02:20:05 | Midleading | 2024-11-04, 15:42:28 |
Mr Robot | 2022-11-04, 14:09:41 | Liridon | 2023-03-02, 13:03:34 |
RobertgarrigosBOT | 2022-10-16, 19:43:23 | Robertgarrigos | 2022-10-16, 19:43:23 |
YSObot | 2021-12-16, 11:33:29 | So9q | 2024-01-02, 10:32:27 |
AradglBot | 2022-03-14, 19:43:27 | Wüstenspringmaus | 2024-08-29, 10:55:49 |
PodcastBot | 2022-02-25, 04:38:31 | Iamcarbon | 2024-10-16, 21:26:09 |
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved--Ymblanter (talk) 18:19, 9 November 2024 (UTC)[reply]
Andrebot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Andrei Stroe (talk • contribs • logs)
Task/s: Update mayors of municipalities of Romania.
Code:
- https://github.com/rowiki/wikiro/tree/master/java/wikiro-java-elections-utils/src/main/java/org/wikipedia/ro/java/elections beginning with class WikidataMayorUpdater
using code from:
- https://github.com/rowiki/wikiro/tree/master/java/wikiro-java-elections-utils/src/main/java/org/wikipedia/ro/java/elections
- https://github.com/andreistroe/wiki-java/tree/master/src/org/wikibase
Function details: For each existing municipality, the bot checks the result of the local elections published by The Authority for Local Elections. These have been already imported under the form of MongoDB database. If the newly elected mayor has a different name, then the bot will create a new item and will link it to the municipality item via head of government (P6) and to the position of mayor of the respective municipality via officeholder (P1308). The item will contain the statements: instance of (P31) = human (Q5), occupation (P106) = politician (Q82955), country of citizenship (P27) = Romania (Q218) and, of course, position held (P39) = the position for the head of government (office held by head of government (P1313) of the municipality item) of the respective municipality. If the mayor already exists, then it will be updated with the new position held (P39) qualified by start time (P580) = 1 November 2024 (when all new mayors are scheduled to take office) and elected in (P2715) = 2024 Romanian local elections (Q105494567); the gender will also be filled in when inferrable from the first name (this should be the case for almost all). The party membership information is also being added to all of them. I did a test run on the municipalities of Alba County, one of the 41 counties of Romania. -- —Andreitalk 13:06, 28 October 2024 (UTC)[reply]
- Support Strainu (talk) 14:43, 28 October 2024 (UTC)[reply]
- Support --Sîmbotin (talk) 18:51, 30 October 2024 (UTC)[reply]
- Support --Valentin JJ. (talk) 19:36, 7 November 2024 (UTC)[reply]
- Support --Pafsanias (talk) 23:31, 7 November 2024 (UTC)[reply]
- Support --Gdaniel111 (talk) 00:18, 8 November 2024 (UTC)[reply]
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Withdrawn--Ymblanter (talk) 07:46, 3 November 2024 (UTC)[reply]
CarbonBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Iamcarbon (talk • contribs • logs)
Task/s:
1) Add default mul labels to given and family names when the item has an existing default label with a mul language.
2) Remove duplicated aliases matching the items mul label, when the item has a native label in with a mul language. As mul has not been fully adopted, a limited of aliases would be modified each day to ensure existing workflows are not disrupted.
I have withdrawn the proposal to delete duplicate aliases as there are concerns that these will reduce the visibility of these items in the search rankings.
It is expected that these tasks will apply to roughy 800,000 given and family names.
Code:
Function details:
The bot runs as a console application using the new wikidata REST api.
The application executes a query for items containing a native label that do not yet have a mul label.
--Iamcarbon (talk) 18:41, 16 October 2024 (UTC)[reply]
- Question Will this take into account the issue of duplicate items?StarTrekker (talk) 20:50, 16 October 2024 (UTC)[reply]
- The unique constraint preventing duplicates is based on (label + description). This bot task only proposes to remove duplicate aliases - so this constraint will still prevent duplicates. Iamcarbon (talk) 21:09, 16 October 2024 (UTC)[reply]
- Support but this would probably also be useful for many websites and organizations. A problem there could be that when removing one could lose the info which languages have the default name as in some cases there could be some that do not have the default label. Some info on that would be good later. Prototyperspective (talk) 22:04, 16 October 2024 (UTC)[reply]
- Agreed that this can likely be extended to other types (e.g. editions of work, humans, and organizations) in the future. However, these will each need their own discussions to determine which rules work - and cause the least distruption - as these label sets have 10+ years of history that needs to be considered.
- The initial tasks here are limited to given and family names where there is an explicit default that is known to be mul, without historical baggage. We also aren't removing labels yet, as this impacts search rankings and removes an important check that prevents duplicate items from being created. Iamcarbon (talk) 22:35, 16 October 2024 (UTC)[reply]
- Support Previously discussed at Help talk:Default values for labels and aliases, there's concensus to remove these aliases in favor of mul. Midleading (talk) 01:58, 17 October 2024 (UTC)[reply]
I am Oppose it in this form. I only support an intervention where the tag and alias are the same within a language. If there is only an alias in the given language, I am convinced that it should not be removed for the time being. I also do not support intervention in non-Latin languages. I don't want to describe the expected problems from the beginning, I have already summarized them here. Unfortunately, subsequent edits ignored my comments. Pallor (talk) 06:42, 20 October 2024 (UTC)[reply]
- Some context to the opposition above: Adding labels in a given language currently gives the item a boost in search, and removing that label remove the boost.
- In languages that have a limited number of localized labels, items (particularly names) can be easily boosted above others by adding a localized or duplicate label.
- For less common names, where the search results only return a few items, removals have no impact - as all items are still returned. However, for other more popular items, these removals can make those names harder to discover. The impact varies per language, depending on how many other items have localized labels.
- We also need to consider that WMDE may change the algorithm in the future, as default labels become more popular, to provide additional weight to site links and other factors. Any short term "boosts" that we get for certain languages are likely to be nullified in the future.
- We should be working toward sustainable long term solutions that do not rely on duplication. For example, adding contextual suggestions (i.e. suggesting only family names, when adding a family name.)
- It should also be considered that while only ~0.5% of items are given and family names, they are currently responsible for nearly 10% off all labels and aliases. At 500 labels per name, this requires us to maintain 350,000,000+ additional labels. These labels have real storage and indexing costs, make the site less responsive, account for millions of unnecessary edits (and related watchlist notifications), and require significant time from the community to provide oversight.
- The default names are project has been in the works for years to facilitate the removal of these labels. While we can consider keeping them to keep the status-quo, I believe this would be a grave mistake that would postpone us from making bigger long term improvements.
- This will cause some short term disruption, but can also be the catalyst for the community to react and improve. Iamcarbon (talk) 05:26, 24 October 2024 (UTC)[reply]
- It's also occurred to me, that any lost search rankings may be regained once we delete duplicate the duplicated human name labels. Iamcarbon (talk) 16:06, 24 October 2024 (UTC)[reply]
- Until the prioritization of the free-word search engine is improved, I will not support the launch of the bot, but after that I see no obstacle. I think - on other discussion pages - we have both already written all our arguments. Deletion of tags currently results in a decrease in the distribution of certain data types, an increase in the number of duplicate elements (which we will not or will have difficulty noticing), Even if I do not consider this to be the most optimal solution for reducing the size of the database, I fundamentally support the "mul" project.
- But only with the reservation that we do everything we can to neutralize the negative effects before starting it. As it stands now, I think launching this bot will do more harm than good. Pallor (talk) 10:17, 25 October 2024 (UTC)[reply]
- I have withdrawn the proposal to delete duplicate aliases. Iamcarbon (talk) 20:43, 25 October 2024 (UTC)[reply]
- Oppose as per Pallor. First we fix the search engine so that deleting labels does not affect results negatively and then we can start removing labels. --So9q (talk) 18:48, 25 October 2024 (UTC)[reply]
- What exactly needs to be fixed with the search engine? Iamcarbon (talk) 19:07, 25 October 2024 (UTC)[reply]
- I have withdrawn the proposal to delete duplicate aliases. Iamcarbon (talk) 20:43, 25 October 2024 (UTC)[reply]
Question As we didn't gain consensus for theses task, does any one know if there is there a way to withdraw this proposal, and propose a new primary task? It appears that our bot submission request process requests that the initial bot request be approved, before additional tasks can be requested.
- I can close this one as withdrawn, and you can then open another request.--Ymblanter (talk) 20:15, 2 November 2024 (UTC)[reply]
- @Ymblanter Yes, please close this as withdrawn. Thank you for your help. It is much appreciated. Iamcarbon (talk) 04:32, 3 November 2024 (UTC)[reply]
QichwaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Elwinlhq (talk • contribs • logs)
Task/s: Creating wikidata lexemes for the Quechua languages
Code: lexeme_upload.py describes the code for creating lexemes for the quechua languages based on a list extracted from the Qichwabase, which is a Wikibase.cloud instance of Quechua lexemes.
Function details: The tasks carried out by the bot include mainly the creation of Lexemes for the Quechua Languages based on the Qichwabase. The lexemes were already modelled according to Wikidata Lexemes model.
A small subset of the lexemes were already imported into Wikidata using the lexeme_upload.py with the support of Kristbaum (talk • contribs • logs). Here is one example of a Quechua Lexeme: aparquy/aparquy (L1322219).
Afterwords, a pronunciation audio was added to the lexemes, with the support of the LinguaLibre tool.
Now, I would like to continue this process, by continuing creating Lexemes, so the pronunciation audio for them can be recorded.
Thanks for your support and understanding.
--Elwinlhq (talk) 17:03, 25 September 2024 (UTC)[reply]
- Support But that's kind of obvious :) Kristbaum (talk) 20:52, 25 September 2024 (UTC)[reply]
- @Elwinlhq: Please make some test edits to get your bot approved. --Wüstenspringmaus talk 13:06, 2 October 2024 (UTC)[reply]
- Support obviously. Glad to see it happens! Cheers, VIGNERON (talk) 16:56, 30 September 2024 (UTC)[reply]
ThesaurusLinguaeAegyptiaeBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Dwer (talk • contribs • logs) (Research Coordinator of Structure and Transformation in the Vocabulary of the Egyptian Language (Q122750399) project.)
Task/s: The bot is to create and update Hieroglyphic Ancient Egyptian and Coptic lexemes and ancient Egyptian text artefact items. It is also to maintain links to the Thesaurus Linguae Aegyptiae project via approved properties.
Code: to be done.
Function details:
- Upload and update basic lexical items (lexemes) of the Hieroglyphic Ancient Egyptian language (ISO "egy"), like 𓍿𓊃𓅓𓃡/ṯzm (L184933).
- Upload and update basic lexical items (lexemes) of the Coptic language (ISO "cop"), like ⲟⲩⲁ (L700695).
- Upload and update Wikidata items for ancient Egyptian text artefacts (specific papyri, ostraca, stelae, ...), like Papyrus Harris I (Q1578003).
- Upload and update Wikidata items for ancient Egyptian textual works (e.g., Amduat, Teachings of Ptahhotep, ...), like The Satire of the Trades (Q616182).
- Add Thesaurus Linguae Aegyptiae ID properties to respective Wikidata items:
--Dwer (talk) 10:02, 17 September 2024 (UTC)[reply]
Leaderbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Leaderboard (talk • contribs • logs)
Task/s: phab:T370842 and meta:Global reminder bot
Code: https://github.com/Leader-board/userrights-reminder-bot, though this is under development
Function details: See the above phabricator task. It should be noted that
- I'm submitting near-identical requests on multiple wikis, and
- I do not expect this bot to run that much (if at all) on Wikidata and will not require a bot flag; however, Wikidata:Bots explicitly mention that approval is needed (and the botflag set which I find unnecessary and even a bad idea), and
- Here's a test edit (the text will be generalised for Wikidata). Users will be able to opt-out from this using a central page on Meta.
P.S: I also noticed that it says that bot "should never be used to make non-automated edits in the user talk namespace" which my bot will do - not sure if there's a way out of that.
--Leaderboard (talk) 18:17, 21 August 2024 (UTC)[reply]
- Oppose no thanks, see also Wikidata_talk:Bots#Is_a_bot_flag_required_for_a_bot_that_is_expected_to_make_very_few_edits_(if_any)?. Just run it on meta. Multichill (talk) 15:39, 24 August 2024 (UTC)[reply]
- @Multichill: I'm confused - just to be clear, are you suggesting that I message users about Wikidata on Meta? Leaderboard (talk) 18:05, 24 August 2024 (UTC)[reply]
- Can you link examples of temporary rights on Wikidata? Sjoerd de Bruin (talk) 16:34, 26 August 2024 (UTC)[reply]
- @Sjoerddebruin: [1], [2] and [3]. As noted above,
- Wikidata does not make that much use of temporary rights (the flooder right is automatically ignored), and
- many (but not all) of them are IPBE - some communities prefer that the bot exclude them. In that case it will run rarely, like in the case of the third example I shared above.
- Leaderboard (talk) 05:29, 27 August 2024 (UTC)[reply]
- I don't understand how a bot flag is needed for a bot that makes "non-automated edits in the user talk namespace"? This may be my confusion... --Lymantria (talk) 17:11, 9 September 2024 (UTC)[reply]
- @Lymantria:, the edits are automated, just that the frequency is (very) low. Leaderboard (talk) 08:00, 10 September 2024 (UTC)[reply]
- I'd prefer that you go for a global bot account. --Lymantria (talk) 13:00, 10 September 2024 (UTC)[reply]
- @Lymantria But global bots are disabled on this wiki (see Meta:Special:WikiSets/14 where Wikidata is in the opt-out set). If there is consensus from the community that global bots should be allowed to run on Wikidata, that's fine by me as well. To reiterate, I don't even need a bot flag in the first place, just approval to run this bot (without one). Leaderboard (talk) 16:29, 10 September 2024 (UTC)[reply]
- I'm sorry, you are right. --Lymantria (talk) 17:29, 10 September 2024 (UTC)[reply]
- @Lymantria But global bots are disabled on this wiki (see Meta:Special:WikiSets/14 where Wikidata is in the opt-out set). If there is consensus from the community that global bots should be allowed to run on Wikidata, that's fine by me as well. To reiterate, I don't even need a bot flag in the first place, just approval to run this bot (without one). Leaderboard (talk) 16:29, 10 September 2024 (UTC)[reply]
- I'd prefer that you go for a global bot account. --Lymantria (talk) 13:00, 10 September 2024 (UTC)[reply]
- @Lymantria:, the edits are automated, just that the frequency is (very) low. Leaderboard (talk) 08:00, 10 September 2024 (UTC)[reply]
- I don't understand how a bot flag is needed for a bot that makes "non-automated edits in the user talk namespace"? This may be my confusion... --Lymantria (talk) 17:11, 9 September 2024 (UTC)[reply]
- @Sjoerddebruin: [1], [2] and [3]. As noted above,
Wikidata:Requests for permissions (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Micrluo (talk • contribs • logs)
Task/s: request SPARQLs for RAG
Code:
Function details: --Micrluo (talk) 12:45, 3 August 2024 (UTC)[reply]
- @Micrluo Could you give use some more information & fix your request? --Wüstenspringmaus talk 15:26, 28 August 2024 (UTC)[reply]
UmisBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Stuchalk (talk • contribs • logs)
Task/s: This bot will add string representations of units of measurement to units of measurement Wikidata pages.
Code: The Python project on the "Units of Measurement Interoperability Service" (UMIS), that this bot will support/enable, is at https://github.com/usnistgov/nist_umis .
Function details: String representations of different units of measurement are being aligned to allow translation between different unit representation systems. As the developer of the UMIS, I have concluded that Wikidata is the best place to organize/align unit representation strings. Once available at nist.gov later this year, the UMIS website will offer additional functionality to enable users to programmatically translate between unit of representation systems, and additional functionality is planned. There are already Wikidata properties for some of the unit representation systems (e.g. QUDT) and additional ones will be requested. This is my first bot permission request so if more info is needed please let me know. --Stuart Chalk (talk) 16:44, 25 July 2024 (UTC)[reply]
- Please make some test edits. Ymblanter (talk) 20:25, 16 August 2024 (UTC)[reply]
DannyS712 bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DannyS712 (talk • contribs • logs)
Task/s: I want to get approval for a bot with translation admin rights that will automatically mark pages for translations if and only if the latest version is identical to the version that is already in the translation system, i.e. only pages with no "net" changes in the pending edits.
Code: not yet
Function details: I am filing almost identical requests for bot approval on a bunch of wikis, and figured I should put some of the details in a central location. Please see meta:User:DannyS712/TranslationBot for further info. --DannyS712 (talk) 03:09, 21 July 2024 (UTC)[reply]
- Support. Sure! --Wüstenspringmaus talk 11:47, 23 July 2024 (UTC)[reply]
- @Lymantria @Ymblanter just noting here that I cannot do test edits unless the bot is granted translation admin rights, unless you want me to test under my own account --DannyS712 (talk) 00:57, 26 July 2024 (UTC)[reply]
- Done Ymblanter (talk) 04:29, 26 July 2024 (UTC)[reply]
TapuriaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: محک (talk • contribs • logs)
Task/s: interwiki
Code: interwikidata.py from PAW, Mainly from Mazandarani and Gilaki Wikipedias.
Function details: novice --محک (talk) 16:18, 3 June 2024 (UTC)[reply]
- there isn't enough info here. i don't understand what this is doing or how it is doing it BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]
IliasChoumaniBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ilias Choumani / IliasChoumaniBot (talk • contribs • logs)
Task/s: Automatic updating of data from JSON files on German scientists
Code: Will be in Python (not there yet)
Function details: --IliasChoumaniBot (talk) 10:16, 3 June 2024 (UTC)[reply]
- what json files? we need more details BrokenSegue (talk) 15:31, 7 June 2024 (UTC)[reply]
- We are students from TH Köln tasked with automating the process of updating data for scientists on Wikidata. Our objective includes verifying the presence of researchers and creating entries if they are not already listed. Similarly, we extend this process to projects, such as those found in GEPRIS, where these researchers have been involved. Subsequently, our goal is to establish connections between these projects and the respective researchers.
- Our JSON files contain comprehensive data necessary for expanding information on researchers (QID, name) and their associated projects (project name, project ID) within Wikidata. This ensures that accurate and up-to-date information is seamlessly integrated into the Wikidata ecosystem.
- This approach leverages automated tools and careful data handling to contribute valuable knowledge to the scientific community on Wikidata. IliasChoumaniBot (talk) 14:35, 17 June 2024 (UTC)[reply]
- What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)[reply]
- We have the data from various online sources such as gepris, orcid or pubmed. we have exrtahted data from various german scientists and their publications and would like to automatically insert them into wikidata as part of our studies. IliasChoumaniBot (talk) 11:01, 18 July 2024 (UTC)[reply]
- What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)[reply]
Browse9ja bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Browse9ja
Task/s: Automated data retrieval and updates for Browse9ja project, focusing on Nigerian and African-based information, integrating a chatbot, NLP API, knowledge graph, and machine learning model.
Code: (Not applicable, as I am using a combination of existing APIs and services)
Function details:
The Browse9ja bot is designed to perform the following tasks:
- Retrieve and update data on Wikidata related to Nigerian and African-based information - Integrate with a chatbot to provide users with accurate and up-to-date information - Utilize natural language processing (NLP) API for text analysis and understanding - Contribute to the development of a knowledge graph for African-based information - Apply machine learning models to improve data accuracy and relevance
The bot will operate under the supervision of the operator (Browse9ja) and adhere to Wikidata's policies and guidelines. --Browse9ja (talk) 02:16, 16 May 2024 (UTC)[reply]
- Comment OP has no track record of contributions either here or on any other project.
- Question Can you please give more details of how the chatbot will be integrated? Do you intend to have an LLM suggest content to add to Wikidata? Bovlb (talk) 15:37, 21 May 2024 (UTC)[reply]
- Details of Chat-bot Integration as requested: My chat-bot will be integrated into the Browse9ja.com as a bot to provide users with accurate and up-to-date information related to Nigerian and African-based data on Wikidata. The integration will involve utilizing a natural language processing (NLP) API for text analysis and understanding. The Chat-bot will enable users to interact with the Browse9ja bot in a conversational manner, allowing for seamless access to information and updates on Wikidata. Additionally, the chat-bot will play a role in contributing to the development of a knowledge graph for African-based information. While the chat-bot will facilitate user interaction, the machine learning models will be applied to improve data accuracy and relevance, ensuring that the information provided is of high quality and relevance to the users.
- About LLM Content Suggestion: The chat-bot integrated with Browse9ja bot will have the capability to suggest content to add to Wikidata. Leveraging natural language processing (NLP) and machine learning models, the chat-bot will be able to analyze user queries and suggest relevant content for addition to Wikidata. This functionality aligns with the broader goal of the Browse9ja bot to automate data retrieval and updates for Nigerian and African-based information, ensuring that the information contributed to Wikidata is accurate, up-to-date, and relevant.
- Hope this clarifies my intent and would please also increase my chances for an approval.Thanks alot.
- .
- Browse9ja bot (talk) 13:12, 25 May 2024 (UTC)[reply]
OpeninfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Fordaemdur (talk • contribs • logs)
Task/s: importing financial data (assets, equity, revenue, EBIT, net profit) from openinfo.uz to entries on public Uzbek companies in Wikidata.
Code:
Function details: I have a project going with openinfo.uz which is a state-owned public portal for financial disclosures of all public Uzbek companies. All joint-stock companies and banks in Uzbekistan have to disclose their financials there by law. I have created entries for all Uzbek banks at User:Fordaemdur/Uzbek banks and would like to test imports of financial data there (Openinfo is ready to provide API for that). If successful, the bot will import financials once per quarter. Next steps would also be creating entries for all other notable public Uzbek companies, not just banks, and import financials there too. --Fordaemdur (talk) 11:14, 16 April 2024 (UTC)[reply]
- MD Imtiaz Ahammad Kopiersperre Jklamo ArthurPSmith S.K. Givegivetake fnielsen rjlabs ChristianKl Vladimir Alexiev Parikan User:Cardinha00 MB-one User:Simonmarch User:Jneubert Mathieudu68 User:Kippelboy User:Datawiki30 User:PKM User:RollTide882071 Andber08 Sidpark SilentSpike Susanna Ånäs (Susannaanas) User:Johanricher User:Celead User:Finnusertop cdo256 Mathieu Kappler RShigapov User:So9q User:1-Byte pmt Rtnf econterms Dollarsign8 User:Izolight maiki c960657 User:Automotom applsdev Bubalina Fordaemdur DaxServer Laurenz SommerladNotified participants of WikiProject Companies --Fordaemdur (talk) 11:32, 16 April 2024 (UTC)[reply]
- How many companies are we talking about? ChristianKl ❪✉❫ 18:57, 17 April 2024 (UTC)[reply]
- @ChristianKl, currently there items on about 50 public Uzbek companies (30+ are banks) - all can be found on my userpage. I am planning on creating items for all companies listed at the Tashkent Stock Exchange, so we'll end up with about 150 companies. There are about 600 joint-stock companies in Uzbekistan and I assume at least one third of them is notable. The test will be run on few companies - a mix of banks and corporates, and I don't expect more than 100 edits on a test run. If the test run is successful, the bot will be occupied with populating these items that i'm manually creating rn (checking notability for each individual entry before creating it). Best, --Fordaemdur (talk) 19:17, 17 April 2024 (UTC)[reply]
- Add:Openinfo.uz now has an entry to facilitate referencing its data: Unified Portal of Corporate Information Data (Q125505748) --Fordaemdur (talk) 19:19, 17 April 2024 (UTC)[reply]
- Support adding all joint-stock companies is fine given the kind of notability rules we have. If you would want to small businesses as well, it would be a harder call whether or not to allow it. ChristianKl ❪✉❫ 11:48, 18 April 2024 (UTC)[reply]
- Thank you for clarification. I confirm that I won't be working on small businesses. Openinfo and Tashkent Stock Exchange (which i'm using for data imports) only have data on joint-stock companies. Best, --Fordaemdur (talk) 14:48, 18 April 2024 (UTC)[reply]
- Support adding all joint-stock companies is fine given the kind of notability rules we have. If you would want to small businesses as well, it would be a harder call whether or not to allow it. ChristianKl ❪✉❫ 11:48, 18 April 2024 (UTC)[reply]
- Support - PKM (talk) 23:28, 18 April 2024 (UTC)[reply]
- Support--So9q (talk) 16:48, 2 May 2024 (UTC)[reply]
- Please make test edits.--Ymblanter (talk) 19:22, 9 May 2024 (UTC)[reply]
MidleadingBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Midleading (talk • contribs • logs)
Task/s: Create items for books in National Diet Library (Q477675).
Function details: As part of WikiProject, items for each book in National Diet Library (Q477675) have to be created so files in Wikimedia Commons can link to Wikidata. The items will have properties DOI (P356), NDL Bib ID (P1054), publication date (P577), author (P50), main subject (P921) and others. The number of items to be created is likely to be more than 100,000. --Midleading (talk) 13:03, 5 February 2024 (UTC)[reply]
- Note: If item with specific NDL Authority ID (P349) does not yet exist, we need to also create an item for the author (which may include people and organizations). Also there are books not yet in public domain so can not be uploaded to Commons, but items for them can still be created. GZWDer (talk) 13:48, 5 February 2024 (UTC)[reply]
- Also you may first create some example items.--GZWDer (talk) 13:56, 5 February 2024 (UTC)[reply]
- Like local history of Kagoshima City (Q111372556)? This item was created by @Sakoppi:. Do you know any Japanese user who are interested in this topic? Some items may have already been created by these users. Midleading (talk) 15:14, 5 February 2024 (UTC)[reply]
- I don't think that item is not following the guidelines at Wikidata:WikiProject_Books. BrokenSegue (talk) 17:37, 5 February 2024 (UTC)[reply]
- What it mean: (1) any created item with a specific
DOI (P356)/JPNO (P2687)/NDL Bib ID (P1054)/publication date (P577)/place of publication (P291) should be an "edition" (instance of (P31)=version, edition or translation (Q3331189)); (2) there are possibly multiple editions of the same work, so (in the next step, after we have the edition items) we create an item for the "work" and link two items with edition or translation of (P629)/has edition or translation (P747). For example c:Category:人事興信録 contains multiple different editions of Q109727675 (a "work"), each edition have a different set of IDs. We should create items for each edition. In the future, the Commons category should be diffused to one for each edition which links to the edition item (instead of the current one for the work item).--GZWDer (talk) 03:17, 6 February 2024 (UTC)[reply]
- What it mean: (1) any created item with a specific
- I don't think that item is not following the guidelines at Wikidata:WikiProject_Books. BrokenSegue (talk) 17:37, 5 February 2024 (UTC)[reply]
- Like local history of Kagoshima City (Q111372556)? This item was created by @Sakoppi:. Do you know any Japanese user who are interested in this topic? Some items may have already been created by these users. Midleading (talk) 15:14, 5 February 2024 (UTC)[reply]
- Note we have another dedicated property for NDL id (besides JPNO (P2687) and NDL Bib ID (P1054)): NDL Persistent ID (P9836). To prevent fragmentation of data, IDs should not be added to DOI (P356).--GZWDer (talk) 09:31, 6 February 2024 (UTC)[reply]
- I have edited local history of Kagoshima City (Q111372508) as example of work item and local history of Kagoshima City (Q111372556) as example of edition item. Midleading (talk) 14:05, 7 February 2024 (UTC)[reply]
- this still looks wrong? is it a book series? or a written work (Q47461344)? There's also constraint violations? BrokenSegue (talk) 18:33, 7 February 2024 (UTC)[reply]
- It was a book that later became book series. Anyway, I will use written work (Q47461344) uniformly, because this information isn't in Commons. genre (P136), official website (P856), copyright status (P6216) for work item and follows (P155), followed by (P156), genre (P136), copyright holder (P3931), official website (P856) for edition item also can't be imported. has edition or translation (P747) statements also will not have any qualifiers when imported. Midleading (talk) 03:29, 16 February 2024 (UTC)[reply]
- this still looks wrong? is it a book series? or a written work (Q47461344)? There's also constraint violations? BrokenSegue (talk) 18:33, 7 February 2024 (UTC)[reply]
- I have edited local history of Kagoshima City (Q111372508) as example of work item and local history of Kagoshima City (Q111372556) as example of edition item. Midleading (talk) 14:05, 7 February 2024 (UTC)[reply]
Hello, I want to do this for books in National Library of Spain. I see that following Wikidata rules for books, two items are needed at least, one for the written work (Q47461344) and one for each version, edition or translation (Q3331189). I created an example for this entry in datos.bne.es:
- La vida es eterna: biografía de Víctor Jara (Q124538246): written work by Mario Amorós
- La vida es eterna: biografía de Víctor Jara (Q124537888): 2023 edition of written work by Mario Amorós
Do you think they are correct? It think that adding the "(PUBLISHER, YEAR)" in the label for each edition is useful, so you can see all that info quickly in the property has edition or translation (P747) in La vida es eterna: biografía de Víctor Jara (Q124538246). But I am open for suggestions. Of course, after we define that, I will open a request for my bot. Just wanted to use this discussion so we can unify the rules for all "book bots". Emijrp (talk) 18:12, 15 February 2024 (UTC)[reply]
- generally looks good to me. personally I would like to see some more identifiers (see Wikidata:WikiProject_Books e.g. Library of Congress Control Number (LCCN) (bibliographic) (P1144) or Google Books ID (P675)) though having ISBN is good. Also a genre (P136) would be nice. BrokenSegue (talk) 18:27, 15 February 2024 (UTC)[reply]
- oh also the description for the edition should say it's an edition (needs to be distinct from the work's description) BrokenSegue (talk) 18:29, 15 February 2024 (UTC)[reply]
- "2023 edition of book by Mario Amorós"? What about writing the Spanish title in the English label? Is that OK or should I leave it blank when book hasn't been translated? Emijrp (talk) 19:19, 15 February 2024 (UTC)[reply]
- oh also the description for the edition should say it's an edition (needs to be distinct from the work's description) BrokenSegue (talk) 18:29, 15 February 2024 (UTC)[reply]
- The description for both items is that it's a "book", which is the least helpful label that could possibly be used. In Wikiproject:Books, we never use the word "book" because it could mean a work, an edition, a specific copy, a section within a work, or any of a dozen other meanings. Please do not use "book" as the description for a work or an edition; it isn't helpful and does not distinguish what it is. The data item for an edition should have "edition" in the description; not in the label. --EncycloPetey (talk) 17:18, 16 February 2024 (UTC)[reply]
- I just fixed the labels and descriptions for both items (written work and edition). Is OK now? Btw, I repeat the same question, is OK to use Spanish title as English label when work hasn't been translated? Emijrp (talk) 16:15, 19 February 2024 (UTC)[reply]
- Yes, it is ok to use the Spanish title if the work has not been translated. Other than that, are we ready for approval? Ymblanter (talk) 19:55, 6 March 2024 (UTC)[reply]
- @Ymblanter This request was created by @Midleading: for his bot MidleadingBot. I don't know if his bot is ready for approval.
- I am going to open a request for my own bot. Emijrp (talk) 19:43, 7 March 2024 (UTC)[reply]
- Yes, it is ok to use the Spanish title if the work has not been translated. Other than that, are we ready for approval? Ymblanter (talk) 19:55, 6 March 2024 (UTC)[reply]
- I just fixed the labels and descriptions for both items (written work and edition). Is OK now? Btw, I repeat the same question, is OK to use Spanish title as English label when work hasn't been translated? Emijrp (talk) 16:15, 19 February 2024 (UTC)[reply]
How is this still going on? The example items (local history of Kagoshima City (Q111372508) for work item and local history of Kagoshima City (Q111372556) for edition item) do not have constraint violations now. In the first phase, edition items will be created, with statements of instance of (P31)=version, edition or translation (Q3331189), title (P1476), country of origin (P495), publication date (P577), language of work or name (P407), document file on Wikimedia Commons (P996), DOI (P356), NDL Persistent ID (P9836), NDL Bib ID (P1054), JPNO (P2687). Other properties will depend on data source. If there is no interest in it, I will close this request in 2024. Midleading (talk) 15:38, 4 November 2024 (UTC)[reply]
- I am fine with approving the bot, but I want to make sure there are no objections. Ymblanter (talk) 19:27, 5 November 2024 (UTC)[reply]
So9qBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)
Task/s: Add DDO identifier to Danish lexemes.
Code: https://github.com/dpriskorn/LexDDO
Function details: Checks whether there are multiple hits in DDO for a lemma. If yes it is skipped. Checks if there is multiple lexemes with the same lemma and lexical category in WD, if yes, it skips. Otherwise we got a match and upload is done. If we get 404 from DDO a not found in + time statement is added. This is the easiest low hanging fruit kind of matching. I vetted the edits and it seems good to me. See ~50 test edits here https://www.wikidata.org/w/index.php?title=Special:Contributions/So9q&target=So9q&offset=20240105165217--So9q (talk) 18:41, 5 January 2024 (UTC)[reply]
- What is this? Ymblanter (talk) 20:01, 11 January 2024 (UTC)[reply]
- It is a placeholder. I add it when there are multiple choices for lexemes or no lexeme match like in this case. If they were numbered (by a bot or to-be-written user script perhaps) one could see it as in the second position we don't know which lexeme correspond. So9q (talk) 08:46, 7 October 2024 (UTC)[reply]
- Are you still interested in the bot approval? Ymblanter (talk) 18:41, 8 October 2024 (UTC)[reply]
- It is a placeholder. I add it when there are multiple choices for lexemes or no lexeme match like in this case. If they were numbered (by a bot or to-be-written user script perhaps) one could see it as in the second position we don't know which lexeme correspond. So9q (talk) 08:46, 7 October 2024 (UTC)[reply]
So9qBot 8 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)
Task/s: Add missing names of European legal documents to labels and aliases of items with a CELEX identifier
Code: logic diagram, code
Function details: This is important for our coverage of EU legal documents. A bug is blocking creation of 50 test edits.--So9q (talk) 15:07, 17 December 2023 (UTC)[reply]
- The bug has been fixed. See test edits So9q (talk) 17:41, 2 January 2024 (UTC)[reply]
- @Samoasambia thanks for moving the test edits to title as suggested by the model and Ainali <3 So9q (talk) 08:56, 7 October 2024 (UTC)[reply]
Discussion
[edit]- Support looks useful, thanks! -Framawiki (please notify !) (talk) 14:34, 6 January 2024 (UTC)[reply]
- Question Wouldn't title (P1476) be better than official name (P1448)? (That is what we used for the Swedish parliamentarian documents.) Ainali (talk) 08:41, 11 January 2024 (UTC)[reply]
- Yes, thanks for the suggestion. So9q (talk) 08:49, 7 October 2024 (UTC)[reply]
- @So9q: FYI, I created some data modeling for EU legal acts here. The EUR-Lex metadata is available through a SPARQL end point which gives us some additional data compared to scraping. –Samoasambia ✎ 18:38, 9 March 2024 (UTC)[reply]
- Oh, I was not aware of the WikiProject. Looks very nice and title is suggested there like Ainali did above. I'm not sure the SPARQL endpoint is needed nor desired for this task. I had a look back when I wrote this request and ditched it. Can't remember why, but this code works and is reasonably fast :) So9q (talk) 08:53, 7 October 2024 (UTC)[reply]
- @Samoasambia, Ainali, Framawiki: I updated the code to use title. I also fixed a small bug which caused duplicate references when the script was rerunning. I also added editgroups so anyone can later undo the changes in bulk easily if needed. I'm ready to run it on all ~4000 items with CELEX id now.--So9q (talk) 21:32, 8 October 2024 (UTC)[reply]
- Are there some test edits with the updated code? Ainali (talk) 21:41, 8 October 2024 (UTC)[reply]
- I'm planning to add data to EU legal acts and to create new items via the EUR-Lex SPARQL endpoint but scraping the titles is fine for me. Makes my life a bit easier :). I'd still add stated in (P248) = EUR-Lex (Q1276282) to the references but otherwise looks great to me. Samoasambia ✎ 22:13, 8 October 2024 (UTC)[reply]
- Fixed, see Test edit.
- Note: no reference is added to existing title-statements (this is to avoid duplicate references with different dates on consecutive runs of the script).
- The script is idempotent. It only adds missing title-statements, never remove or change existing statements.
- I added editgroups so a complete run of the script can be rolled back easily.--So9q (talk) 09:10, 18 October 2024 (UTC)[reply]
- I added extraction of "EUID" e.g. "(EU) 1979/110" from en descriptions in WD and add them as mul aliases. They make it easier to lookup laws in Wikidata using the search bar and are used as IDs by e.g. the swedish government. See test edit. So9q (talk) 12:16, 18 October 2024 (UTC)[reply]
- Looks good to me, So9q. However, there are some issues with the "EUID". The initialisms in the identifier stand for the legal domain under which the act was passed (European Union, European Economic Community, European Atomic Energy Community etc.). The current naming format of legal acts has been in use only since January 2015, so for example "(EU) 1979/110" is not correct, it should be "79/110/EEC" (in English, different in others). Since the Lisbon treaty most new acts have legal domain "EU" but some also have "EU, Euratom" or "CFSP". The legal domain appreviations are language-specific, so while in English it's "EU", in French it's "UE" and in Irish "AE" etc. I added a table of all of them here. More information can be found at the Interinstitutional Style Guide.
- So I would recommend that the bot shouldn't add "EUIDs" with the legal domains to
mul
aliases because the format depends on language. However, adding only the year-and-number-part (e.g. "79/110", "2016/679") is fine and I support that. I have started working on a python code that would extract short labels for legal acts from the full titles in different languages using regex. Maybe we could work on that together if I add the code to GitHub? Samoasambia ✎ 19:38, 18 October 2024 (UTC)[reply]- Oh, I was not aware that the EUID had a component that differs along both language and legal domains. Thanks for the table. I can use that to translate the legal domain part before adding the alias.
- This is becoming increasingly complicated. EU is so complicated :sweat smile:
- I digged a little and found a use of the "EUID" without the parenthesis "EU 2023/138" from a Swedish government agency.
- So now we have 5 different EUIDs used by governmental workers to refer to the same law:
- long EUID with parens e.g. "(EU) 2023/138"
- long EUID without parens e.g. "EU 2023/138"
- short EUID without the legal domain e.g. "2023/138"
- ELI IDs (we are missing a property, see Wikidata:Property proposal/European Legislation Identifier) (used in EUR-Lex, but not by e.g. the Swedish government)
- CELEX ids (used in EUR-Lex and Cellar, but not by e.g. the Swedish government)
- So9q (talk) 12:24, 19 October 2024 (UTC)[reply]
- I added support for localized EUIDs according to the table provided by @Samoasambia and only add the "short EUID" to mul. I did not add support for Euratom and CFSP for now (I set the script to raise an exception if the EUID cannot be extracted and will implement it if needed when the script fails). See test edit
- Also added support for extracting and adding the localized "EECID" e.g. "80/1177/EEC" to aliases, see test edit
- @Ainali, @Samoasambia WDYT? :) --So9q (talk) 16:53, 19 October 2024 (UTC)[reply]
- Do we really need to add the same alias in multiple languages? If it exists in one language, it shows up in the search independent of what language one is using. Is there some added value for this that I am not seeing? Ainali (talk) 18:26, 19 October 2024 (UTC)[reply]
- It is the most light way we have, so yes it is necessary, if we would add all the variants to mul as alias instead we would loose information. They are valid for each of the languages and deduplicated in the database so nothing to worry about IMO. So9q (talk) 07:59, 20 October 2024 (UTC)[reply]
- I have a still a couple issues left. Firstly I think we shouldn't use the full titles as labels, instead we should be using some sort of short titles. Unfortunately they are not directly available on EUR-Lex but I did some regex magic for extracting them out of the full titles in all official languages. You can find it here. Currently it works in 22 out of 24 languages and for nearly all acts published since 1 January 2015. Adjusting it for earlier acts needs still some extra work. The second issue is that I don't think the "long EUID without parens" (e.g. EU 1980/1177) is anything official, so I wouldn't include that. EUR-Lex seems to use only the version with parens, and that is what the interinstitunional style guide says [4][5]. And finally I would put stated in (P248) before the URL in the references since it looks a bit nicer that way :). Otherwise looks good to me! Samoasambia ✎ 22:20, 28 October 2024 (UTC)[reply]
- Do we really need to add the same alias in multiple languages? If it exists in one language, it shows up in the search independent of what language one is using. Is there some added value for this that I am not seeing? Ainali (talk) 18:26, 19 October 2024 (UTC)[reply]
- @Ymblanter: ready for approval?--So9q (talk) 21:34, 25 October 2024 (UTC)[reply]
- I will wait for a few days to see whether there are objections. Ymblanter (talk) 19:34, 26 October 2024 (UTC)[reply]
HVSH-Bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Historischer_Verein_SH (talk • contribs • logs)
Task/s: Import data about politicians from the Database of important Persons from Schaffhausen (Q119949776), now only partially online available
Code: N/A
Function details: Import of reconciled data from OpenRefine with given name, familiy name, date of birth, date of death, place of origin, sex or gender, position held, language spoken, country of citizenship. --HVSH-Bot (talk) 12:37, 31 December 2023 (UTC)[reply]
- Could you explain the logic using an activity planuml diagram? Could you make 50 test edits and link them here? So9q (talk) 10:35, 2 January 2024 (UTC)[reply]
RudolfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: RudolfoMD (talk • contribs • logs)
Task/s:importing list of Drugs With Black Box Warnings; setting Property / legal status (medicine): boxed warning.
Code: N/A
Function details: Continue importing FDA list of Drugs With Black Box Warnings, as I've been doing, with OpenRefine. Ideally hope to create or have someone run a bot to maintain the data.
OpenRefine urges me to submit Large edit batches for review. I've done ~400 in batches of ~200. I want to do more, like https://www.wikidata.org/w/index.php?title=Q7939256&diff=prev&oldid=2019984699&diffmode=source. This is what's set: Property / legal status (medicine): boxed warning / rank Property / legal status (medicine): boxed warning / reference reference URL: https://nctr-crs.fda.gov/fdalabel/ui/spl-summaries/criteria/343802 title: FDA-sourced list of all drugs with black box warnings (Use Download Full Results and View Query links. (English) Want to match more widely - on Q113145171, which has ~500 matches, and the other types which match and are drugs of some kind listed below. Table has ~1600 rows, and the bulk have a matching drug in wikidata already. Types: Q113145171 type of chemical entity (658) Q59199015 group of sterioisomers (51) Q12140 medication DONE- first extract, I think (need to redo to add cites) Q169336 mixture (45) Q79529 chemical substance (40) Q1779868 combo drug (28) Q35456 essential med (13) Q119892838 type of mixture of chem (3) Q28885102 pharm prod (3) Q467717 racemate (3) Q8054 protein (biomolecule) (4) Q422248 mab (12) Q679692 biopharmaceutical (6) Q213901 gene therapy (4) Q2432100 vet drug (3) I do not want to do for types Q13442814 article (NO) Q30612 clinical trial (NO) Q7318358 review article (NO) Q16521 taxon (NO?)
--RudolfoMD (talk) 09:29, 29 November 2023 (UTC)[reply]
- Comment Looks useful! Can we see some test edits with the actual bot code to be used?
GamerProfilesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Parnswir (talk • contribs • logs)
Task/s: Backfill GamerProfiles game IDs (P12001)
Code: https://github.com/GP-9000/GamerProfilesBot
Function details: The bot will regularly update existing video games with the GamerProfiles game ID (P12001) sourced from https://gamerprofiles.com. We plan to update the initial batch of around 55,000 games within a month of approval and then switch to a more relaxed (on-demand) update process.
--Parnswir (talk) 11:05, 5 October 2023 (UTC)[reply]
- Question How do you match the GamerProfiles pages to the items? Jean-Fred (talk) 15:01, 5 October 2023 (UTC)[reply]
- We have an existing 1:1 mapping in our database for those games we want to backfill. Parnswir (talk) 15:31, 5 October 2023 (UTC)[reply]
- who is we? BrokenSegue (talk) 19:23, 5 October 2023 (UTC)[reply]
- and how was the mapping made? BrokenSegue (talk) 19:32, 5 October 2023 (UTC)[reply]
- Ah sorry for the confusion, I forgot to mention I am associated with the company behind GamerProfiles.com, so "we" is the company. The games were originally exported from Wikidata and thus we have the original Wikidata ID for each game. Parnswir (talk) 21:38, 5 October 2023 (UTC)[reply]
- Does your association with the company fall inside paid editing? If so, you are obliged to mention it (on your user page). --Lymantria (talk) 11:06, 8 November 2023 (UTC)[reply]
- Thanks for the clarification, I didn't mean to mislead. I added the paid contributions template to both the bot account and this account. Parnswir (talk) 11:53, 9 November 2023 (UTC)[reply]
- Does your association with the company fall inside paid editing? If so, you are obliged to mention it (on your user page). --Lymantria (talk) 11:06, 8 November 2023 (UTC)[reply]
- Ah sorry for the confusion, I forgot to mention I am associated with the company behind GamerProfiles.com, so "we" is the company. The games were originally exported from Wikidata and thus we have the original Wikidata ID for each game. Parnswir (talk) 21:38, 5 October 2023 (UTC)[reply]
- and how was the mapping made? BrokenSegue (talk) 19:32, 5 October 2023 (UTC)[reply]
- who is we? BrokenSegue (talk) 19:23, 5 October 2023 (UTC)[reply]
- We have an existing 1:1 mapping in our database for those games we want to backfill. Parnswir (talk) 15:31, 5 October 2023 (UTC)[reply]
- @Parnswir: Is Master Jaro (talk • contribs • logs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)[reply]
- @Kirilloparma @Lymantria Thank you for the info everyone! I didn't know about the "paid contributions" info before. And yes, I am a different person :) Since high-quality edits are also in the interest of the company, I have added the paid contributions template to my page as well now. Just let me know if anything else is missing. I've learned quite a bit over the last months, and will keep doing my best to produce helpful edits. Master Jaro (talk) 15:33, 10 November 2023 (UTC)[reply]
- @Parnswir: Is Master Jaro (talk • contribs • logs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)[reply]
- Please make 50 test edits and link them here. So9q (talk) 10:38, 2 January 2024 (UTC)[reply]
- The contributions were already made on October 5th 2023: https://m.wikidata.org/wiki/Special:Contributions/GamerProfilesBot Parnswir (talk) 16:40, 2 January 2024 (UTC)[reply]
- @Kirilloparma @Jean-Frédéric @BrokenSegue @Lymantria @So9q Thank you for your efforts everyone! Is there anything more we can do to help move this project forward? We would love to add more of the relevant IDs next to the other game edits we make along the way. Any help is highly appreciated :) Master Jaro (talk) 16:35, 27 March 2024 (UTC)[reply]
- Support The origin of the mapping (the entries were originally exported from WD, as stated above) ensures the quality of the edits. I think the test edits look fine. Happy to support this. Jean-Fred (talk) 19:35, 16 May 2024 (UTC)[reply]
- Comment Meanwhile, User:Kirilloparma performed an import of 84K+ GamerProfiles ids − see Wikidata:Edit groups/QSv2/230179. Jean-Fred (talk) 07:39, 19 May 2024 (UTC)[reply]
MangadexBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Binarycat64 (talk • contribs • logs)
Task/s: add metadata from mangadex to manga with Mangadex manga ID
Code: not yet implemented
Function details: many manga items have a mangadex id specified, but not the id for other sites (MangaUpdates, Kitsu, AniList). however, this data exists on mangadex, so this bot would simply copy over the data.
the initial scope is quite small, only focusing on ID tags. --Binarycat64 (talk) 18:01, 6 August 2023 (UTC)[reply]
- I'm concerned you are too inexperienced with wikidata (<500 edits) to be granted bot permissions. At the very least I'm going to need to see some test edits. BrokenSegue (talk) 18:39, 6 August 2023 (UTC)[reply]
- I can certainly provide test edits if that's your concern. I will also adjust my code according to any reasonable concerns that are raised.
- I'll start working on the code, seeing as there are no objections to the goal of the bot.
- Is there a certain way i should do test edits? I can test most of the functionality on sandbox items, but I need a query endpoint to test the functionality of finding items to update, and test.wikidata.org doesn't seem to provide that. Binarycat64 (talk) 16:23, 7 August 2023 (UTC)[reply]
- it is ok to make a small number of test edits on main wikidata using the bot account before approval. just make sure it is relatively few at low speed. BrokenSegue (talk) 06:28, 8 August 2023 (UTC)[reply]
- Please, do so, let your bot make some test edits. --Lymantria (talk) 15:41, 17 September 2023 (UTC)[reply]
- I implemented something very similar last year: https://github.com/PythonCoderAS/wikidata-anime-import
- I'll take some time and revive the codebase, as I've taken an extended break but am ready to come back again. RPI2026F1 (talk) 16:22, 25 January 2024 (UTC)[reply]
WingUCTBOT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Tadiwa Magwenzi (talk • contribs • logs)
Task/s: Batch Upload of Niger-Congo B Lexemes , including Senses and Forms.
Code:https://github.com/Boomcarti/WingUCTBOT
Function details: Upload of 550 isiZulu Nouns as Lexemes, Including their associated Forms and Senses. --WingUCTBOT (talk) 10:07, 31 July 2023 (UTC)[reply]
- Please make some test edits. Ymblanter (talk) 19:19, 7 August 2023 (UTC)[reply]
- Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)[reply]
- I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)[reply]
- I've just redone about 250 test edits they are on the TestWikidata recent changes page. Some examples: https://test.wikidata.org/wiki/Lexeme:L3768 , https://test.wikidata.org/wiki/Lexeme:L3753 . The link to the page: Recent changes - Wikidata . WingUCTBOT (talk) 18:14, 9 September 2023 (UTC)[reply]
- I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)[reply]
- Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)[reply]
- I took a quick look at the code. Are you aware of the python library WikibaseIntegrator which supports lexemes?
- I prefer if you would use that or a similar library to make sure you honor the max edit thing on the servers.
- Would you be willing to do that? So9q (talk) 10:50, 2 January 2024 (UTC)[reply]
The Lexemes were sourced manually by Professor M.Keet and Langa Khumalo.
https://github.com/mkeet/GENIproject/tree/master/isiZulupluraliser/isiZulu
- @WingUCTBOT, Tadiwa Magwenzi: Your code appears to add the same sense multiple times and, among forms, adds the plural of a noun multiple times without including a form for the singular. (You may wish to consider using tfsl for your import; once it is installed, an overview of how it is used may be found here.) Mahir256 (talk) 00:05, 16 August 2023 (UTC)[reply]
- Understood, will fix it now. WingUCTBOT (talk) 17:21, 16 August 2023 (UTC)[reply]
- Good evening. I have addressed your concerns with the code and have uploaded a test batch of 50+ Lexemes( isiZulu Nouns, along with their senses and forms) WingUCTBOT (talk) 22:36, 16 August 2023 (UTC)[reply]
- In time, i do intend to refactor the code to use tfsl WingUCTBOT (talk) 23:09, 16 August 2023 (UTC)[reply]
MajavahBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Taavi (talk • contribs • logs)
Task/s: Import version and metadata information for Python libraries from PyPI.
Function details: For items with PyPI project (P5568) set, imports the following data from PyPI:
- software version identifier (P348) (from PyPI releases). The latest release is marked as preferred, and the preferred rank is removed from older versions if it was added by this bot.
- issue tracker URL (P1401), user manual URL (P2078), source code repository URL (P1324), source code repository URL (P1324) (from the metadata of the latest release)
Additionally the PyPI project (P5568) value will be updated to the normalized name if it's not already in that form.
Taavi (talk) 19:54, 11 July 2023 (UTC)[reply]
- how many statements do you think this will add? don't some packages have...lots of versions? BrokenSegue (talk) 20:05, 11 July 2023 (UTC)[reply]
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)[reply]
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)[reply]
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
year.month.patch
type releases so the first digit changing isn't really meaningful. - However I can filter out all packages generated from https://github.com/vemel/mypy_boto3_builder, as those are all very similar and not intended for human use directly anywyays. That cuts the total number of versions to a third (~70k) even before doing any other per-package limits. Taavi (talk) 21:15, 11 July 2023 (UTC)[reply]
- See also Wikidata:Requests for permissions/Bot/RPI2026F1Bot 5 for discussion of a previous similar task (seems not active) and Github-wiki-bot imports version data from GitHub (see e.g. history of modelscope (Q120550399)); however you should care that version numbers may be different between GitHub and PyPI.--GZWDer (talk) 11:38, 12 July 2023 (UTC)[reply]
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)[reply]
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)[reply]
- ┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Oh yes, the RPI2026F1Bot task looks somewhat similar. I'm aware of Github-wiki-bot, but there are quite a few PyPI projects that are not hosted on GitHub, and I think my code should be able to handle items with data from both and ensure the two bots don't start edit warring for example. Taavi (talk) 17:23, 12 July 2023 (UTC)[reply]
- @Taavi: Please make some test edits. --Wüstenspringmaus talk 11:05, 29 August 2024 (UTC)[reply]
FromCrossrefBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Carlinmack (talk • contribs • logs)
Task/s: Using information from Crossref:
- Add publication date to items where they are not present in Wikidata
- Fix publication dates where they are erroneous
Code: Will be using Pywikibot in a similar way as I have done previously with this bot
Function details: Previously this bot has been used to add CC licenses to items which has been successful. In March 2022 it was realised that other bots/tools were using the wrong date for publication date in Crossref. Since I am working with this dump, I will step up to try fix this issue.
A simpler task is to fill in the details for items without publications. I've created a set of 80k items and once given the go ahead I will contribute these dates.
The issue of the wrong dates is a little more complicated as there are some false positives on both sides of this, sometimes Crossref is wrong and sometimes Wikidata is wrong. I'm sure that Wikidata is wrong more often, however before doing any edits I will do some manual validation to check the prevalence of false positives. When I am fairly confident I will start editing and I'll see whether I can deprecate the existing statement, add a reason and add the new date as preferred. If not, due to limitations in Pywikibot, I'll remove the previous statement instead. --Carlinmack (talk) 14:31, 7 July 2023 (UTC)[reply]
- Support This seems useful. However I see only one example edit for this so far, maybe you should do some more just to verify it's doing what we expect? You will be using the "published" date-parts data in the Crossref json files for this? If an item already has the correct published date value will you add the reference? Maybe that should only be done if the published date doesn't already have a reference though... ArthurPSmith (talk) 18:17, 24 July 2023 (UTC)[reply]
- Pls make some test edits.--Ymblanter (talk) 15:53, 9 August 2023 (UTC)[reply]
- @User:Carlinmack: What about "erroneous" in Crossref and corrected in WD? --Succu (talk) 20:19, 7 November 2023 (UTC)[reply]
UrbanBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Urban Versis 32 (talk • contribs • logs)
Note: A discussion at Wikipedia about this bot took place: Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category
Task/s: UrbanBot's task is to mass-add English descriptions to items that don't have one.
Code: Main repository for UrbanBot's code Source code file for task
Function details:
1. The bot operator will first enter a category name from the English Wikipedia. This category will be used to group similar pages (items on Wikidata) which will all have the same description added to them.
2. The bot operator will enter the description to be added to the pages in the Wikipedia category.
3. The bot will follow through these steps for each page:
3a. The bot will check if the Wikipedia page has a corresponding item.
3b. The bot will check if the item already has a description
3c. If the Wikipedia page has a corresponding item and the item does not already have a description, the bot will write the description specified by the bot operator in step 2 into the item.
3d. The bot will loop through to the next page in the category and run all steps in step 3 again.
Due to the bot requiring the bot operator to enter in the English Wikipedia category and the description for the items, the bot is semi-automated. I have already done the aforementioned process using the bot to add descriptions to items a few times to make sure the code was working properly.
Thanks, Urban Versis 32KB ⚡ (talk | contribs) 16:04, 29 June 2023 (UTC)[reply]
- Support This sounds fine as long as you are aware of Wikidata's style guide for descriptions. Confirm that you've read Help:Description? BrokenSegue (talk) 16:24, 29 June 2023 (UTC)[reply]
- @BrokenSegue Yes, I have and am aware of the formatting of descriptions. Urban Versis 32KB ⚡ (talk | contribs) 03:55, 10 July 2023 (UTC)[reply]
- Support Looks fine to me too, at least if you'll be following pretty much the pattern you've tested with. One note - two items with the same primary label cannot have the same description string in Wikidata; I'm not sure if your bot would ever run into that but it might be an error condition you'll have to check for. ArthurPSmith (talk) 20:33, 29 June 2023 (UTC)[reply]
- Comment Another approach might be to add the short descriptions to enwiki, which are then automatically copied over here by Pi bot. That might help reduce the number of differences of descriptions here and there in the longer term. Thanks. Mike Peel (talk) 16:46, 3 July 2023 (UTC)[reply]
- the style guide for wikipedia/wikidata descriptions are not the same though BrokenSegue (talk) 17:24, 5 July 2023 (UTC)[reply]
- @Mike Peel Actually, this was my original plan and I discussed it at Wikipedia:Village_pump_(proposals)/Archive_202#Bot_to_add_short_descriptions_to_articles_in_a_category but I was suggested to bring it here as the bot would mainly edit Wikidata and editing Wikipedia would only create extra steps. Urban Versis 32KB ⚡ (talk | contribs) 03:57, 10 July 2023 (UTC)[reply]
- @BrokenSegue, Urban Versis 32: Those are both problems that should be fixed. English Wikipedia seems to want the extra steps, it would be useful if they didn't self-contradict themselves... Thanks. Mike Peel (talk) 21:24, 12 July 2023 (UTC)[reply]
- those won't be fixed in this request for permission. BrokenSegue (talk) 22:47, 12 July 2023 (UTC)[reply]
- @Mike Peel Not sure what you mean by English Wikipedia wanting the extra steps, but if an en-wiki article is linked to a Wikidata item with a description, the description takes the place of a short description on Wikipedia. For example, viewing this Wikipedia category with the shortdescs-in-category tool will reveal that some articles have a locally-added short description whereas one page doesn't have a short description but its corresponding Wikidata item did have a description, which took the place of a Wikipedia short description. Urban Versis 32KB ⚡ (talk | contribs) 22:50, 13 July 2023 (UTC)[reply]
- @Mike Peel Actually, I stand corrected. I was looking through the en-wiki Wikiproject Short Descriptions (link here) and it looks like Wikidata descriptions are actually not really used as a replacement for a Wikipedia short description. Therefore, I think I will submit a bot request to en-wiki as you were correct about Short descriptions being a much higher priority on Wikipedia compared to Wikidata descriptions. I will leave this request up however, in case I run into people saying the same thing at Wikipedia as they did before. After the bot (hopefully) gets approved, I will take this one down. Thanks again, Urban Versis 32KB ⚡ (talk | contribs) 02:40, 15 July 2023 (UTC)[reply]
- @BrokenSegue, Urban Versis 32: Those are both problems that should be fixed. English Wikipedia seems to want the extra steps, it would be useful if they didn't self-contradict themselves... Thanks. Mike Peel (talk) 21:24, 12 July 2023 (UTC)[reply]
ACMIsyncbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Pxxlhxslxn (talk • contribs • logs)
Task/s: Sync links with ACMI API.
Code: https://github.com/ACMILabs/acmi-wikidata-bot/blob/main/acmi_bot.py
Function details: As part of an upcoming residency with the ACMI (Q4823962) I have written a small bot to pull Wikidata links from their public API and write back to Wikidata to ensure sync between the two resources.The plan was to integrate this as part of the build workflow for the ACMI API (https://github.com/ACMILabs/acmi-api). This is currently set to append only, not removing any links Wikidata-side. While the initial link count is only around 1500 there will likely be significant expansion in the current weeks as we identify further overlaps. --Pxxlhxslxn (talk) 00:36, 16 May 2023 (UTC)[reply]
- can you add a reference? can you set an edit summary (just add a "summary" arg to the write call)? Otherwise looks good. BrokenSegue (talk) 01:23, 16 May 2023 (UTC)[reply]
- Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)[reply]
- No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)[reply]
- We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)[reply]
- I don't think it's an IP block. BrokenSegue (talk) 20:40, 22 June 2023 (UTC)[reply]
- We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)[reply]
- No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)[reply]
- Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)[reply]
WikiRankBot
[edit]WikiRankBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Danielyepezgarces (talk • contribs • logs)
Task/s: Use Alexa rank (P1661)
Code: Coming soon i publish the code
Function details: I am making a bot that can track the monthly ranking of websites based on Similarweb Ranking. The bot will receive a list of websites with their corresponding Wikidata IDs and domains to keep the data accurate.
The bot will have to use the Similarweb Top Sites API to get the traffic ranking of each website and store it in a MySQL database along with the date of the ranking. If the website already exists in the database, the bot should update its ranking and date every time there is a new ranking update.
Soon the bot will include some new features that will be communicated in the future.
- The Similarweb ranking is not this property. It is Similarweb ranking (P10768).--GZWDer (talk) 05:16, 12 May 2023 (UTC)[reply]
- If correct the bot uses the property P10768 and rewrites the old property P1661 since the public data of Alexa Rank ceased to exist,
- when I put Similarweb Ranking I don't mean the property P10768 but that the bot took the data from similarweb.com website Danielyepezgarces (talk) 16:15, 17 May 2023 (UTC)[reply]
- what edits is this bot making? BrokenSegue (talk) 15:59, 22 February 2024 (UTC)[reply]
ForgesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Dachary (talk • contribs • logs)
Task/s: Add licensing information to software forges entries in accordance to what is found in the corresponding Wikipedia page. It is used as a helper in the context of the Forges project
Code: https://lab.forgefriends.org/friendlyforgeformat/f3-wikidata-bot/
Function details: ForgesBot is a CLI tool designed to be used by participants in the Forges project in two steps. First it is run to do some sanity check, such as verifying forges are associated with a license. If some information is missing, the participant can manually add it or it can use ForgesBot to do so.
The implementation includes one plugin for each task. There is currently only one plugin to verify and edit the license information. The license is deduced by querying the wikipedia pages of each software: if they consistently mention the same license the edit can be done. If there are discrepancies they are reported and no action is done.
--Dachary (talk) 09:29, 26 April 2023 (UTC)[reply]
- I don't think I understand the task. Can you do some (~30) test edits? Or try to explain again? BrokenSegue (talk) 17:13, 26 April 2023 (UTC)[reply]
IngeniousBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Premeditated (talk • contribs • logs)
Task/s: Adding identifiers to album items, based on existing identifiers.
Code:
Function details: Adding Spotify album ID (P2205), Apple Music album ID (U.S. version) (P2281), YouTube playlist ID (P4300), SoundCloud ID (P3040), Pandora album ID (P10138), Amazon Standard Identification Number (P5749), Tidal album ID (P4577), Deezer album ID (P2723), Yandex Music release ID (P2819), Anghami album ID (P10972), Boomplay album ID (SOON), and Napster album ID (SOON). Based on previously mentioned properties. --Premeditated (talk) 16:29, 22 March 2023 (UTC)[reply]
- can you go into more detail about how this lookup will be done? link to some test edits? BrokenSegue (talk) 16:36, 22 March 2023 (UTC)[reply]
- @BrokenSegue: Test edits. Lookups are based on a given album identifier like for example, Spotify album ID (P2205). UPC, Spotify artist ID (P1902), artist name, number of tracks, name of tracks, ISRC (P1243), and more are compared and looked up on other streaming services API/scrapping to match "identical" relases. I have made a scoring system where only relases that score 80% or better are added by the bot. The matches that does not get published will be saved to a file for later to be added to Mix'n'match, maybe. - Premeditated (talk) 23:50, 22 March 2023 (UTC)[reply]
- I believe you are misusing the inferred from (P3452) property. Look at the description of that property in English. Please go and fix all the test edits you made. Maybe you want stated in (P248) or similar.
- I think you should add a based on heuristic (P887) statement in the reference? Maybe to record linkage (Q1266546) or similar. This whole workstream seems really similar to what is/was being done by User:Soweego bot. Can you explain how you are different/the same. Maybe we should get input from @Hjfocs:.
- Can you go into more detail about what is creating these scores? How did you verify the scores are meaningful? What kind of model are you using? Is your source code available? What " looked up on other streaming services API/scrapping to match "identical" relases " are you using. Etc. BrokenSegue (talk) 16:59, 23 March 2023 (UTC)[reply]
- Hey folks, happy to give my 2 cents. I second BrokenSegue's comments: (based on heuristic (P887), record linkage (Q1266546)) reference nodes sound good. @Premeditated: interesting project: it would be great if you could share the code and tell us something more about it. Cheers, Hjfocs (talk) 22:57, 25 March 2023 (UTC)[reply]
- @BrokenSegue: Test edits. Lookups are based on a given album identifier like for example, Spotify album ID (P2205). UPC, Spotify artist ID (P1902), artist name, number of tracks, name of tracks, ISRC (P1243), and more are compared and looked up on other streaming services API/scrapping to match "identical" relases. I have made a scoring system where only relases that score 80% or better are added by the bot. The matches that does not get published will be saved to a file for later to be added to Mix'n'match, maybe. - Premeditated (talk) 23:50, 22 March 2023 (UTC)[reply]
- What is the situation here?--Ymblanter (talk) 19:04, 23 June 2023 (UTC)[reply]
LucaDrBiondi@Biondibot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: LucaDrBiondi (talk • contribs • logs)
Task/s: Import us patent from a csv file
For example:
US11387028; Unitary magnet having recessed shapes for forming part of contact areas between adjacent magnets ;Patent number: 11387028;Type: Grant ;Filed: Jan 18, 2019;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210218300;Assignee Whylot SAS (Cambes) Inventors: Romain Ravaud (Labastide-Murat), Loic Mayeur (Saint Santin), Vasile Mihaila (Figeac) ;Primary Examiner: Mohamad A Musleh;Application Number: 16/769,182
US11387027; Radial magnetic circuit assembly device and radial magnetic circuit assembly method ;Patent number: 11387027;Type: Grant ;Filed: Dec 5, 2017;Date of Patent: Jul 12, 2022;Patent Publication Number: 20200075208;Assignee SHENZHEN GRANDSUN ELECTRONIC CO., LTD. (Shenzhen) Inventors: Mickael Bernard Andre Lefebvre (Shenzhen), Gang Xie (Shenzhen), Haiquan Wu (Shenzhen), Weiyong Gong (Shenzhen), Ruiwen Shi (Shenzhen) ;Primary Examiner: Angelica M McKinney;Application Number: 16/491,313
US11387026; Assembly comprising a cylindrical structure supported by a support structure ;Patent number: 11387026;Type: Grant ;Filed: Nov 21, 2018;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210183551;Assignee Siemens Healthcare Limited (Chamberley) Inventors: William James Bickell (Witney), Ashley Fulham (Hinkley), Martin Gambling (Rugby), Martin Howard Hempstead (Ducklington), Graeme Hyson (Milton Keynes), Paul Lewis (Witney), Nicholas Mann (Compton), Michael Simpkins (High Wycombe) ;Primary Examiner: Alexander Talpalatski;Application Number: 16/771,560
Code:
I would learn to write my bot to perform this operation. I am using Curl in c language, i have a bot account (that now i want to "request for permission") buy i get the following error message:
{"login":{"result":"Failed","reason":"Unable to continue login. Your session most likely timed out."}} {"error":{"code":"missingparam","info":"The \"token\" parameter must be set.","*":"See https://www.wikidata.org/w/api.php for API usage.
probably i think my bot account is not already approved...
Function details:
Import item on wikidata starting from title and description and these properties for now:
P31 (instance of) "United States patent" P17 (country) "united states" P1246 (patent number) "link to google patents or similar" --LucaDrBiondi (talk) 18:25, 28 February 2023 (UTC)[reply]
- @LucaDrBiondi How many patents are you planning to add this way? ChristianKl ❪✉❫ 12:33, 17 March 2023 (UTC)[reply]
- The bot account to which you link doesn't exist. ChristianKl ❪✉❫ 12:34, 17 March 2023 (UTC)[reply]
- Hi i am still writing and trying it and moreover it is not yet a bot ...because it is not automatic.
I have imported patents data into a sql server database then i read a patent and with pywikibot i try for example to search the assignee (owned by property). If i not find a match i will search manually. only if i am sure then i insert the data into wikidata. this is because i do not want to add data with errors. For example look at Q117193724 item. LucaDrBiondi (talk) 18:27, 17 March 2023 (UTC)[reply]
- @ChristianKl
- At the end i have developed a bot using pywikibot.
- It is not fully automatic because i have the property Owned_id that it is mandatory for me.
- So i verify if wikidata has already an item to use for this property.
- If I not find it then i not import the item (the patent)
- I have already loaded some houndred items like for example this Q117349404
- Do a limit of number of item that can i import each day exists?
- I have received at a point a warning message from the API
- Must i so somethink with my user bot?
- thank you for your help! LucaDrBiondi (talk) 16:08, 31 March 2023 (UTC)[reply]
Kalliope 7.3 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Push-f (talk • contribs • logs)
Task/s: Update User:Kalliope 7.3/List of bots every hour.
Code: https://git.push-f.com/wikidata-bots/tree/bots.py
Function details:
I am planning on adding more features e.g. adding a parameter to {{Bot}}
to allow bots to define which properties they edit and then generating a table like:
Property | Bot |
---|---|
software version identifier (P348) | Github-wiki-bot |
but I still have to implement that.
--Push-f (talk) 09:16, 7 December 2022 (UTC)[reply]
- @Push-f: You do not need bot right if the bot only edit subpages of your or your bot's user pages.--GZWDer (talk) 09:26, 7 December 2022 (UTC)[reply]
- @GZWDer: I think I do need something because any attempt to edit a subpage via the API is failing with a Captcha (and I did confirm the email address for the account). --Push-f (talk) 14:12, 7 December 2022 (UTC)[reply]
- You need a confirmed flag for this. GZWDer (talk) 14:13, 7 December 2022 (UTC)[reply]
- Ah ok thanks, then I hereby request the "confirmed" right for my bot. --Push-f (talk) 14:29, 7 December 2022 (UTC)[reply]
- confirmed flags are requested at Wikidata:Requests for permissions/Other rights. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]
- oh I see you figured that out. never mind. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]
- confirmed flags are requested at Wikidata:Requests for permissions/Other rights. BrokenSegue (talk) 06:45, 8 December 2022 (UTC)[reply]
- Ah ok thanks, then I hereby request the "confirmed" right for my bot. --Push-f (talk) 14:29, 7 December 2022 (UTC)[reply]
- You need a confirmed flag for this. GZWDer (talk) 14:13, 7 December 2022 (UTC)[reply]
- @GZWDer: I think I do need something because any attempt to edit a subpage via the API is failing with a Captcha (and I did confirm the email address for the account). --Push-f (talk) 14:12, 7 December 2022 (UTC)[reply]
- @Push-f is this request still relevant? I saw that you got the confirmed flag for the bot at one point but the bot hasn't run since 2023. If you'd like to have a permanent confirmed flag (at least until the account gets autoconfirmed) and then close this request since you don't really need bot approval if you just edit your subpage, we can do that --DannyS712 (talk) 07:00, 9 June 2024 (UTC)[reply]
DL2204bot 2 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DL2204 (talk • contribs • logs)
Task/s: Correct messy entries for scholarly articles of Uztaro. Journal of Humanities and Social Sciences. (Q12268801) and Aldiri. Arkitektura eta abar (Q12253132) journals, add 2020-2022 articles.
Code: We are using WBI 12.0 for interaction with the source Wikibase, and with Wikidata.
Function details: In 2020-21, articles of the two journals (example Q108042527) have been uploaded using OpenRefine (see Q108042527 history). That dataset has several problems, such as repeated author statements (with and without "series ordinal" qualifier), incorrect issue number, DOI not present (although existing), download URL not present (although existing), etc. This proposal consists en re-writing all entries (see all using this query), using data from the newly created Inguma Wikibase (see items for these two journals using this query). Before the operation, we will check completeness and integrity of the data, and include some missing items (original source is the SQL database in the back of https://inguma.eus). --DL2204 (talk) 11:18, 30 November 2022 (UTC)[reply]
- If I'm understanding your query correctly you are planning on editing just 1000 items? Personally I would be comfortable letting you do that without bot approval. Seems like a manual audit would be possible to ensure the quality is acceptable. Either way Support. BrokenSegue (talk) 16:43, 7 December 2022 (UTC)[reply]
- Please make some test edits.--Ymblanter (talk) 20:04, 11 December 2022 (UTC)[reply]
- @DL2204 reminder to make your test edits (or do you want this closed?) --DannyS712 (talk) 07:02, 9 June 2024 (UTC)[reply]
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Withdrawn. --Wüstenspringmaus talk 09:13, 30 August 2024 (UTC)[reply]
Botcrux (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Horcrux (talk • contribs • logs)
Task/s: Replace stated in (P248) to publisher (P123) where the value is Q1158.
Problem description: In 2020 User:Reinheitsgebot made a massive addition of references in which stated in (P248)World Athletics (Q1158) have been added as claim in the reference (edit example). The operation was ok, except that World Athletics (Q1158) is an organization, not a document or a database, therefore thousands of warnings are currently raised up (example). A more suitable property is publisher (P123).
Function details: For technical reasons, I'm not able to fix the source with a single edit, so the bot will:
- copy all the claims in the reference to be removed (except for stated in (P248)World Athletics (Q1158));
- remove the problematic reference;
- add a new reference with all the claims copied plus publisher (P123)World Athletics (Q1158).
The script is ready, here a couple of edits: [6][7]. --Horcrux (talk) 09:04, 28 November 2022 (UTC)[reply]
- Sounds fine though why not just remove the "Stated In" "World Atheletics" claim from the reference altogether? Surely that's implied by the athlete ID.
- BrokenSegue (talk) 16:41, 28 November 2022 (UTC)[reply]
- @BrokenSegue: Just because I'm used to be as much complete as I can when I add a reference. But yes, it would also be ok just to execute point #2. --Horcrux (talk) 19:41, 28 November 2022 (UTC)[reply]
- Personally I'd prefer just doing point 2 but I don't care enough to argue either way. I might even argue that this bot doesn't need approval since the scope is so limited and there's warnings. BrokenSegue (talk) 21:39, 28 November 2022 (UTC)[reply]
- Hey Horcrux, are you still interested in doing those edits? Support from me! I also see stated in (P248) + a specific external ID property as a great combo, by the way. I'd say that's how they are usually used as well? The UseAsRef userscript also creates references like that. --Azertus (talk) 17:37, 12 April 2024 (UTC)[reply]
- @Horcrux: If you are still interested, could you make some test edits, please? --Wüstenspringmaus talk 15:38, 28 August 2024 (UTC)[reply]
- Hey Horcrux, are you still interested in doing those edits? Support from me! I also see stated in (P248) + a specific external ID property as a great combo, by the way. I'd say that's how they are usually used as well? The UseAsRef userscript also creates references like that. --Azertus (talk) 17:37, 12 April 2024 (UTC)[reply]
- Personally I'd prefer just doing point 2 but I don't care enough to argue either way. I might even argue that this bot doesn't need approval since the scope is so limited and there's warnings. BrokenSegue (talk) 21:39, 28 November 2022 (UTC)[reply]
- @BrokenSegue: Just because I'm used to be as much complete as I can when I add a reference. But yes, it would also be ok just to execute point #2. --Horcrux (talk) 19:41, 28 November 2022 (UTC)[reply]
- It looks like the problem was solved (honestly I don't remember whether by myself after BrokenSegue's reply), hence I withdraw the request. For the records, I've just noticed this statement, which indicates that the correct claim in the reference should be stated in (P248)World Athletics database (Q54960205). --Horcrux (talk) 07:54, 30 August 2024 (UTC)[reply]
- Strangely, from this query I still see 20k+ entries, but I'm not able to trace back the statements they occur in (see the query in the previous comment). --Horcrux (talk) 07:57, 30 August 2024 (UTC)[reply]
Cewbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Kanashimi (talk • contribs • logs)
Task/s: Add sitelink to redirect (Q70893996) for sitelinks to redirects without intentional sitelink to redirect (Q70894304).
Code: github
Function details: Find redirects in wiki projects, and check if there is sitelink to redirect (Q70893996) / intentional sitelink to redirect (Q70894304) or not. Add sitelink to redirect (Q70893996) for sitelinks without sitelink to redirect (Q70893996) or intentional sitelink to redirect (Q70894304). Also see Wikidata:Sitelinks to redirects. --Kanashimi (talk) 02:19, 15 November 2022 (UTC)[reply]
- sounds good. link to the source? BrokenSegue (talk) 05:28, 15 November 2022 (UTC)[reply]
- I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)[reply]
- The complete source code of my bot is here: https://github.com/MisterSynergy/redirect_sitelink_badges. It is a bit of a work-in-progress since I need to address all sorts of special situations that my bot comes across during the inital backlog processing.
- You can of course come up with something similar, but after the initial backlog has been cleared, there is actually not that much work left to do. Give how complex this task turned out to be, I am not sure whether it is worth to make a complete separate implementation for this task. Yet, it's your choice.
- Anyways, my bot would not be affected by the presence of another one in a similar field of work. —MisterSynergy (talk) 18:55, 21 November 2022 (UTC)[reply]
- I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)[reply]
Support Just another implementation of an approved task, why don't trust this one? Midleading (talk) 15:42, 4 November 2024 (UTC)[reply]
Mr Robot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Liridon (talk • contribs • logs)
Task/s: Add descriptions/labels/aliases
Code: https://github.com/emijrp/wikidata
Function details: I have been using QuickStatements to work on large numbers of items and properties for a lot types of items, and have +12 mil edits so far. I intend to continue to do so, and after this discussion I am applying for the bot flag for this account in order to avoid flooding Recent Changes/Watchlists.--Liridon (talk) 14:09, 4 November 2022 (UTC)[reply]
- I don't think we grant blanket approval for bots. Can you specify what tasks you will be working on? BrokenSegue (talk) 16:31, 4 November 2022 (UTC)[reply]
- I've already done some tasks with this account using scripts which are part of the github link, eg ([8], [9] ...) through paws.wmcloud.org. Liridon (talk) 17:35, 8 November 2022 (UTC)[reply]
- that doesn't really answer the question. I don't think we grant blanket approval. BrokenSegue (talk) 17:25, 11 November 2022 (UTC)[reply]
- You guys did approve this one, which had similar task description.--Liridon (talk) 16:46, 13 December 2022 (UTC)[reply]
- that doesn't really answer the question. I don't think we grant blanket approval. BrokenSegue (talk) 17:25, 11 November 2022 (UTC)[reply]
- @BrokenSegue Hello. Liridon is flooding my Watchlist with his edits adding sq labels to people items. And he's saying he cannot use the bot account because the bot request here was not approved. Can we grant him approval specifically for this kind of edits? Please - for the sake of my watchlist... Thanks... Vojtěch Dostál (talk) 18:28, 18 February 2023 (UTC)[reply]
- @Vojtěch Dostál: I'm not a bcrat. I can't assign the bot flag. BrokenSegue (talk) 18:43, 18 February 2023 (UTC)[reply]
- Or we can block the user for running unapproved bot. Ymblanter (talk) 20:26, 19 February 2023 (UTC)[reply]
- What? You cant block me because of this. I query Items throught https://query.wikidata.org/, find those without specific label or description, then edit all them with Quickstatements. They are not bot edits. Liridon (talk) 13:42, 20 February 2023 (UTC)[reply]
- the bot policy does not specify what technology the bot uses to make the edits. the point of the policy is to provide some oversight over large batch edits. BrokenSegue (talk) 21:46, 26 February 2023 (UTC)[reply]
- I'm not doing these edits with cadidate bot user(Mr Robot), but with my non-bot-account (Liridon). Exept for flooding watchlist of other users with my semi-automated edits (which I'm sure a lot of other users do), nothing is against any rules of Wikidata. Liridon (talk) 13:03, 2 March 2023 (UTC)[reply]
- the bot policy does not specify what technology the bot uses to make the edits. the point of the policy is to provide some oversight over large batch edits. BrokenSegue (talk) 21:46, 26 February 2023 (UTC)[reply]
- What? You cant block me because of this. I query Items throught https://query.wikidata.org/, find those without specific label or description, then edit all them with Quickstatements. They are not bot edits. Liridon (talk) 13:42, 20 February 2023 (UTC)[reply]
- Or we can block the user for running unapproved bot. Ymblanter (talk) 20:26, 19 February 2023 (UTC)[reply]
- @Vojtěch Dostál: I'm not a bcrat. I can't assign the bot flag. BrokenSegue (talk) 18:43, 18 February 2023 (UTC)[reply]
- I've already done some tasks with this account using scripts which are part of the github link, eg ([8], [9] ...) through paws.wmcloud.org. Liridon (talk) 17:35, 8 November 2022 (UTC)[reply]
RobertgarrigosBOT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Robertgarrigos (talk • contribs • logs)
Task/s: I'm using Openrefine to edit items related to Wikidata:Wikiproject_Lieder, beginning by adding the new subclass lyrico-musical work (Q114586269) to the actual lieder in WD. I hope to gain some experience before going with further edits.
Code:
Function details:
--Robertgarrigos (talk) 19:42, 16 October 2022 (UTC)[reply]
YSObot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: YSObot (talk • contribs • logs)
Task/s: Account for mapping Wikidata with General Finnish Ontology (Q27303896) and the YSO-places ontology by adding YSO ID (P2347) and for creating new corresponding concepts in case there are no matches.
Code: n/a. Uploads will be done mainly with Openrefine, Mix'n'Match and crresopoinding tools.
Function details: YSO includes over 40.000 concepts and about half of them are already maapped. The mapping includes
- adding possible missing labels in Finnish, Swedish and English
- adding YSO ID (P2347) with subject named as (P1810) values from YSO
- adding stated in (P248) with value YSO-Wikidata mapping project (Q89345680) and retrieved (P813) with the date.
Matches are checked manually before upload. Double-checking is controlled afterwords by using the Constraint violations report
Flag/s: High-volume editing, Edit existing pages, Create, edit, and move pages
--YSObot (talk) 11:33, 16 December 2021 (UTC)[reply]
- The bot was running without approval (this page was never included). I asked the operator to first get it approved. Can you please explain the creation of museum (Q113965327) & theatre (Q113965328) and similar duplicate items? Multichill (talk) 16:27, 15 September 2022 (UTC)[reply]
- museo (Q113965327) & teatteri (Q113965328) are part of the Finnish National Land Survey classification for places. These classes will be mapped with existing items if they are exact matches by using Property:P2959.
- Considering duplicate YSO-ID instances: these are most often due to modeling differences between Wikdata and YSO. Some concepts are split in the other one and vice versa. These are due to linguistic and cultural differences in vocabularies and concept formation. Currently the duplicates would be added to the exceptions list in the YSO-ID property P2347. However, lifting the single value constraint for this proerty is another options here.
- Anyway, YSObot is currently an important tool in efforts to complete the mappings of the 30.000+ conepts of YSO with Wikidata. Uploads of YSO-IDs are made to reconciled items from OpenRefine. See YSO-Wikidata mapping project and the log of YSObot. For the moment, uploads are done usually only to 10-500 items at time few times per day max. Saarik (talk) 13:46, 23 September 2022 (UTC)[reply]
- That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I Oppose this bot. You should first clean up all these duplicates before doing any more edits with this bot. @Susannaanas: care to comment on this? Multichill (talk) 09:58, 24 September 2022 (UTC)[reply]
- This bot is very important, we just need to reach common understanding about how to model the specific Finnish National Land Survey concepts. I have myself struggled with them previously. There is no need to oppose to the bot itself. – Susanna Ånäs (Susannaanas) (talk) 18:02, 25 September 2022 (UTC)[reply]
- why do we want to maintain permanently duplicated items? this seems like a bad outcome. why not instead make these subclasses of the things they are duplicates of. or attach the identifier to already existing items. BrokenSegue (talk) 20:36, 11 October 2022 (UTC)[reply]
- That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I Oppose this bot. You should first clean up all these duplicates before doing any more edits with this bot. @Susannaanas: care to comment on this? Multichill (talk) 09:58, 24 September 2022 (UTC)[reply]
- I think this discussion went a little astray from the original purpose of YSObot.
- The creation of the Finnish National Land Survey place types were erroneously made with the YSObot account although they are not related to YSO at all. I was adding them manually with Openrefine but forgot to change the user ids in my Openrefine! I though that that would not be a big issue. The comments by @Multichilland @BrokenSegue are not really related to the original use of YSObot and do not belong here at all but rather to Q106589826 Talk page.
- About the duplicate question - Earliear, I did exactly that and added these to already existing items with "instance of" property. THe I received feedback and was told to create separate items for the types. So now I am getting to totally opposite instructions from you guys. Lets move this discussion to its proper place.
- And please, add the correct rights for this bot account, if they are still missing as we still need to add the remaining 10.000+ identifiers. Saarik (talk) 11:32, 27 October 2022 (UTC)[reply]
- Oppose as per above. If you refrain from creating new items I would probably support it if I could easily see the flow of logic.
- I strongly encourage you to publish an actvity planuml diagram showing he logic of the matching.
- Thanks in advance. So9q (talk) 10:26, 2 January 2024 (UTC)[reply]
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Not done / Withdrawn. --Wüstenspringmaus talk 10:55, 29 August 2024 (UTC)[reply]
AradglBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Aradgl (talk • contribs • logs)
Task/s:
Create between 100,000 and 200,000 new lexemes in Aragonese language Aragonese (Q8765)
Code:
Function details: --Aradgl (talk) 19:43, 14 March 2022 (UTC)[reply]
Using a small program and the api, the bot will create new lexemes in Aragonese specifying the lexical category, the language and some of its forms
I have about 30,000 lexemes prepared and I have started uploading them
In the coming weeks and months I hope to reach 100,000 or 200,000 new lexemes.
- Oppose on principle, since senses (meanings) of these words, or links to references for each lexeme (such as to dictionary entries for these words, or other lexical identifiers for these words) are not also being provided. We already have massive backlogs of senseless lexemes for a bunch of languages (see the bottom of the first table); I will not support making this backlog inordinately larger. Mahir256 (talk) 20:58, 23 March 2022 (UTC)[reply]
- We understand your observations. You are right that no meanings or links are provided at this stage. However, this is only natural since this is the beginning of a broader task that we are starting now.
- Due to the lack of resources of a minority language such as Aragonese (spoken by less than 30.000 people), we believe this is the most sensible way to proceed: step by step. Moreover, Aragonese is on the brink of extinction according to UNESCO.
- Undermining any effort to dignify its status will definitely will speed up the death of the Aragonese language. On the contrary, we ask for support to promote our beloved language.
- Thank you very much. Aradgl (talk) 18:46, 24 March 2022 (UTC)[reply]
- @Aradgl: I'm not sure where you're getting that I'm interested in undermining Aragonese's dignity or speeding up the death of Aragonese. On the contrary, I'd love to see Aragonese thrive as an independent and flourishing tongue, but there should be just enough in that language's lexemes to begin with such that improvements to them, both from inside and outside the language community, are actually conceivable. Consider Breton lexemes: the language itself is also endangered, and most Breton lexemes currently do not have senses, but they do have links to Lexique étymologique du breton moderne (Q19216625), so that someone else (not necessarily a Breton speaker) can come by and at least add information based on that lexicon (@VIGNERON, Envlh:, who imported them). On the other hand, consider Estonian lexemes; an Estonian non-native speaker created a bunch of them over the course of a few days, all of them without senses, and most still sit as empty shells, with no clear way for non-Estonians to improve them and no indication that actual Estonian speakers even know they exist. I am happy to look around for references you could add to potential Aragonese lexemes, such that you can add some potential resource links based on them, but that is not a reason to begin importing them now without any such resources (especially since you have not indicated how/when you plan to add senses/resource links later). Mahir256 (talk) 20:01, 24 March 2022 (UTC)[reply]
- @Mahir256 Right now we are discussing our timetable in order to implement next steps within Wikidata, with the prospect of relating lexemes with concepts and meanings. We count on finishing the first phase by the end of 2022.
- By no means have we wanted to create lexemes as “empty shells”. We are working in a long-term project in order to provide valuable information for the sake of Aragonese language. We are working together with our Occitan counterparts (Lo Congrès) and in fact, we want to follow their example promoting further contributions from the community. Our reference is AitalDisem, a project initiated by Lo Congrès following its collaboration with Wikidata. This project is the direct continuation of the project AitalvivemBot. Aradgl (talk) 15:09, 25 March 2022 (UTC)[reply]
- @Aradgl: I'll believe that you don't want to create empty shell lexemes, but I find it difficult to believe, given the prior examples of Russian, Estonian, Latin, and Hebrew lexemes, that they won't stay empty shells forever. If you are basing your work on the example of Aitalvivem, then (at least judging from that bot's contributions, which stopped in July 2019) you are not likely to be applying the right amount of attention to senses/resource linkages that would be desired, and (at least judging from the outcome of this bot request, from a user who disappeared after January 2020) you might disappear if prompted later about them.
- You speak of wanting to add "valuable information for the sake of the language", but I fear that if there are no paths to this valuable information (with respect to the meanings of words) early on, then it is unlikely there ever will be such paths. If you are absolutely certain that existing printed/online references about Aragonese are not suitable/worthy of at least being linked to, and thus plan to essentially only crowdsource word meanings the same way the Occitan folks appear to have attempted, then what you could instead do (and what would change my opposition to a support) is have your system create lexemes only when an appropriate meaning has been added to that lexeme in that system by a community member, rather than creating lexemes with just the forms all at once waiting to be filled in on Wikidata. Mahir256 (talk) 15:37, 25 March 2022 (UTC)[reply]
- @Mahir256: I'm the one who was supposed to continue the work about the AitalvivemBot. Unfortunately, I suffer since March 2020 from long covid and all my works has been postponed. But we still intend to add occitan lexemes in Wikidata, if it's something that you think can be useful. I thought that the purpose of Wikidata lexeme was to inventory words from languages. I never heard we needed to add senses to them as a mandatory requirement. Is that like this, now ? If it is, of course we wouldn't disturb the work done in Wikidata by uploading a lot of words without senses. Minority languages, indeed, don't have a lot of human and financial means and we can't move forward at the speed the main languages do (you see it with occitan, one person is sick and many works are postponed for years). Of course, we can't guarantee all the words we upload will be related to a meaning. But we intend to try with the poor means we have. In the other hand, all our words are from recognized dictionaries. Is that still interesting for Wikidata or will it be better if we keep them for ourselves ? Unuaiga (talk) 14:00, 28 March 2022 (UTC)[reply]
- @Unuaiga: I'm sorry to hear that you have had long COVID this whole time—I sincerely hope you can recover! Please re-read my reply from 20:01, 24 March 2022 (UTC) above, and VIGNERON's comments below (in other words, you don't need senses if you can provide a way to add them later). Wikidata lexicographical data can do so much more than "inventory(ing) words from languages"; it's only appropriate that if more isn't done immediately after creating a lexeme, then opportunities for doing so (through the linkages of references) ought to be provided. My offer to find references re: Aragonese to Aradgl from 20:01, 24 March 2022 (UTC) above is extended to you re: Occitan. As for minority languages not moving as fast as main languages, I point you to the examples, in addition to Breton, of Hausa, Igbo, and Dagbani as under-resourced languages making lots of progress on lexemes. Mahir256 (talk) 14:23, 28 March 2022 (UTC)[reply]
- Thanks for your explanations. I will look ath the languages you talk about with great curiosity. Unuaiga (talk) 16:04, 28 March 2022 (UTC)[reply]
- @Unuaiga: I'm sorry to hear that you have had long COVID this whole time—I sincerely hope you can recover! Please re-read my reply from 20:01, 24 March 2022 (UTC) above, and VIGNERON's comments below (in other words, you don't need senses if you can provide a way to add them later). Wikidata lexicographical data can do so much more than "inventory(ing) words from languages"; it's only appropriate that if more isn't done immediately after creating a lexeme, then opportunities for doing so (through the linkages of references) ought to be provided. My offer to find references re: Aragonese to Aradgl from 20:01, 24 March 2022 (UTC) above is extended to you re: Occitan. As for minority languages not moving as fast as main languages, I point you to the examples, in addition to Breton, of Hausa, Igbo, and Dagbani as under-resourced languages making lots of progress on lexemes. Mahir256 (talk) 14:23, 28 March 2022 (UTC)[reply]
- @Aradgl: I'm not sure where you're getting that I'm interested in undermining Aragonese's dignity or speeding up the death of Aragonese. On the contrary, I'd love to see Aragonese thrive as an independent and flourishing tongue, but there should be just enough in that language's lexemes to begin with such that improvements to them, both from inside and outside the language community, are actually conceivable. Consider Breton lexemes: the language itself is also endangered, and most Breton lexemes currently do not have senses, but they do have links to Lexique étymologique du breton moderne (Q19216625), so that someone else (not necessarily a Breton speaker) can come by and at least add information based on that lexicon (@VIGNERON, Envlh:, who imported them). On the other hand, consider Estonian lexemes; an Estonian non-native speaker created a bunch of them over the course of a few days, all of them without senses, and most still sit as empty shells, with no clear way for non-Estonians to improve them and no indication that actual Estonian speakers even know they exist. I am happy to look around for references you could add to potential Aragonese lexemes, such that you can add some potential resource links based on them, but that is not a reason to begin importing them now without any such resources (especially since you have not indicated how/when you plan to add senses/resource links later). Mahir256 (talk) 20:01, 24 March 2022 (UTC)[reply]
- @Aradgl: this is a wonderful project but I have to agree with Mahir256, this doesn't seems ready yet (for Breton, after a ~4000 lexemes import and even with some info for the meaning, I estimated at least a year of manual work every week to have good lexemes :/ this is already painfull, 100,000 to 200,000 lexemes wouldbe overwhelming).
- I have some additionnal questions :
- what is the source ? and is it public or not ? (in both case, it would be better to indicate the source in the lexemes themselves)
- is you bot ready yet ; if so, could you do some test edit (like creating 10 lexemes) so we can better see exactly what we are talking about and maybe provide some help.
- Cheers, VIGNERON (talk) 13:23, 27 March 2022 (UTC)[reply]
- @VIGNERON: It seems like the edits the requestor has been making in the Lexeme namespace of late resemble those described in this request. Mahir256 (talk) 16:09, 27 March 2022 (UTC)[reply]
- @Mahir256: ah thanks, I looked at the bot edit but notat the account behind the bot ;) Indeed, these lexemes are way to empty to have any use. At the very very least, you need to add a source (and ideally, multiple). Maybe you can cross it with other dataset. I'm also wondering, why « between 100,000 and 200,000 » don't you have the exact number?
- Also, I'm pinging @Fjrc282a, Herrinsa, Jfblanc, Universal Life: who speak Aragonese and might want to know about this Bot and maybe even want to help.
- Cheers, VIGNERON (talk) 16:24, 27 March 2022 (UTC)[reply]
- @Aradgl: Thoughts on VIGNERON's reply from 16:24, 27 March 2022 (UTC)? Mahir256 (talk) 20:14, 8 June 2022 (UTC)[reply]
- @Unuaiga, Miguel&IvanV: If either of you know or can get a hold of @Aradgl:, could you tell that user to reply to User:VIGNERON's messages above? Mahir256 (talk) 16:59, 19 July 2022 (UTC)[reply]
- Ok, I write them an email to tell them. 217.119.181.174 12:09, 25 July 2022 (UTC)[reply]
- Sorry I wasn't connected. I write to them. Unuaiga (talk) 12:10, 25 July 2022 (UTC)[reply]
- @Unuaiga: Thank you for doing that; it is a bit disappointing that Aradgl has not replied, since only their ability to edit the lexeme namespace has been blocked and not their ability to do other things on Wikidata. Do you or @Miguel&IvanV: know @Uesca:, and could inform them of this discussion and the messages I placed on their talk page? Mahir256 (talk) 18:05, 30 August 2022 (UTC)[reply]
- Good morning to the Wikidata community. I want to apologize for my delay in replying. For various reasons I have been absent.
- The source used is from the regional government of Aragon in Spain. It can be consulted with the free and public tool: Aragonario. https://aragonario.aragon.es/
- The bot is created and working. Almost all the lexemes created by the user @Aradgl have been created using the bot.
- Please,
- @
- Mahir256
- , unlock my user account (@Aradgl) and allow me to continue working for the protection and dissemination of the Aragonese language.
- Aradgl (talk) 06:54, 31 August 2022 (UTC)[reply]
- @Aradgl: Thank you for finally providing at least an external source for the lexemes you have created. Since it appears each lexeme has its own ID (the number "67731" in https://aragonario.aragon.es/words/67731/, for example), I would like you to do the following first: 1) propose a Wikidata property to store these IDs (maybe call it "Aragonario ID"), 2) once that property is created and I unblock you from the lexeme namespace, add values for this property to all of the Aragonese lexemes already created, and then 3) commit to only creating lexemes alongside their Aragonario IDs, rather than without these IDs. Mahir256 (talk) 07:13, 31 August 2022 (UTC)[reply]
- @Aradgl: As a gesture of goodwill, I have gone ahead and did the first thing, proposing Wikidata:Property proposal/Aragonario ID which I will insist @Aradgl, Uesca: add to the lexemes they created first before creating any further new ones. Mahir256 (talk) 23:04, 31 August 2022 (UTC)[reply]
- It is not possible to add the AragonarioID because both the Aragonario and my data come from the same database on a server, but the AragonarioID only exists on the Aragonario's website (the Aragonario's id is generated by the Aragonario's website and it is not in the database of the server that belongs to the Government of Aragon).
- As we have already indicated, we are proposing the introduction of the Aragonese language in Wikidata in several phases that include its provision of content and even in the final phases the use of Wikidata to create chats in Aragonese, translators, etc.
- The first phase consists of uploading the lexemes so that later other classmates manually add the meaning using dictionaries (on paper) and other resources. We would have liked to have all the lexemes (without meaning) created previously because it would have been easier, but given the circumstances, some colleagues have already begun to add meaning to the lexemes already created. The more lexemes (without meaning) I have created, the easier it will be for my classmates to add meanings, in fact, the ideal would be for all the lexemes (without meaning) created to start with phase two.
- I wish we had the means and resources to tackle all the work in a single phase and a very short period of time, but this is not the case, there are very few of us who work for the defense and safeguard of Aragonese and many who put obstacles in our way to achieve it.
- Can you please let me continue with my work? Don't give us bot permissions, but don't block us for creating lexemes in Aragonese. We will be adding meaning manually from now on to the lexemes and at the same time creating new lexemes (without meaning). Aradgl (talk) 08:15, 23 September 2022 (UTC)[reply]
- @Unuaiga: Thank you for doing that; it is a bit disappointing that Aradgl has not replied, since only their ability to edit the lexeme namespace has been blocked and not their ability to do other things on Wikidata. Do you or @Miguel&IvanV: know @Uesca:, and could inform them of this discussion and the messages I placed on their talk page? Mahir256 (talk) 18:05, 30 August 2022 (UTC)[reply]
- Sorry I wasn't connected. I write to them. Unuaiga (talk) 12:10, 25 July 2022 (UTC)[reply]
- Good morning,
- As a result of opening this conversation, I found out about the initiative of the user Aradgl in Wikidata and I have seen the problem you mention.
- I have been including verbs in the Aragonese language and as I work in the same line, I have contacted Aradgl and Iizquierdogo (another user who includes Aragonese language content in Wikidata) and we are going to support Aradgl's initiative by manually including the sense in the lexemes.
- Best regards Miguel&IvanV (talk) 10:22, 23 September 2022 (UTC)[reply]
- Ok, I write them an email to tell them. 217.119.181.174 12:09, 25 July 2022 (UTC)[reply]
- @Unuaiga, Miguel&IvanV: If either of you know or can get a hold of @Aradgl:, could you tell that user to reply to User:VIGNERON's messages above? Mahir256 (talk) 16:59, 19 July 2022 (UTC)[reply]
- @Aradgl: Thoughts on VIGNERON's reply from 16:24, 27 March 2022 (UTC)? Mahir256 (talk) 20:14, 8 June 2022 (UTC)[reply]
- @VIGNERON: It seems like the edits the requestor has been making in the Lexeme namespace of late resemble those described in this request. Mahir256 (talk) 16:09, 27 March 2022 (UTC)[reply]
PodcastBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Germartin1 (talk • contribs • logs)
Task/s: Upload new podcast episodes, extract: title, part of the series, has quality (explicit episode), full work available at (mp3), production code, apple podcast episode id, spotify episode ID. Regex extraction: talk show guest, recording date (from description) It will be manually run and only for prior selected podcasts. Code: https://github.com/mshd/wikidata-to-podcast-xml/blob/main/src/import/wikidataCreate.ts
Function details:
- Read XML Feed
- Read Apple podcast feed/ and spotify
- Get latest episode date available on Wikidata
- Loop all new episodes which do not exists in Wikidata yet
- Extract data
- Import to Wikidata using maxlath/wikidata-edit
--Germartin1 (talk) 04:38, 25 February 2022 (UTC)[reply]
- Comment What is your plan for deciding which episodes are notable? Ainali (talk) 06:40, 21 March 2022 (UTC)[reply]
- Oppose for a bot with would do blanket import of all Apple or Spotify podcasts. ChristianKl ❪✉❫ 22:46, 22 March 2022 (UTC)[reply]
- Have a look at the code, it's only for certain podcasts and will run only manually. Germartin1 (talk) 05:12, 23 March 2022 (UTC)[reply]
- @Germartin1: Bot approvals are generally for a task. If that task is more narrow, that shouldn't be just noticeable from the code but be included in the task description. ChristianKl ❪✉❫ 11:39, 24 March 2022 (UTC)[reply]
- Have a look at the code, it's only for certain podcasts and will run only manually. Germartin1 (talk) 05:12, 23 March 2022 (UTC)[reply]
How about episodes to podcasts with a Wikipedia article? @Ainali:--Trade (talk) 18:34, 12 June 2022 (UTC)[reply]
- Support Productive user with a high quality track record.--Big bushlips (talk) 19:29, 25 January 2023 (UTC)[reply]
- Support Are we really letting this proposal languish because the request was incomplete at the time of submission? Proposer has since addressed that only a selection of podcasts will be imported. If the podcast is in Wikidata/Wikipedia, I'd say the episodes are notable. Also the other way around, if we already have an item for the guest(s). @Germartin1: are you still interested in editing about this subject (I noticed you publicly archived your repo)? I did some similar editing (semi-automated using OpenRefine) before and might be interested in trying to set your code up and operate it for Richard Herring's Leicester Square Theatre Podcast (Q96757385) and Between the Brackets (Q108093799). --Azertus (talk) 10:09, 23 August 2023 (UTC)[reply]
- Support As long as we limit to notable podcasts (and their episodes), I support. There is a lot of valuable interconnected data that can come from these objects. Iamcarbon (talk) 21:26, 16 October 2024 (UTC)[reply]