Spurred by a discussion at en.WP about citation templates emitting duplicate HTML IDs, it would be nice if we could find pages which have duplicate IDs so that we can fix them. That means probably a maintenance category or a lint error.
See also
Spurred by a discussion at en.WP about citation templates emitting duplicate HTML IDs, it would be nice if we could find pages which have duplicate IDs so that we can fix them. That means probably a maintenance category or a lint error.
See also
For the lint error, just need to surface https://github.com/wikimedia/parsoid/blob/master/lib/utils/DOMUtils.js#L360
Change 493116 had a related patch set uploaded (by Farida; owner: Farida):
[mediawiki/services/parsoid@master] Emit lint error when a page has duplicate HTML IDs
This reappeared as related to T358588 and maybe we should re-triage this as maintenance work for Content-Transform-Team .
Change #493116 abandoned by Subramanya Sastry:
[mediawiki/services/parsoid@master] Emit lint error when a page has duplicate HTML IDs
Reason:
No longer relevant -- partial patch and we are also in PHP land now.
Change #1073572 had a related patch set uploaded (by Arlolra; author: Arlolra):
[mediawiki/extensions/Linter@master] Add a "duplicate-ids" lint category
Change #1073574 had a related patch set uploaded (by Arlolra; author: Arlolra):
[mediawiki/services/parsoid@master] Lint duplicate ids
Change #1074253 had a related patch set uploaded (by C. Scott Ananian; author: Arlolra):
[mediawiki/extensions/Linter@wmf/1.43.0-wmf.23] Add a "duplicate-ids" lint category
Change #1073572 merged by jenkins-bot:
[mediawiki/extensions/Linter@master] Add a "duplicate-ids" lint category
Change #1074253 merged by jenkins-bot:
[mediawiki/extensions/Linter@wmf/1.43.0-wmf.23] Add a "duplicate-ids" lint category
Mentioned in SAL (#wikimedia-operations) [2024-09-19T20:45:58Z] <dreamyjazz@deploy1003> Started scap sync-world: Backport for [[gerrit:1073871|Re-order arguments to DataAccess::addTrackingCategory]], [[gerrit:1074253|Add a "duplicate-ids" lint category (T200517)]]
Mentioned in SAL (#wikimedia-operations) [2024-09-19T21:00:53Z] <dreamyjazz@deploy1003> dreamyjazz, cscott: Backport for [[gerrit:1073871|Re-order arguments to DataAccess::addTrackingCategory]], [[gerrit:1074253|Add a "duplicate-ids" lint category (T200517)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2024-09-19T21:15:19Z] <dreamyjazz@deploy1003> Finished scap sync-world: Backport for [[gerrit:1073871|Re-order arguments to DataAccess::addTrackingCategory]], [[gerrit:1074253|Add a "duplicate-ids" lint category (T200517)]] (duration: 29m 20s)
Change #1073574 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Lint duplicate ids
Change #1075077 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a22
Change #1075077 merged by jenkins-bot:
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.20.0-a22
For some reason this new category is timing out for me on en.wp with 100k results. Other categories with more don't (such as misnested tags which has 300k). Is the category still filling, or is there something weird going on? (I can work around it by selecting a namespace in the URL, so it's not a critical problem, but not everyone is going to know that is viable.)
Separate comment: I think the name could use some adjustment. These aren't "Duplicate Ids" (see also https://en.wikipedia.org/wiki/Id,_ego_and_superego#Id ), they're (either) "duplicate IDs" or "duplicate id attributes". ("duplicate ids" is almost as bad as "duplicate Ids")
Can someone please perform the rest of the "add a new Linter condition" checklist before closing this ticket? The new condition needs to be added to the lists in which Linter errors appear, including the Page Information entry for each page, and the necessary documentation and help pages need to be completed.
Izno's comment is also correct; this problem does not relate to Freudian psychology. It's better to fix it now than to wait until reports and other systems depend on a suboptimal naming choice.
including the Page Information entry for each page
Do you have an example where it isn't showing up? It appears to be working here,
https://www.mediawiki.org/wiki/Extension:Scribunto?action=info#Lint_errors
Change #1076048 had a related patch set uploaded (by Arlolra; author: Arlolra):
[mediawiki/extensions/Linter@master] Change capitalization of duplicate IDs
Change #1076048 merged by jenkins-bot:
[mediawiki/extensions/Linter@master] Change capitalization of duplicate IDs
The new condition needs to be added to the lists in which Linter errors appear, ... , and the necessary documentation and help pages need to be completed.
These edits have been made,
https://www.mediawiki.org/w/index.php?title=Help%3ALint_errors%2Fduplicate-ids&diff=6772997&oldid=6772988
https://www.mediawiki.org/w/index.php?title=Help%3ALint_errors&diff=6772876&oldid=6616726
They just started showing up on en.WP between the time of my comment and right now. Thanks for the response.
Hmm, I imagine what's happening is that the linter_cat_page_position index isn't being used, instead the primary linter_id. Since this is a new category, all the linter_id for its errors will be the newest ones so quite a few rows will need to be scanned before returning the ~50 requested.
Adding a namespace probably forces linter_cat_namespace.
P70205#281191 kind of confirms that.
Doing,
SELECT page_id,page_namespace,page_title,page_is_redirect,page_is_new,page_latest,page_touched,page_len,page_content_model,page_namespace,page_title,linter_id,linter_params,linter_start,linter_end,linter_cat FROM `page` JOIN `linter` FORCE INDEX (linter_cat_page_position) ON ((page_id=linter_page)) WHERE linter_cat = 25 ORDER BY linter_id LIMIT 51;
Goes from,
51 rows in set (47.742 sec)
to,
51 rows in set (1.708 sec)
Change #1080845 had a related patch set uploaded (by Arlolra; author: Arlolra):
[mediawiki/extensions/Linter@master] [WIP] Force using an index when paging by category
Change #1080845 merged by jenkins-bot:
[mediawiki/extensions/Linter@master] Force the use of the category index when paging by category