I'm working with the Growth team on features that keep growing the number of Wikipedia contributors.
User Details
- User Since
- Oct 4 2021, 3:13 PM (163 w, 4 d)
- Availability
- Available
- LDAP User
- Sergio Gimeno
- MediaWiki User
- SGimeno (WMF) [ Global Accounts ]
Yesterday
Thu, Nov 21
After implementing a first version of the multiple topics per query approach, some tests started failing, in particular RemoteSearchTaskSuggesterTest which was asserting the number of requests made is correct and the results returned contain the expected data. Adapting the test, we do now less queries, hence collapsing requests was necessary revealed another problem. Since the search requests are performing one query per topic, our suggester assumes it can add such topic as the "most relevant topic" of the article and assigned the result match score to the topic, eg: art => 0.89268. This information was then added to each task object generated from search results and sent to the client for instrumentation purposes. In particular, the information is meant to be used by the newcomertask schema instruments to feed topic and match_score properties. Probably this data was used in some old GE experiment analysis. However, when looking into the events sent to the relevant schema/stream we find out a bug in the client code that is preventing to add the information into the event payload. As a result, this information has not been present in the ingested events for a long time (haven't determined when but can do if relevant).
Wed, Nov 20
Tentatively resolving. The charts are back to a correct state, see enwiki's.
Tue, Nov 19
Mon, Nov 18
My observation is that this problem is not restricted to Add a link task but any task in general with enough number of suggestions to be noticeable. The same problem can be seen with Find references and other tasks