Page MenuHomePhabricator

Capture special mute events in Prefupdate table [4 hour spike]
Closed, ResolvedPublicOct 20 2020

Assigned To
Authored By
jwang
Aug 27 2020, 9:25 PM
Referenced Files
F32383627: image.png
Oct 13 2020, 7:16 PM
F32379958: image.png
Oct 10 2020, 1:10 AM
F32197912: image.png
Aug 27 2020, 9:25 PM
F32197952: image.png
Aug 27 2020, 9:25 PM
F32197946: image.png
Aug 27 2020, 9:25 PM
F32197949: image.png
Aug 27 2020, 9:25 PM

Description

I created this ticket to discuss whether special mute events should be captured in prefupdate table.

AHT team launched special mute feature, which can mute users through special mute page. Before that user only can mute others on user's preferences page. Both ways change user's mute preference from user's perspective.
The interfaces of these two pages.

  • snapshot of special mute

image.png (478×646 px, 29 KB)

  • snapshot of mute by preference setting

image.png (576×1 px, 63 KB)

Now special mute events are stored in event.specialmutesubmit, which will be removed after 90 days. We don't have data about it after 90 days. Mute events from preference page are stored in prefupdate table, which will be sanitized after 90 days. We don't have user level information after 90 days, but we still can count how often it was used in history.

Should special mute events also been captured in prefupdate table as it's a change for user preference? Benefits include: same type of setting changes is stored in the same table; we don't lose historical data of special mute.

Dataset
From the timestamp we can tell the events in specialmutesubmit is not in prefupdate

  • data from prefupdate table

image.png (312×1 px, 53 KB)

  • data from specialmutesubmit table on the same day

image.png (350×890 px, 48 KB)

# SQL for prefupdate
SELECT *
FROM event_sanitized.prefupdate
WHERE year = 2020 AND month = 8 AND day=1
AND  array_contains(array(event.property),'echo-notifications-blacklist')
#OR 
SELECT *
FROM event.prefupdate
WHERE year = 2020 AND month=8 and day =1
AND substr(dt,1, 18) ='2020-08-01T03:49:2'

#SQL for specialmute
SELECT *
FROM event.specialmutesubmit
WHERE year = 2020 AND month=8 and day =1

Other ticket about prefupdate table might be related to this discussion: https://phabricator.wikimedia.org/T260867

Details

Due Date
Oct 20 2020, 4:00 AM

Event Timeline

jwang updated the task description. (Show Details)
jwang added a subscriber: Niharika.
jwang updated the task description. (Show Details)
Niharika renamed this task from Capture special mute events in Prefupdate tabe to Capture special mute events in Prefupdate table.Sep 9 2020, 8:10 PM
Niharika renamed this task from Capture special mute events in Prefupdate table to Capture special mute events in Prefupdate table [4 hour spike].Oct 6 2020, 4:18 PM
Niharika triaged this task as Medium priority.
Niharika moved this task from Triage/To be Estimated to The Letter Song on the Anti-Harassment board.
ARamirez_WMF changed the subtype of this task from "Task" to "Deadline".

It took me a bit, but I was finally able to reproduce this problem.

Whenever I use the Special:Mute normally, it does in fact report that a log entry is being sent for the perfupdate

-- event --------------------------------------------------------------------
{
  "event": {
    "bucketedUserEditCount": "5-99 edits",
    "isDefault": true,
    "property": "echo-notifications-blacklist",
    "saveTimestamp": "20201010003411",
    "userId": 1,
    "value": "0",
    "version": "2"
  },
  "recvFrom": "589650fa6e83",
  "revision": 19799589,
  "schema": "PrefUpdate",
  "seqId": 2,
  "timestamp": 1602290051,
  "userAgent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0",
  "uuid": "77cd32cd09f25dd0a4ca15a9d6b42ac6",
  "webHost": "localhost:8888",
  "wiki": "my_wiki"
}
-- validation ---------------------------------------------------------------
Valid.

The only way I was able to not have this happen, was if I disable JavaScript and I don't change the value (just hit the submit button after page load). Then I would get the specialmutesubmit without the perfupdate.

-- event --------------------------------------------------------------------
{
  "event": {
    "emailsAfter": false,
    "emailsBefore": false,
    "notificationsAfter": false,
    "notificationsBefore": false
  },
  "recvFrom": "589650fa6e83",
  "revision": 19265572,
  "schema": "SpecialMuteSubmit",
  "seqId": 0,
  "timestamp": 1602290811,
  "userAgent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0",
  "uuid": "57eb9531c805515db41830bc0e75fe6b",
  "webHost": "localhost:8888",
  "wiki": "my_wiki"
}
-- validation ---------------------------------------------------------------
Valid.

@jwang Do we know how often this is happening? In cases that it does happen, are the before and after values the same? If they are this would imply that they submitted the form without changing anything (i.e. their preference was not updated).

@dbarratt I found only 2 events for the entire 2020 that the before and after are all false. The timestamps are show as below.

image.png (178×408 px, 12 KB)

My sql code:

SELECT  *
FROM specialmutesubmit
WHERE
 year=2020 
 AND NOT event.emailsbefore 
 AND NOT event.emailsafter
 AND NOT event.notificationsbefore 
 AND NOT event.notificationsafter

May I know what kind of data issue you are trying to reproduce? If needed, we can schedule a time and debug together.

@dbarratt I found only 2 events for the entire 2020 that the before and after are all false. The timestamps are show as below.

image.png (178×408 px, 12 KB)

My sql code:

SELECT  *
FROM specialmutesubmit
WHERE
 year=2020 
 AND NOT event.emailsbefore 
 AND NOT event.emailsafter
 AND NOT event.notificationsbefore 
 AND NOT event.notificationsafter

Sorry. I'm asking when they are the same so something like:

SELECT  *
FROM specialmutesubmit
WHERE
year=2020 
AND event.emailsbefore == event.emailsafter
AND event.notificationsbefore  == event.notificationsafter

May I know what kind of data issue you are trying to reproduce? If needed, we can schedule a time and debug together.

On my local machine, when I use Special:Mute an event is recorded in prefupdate. The only way I can make it not update is by submitting the form without changing anything. I'm not sure why it would work on my local machine, but not in production. Unless maybe some wikis do not use/enable WikimediaEvents? But if that were the case, they wouldn't have any perfupdate records at all.

(oops, updated the condition in the example SQL in the previous comment)

Understand. There are 5 events for entire 2020, which has the same before and after.

image.png (382×400 px, 27 KB)

SQL used:

SELECT  *
FROM event.specialmutesubmit
WHERE
year=2020 
AND event.emailsbefore = event.emailsafter
AND event.notificationsbefore  = event.notificationsafter

@Nuria , Does any one in analytics team possibly know the answer to @dbarratt 's question?

On my local machine, when I use Special:Mute an event is recorded in prefupdate. The only way I can make it not update is by submitting the form without changing anything. I'm not sure why it would work on my local machine, but not in production. Unless maybe some wikis do not use/enable WikimediaEvents? But if that were the case, they wouldn't have any perfupdate records at all.

@jwang I think @Mholloway might be able to help given that this seems to bean instrumentation issue.

WikimediaEvents is enabled on all open, public, non-fishbowl production wikis.

'wmgUseWikimediaEvents' => [
        'default' => true,
        'closed' => false, // T158721
        'private' => false,
        'fishbowl' => false,
],

Let me make sure I understand the issue: when a user mutes another user via the Special:Mute web UI, we would expect both a PrefUpdate and a SpecialMuteSubmit event to be created and sent to their respective eventlogging tables, and this indeed happens locally, but in production it appears that only the SpecialMuteSubmit event is created and persisted. Is that correct?

After an initial look at the code, I am surprised that the PrefUpdate event is created even locally. Looking at the PrefUpdateInstrumentation class in WikimediaEvents, for preference updates via the web UI (not API), it appears that isUserInitiated() will return false (and the onUserSaveOptions() hook handler will accordingly bail out without creating a PrefUpdate event) unless the update was made via Special:Preferences or Special:MobileOptions.

Let me make sure I understand the issue: when a user mutes another user via the Special:Mute web UI, we would expect both a PrefUpdate and a SpecialMuteSubmit event to be created and sent to their respective eventlogging tables, and this indeed happens locally, but in production it appears that only the SpecialMuteSubmit event is created and persisted. Is that correct?

Yes. that is correct.

After an initial look at the code, I am surprised that the PrefUpdate event is created even locally. Looking at the PrefUpdateInstrumentation class in WikimediaEvents, for preference updates via the web UI (not API), it appears that isUserInitiated() will return false (and the onUserSaveOptions() hook handler will accordingly bail out) unless the update was made via Special:Preferences or Special:MobileOptions.

Yeah.... looking at WikimediaEvents\PrefUpdateInstrumentation::onUserSaveOptions() and WikimediaEvents\PrefUpdateInstrumentation::isUserInitiated() it shouldn't be working locally at all. I'll dig into that and see why it works at all. Thanks for noticing that.

After doing some background digging, it appears that the PrefUpdate schema is intended specifically to capture explicit, user-initiated preference changes (cf. T260867, a bug filed when the check for user initiation was dropped and PrefUpdate events were being created relating to preference creation on user registration). If you drill down, you'll see that User::saveSettings is called from quite a few different places (including several different special pages), but I am gathering that in many of those cases, for purposes of the PrefUpdate schema, the fact of a preference being updated is incidental to what the user is trying to accomplish. Here, for example, it's essentially an implementation detail that mutes are implemented as a user option; the user doesn't navigate to Special:Mute intending to accomplish a preference change.

Taking a step back, and after rereading the task description, it seems like the real issue here is that you want to have sanitized user mute events preserved for longer than 90 days. Perhaps you'd be better off pursuing that directly (see https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Data_retention#How_to_change_the_purging_strategy_of_a_schema) rather than trying to have events related to mute actions duplicated into the prefupdates table?

@Mholloway thank you for digging in to it and clarifying the difference behind it.
@dbarratt @Niharika , given that do you agree to keep data capture as it is: not merging special:mute events to prefupdate table?

For data retention, I am working and following up in another phab: https://phabricator.wikimedia.org/T262499

Thanks for the explanation @Mholloway. @jwang I agree that we can keep the data retained for longer in its current state without merging it with prefupdate. Thanks!