This task represents the work with running an A/B test to evaluate the impact of
disabling the MobileFrontend talk page overlay and introducing the suite of mobile DiscussionTools:
- Reply Tool
- New Topic Tool
- Topic Subscriptions
- Usability Improvements
Research Questions
In running this A/B test, we are seeking to learn whether introducing the set of DiscussionTools listed above cause the following to happen?
- Junior Contributors are more successful publishing new talk page comments and discussion topics
- Junior Contributors intuitively recognize talk pages as spaces to communicate with other volunteers
- Senior Contributors can assess the level of activity on a talk page with less effort
Decision to be made
This A/B test will help us make the following decision: Are the set of mobile Talk Pages Project features fit to be made available to everyone, at all wikis, by default?
Decision Matrix
We do not think a single metric / KPI will be sufficient for evaluating the cumulative impact of the set of DiscussionTools we are introducing in this test.
Reason being: we do not think there is a single metric that is likely to: A) move in response to these changes *and* B) for the direction of that movement to indicate a clear improvement or degradation in peoples' user experience.
In line with the above, we will take a "guardrail" approach to this analysis. Meaning, we will base the Decision to be made on the presence of or absence of two unambiguously negative outcomes.
ID | Scenario | Indicator/Metric | Plan of action |
---|---|---|---|
1. | People are more likely to make destructive edits | Proportion of published edits that are reverted within 48 hours of being made increases by >10% over a sustained period of time | 1) Pause plans for wider deployment, 2) To contextualize change in revert rate, investigate changes in the number of published edits (maybe higher revert rate is a "price" we're willing to "pay" for the increase in good edits), 3) Investigate the type of edits being reverted to understand how the new tools – namely the Reply and New Topic Tools – could be contributing to the uptick |
2. | People are less likely to publish the edits they start | Edit completion rate decreases by >10% over a sustained period of time | 1) Pause plans for wider deployment and 2) Investigate what patterns exist among the people whose edit completion rate has gone down |
3. | People do NOT encounter more difficulty publishing edits and there are no regressions in edit revert and edit completion rates | A) Edit completion rate increases by any percentage or it decreases by <10% over a sustained period of time and B) Edit revert rate decreases by any percentage or it increases by <10% over a sustained period of time | Move forward with opt-out deployment at all Wikimedia wikis |
Curiosities
While the scenarios listed in the Decision Matrix section above will guide
Priority | Impact/Outcome | Metric |
---|---|---|
1. | Junior Contributors intuitively recognize talk pages as places to communicate | Percentage of unique Junior Contributors who visit a talk page and engage with it in some way. //Read: expanding a discussion section, initiating the workflow for starting a new discussion, initiating the workflow for replying to a comment someone else has made, etc. |
2. | Senior Contributors can assess the level of activity on a talk page with less effort | Average time duration between from when a contributor views a talk page to the time they first engage with the page in some way |
3. | People across experience levels are more successful publishing new talk page comments and discussion topics | A) Average number of talk page new topics or comments people publish during the course of the test and B) Percentage of people that edit a talk page, grouped by number of new topics or comments (e.g. 1-5, 6-10, 11-15, etc) they publish during the course of the test |
Wikis
This section will contain the list of wikis participating in the A/B test. See T314950.
Open Questions
- 1. Per the question @dchan raised in Editing Scratch, how long do anticipate the A/B test needing to run given the number of people using mobile talk pages and they frequency with which they are using them? See T295180 for more details on mobile talk page usage.
- 2. Should the A/B test be limited to wikis that have NOT had access to any mobile talk page improvements via T298221 or T298222 to-date? See T297448#7575858 for more context.
- Yes. The wikis involved in this A/B test will be limited to those who have NOT had access to any mobile DT features prior to the test beginning. See Selection Criteria within T314950's description for more details.
Done
- A report is published that evaluates the ===Hypotheses above